Writing tests to compare array results from synphot and pysynphot was
difficult. The problem had two elements:

  - produce arrays to be compared on the same waveset
  - devise a comparison that was reasonably insensitive to numerical noise

Producing arrays on the same waveset
-------------------------------------

Because pysynphot was deliberately designed to correct some of synphot's
deficiencies in waveset handling, it is clear that the spectral arrays
produced by the two software packages will be different, unless
special steps are taken to force them onto the same waveset. 

testcrspec:
  This test uses the synphot.countrate task, with a box(15000,30000)
  to simulate the obsmode to produce the source spectrum unconvolved with the
  instrumental obsmode. The task uses a wavecat file to select the
  wavelength table used based on the obsmode. For this test, a special
  purpose wavecat was produced for each run, so that the wavelength
  table in the pysynphot-computed spectrum would be used.

testthru:
  This test samples the pysynphot-computed throughput array on the
  wavelength table used for the synphot-computed throughput.

testcrphotlam:
  For this test, the synphot.countrate task selects its wavelength
  table from the wavecat based on the obsmode, and the pysynphot
  Observation selects its binned wavelength set from the same wavecat
  based on the obsmode. Thus the synphot spectrum can be compared
  directly to the binned tables associated with the pysynphot
  Observation.

testcrcounts:
  For this test, the wavelength table from the synphot-computed
  spectrum is used as the binned wavelength set when constructing
  the pysynphot Observation. 


Performing the array comparisons
--------------------------------

Constructing a meaningful comparison of array numerical data that will
be sensitive to significant differences, but insensitive to
insignificant differences, is a difficult task. This was the most
volatile element of the commissioning test software, as we
experimented with various approaches; details can be seen in the
revision log. This document describes the final approach used.

 - After confirming that the arrays are the same size, exclude the two
   endpoints on either end of the array from the comparison. This step
   was introduced when it was observed that the endpoints were often
   significantly discrepant due to a difference in waveset handling.

 - For each array, identify its significant elements by selecting
   elements that were greater than SIGTHRESH*the array maximum. The
   value of SIGTHRESH was tunable for test runs but is typically on
   the order of 0.01. If the two arrays have different sets of
   significant elements, the test fails.

 - From the set of significant elements, further select the set of
   elements for which the synphot array is nonzero. Then, using this
   subset of elements, construct the discrepancy array (test-ref)/ref.

 - If any elements in the discrepancy array exceed the threshold
   THRESH (also tunable, typically set to 0.01), the test fails.

 - If more than SUPERTHRESH (typically 20%) of the elements in the
   discrepancy array exceed the threshold, the test is considered an
   "extreme failure".

 - The min, max, mean, and std of the discrepancy array were also
   reported, as were the number of 5-sigma outliers.
