forked from spacetelescope/pysynphot
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathetc_pysynphot_testplan.txt
115 lines (83 loc) · 4.45 KB
/
etc_pysynphot_testplan.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
Test plan for ETC-related category of tests
Summary:
There exists a large number of ETC regression test cases with answers,
provided by INS. However, these are not suitable for directly testing
of pysynphot, because they perform comparisons of values calculated by
the ETC based on pysynphot results, and other things.
Therefore, the (py)synphot commands will be extracted from these test
cases; we will then run the commands directly through the equivalent
pairs of synphot/pysynphot commands. This permits a direct comparison
between synphot and pysynphot; discrepant results will be submitted to
INS for assessment.
Once pysynphot has passed this large suite of tests, the pysynphot
answers will be accepted as "truth", and the tests will be modified
into true regression tests, using a specific named set of CDBS
files; this is how most pysynphot tests are run to protect against
"false positive" test failures due only to a change in data files.
Test definitions:
For each test case, consisting of a spectrum specification, an obsmode
specification (usually), and a taskname, the following tests will be
performed:
- Compare the throughput arrays for the obsmode (syn calcband/pysyn)
calcphotCase.testthru
- Compare the spectrum arrays in photlam (syn calcspec/pysyn)
calcspecCase.testspecphotlam
- Compare the resulting (spectrum through obsmode) arrays in photlam
(syn calcspec/syn countrate/pysyn)
countrateCase.testcsphotlam, countrateCase.testcrphotlam
- Compare the resulting (spectrum through obsmode) arrays in counts
(syn calcspec/syn countrate/pysyn)
countrateCase.testcscounts, countrateCase.testcrcounts
- Compare any scalar values calculated (countrate, efflam, ...)
calcphotCase.testefflam
countrateCase.testcountrate
Array comparisons:
Prior to r465, array comparisons were performed strictly, applying the
same (pysyn-syn)/syn metric as for scalar elements to each element of
the array. If any elements failed, then the test failed; the fraction
of array elements that failed was compared to the superthreshold to
determine extreme failures.
This generated a large number of spurious failures because of
differences between very small numbers in the domain where the
spectrum was essentially zero. With r465, the array comparisons were
performed only on "significant" array elements, where:
- a "significant" element has a value at least SigThresh*max(array)
- if the set of significant elements for the two arrays differs,
the test fails
- the (pysyn-syn)/syn comparison is performed only on the
significant elements.
The SigThresh was initially set to 0.005.
Array comparisons will (eventually) be performed in two stages, so as
to apply a looser acceptance criteria at locations that are likely to
exhibit numerical discrepancies (eg, steep gradients, inflection
points) and a tighter acceptance criteria elsewhere.
Another possibility is to calculate the mean, std, and extrema of the
difference array after some clipping has been applied
Test results:
Discrepancies greater than some pre-agreed percentage
will be brought to the attention of the instrument teams.
Some analysis will also be performed on the test results -- e.g.,
histogram of discrepancies, mean & std. The pass/fail discrepancy will
probably be updated based on such analysis.
Other comments:
Waveset issues require special handling:
- for COUNTRATE comparisons, both synphot and pysynphot will produce
spectra (in counts) on the same optimally binned wavelength
set. These can be compared directly.
- for all other comparisons of flux arrays, the synphot and
pysynphot calculations will be performed independently, each in
their native wavesets. Then the waveset from the synphot spectrum
will be used to sample the pysynphot spectrum, and those results
will be compared.
We may additionally wish to design some tests intended to validate the
pysynphot native results.
Both synphot and pysynphot use photlam as their native or internal
units for calculations; thus making these comparisons will be
informative.
Calculations that are performed in counts are likely to show greater
discrepancies than those in any other units, because in order to
compute counts, one must integrate over wavelength (which implies
multiplying by the bin size, which implies computing delta-wavelength,
which is susceptible to very small numerical variations) and area
(which implies multiplying by the area of the HST mirror, which will
amplify those small variations).