1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
|
MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
Algorithms for the Multi-Spectra Extraction Package
Analysis and Discussion
December 2, 1983
1. Disclaimer
This should not be taken as a statement of how the algorithms of
the final package should function; this is merely an analysis and
discussion of the algorithms, and should be followed by further
discussion before we decide what course to follow in the final
package. We may very well decide that the level of effort required to
implement rigorously correct nonlinear fitting algorithms is not
justified by the expected scientific usage of the package. Before we
can decide that, though, we need an accurate estimate of the level of
effort required.
In attacking nonlinear surface fitting problems it is important to
recognize that almost any techniques can be made to yield a result
without the program crashing. Production of a result (extraction of a
spectrum) does not mean that the algorithm converged, that the
solution is unique, that the model is accurate, or that the
uncertainties in the computed coefficients have been minimized.
2. Multispec Flat (pg. 4)
This sounds like a classical high pass filter and might be best
implemented via convolution. Using a convolution operator with a
numerical kernel has the advantage that the filter can be easily
modifed by resampling the kernel or by changing the size of the
kernel. It is also quite efficient. The boundary extension feature
of IMIO makes it easy to deal with the problem of the kernel
overlapping the edge of the image in a convolution. Since the
convolution is one-dimensional (the image is only filtered in Y), it
will always be desirable to transpose the image.
The method used to detect and reject bad pixels (eqn 1) is not correct.
The rejection criteria should be invariant with respect to a scaling
of the pixel values. If the data has gone through very much
processing (i.e., dtoi on photographic data), the relation between
photon counts and pixel value may be linear, but the scale is
unknown. Rejection by comparison of a data value to a "smoothed"
value is more commonly done as follows:
reject if: abs (observed - smoothed) > (K * sigma)
where sigma is the noise sigma of the data, generally a function of
the signal.
It is often desirable in rejection algorithms to be able to specify,
-1-
MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
as an option, that all pixels within a specified radius of a bad pixel
be rejected, rather than just the pixel which was detected. This is
only unnecessary if the bad pixels are single pixel events (no
wings). Region rejection makes an iterative rejection scheme converge
faster, as well as rejecting the faint wings of the contaminated
region.
2.1 Dividing by the Flat (pg. 5)
There is no mention of any need for registering the flat with the
data field. Is it safe to assume that the quartz and the object
frames are precisely registered? What if the user does in fact
average several quartz frames taken over a period of time? (Image
registration is a general problem that is probably best left until
solved in IMAGES).
3. Multiap Extraction (pg. 5-6, 8-13)
The thing that bothers me most about the modeling and extraction
process is that the high signal to noize quartz information is not
used to full advantage, and the background is not fitted very
accurately. The present algorithms will work well for high signal to
noise data, but will result in large (percentage) errors for faint
spectra.
Basically, it seems to me that the high signal to noise quartz spectra
should, in many cases, be used to determine the position and shape of
the spectral lines. This is especially attractive since the quartz
and spectra appear to be closely registered. Furthermore, if the
position-shape solution and extraction procedures are separate
procedures, there is nothing to prevent one from applying both to the
object spectum if necessary for some reason (i.e., poor registration,
better signal to noise in the object spectrum in the region of
interest, signal dependent distortions, lack of a quartz image, etc.,
would all justify use of the object frame). It should be possible to
model either the quartz or the object frame, and to reuse a model for
more than one extraction.
Let us divide the process up into two steps, "modeling", and
"extraction" (as it is now). The "calibration frame" may be the
quartz, an averaged quartz, or the object frame. Ideally it will have
a high signal to noise ratio and any errors in the background should
be negligible compared to the signal.
We do not solve for the background while modeling the calibration
frame; we assume that the background has been fitted by any of a
variety of techniques and a background frame written before the
calibration frame is modeled. A "swath" is the average of several
image lines, where an image line runs across the dispersion, and a
-2-
MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
column along the dispersion.
3.1 Modeling
I would set the thing up to start fitting at any arbitrary swath,
rather than the first swath, because it not much harder, and there is
no guarantee that the calibration frame will have adequate signal to
noise in the first swath (indeed often the lowest signal to noise will
be found there). We define the "center" swath as the first swath to
be fitted, corresponding to the highest signal to noise region of the
calibration frame. By default the center swath should be the swath
used by find_spectra, especially if there is significant curvature in
the spectra.
algorithm model_calibration_frame
begin
extract center swath
initialize coeff using centers from find_spectra
model center swath (nonlinear)
for (successive swaths upward to top of frame) {
extract swath
initialize coeff to values from last fit
model swath (nonlinear)
save coeff in datafile
}
set last-fit coeff to values for center swath
for (successive swaths downward to bottom of frame) {
extract swath
initialize coeff to values from last fit
model swath (nonlinear)
save coeff in datafile
}
smooth model coeff (excluding intensity) along the dispersion
[high freq variations in spectra center and shape from line]
[to line are nonphysical]
variance of a coeff at line-Y from the smoothed model value is
a measure of the uncertainty in that coeff.
end
I would have the background fitting routine write as output a
background frame, the name of which would be saved in the datafile,
rather than saving the coeff of the bkg fit in the datafile. The
background frame may then be produced by any of a number of
techniques; storing the coefficients of the bkg fit in the datafile
limits the technique used to a particular model. For similar reasons,
the standard bkg fitting routine should be broken up into a module
-3-
MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
which determines the region to be fitted, and a module which fits the
bkg pixels and writes the bkg image.
For example, if the default background fitting routine is a line by
line routine, the output frame could be smoothed to remove the
(nonphysical) fluctuations in the background from line to line. A
true two dimensional background fitting routine may be added later
without requiring modifications to the datafile. Second order
corrections could be made to the background by repeating the solution
using the background fitted by the extraction procedure.
procedure extract_swath
begin
extract raw swath from calibration frame
extract raw swath from background frame
return (calib swath minus bkg swath)
end
The algorithm used to simultaneously model all spectra in a swath from
across the dispersion is probably the most difficult and time consuming
part of the problem. The problem is nonlinear in all but one of the
four or more parameters for each spectra. You have spent a lot of
time on this and we are probably not going to be able to improve on
your algorithms significantly, though the generation of the matrix in
each step can probably be optimized significantly.
The analytic line-profile model is the most general and should work
for most instruments with small circular apertures, even in the
presence of severe distortions. It should be possible, however, to
fit a simpler model given by a lookup table, solving only for the
position and height of each spectra. This model may be adequate for
many instruments, should be must faster to fit, and may produce more
accurate results since there are fewer parameters in the fit. The
disadvantage of an empirical model is that it must be accurately
interpolated (including the derivatives), requiring use of spline
interpolation or a similar technique (I have tried linear and it is
not good enough). Vesa has implemented procedures for fitting splines
and evaluating their derivatives.
Fitting the empirical model simultaneously to any number of spectra
should be straightforward provided the signal to noise is reasonable,
since there are few parameters in the fit and the matrix is banded
(the Marquardt algorithm would work fine). However, if you ever have
to deal with data where a very faint or nonexistent spectra is next to
a bright one, it may be difficult to constrain the fit. I doubt if
the present approach of smoothing the coeff across the dispersion and
iterating would work in such a case. The best approach might be to
fix the center of the faint spectra relative to the bright one once
the signal drops below a certain level, or to drop it from the fit
entirely. This requires that the matrix be able to change size during
-4-
MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
the fit.
algorithm fit_empirical_model
begin
[upon entry, we already have an initial estimate of the coeff]
# Marquardt (gradient expansion) algorithm. Make 2nd order
# Taylor's expansion to chisquare near minimum and solve for
# correction vector which puts us at minimum (subject to
# Taylor's approx). Taylor's approximation rapidly becomes
# better as we near the minimum of the multidimensional
# chisquare, hence convergence is extremely rapid given a good
# starting estimate.
repeat {
evaluate curvature matrix using current coeff.
solve banded curvature matrix
compute error matrix
for (each spectra)
if (uncertainty in center coeff > tol) {
fix center by estimation given relative spacing
in higher signal region
remove spectra center from solution
}
if (no center coeff were rejected)
tweak correction vector to accelerate convergence
new coeff vector = old coeff vector + correction vector
compute norm of correction vector
} until (no more center coeff rejected and norm < tolerance)
compute final uncertainties
end
The following is close to what is currently done to fit the analytic
model, as far as I can remember (I have modified it slightly to
stimulate discussion). The solution is broken up into two parts to
reduce the number of free parameters and increase stability. If the
uncertainty in a free parameter becomes large it is best to fix the
parameter (it is particularly easy for this data to estimate all but
the intensity parameter). A fixed parameter is used in the model and
affects the solution but is not solved for (i.e., like the background).
The analytic fit will be rather slow, even if the outer loop is
constrained to one iteration. If it takes (very rough estimates) .5
sec to set up the banded matrix and .3 sec to solve it, 3 iterations
to convergence, we have 5 sec per swath. If we have an 800 lines
broken into swaths of 32 lines, the total is 125 sec per image (to
within a factor of 5).
-5-
MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
algorithm fit_analytic_model
begin
[upon entry, we already have an initial estimate of the coeff]
repeat {
save coeff
solve for center,height,width of each line with second
order terms fixed (but not necessarily zero)
apply constraints on line centers and widths
repeat solution adding second order coeff (shape terms)
compute error matrix
for (each coeff)
if (uncertainty in coeff > tol) {
fix coeff value to reasonable estimate
remove coeff from solution
}
compute total correction vector given saved coeff
if (no coeff were rejected)
tweak correction vector to accelerate convergence
compute norm of correction vector
} until (no additional coeff rejected and norm < tolerance)
compute final uncertainties
end
3.2 Extraction
The purpose of extraction is to compute the integral of the spectra
across the dispersion, producing I(y) for each spectra. An estimate of
the uncertainty U(y) should also be produced. The basic extraction
techniques are summarized below. The number of spectra, spectra
centers, spectra width and shape parameters are taken from the model
fitted to the calibration frame as outlined above. We make a
simultaneous solution for the profile heights and the background (a
linear problem), repeating the solution independently for each line in
the image. For a faint spectrum, it is essential to determine the
background accurately, and we can do that safely here since the matrix
will be very well behaved.
(1) Aperture sum. All of the pixels within a specified radius of
the spectra are summed to produce the raw integral. The
background image is also summed and subtracted to yield the
final integral. The radius may be a constant or a function of
the width of the profile. Fractional pixel techniques should
be used to minimize sampling effects at the boundaries of the
aperture. Pixel rejection is not possible since there is no
fitted surface. The model is used only to get the spectra
center and width. This technique is fastest and may be best
-6-
MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
if the profile is difficult to model, provided the spectra are
not crowded.
(2) Weighted aperture sum. Like (1), except that a weighting
function is applied to correct for the effects of crowding.
The model is fitted to each object line, solving for I
(height) and B (background) with all other parameters fixed.
This is a linear solution of a banded matrix and should be
quite fast provided the model can be sampled efficiently to
produce the matrix. It is possible to iterate to reject bad
pixels. The weight for a spectra at a data pixel is the
fractional contribution of that spectra to the integral of the
contributions of all spectra.
(3) Fit and integrate the model. The model is fitted as in (2) to
the data pixels but the final integral is produced by
integrating the model. This technique should be more
resistant to noise in the data than is (2), because we are
using the high signal to noise information in the model to
constrain the integral. More accurate photometric results
should therefore be possible.
Method (2) has the advantage that the integral is invariant with
respect to scale errors in the fitted models, provided the same error
is made in each model. Of course, the same error is unlikely to be
made in all models contributing to a point; it is more likely that an
error will put more energy into one spectra at the expense of its
neighbors. In the limit as the spectra become less crowded, however,
the effects of errors in neighboring spectra become small and the
weighted average technique looks good; it becomes quite insensitive to
errors in the model and in the fit. For crowded spectra there seems
no alternative to a good multiparameter fit. For faint spectra method
(3) is probably best, and fitting the background accurately becomes
crucial.
In both (2) and (3), subtraction of the scaled models yields a residual
image which can be used to evaluate at a glance the quality of the fit.
Since most all of the effort in (2) and (3) is in the least squares
solution and the pixel rejection, it might be desirable to produce two
integrals (output spectra), one for each algorithm, as well as the
uncertainty vector (computed from the covariance matrix, not the
residual).
3.3 Smoothing Coefficient Arrays
In several places we have seen the need for smoothing coefficient
arrays. The use of polynomials for smoothing is questionable unless
the order of the polynomial is low (3 or less). High order
polynomials are notoriously bad near the endpoints of the fitted
array, unless the data curve happens to be a noisy low order
-7-
MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
polynomial (rare, to say the least). Convolution or piecewise
polynomial functions (i.e., the natural cubic smoothing spline) should
be considered if there is any reason to believe that the coefficient
array being smoothed may have high frequency components which are
physical and must be followed (i.e., a bend or kink).
3.4 Weighting (pg. 11)
The first weighting scheme (1 / sqrt (data)) seems inverted to me.
The noise goes up as with the signal, to be sure, but the signal to
noise usually goes up faster. Seems to me the weight estimate should
be sqrt(data). It also make more sense to weight the least blended
(peak) areas most.
3.5 Rejection criteria (pg. 13)
The same comments apply to this rejection criterium as in section
2. I assume that "(data - model)" is supposed to be "abs (data -
model").
3.6 Uncertainties and Convergence Criteria
I got the impression that you were using the residual of the data
minus the fitted surface both as the convergence criterium and as a
measure of the errors in the fit. It is neither; assuming a perfect
model, the residual gives only a measure of the noise in the data.
Using the residual to establish a convergence criterium seems
reasonable except that it is hard to reliably say what the criterium
should be. Assuming that the algorithm converges, the value of the
residual when convergence is acheived is in general hard to predict,
so it seems to me to be difficult to establish a convergence
criterium. The conventional way to establish when a nonlinear fit
converges is by measuring the norm of the correction vector. When the
norm becomes less than some small number the algorithm is said to have
converged. The multidimensional chisquare surface is parabolic near
the minimum and a good nonlinear algorithm will converge very rapidly
once it gets near the minimum.
The residual is a measure of the overall goodness of fit, but tells us
nothing about the uncertainties in the individual coefficients of the
model. The uncertainties in the coefficients are given by the
covariance or error matrix (see Bevington pg. 242). It is ok to push
forward and produce an extraction if the algorithm fails to converge,
but ONLY provided the code gives a reliable estimate of the
uncertainty.
-8-
MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
3.6 Evaluating the Curvature Matrix Efficiently
The most expensive part of the reduction process is probably
evaluating the model to form the curvature matrix at each iteration in
the nonlinear solution. The most efficient way to do this is to use
lookup tables. If the profile shape does not change, the profile can
be sampled, fitted with a spline, and the spline evaluated to get the
zero through second derivatives for the curvature matrix. This can be
done even if the width of the profile changes by adding a linear
term. If the shape of the profile has to change, it is still possible
to sample either the gaussian or the exponential function with major
savings in computation time.
3.7 Efficient Extraction (pg. 12)
The reported time of 3 cpu hours to extract the spectra from an
800 line image is excessive for a linear solution. I would estimate
the time required for the 800 linear banded matrix solutions at 4-8
minutes, with a comparable time required for matrix setup if it is
done efficiently. I suspect that the present code is not setting up
the linear banded matrix efficiently (not sampling the model
efficiently). Pixel rejection should not seriously affect the timings
assuming that bad pixels are not detected in most image lines.
4. Correcting for Variations in the PSF
For all low signal to noise data it is desirable to correct for
variations in the point spread function, caused by variable focus,
scattering, or whatever. This does not seem such a difficult problem
since the width of the line profile is directly correlated with the
width of the PSF and the information is provided by the current model
at each point in each extracted spectrum. The extracted spectra can
be corrected for the variation in the PSF by convolution with a spread
function the width of which varies along the spectrum.
|