diff options
author | Joseph Hunkeler <jhunkeler@gmail.com> | 2015-07-08 20:46:52 -0400 |
---|---|---|
committer | Joseph Hunkeler <jhunkeler@gmail.com> | 2015-07-08 20:46:52 -0400 |
commit | fa080de7afc95aa1c19a6e6fc0e0708ced2eadc4 (patch) | |
tree | bdda434976bc09c864f2e4fa6f16ba1952b1e555 /noao/twodspec/multispec/doc/MSalgo_c.doc | |
download | iraf-linux-fa080de7afc95aa1c19a6e6fc0e0708ced2eadc4.tar.gz |
Initial commit
Diffstat (limited to 'noao/twodspec/multispec/doc/MSalgo_c.doc')
-rw-r--r-- | noao/twodspec/multispec/doc/MSalgo_c.doc | 522 |
1 files changed, 522 insertions, 0 deletions
diff --git a/noao/twodspec/multispec/doc/MSalgo_c.doc b/noao/twodspec/multispec/doc/MSalgo_c.doc new file mode 100644 index 00000000..b3322dff --- /dev/null +++ b/noao/twodspec/multispec/doc/MSalgo_c.doc @@ -0,0 +1,522 @@ +MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83) + + + + Algorithms for the Multi-Spectra Extraction Package + Analysis and Discussion + December 2, 1983 + + + +1. Disclaimer + + This should not be taken as a statement of how the algorithms of +the final package should function; this is merely an analysis and +discussion of the algorithms, and should be followed by further +discussion before we decide what course to follow in the final +package. We may very well decide that the level of effort required to +implement rigorously correct nonlinear fitting algorithms is not +justified by the expected scientific usage of the package. Before we +can decide that, though, we need an accurate estimate of the level of +effort required. + +In attacking nonlinear surface fitting problems it is important to +recognize that almost any techniques can be made to yield a result +without the program crashing. Production of a result (extraction of a +spectrum) does not mean that the algorithm converged, that the +solution is unique, that the model is accurate, or that the +uncertainties in the computed coefficients have been minimized. + + + +2. Multispec Flat (pg. 4) + + This sounds like a classical high pass filter and might be best +implemented via convolution. Using a convolution operator with a +numerical kernel has the advantage that the filter can be easily +modifed by resampling the kernel or by changing the size of the +kernel. It is also quite efficient. The boundary extension feature +of IMIO makes it easy to deal with the problem of the kernel +overlapping the edge of the image in a convolution. Since the +convolution is one-dimensional (the image is only filtered in Y), it +will always be desirable to transpose the image. + +The method used to detect and reject bad pixels (eqn 1) is not correct. +The rejection criteria should be invariant with respect to a scaling +of the pixel values. If the data has gone through very much +processing (i.e., dtoi on photographic data), the relation between +photon counts and pixel value may be linear, but the scale is +unknown. Rejection by comparison of a data value to a "smoothed" +value is more commonly done as follows: + + reject if: abs (observed - smoothed) > (K * sigma) + +where sigma is the noise sigma of the data, generally a function of +the signal. + +It is often desirable in rejection algorithms to be able to specify, + + + -1- +MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83) + + + +as an option, that all pixels within a specified radius of a bad pixel +be rejected, rather than just the pixel which was detected. This is +only unnecessary if the bad pixels are single pixel events (no +wings). Region rejection makes an iterative rejection scheme converge +faster, as well as rejecting the faint wings of the contaminated +region. + + + +2.1 Dividing by the Flat (pg. 5) + + There is no mention of any need for registering the flat with the +data field. Is it safe to assume that the quartz and the object +frames are precisely registered? What if the user does in fact +average several quartz frames taken over a period of time? (Image +registration is a general problem that is probably best left until +solved in IMAGES). + + + +3. Multiap Extraction (pg. 5-6, 8-13) + + The thing that bothers me most about the modeling and extraction +process is that the high signal to noize quartz information is not +used to full advantage, and the background is not fitted very +accurately. The present algorithms will work well for high signal to +noise data, but will result in large (percentage) errors for faint +spectra. + +Basically, it seems to me that the high signal to noise quartz spectra +should, in many cases, be used to determine the position and shape of +the spectral lines. This is especially attractive since the quartz +and spectra appear to be closely registered. Furthermore, if the +position-shape solution and extraction procedures are separate +procedures, there is nothing to prevent one from applying both to the +object spectum if necessary for some reason (i.e., poor registration, +better signal to noise in the object spectrum in the region of +interest, signal dependent distortions, lack of a quartz image, etc., +would all justify use of the object frame). It should be possible to +model either the quartz or the object frame, and to reuse a model for +more than one extraction. + +Let us divide the process up into two steps, "modeling", and +"extraction" (as it is now). The "calibration frame" may be the +quartz, an averaged quartz, or the object frame. Ideally it will have +a high signal to noise ratio and any errors in the background should +be negligible compared to the signal. + +We do not solve for the background while modeling the calibration +frame; we assume that the background has been fitted by any of a +variety of techniques and a background frame written before the +calibration frame is modeled. A "swath" is the average of several +image lines, where an image line runs across the dispersion, and a + + + -2- +MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83) + + + +column along the dispersion. + + + +3.1 Modeling + + I would set the thing up to start fitting at any arbitrary swath, +rather than the first swath, because it not much harder, and there is +no guarantee that the calibration frame will have adequate signal to +noise in the first swath (indeed often the lowest signal to noise will +be found there). We define the "center" swath as the first swath to +be fitted, corresponding to the highest signal to noise region of the +calibration frame. By default the center swath should be the swath +used by find_spectra, especially if there is significant curvature in +the spectra. + +algorithm model_calibration_frame + +begin + extract center swath + initialize coeff using centers from find_spectra + model center swath (nonlinear) + + for (successive swaths upward to top of frame) { + extract swath + initialize coeff to values from last fit + model swath (nonlinear) + save coeff in datafile + } + + set last-fit coeff to values for center swath + for (successive swaths downward to bottom of frame) { + extract swath + initialize coeff to values from last fit + model swath (nonlinear) + save coeff in datafile + } + + smooth model coeff (excluding intensity) along the dispersion + [high freq variations in spectra center and shape from line] + [to line are nonphysical] + variance of a coeff at line-Y from the smoothed model value is + a measure of the uncertainty in that coeff. +end + + +I would have the background fitting routine write as output a +background frame, the name of which would be saved in the datafile, +rather than saving the coeff of the bkg fit in the datafile. The +background frame may then be produced by any of a number of +techniques; storing the coefficients of the bkg fit in the datafile +limits the technique used to a particular model. For similar reasons, +the standard bkg fitting routine should be broken up into a module + + + -3- +MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83) + + + +which determines the region to be fitted, and a module which fits the +bkg pixels and writes the bkg image. + +For example, if the default background fitting routine is a line by +line routine, the output frame could be smoothed to remove the +(nonphysical) fluctuations in the background from line to line. A +true two dimensional background fitting routine may be added later +without requiring modifications to the datafile. Second order +corrections could be made to the background by repeating the solution +using the background fitted by the extraction procedure. + + +procedure extract_swath + +begin + extract raw swath from calibration frame + extract raw swath from background frame + return (calib swath minus bkg swath) +end + + +The algorithm used to simultaneously model all spectra in a swath from +across the dispersion is probably the most difficult and time consuming +part of the problem. The problem is nonlinear in all but one of the +four or more parameters for each spectra. You have spent a lot of +time on this and we are probably not going to be able to improve on +your algorithms significantly, though the generation of the matrix in +each step can probably be optimized significantly. + +The analytic line-profile model is the most general and should work +for most instruments with small circular apertures, even in the +presence of severe distortions. It should be possible, however, to +fit a simpler model given by a lookup table, solving only for the +position and height of each spectra. This model may be adequate for +many instruments, should be must faster to fit, and may produce more +accurate results since there are fewer parameters in the fit. The +disadvantage of an empirical model is that it must be accurately +interpolated (including the derivatives), requiring use of spline +interpolation or a similar technique (I have tried linear and it is +not good enough). Vesa has implemented procedures for fitting splines +and evaluating their derivatives. + +Fitting the empirical model simultaneously to any number of spectra +should be straightforward provided the signal to noise is reasonable, +since there are few parameters in the fit and the matrix is banded +(the Marquardt algorithm would work fine). However, if you ever have +to deal with data where a very faint or nonexistent spectra is next to +a bright one, it may be difficult to constrain the fit. I doubt if +the present approach of smoothing the coeff across the dispersion and +iterating would work in such a case. The best approach might be to +fix the center of the faint spectra relative to the bright one once +the signal drops below a certain level, or to drop it from the fit +entirely. This requires that the matrix be able to change size during + + + -4- +MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83) + + + +the fit. + +algorithm fit_empirical_model + +begin + [upon entry, we already have an initial estimate of the coeff] + + # Marquardt (gradient expansion) algorithm. Make 2nd order + # Taylor's expansion to chisquare near minimum and solve for + # correction vector which puts us at minimum (subject to + # Taylor's approx). Taylor's approximation rapidly becomes + # better as we near the minimum of the multidimensional + # chisquare, hence convergence is extremely rapid given a good + # starting estimate. + + repeat { + evaluate curvature matrix using current coeff. + solve banded curvature matrix + + compute error matrix + for (each spectra) + if (uncertainty in center coeff > tol) { + fix center by estimation given relative spacing + in higher signal region + remove spectra center from solution + } + + if (no center coeff were rejected) + tweak correction vector to accelerate convergence + new coeff vector = old coeff vector + correction vector + compute norm of correction vector + } until (no more center coeff rejected and norm < tolerance) + + compute final uncertainties +end + + +The following is close to what is currently done to fit the analytic +model, as far as I can remember (I have modified it slightly to +stimulate discussion). The solution is broken up into two parts to +reduce the number of free parameters and increase stability. If the +uncertainty in a free parameter becomes large it is best to fix the +parameter (it is particularly easy for this data to estimate all but +the intensity parameter). A fixed parameter is used in the model and +affects the solution but is not solved for (i.e., like the background). + +The analytic fit will be rather slow, even if the outer loop is +constrained to one iteration. If it takes (very rough estimates) .5 +sec to set up the banded matrix and .3 sec to solve it, 3 iterations +to convergence, we have 5 sec per swath. If we have an 800 lines +broken into swaths of 32 lines, the total is 125 sec per image (to +within a factor of 5). + + + + -5- +MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83) + + + +algorithm fit_analytic_model + +begin + [upon entry, we already have an initial estimate of the coeff] + + repeat { + save coeff + solve for center,height,width of each line with second + order terms fixed (but not necessarily zero) + apply constraints on line centers and widths + repeat solution adding second order coeff (shape terms) + + compute error matrix + for (each coeff) + if (uncertainty in coeff > tol) { + fix coeff value to reasonable estimate + remove coeff from solution + } + + compute total correction vector given saved coeff + if (no coeff were rejected) + tweak correction vector to accelerate convergence + compute norm of correction vector + } until (no additional coeff rejected and norm < tolerance) + + compute final uncertainties +end + + + +3.2 Extraction + + The purpose of extraction is to compute the integral of the spectra +across the dispersion, producing I(y) for each spectra. An estimate of +the uncertainty U(y) should also be produced. The basic extraction +techniques are summarized below. The number of spectra, spectra +centers, spectra width and shape parameters are taken from the model +fitted to the calibration frame as outlined above. We make a +simultaneous solution for the profile heights and the background (a +linear problem), repeating the solution independently for each line in +the image. For a faint spectrum, it is essential to determine the +background accurately, and we can do that safely here since the matrix +will be very well behaved. + + (1) Aperture sum. All of the pixels within a specified radius of + the spectra are summed to produce the raw integral. The + background image is also summed and subtracted to yield the + final integral. The radius may be a constant or a function of + the width of the profile. Fractional pixel techniques should + be used to minimize sampling effects at the boundaries of the + aperture. Pixel rejection is not possible since there is no + fitted surface. The model is used only to get the spectra + center and width. This technique is fastest and may be best + + + -6- +MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83) + + + + if the profile is difficult to model, provided the spectra are + not crowded. + + (2) Weighted aperture sum. Like (1), except that a weighting + function is applied to correct for the effects of crowding. + The model is fitted to each object line, solving for I + (height) and B (background) with all other parameters fixed. + This is a linear solution of a banded matrix and should be + quite fast provided the model can be sampled efficiently to + produce the matrix. It is possible to iterate to reject bad + pixels. The weight for a spectra at a data pixel is the + fractional contribution of that spectra to the integral of the + contributions of all spectra. + + (3) Fit and integrate the model. The model is fitted as in (2) to + the data pixels but the final integral is produced by + integrating the model. This technique should be more + resistant to noise in the data than is (2), because we are + using the high signal to noise information in the model to + constrain the integral. More accurate photometric results + should therefore be possible. + + +Method (2) has the advantage that the integral is invariant with +respect to scale errors in the fitted models, provided the same error +is made in each model. Of course, the same error is unlikely to be +made in all models contributing to a point; it is more likely that an +error will put more energy into one spectra at the expense of its +neighbors. In the limit as the spectra become less crowded, however, +the effects of errors in neighboring spectra become small and the +weighted average technique looks good; it becomes quite insensitive to +errors in the model and in the fit. For crowded spectra there seems +no alternative to a good multiparameter fit. For faint spectra method +(3) is probably best, and fitting the background accurately becomes +crucial. + +In both (2) and (3), subtraction of the scaled models yields a residual +image which can be used to evaluate at a glance the quality of the fit. +Since most all of the effort in (2) and (3) is in the least squares +solution and the pixel rejection, it might be desirable to produce two +integrals (output spectra), one for each algorithm, as well as the +uncertainty vector (computed from the covariance matrix, not the +residual). + + + +3.3 Smoothing Coefficient Arrays + + In several places we have seen the need for smoothing coefficient +arrays. The use of polynomials for smoothing is questionable unless +the order of the polynomial is low (3 or less). High order +polynomials are notoriously bad near the endpoints of the fitted +array, unless the data curve happens to be a noisy low order + + + -7- +MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83) + + + +polynomial (rare, to say the least). Convolution or piecewise +polynomial functions (i.e., the natural cubic smoothing spline) should +be considered if there is any reason to believe that the coefficient +array being smoothed may have high frequency components which are +physical and must be followed (i.e., a bend or kink). + + + +3.4 Weighting (pg. 11) + + The first weighting scheme (1 / sqrt (data)) seems inverted to me. +The noise goes up as with the signal, to be sure, but the signal to +noise usually goes up faster. Seems to me the weight estimate should +be sqrt(data). It also make more sense to weight the least blended +(peak) areas most. + + + +3.5 Rejection criteria (pg. 13) + + The same comments apply to this rejection criterium as in section +2. I assume that "(data - model)" is supposed to be "abs (data - +model"). + + + +3.6 Uncertainties and Convergence Criteria + + I got the impression that you were using the residual of the data +minus the fitted surface both as the convergence criterium and as a +measure of the errors in the fit. It is neither; assuming a perfect +model, the residual gives only a measure of the noise in the data. + +Using the residual to establish a convergence criterium seems +reasonable except that it is hard to reliably say what the criterium +should be. Assuming that the algorithm converges, the value of the +residual when convergence is acheived is in general hard to predict, +so it seems to me to be difficult to establish a convergence +criterium. The conventional way to establish when a nonlinear fit +converges is by measuring the norm of the correction vector. When the +norm becomes less than some small number the algorithm is said to have +converged. The multidimensional chisquare surface is parabolic near +the minimum and a good nonlinear algorithm will converge very rapidly +once it gets near the minimum. + +The residual is a measure of the overall goodness of fit, but tells us +nothing about the uncertainties in the individual coefficients of the +model. The uncertainties in the coefficients are given by the +covariance or error matrix (see Bevington pg. 242). It is ok to push +forward and produce an extraction if the algorithm fails to converge, +but ONLY provided the code gives a reliable estimate of the +uncertainty. + + + + -8- +MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83) + + + +3.6 Evaluating the Curvature Matrix Efficiently + + The most expensive part of the reduction process is probably +evaluating the model to form the curvature matrix at each iteration in +the nonlinear solution. The most efficient way to do this is to use +lookup tables. If the profile shape does not change, the profile can +be sampled, fitted with a spline, and the spline evaluated to get the +zero through second derivatives for the curvature matrix. This can be +done even if the width of the profile changes by adding a linear +term. If the shape of the profile has to change, it is still possible +to sample either the gaussian or the exponential function with major +savings in computation time. + + + +3.7 Efficient Extraction (pg. 12) + + The reported time of 3 cpu hours to extract the spectra from an +800 line image is excessive for a linear solution. I would estimate +the time required for the 800 linear banded matrix solutions at 4-8 +minutes, with a comparable time required for matrix setup if it is +done efficiently. I suspect that the present code is not setting up +the linear banded matrix efficiently (not sampling the model +efficiently). Pixel rejection should not seriously affect the timings +assuming that bad pixels are not detected in most image lines. + + + +4. Correcting for Variations in the PSF + + For all low signal to noise data it is desirable to correct for +variations in the point spread function, caused by variable focus, +scattering, or whatever. This does not seem such a difficult problem +since the width of the line profile is directly correlated with the +width of the PSF and the information is provided by the current model +at each point in each extracted spectrum. The extracted spectra can +be corrected for the variation in the PSF by convolution with a spread +function the width of which varies along the spectrum. |