aboutsummaryrefslogtreecommitdiff
path: root/noao/twodspec/multispec/doc/MSalgo_c.doc
diff options
context:
space:
mode:
Diffstat (limited to 'noao/twodspec/multispec/doc/MSalgo_c.doc')
-rw-r--r--noao/twodspec/multispec/doc/MSalgo_c.doc522
1 files changed, 522 insertions, 0 deletions
diff --git a/noao/twodspec/multispec/doc/MSalgo_c.doc b/noao/twodspec/multispec/doc/MSalgo_c.doc
new file mode 100644
index 00000000..b3322dff
--- /dev/null
+++ b/noao/twodspec/multispec/doc/MSalgo_c.doc
@@ -0,0 +1,522 @@
+ MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
+
+
+
+ Algorithms for the Multi-Spectra Extraction Package
+ Analysis and Discussion
+ December 2, 1983
+
+
+
+1. Disclaimer
+
+ This should not be taken as a statement of how the algorithms of
+the final package should function; this is merely an analysis and
+discussion of the algorithms, and should be followed by further
+discussion before we decide what course to follow in the final
+package. We may very well decide that the level of effort required to
+implement rigorously correct nonlinear fitting algorithms is not
+justified by the expected scientific usage of the package. Before we
+can decide that, though, we need an accurate estimate of the level of
+effort required.
+
+In attacking nonlinear surface fitting problems it is important to
+recognize that almost any techniques can be made to yield a result
+without the program crashing. Production of a result (extraction of a
+spectrum) does not mean that the algorithm converged, that the
+solution is unique, that the model is accurate, or that the
+uncertainties in the computed coefficients have been minimized.
+
+
+
+2. Multispec Flat (pg. 4)
+
+ This sounds like a classical high pass filter and might be best
+implemented via convolution. Using a convolution operator with a
+numerical kernel has the advantage that the filter can be easily
+modifed by resampling the kernel or by changing the size of the
+kernel. It is also quite efficient. The boundary extension feature
+of IMIO makes it easy to deal with the problem of the kernel
+overlapping the edge of the image in a convolution. Since the
+convolution is one-dimensional (the image is only filtered in Y), it
+will always be desirable to transpose the image.
+
+The method used to detect and reject bad pixels (eqn 1) is not correct.
+The rejection criteria should be invariant with respect to a scaling
+of the pixel values. If the data has gone through very much
+processing (i.e., dtoi on photographic data), the relation between
+photon counts and pixel value may be linear, but the scale is
+unknown. Rejection by comparison of a data value to a "smoothed"
+value is more commonly done as follows:
+
+ reject if: abs (observed - smoothed) > (K * sigma)
+
+where sigma is the noise sigma of the data, generally a function of
+the signal.
+
+It is often desirable in rejection algorithms to be able to specify,
+
+
+ -1-
+ MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
+
+
+
+as an option, that all pixels within a specified radius of a bad pixel
+be rejected, rather than just the pixel which was detected. This is
+only unnecessary if the bad pixels are single pixel events (no
+wings). Region rejection makes an iterative rejection scheme converge
+faster, as well as rejecting the faint wings of the contaminated
+region.
+
+
+
+2.1 Dividing by the Flat (pg. 5)
+
+ There is no mention of any need for registering the flat with the
+data field. Is it safe to assume that the quartz and the object
+frames are precisely registered? What if the user does in fact
+average several quartz frames taken over a period of time? (Image
+registration is a general problem that is probably best left until
+solved in IMAGES).
+
+
+
+3. Multiap Extraction (pg. 5-6, 8-13)
+
+ The thing that bothers me most about the modeling and extraction
+process is that the high signal to noize quartz information is not
+used to full advantage, and the background is not fitted very
+accurately. The present algorithms will work well for high signal to
+noise data, but will result in large (percentage) errors for faint
+spectra.
+
+Basically, it seems to me that the high signal to noise quartz spectra
+should, in many cases, be used to determine the position and shape of
+the spectral lines. This is especially attractive since the quartz
+and spectra appear to be closely registered. Furthermore, if the
+position-shape solution and extraction procedures are separate
+procedures, there is nothing to prevent one from applying both to the
+object spectum if necessary for some reason (i.e., poor registration,
+better signal to noise in the object spectrum in the region of
+interest, signal dependent distortions, lack of a quartz image, etc.,
+would all justify use of the object frame). It should be possible to
+model either the quartz or the object frame, and to reuse a model for
+more than one extraction.
+
+Let us divide the process up into two steps, "modeling", and
+"extraction" (as it is now). The "calibration frame" may be the
+quartz, an averaged quartz, or the object frame. Ideally it will have
+a high signal to noise ratio and any errors in the background should
+be negligible compared to the signal.
+
+We do not solve for the background while modeling the calibration
+frame; we assume that the background has been fitted by any of a
+variety of techniques and a background frame written before the
+calibration frame is modeled. A "swath" is the average of several
+image lines, where an image line runs across the dispersion, and a
+
+
+ -2-
+ MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
+
+
+
+column along the dispersion.
+
+
+
+3.1 Modeling
+
+ I would set the thing up to start fitting at any arbitrary swath,
+rather than the first swath, because it not much harder, and there is
+no guarantee that the calibration frame will have adequate signal to
+noise in the first swath (indeed often the lowest signal to noise will
+be found there). We define the "center" swath as the first swath to
+be fitted, corresponding to the highest signal to noise region of the
+calibration frame. By default the center swath should be the swath
+used by find_spectra, especially if there is significant curvature in
+the spectra.
+
+algorithm model_calibration_frame
+
+begin
+ extract center swath
+ initialize coeff using centers from find_spectra
+ model center swath (nonlinear)
+
+ for (successive swaths upward to top of frame) {
+ extract swath
+ initialize coeff to values from last fit
+ model swath (nonlinear)
+ save coeff in datafile
+ }
+
+ set last-fit coeff to values for center swath
+ for (successive swaths downward to bottom of frame) {
+ extract swath
+ initialize coeff to values from last fit
+ model swath (nonlinear)
+ save coeff in datafile
+ }
+
+ smooth model coeff (excluding intensity) along the dispersion
+ [high freq variations in spectra center and shape from line]
+ [to line are nonphysical]
+ variance of a coeff at line-Y from the smoothed model value is
+ a measure of the uncertainty in that coeff.
+end
+
+
+I would have the background fitting routine write as output a
+background frame, the name of which would be saved in the datafile,
+rather than saving the coeff of the bkg fit in the datafile. The
+background frame may then be produced by any of a number of
+techniques; storing the coefficients of the bkg fit in the datafile
+limits the technique used to a particular model. For similar reasons,
+the standard bkg fitting routine should be broken up into a module
+
+
+ -3-
+ MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
+
+
+
+which determines the region to be fitted, and a module which fits the
+bkg pixels and writes the bkg image.
+
+For example, if the default background fitting routine is a line by
+line routine, the output frame could be smoothed to remove the
+(nonphysical) fluctuations in the background from line to line. A
+true two dimensional background fitting routine may be added later
+without requiring modifications to the datafile. Second order
+corrections could be made to the background by repeating the solution
+using the background fitted by the extraction procedure.
+
+
+procedure extract_swath
+
+begin
+ extract raw swath from calibration frame
+ extract raw swath from background frame
+ return (calib swath minus bkg swath)
+end
+
+
+The algorithm used to simultaneously model all spectra in a swath from
+across the dispersion is probably the most difficult and time consuming
+part of the problem. The problem is nonlinear in all but one of the
+four or more parameters for each spectra. You have spent a lot of
+time on this and we are probably not going to be able to improve on
+your algorithms significantly, though the generation of the matrix in
+each step can probably be optimized significantly.
+
+The analytic line-profile model is the most general and should work
+for most instruments with small circular apertures, even in the
+presence of severe distortions. It should be possible, however, to
+fit a simpler model given by a lookup table, solving only for the
+position and height of each spectra. This model may be adequate for
+many instruments, should be must faster to fit, and may produce more
+accurate results since there are fewer parameters in the fit. The
+disadvantage of an empirical model is that it must be accurately
+interpolated (including the derivatives), requiring use of spline
+interpolation or a similar technique (I have tried linear and it is
+not good enough). Vesa has implemented procedures for fitting splines
+and evaluating their derivatives.
+
+Fitting the empirical model simultaneously to any number of spectra
+should be straightforward provided the signal to noise is reasonable,
+since there are few parameters in the fit and the matrix is banded
+(the Marquardt algorithm would work fine). However, if you ever have
+to deal with data where a very faint or nonexistent spectra is next to
+a bright one, it may be difficult to constrain the fit. I doubt if
+the present approach of smoothing the coeff across the dispersion and
+iterating would work in such a case. The best approach might be to
+fix the center of the faint spectra relative to the bright one once
+the signal drops below a certain level, or to drop it from the fit
+entirely. This requires that the matrix be able to change size during
+
+
+ -4-
+ MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
+
+
+
+the fit.
+
+algorithm fit_empirical_model
+
+begin
+ [upon entry, we already have an initial estimate of the coeff]
+
+ # Marquardt (gradient expansion) algorithm. Make 2nd order
+ # Taylor's expansion to chisquare near minimum and solve for
+ # correction vector which puts us at minimum (subject to
+ # Taylor's approx). Taylor's approximation rapidly becomes
+ # better as we near the minimum of the multidimensional
+ # chisquare, hence convergence is extremely rapid given a good
+ # starting estimate.
+
+ repeat {
+ evaluate curvature matrix using current coeff.
+ solve banded curvature matrix
+
+ compute error matrix
+ for (each spectra)
+ if (uncertainty in center coeff > tol) {
+ fix center by estimation given relative spacing
+ in higher signal region
+ remove spectra center from solution
+ }
+
+ if (no center coeff were rejected)
+ tweak correction vector to accelerate convergence
+ new coeff vector = old coeff vector + correction vector
+ compute norm of correction vector
+ } until (no more center coeff rejected and norm < tolerance)
+
+ compute final uncertainties
+end
+
+
+The following is close to what is currently done to fit the analytic
+model, as far as I can remember (I have modified it slightly to
+stimulate discussion). The solution is broken up into two parts to
+reduce the number of free parameters and increase stability. If the
+uncertainty in a free parameter becomes large it is best to fix the
+parameter (it is particularly easy for this data to estimate all but
+the intensity parameter). A fixed parameter is used in the model and
+affects the solution but is not solved for (i.e., like the background).
+
+The analytic fit will be rather slow, even if the outer loop is
+constrained to one iteration. If it takes (very rough estimates) .5
+sec to set up the banded matrix and .3 sec to solve it, 3 iterations
+to convergence, we have 5 sec per swath. If we have an 800 lines
+broken into swaths of 32 lines, the total is 125 sec per image (to
+within a factor of 5).
+
+
+
+ -5-
+ MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
+
+
+
+algorithm fit_analytic_model
+
+begin
+ [upon entry, we already have an initial estimate of the coeff]
+
+ repeat {
+ save coeff
+ solve for center,height,width of each line with second
+ order terms fixed (but not necessarily zero)
+ apply constraints on line centers and widths
+ repeat solution adding second order coeff (shape terms)
+
+ compute error matrix
+ for (each coeff)
+ if (uncertainty in coeff > tol) {
+ fix coeff value to reasonable estimate
+ remove coeff from solution
+ }
+
+ compute total correction vector given saved coeff
+ if (no coeff were rejected)
+ tweak correction vector to accelerate convergence
+ compute norm of correction vector
+ } until (no additional coeff rejected and norm < tolerance)
+
+ compute final uncertainties
+end
+
+
+
+3.2 Extraction
+
+ The purpose of extraction is to compute the integral of the spectra
+across the dispersion, producing I(y) for each spectra. An estimate of
+the uncertainty U(y) should also be produced. The basic extraction
+techniques are summarized below. The number of spectra, spectra
+centers, spectra width and shape parameters are taken from the model
+fitted to the calibration frame as outlined above. We make a
+simultaneous solution for the profile heights and the background (a
+linear problem), repeating the solution independently for each line in
+the image. For a faint spectrum, it is essential to determine the
+background accurately, and we can do that safely here since the matrix
+will be very well behaved.
+
+ (1) Aperture sum. All of the pixels within a specified radius of
+ the spectra are summed to produce the raw integral. The
+ background image is also summed and subtracted to yield the
+ final integral. The radius may be a constant or a function of
+ the width of the profile. Fractional pixel techniques should
+ be used to minimize sampling effects at the boundaries of the
+ aperture. Pixel rejection is not possible since there is no
+ fitted surface. The model is used only to get the spectra
+ center and width. This technique is fastest and may be best
+
+
+ -6-
+ MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
+
+
+
+ if the profile is difficult to model, provided the spectra are
+ not crowded.
+
+ (2) Weighted aperture sum. Like (1), except that a weighting
+ function is applied to correct for the effects of crowding.
+ The model is fitted to each object line, solving for I
+ (height) and B (background) with all other parameters fixed.
+ This is a linear solution of a banded matrix and should be
+ quite fast provided the model can be sampled efficiently to
+ produce the matrix. It is possible to iterate to reject bad
+ pixels. The weight for a spectra at a data pixel is the
+ fractional contribution of that spectra to the integral of the
+ contributions of all spectra.
+
+ (3) Fit and integrate the model. The model is fitted as in (2) to
+ the data pixels but the final integral is produced by
+ integrating the model. This technique should be more
+ resistant to noise in the data than is (2), because we are
+ using the high signal to noise information in the model to
+ constrain the integral. More accurate photometric results
+ should therefore be possible.
+
+
+Method (2) has the advantage that the integral is invariant with
+respect to scale errors in the fitted models, provided the same error
+is made in each model. Of course, the same error is unlikely to be
+made in all models contributing to a point; it is more likely that an
+error will put more energy into one spectra at the expense of its
+neighbors. In the limit as the spectra become less crowded, however,
+the effects of errors in neighboring spectra become small and the
+weighted average technique looks good; it becomes quite insensitive to
+errors in the model and in the fit. For crowded spectra there seems
+no alternative to a good multiparameter fit. For faint spectra method
+(3) is probably best, and fitting the background accurately becomes
+crucial.
+
+In both (2) and (3), subtraction of the scaled models yields a residual
+image which can be used to evaluate at a glance the quality of the fit.
+Since most all of the effort in (2) and (3) is in the least squares
+solution and the pixel rejection, it might be desirable to produce two
+integrals (output spectra), one for each algorithm, as well as the
+uncertainty vector (computed from the covariance matrix, not the
+residual).
+
+
+
+3.3 Smoothing Coefficient Arrays
+
+ In several places we have seen the need for smoothing coefficient
+arrays. The use of polynomials for smoothing is questionable unless
+the order of the polynomial is low (3 or less). High order
+polynomials are notoriously bad near the endpoints of the fitted
+array, unless the data curve happens to be a noisy low order
+
+
+ -7-
+ MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
+
+
+
+polynomial (rare, to say the least). Convolution or piecewise
+polynomial functions (i.e., the natural cubic smoothing spline) should
+be considered if there is any reason to believe that the coefficient
+array being smoothed may have high frequency components which are
+physical and must be followed (i.e., a bend or kink).
+
+
+
+3.4 Weighting (pg. 11)
+
+ The first weighting scheme (1 / sqrt (data)) seems inverted to me.
+The noise goes up as with the signal, to be sure, but the signal to
+noise usually goes up faster. Seems to me the weight estimate should
+be sqrt(data). It also make more sense to weight the least blended
+(peak) areas most.
+
+
+
+3.5 Rejection criteria (pg. 13)
+
+ The same comments apply to this rejection criterium as in section
+2. I assume that "(data - model)" is supposed to be "abs (data -
+model").
+
+
+
+3.6 Uncertainties and Convergence Criteria
+
+ I got the impression that you were using the residual of the data
+minus the fitted surface both as the convergence criterium and as a
+measure of the errors in the fit. It is neither; assuming a perfect
+model, the residual gives only a measure of the noise in the data.
+
+Using the residual to establish a convergence criterium seems
+reasonable except that it is hard to reliably say what the criterium
+should be. Assuming that the algorithm converges, the value of the
+residual when convergence is acheived is in general hard to predict,
+so it seems to me to be difficult to establish a convergence
+criterium. The conventional way to establish when a nonlinear fit
+converges is by measuring the norm of the correction vector. When the
+norm becomes less than some small number the algorithm is said to have
+converged. The multidimensional chisquare surface is parabolic near
+the minimum and a good nonlinear algorithm will converge very rapidly
+once it gets near the minimum.
+
+The residual is a measure of the overall goodness of fit, but tells us
+nothing about the uncertainties in the individual coefficients of the
+model. The uncertainties in the coefficients are given by the
+covariance or error matrix (see Bevington pg. 242). It is ok to push
+forward and produce an extraction if the algorithm fails to converge,
+but ONLY provided the code gives a reliable estimate of the
+uncertainty.
+
+
+
+ -8-
+ MULTISPEC (Dec83) Multispec Algorithms MULTISPEC (Dec83)
+
+
+
+3.6 Evaluating the Curvature Matrix Efficiently
+
+ The most expensive part of the reduction process is probably
+evaluating the model to form the curvature matrix at each iteration in
+the nonlinear solution. The most efficient way to do this is to use
+lookup tables. If the profile shape does not change, the profile can
+be sampled, fitted with a spline, and the spline evaluated to get the
+zero through second derivatives for the curvature matrix. This can be
+done even if the width of the profile changes by adding a linear
+term. If the shape of the profile has to change, it is still possible
+to sample either the gaussian or the exponential function with major
+savings in computation time.
+
+
+
+3.7 Efficient Extraction (pg. 12)
+
+ The reported time of 3 cpu hours to extract the spectra from an
+800 line image is excessive for a linear solution. I would estimate
+the time required for the 800 linear banded matrix solutions at 4-8
+minutes, with a comparable time required for matrix setup if it is
+done efficiently. I suspect that the present code is not setting up
+the linear banded matrix efficiently (not sampling the model
+efficiently). Pixel rejection should not seriously affect the timings
+assuming that bad pixels are not detected in most image lines.
+
+
+
+4. Correcting for Variations in the PSF
+
+ For all low signal to noise data it is desirable to correct for
+variations in the point spread function, caused by variable focus,
+scattering, or whatever. This does not seem such a difficult problem
+since the width of the line profile is directly correlated with the
+width of the PSF and the information is provided by the current model
+at each point in each extracted spectrum. The extracted spectra can
+be corrected for the variation in the PSF by convolution with a spread
+function the width of which varies along the spectrum.