.nr PS 9 .nr VS 11 .RP .ND .TL APEXTRACT Package Revisions Summary: IRAF Version 2.10 .AU Francisco Valdes .AI IRAF Group - Central Computer Services .K2 P.O. Box 26732, Tucson, Arizona 85726 September 1990 .AB This paper summarizes the changes in Version 3 of the IRAF \fBapextract\fR package which is part of IRAF Version 2.10. The major new features and changes are: .IP \(bu New techniques for cleaning and variance weighting extracted spectra .IP \(bu A new task, \fBapall\fR, which integrates all the parameters used for one dimensional extraction of spectra .IP \(bu A new extended output format for recording both weighted and unweighted extractions, subtracted background, and variance information. .IP \(bu Special featurers for automatically numbering and identifying large numbers of apertures. .IP \(bu New tasks and algorithms, \fBaprecenter\fR and \fBapresize\fR, for automatically recentering and resizing aperture definitions .IP \(bu A new task, \fBapflatten\fR, for creating flat fields from fiber and slitlet spectra .IP \(bu A new task, \fBapfit\fR, providing various types of fitting for two dimensional multiobject spectra. .IP \(bu A new task, \fBapmask\fR, for creating mask images from aperture definitions. .AE .NH Introduction .PP A new version of the IRAF \fBapextract\fR package has been completed. It is Version 3 and is part of IRAF Version 2.10. The package will be made available as an external package prior to the release of V2.10. This paper describes the changes and new features of the package. It does not describe them in detail. Full details of the algorithms, functions, and parameters are found in the task descriptions. Reference is made to the previous version so familiarity with that version is useful though not necessary. There were three goals for the new package: new and improved cleaning and variance weighting (optimal extraction) algorithms, the addition of recommended or desirable new tasks and algorithms (particularly to support large numbers of spectra from fiber and aperture mask instruments), and special support for the new image reduction scripts. Features relating to the last point are not discussed here. .PP Table 1 summarizes the major new features and changes in the package. .ce Table 1: Summary of Major New Features and Changes .IP \(bu New techniques for cleaning and variance weighting extracted spectra .IP \(bu A new task, \fBapall\fR, which integrates all the parameters used for one dimensional extraction of spectra .IP \(bu A new extended output format for recording both weighted and unweighted extractions, subtracted background, and variance information. .IP \(bu Special featurers for automatically numbering and identifying large numbers of apertures. .IP \(bu New tasks and algorithms, \fBaprecenter\fR and \fBapresize\fR, for automatically recentering and resizing aperture definitions .IP \(bu A new task, \fBapflatten\fR, for creating flat fields from fiber and slitlet spectra .IP \(bu A new task, \fBapfit\fR, providing various types of fitting for two dimensional multiobject spectra. .IP \(bu A new task, \fBapmask\fR, for creating mask images from aperture definitions. .NH Cleaned and Variance Weighted Extractions: apsum and apall .PP There are two types of aperture extraction (estimating the background subtracted flux across a fixed width aperture at each image line or column) just as in the previous version. One is a simple sum of pixel values across an aperture. In the previous version this was called "profile" weighting while in this version it is simply called unweighted or "none". The second type weights each pixel in the sum by its estimated variance based on a spectrum model and detector noise parameters. As before this type of extraction is selected by specifying "variance" for the weighting parameter. .PP Variance weighting is often called "optimal" extraction since it produces the best unbiased signal-to-noise estimate of the flux in the two dimensional profile. It also has the advantage that wider apertures may be used without penalty of added noise. The theory and application of this type of weighting has been described in several papers. The ones which were closely examined and used as a model for the algorithms in this software are \fIAn Optimal Extraction Algorithm for CCD Spectroscopy\fR, \fBPASP 98\fR, 609, 1986, by Keith Horne and \fIThe Extraction of Highly Distorted Spectra\fR, \fBPASP 100\fR, 1032, 1989, by Tom Marsh. .PP The noise model for the image data used in the variance weighting, cleaning, and profile fitting consists of a constant gaussian noise and a photon count dependent poisson noise. The signal is related to the number of photons detected in a pixel by a gain parameter given as the number of photons per data number. The gaussian noise is given by a readout noise parameter which is a defined as a sigma in photons. The poisson noise is approximated as gaussian with sigma given by the number of photons. The method of specifying this noise model differs from the previous version in that the more common CCD detector parameters of readout noise and gain are used rather than the linear variance parameters "v0" and "v1". .PP Some additional effects which should be considered in principle, and which are possibly important in practice, are that the variance estimate should be based on the actual number of photons detected before correction for pixel sensitivity; i.e. before flat field correction. Furthermore the uncertainty in the flat field should also be included in the weighting. However, the profile must be determined free of sensitivity effects including rapid larger scale variations such as fringing. Thus, ideally one should input the unflat-fielded observation and the flat field data and carry out the extractions with the above points in mind. However, due to the complexity often involved in basic CCD reductions and special steps required for producing spectroscopic flat fields this level of sophistication is not provided by the current package. .PP The package does provide, however, for propagation of an approximate uncertainty in the background estimate when using background subtraction. If background subtraction is done, a background variance is computed using the poisson noise model based on the estimated background counts. Because the background estimate is based on a finite number of pixels, the poisson variance estimate is divided by the number (minus one) of pixels used in determining the background. The number of pixels used includes any box car smoothing. Thus, the larger the number of background pixels the smaller the background noise contribution to the variance weighting. This method is only approximate since no correction is made for the number of degrees of freedom and correlations when using the fitting method of background estimation. .PP If removal of cosmic rays and other deviant pixels is desired (called cleaning) they are iteratively rejected based on the estimated variance and excluded from the weighted sum. Unlike the previous version, a cleaned extraction is always variance weighted. This makes sense since the detector noise parameters must be specified and the spectrum profile computed, so all of the computational effort must be done anyway, and the variance weighting is as good or superior to a simple unweighted extraction. .PP The detection and removal of deviant pixels is straightforward. Based on the noise model, pixels deviating by more than a specified number of sigma (square root of the variance) above or below the model are removed from the weighted sum. A new spectrum estimate is made and the rejection is repeated. The rejections are made one at a time starting with the most deviant and up to half the pixels in the aperture may be rejected. .NH Spectrum Profile Determination: apsum, apall, apflatten, apfit .PP The foundation of variance weighted or optimal extraction, cosmic ray detection and removal, two dimensional flat field normalization, and spectrum fitting and modeling is the accurate determination of the spectrum profile across the dispersion as a function of wavelength. The previous version of the \fBapextract\fR package accomplished this by averaging a specified number of profiles in the vicinity of each wavelength after correcting for shifts in the center of the profile. This technique was sensitive to perturbations from cosmic rays and the exact choice of averaging parameters. The current version of the package uses a different algorithm, actually a combination of two algorithms, which is much more stable. .PP The basic idea is to normalize each profile along the dispersion to unit flux and then fit a low order function to sets of unsaturated points at nearly the same point in the profile parallel to the dispersion. The important point here is that points at the same distance from the profile center should have the nearly the same values once the continuum shape and spectral features have been divided out. Any variations are due to slow changes in the shape of the profile with wavelength, differences in the exact point on the profile, pixel binning or sampling, and noise. Except for the noise, the variations should be slow and a low order function smoothing over many points will minimize the noise and be relatively insensitive to bad pixels such as cosmic rays. Effects from bad pixels may be further eliminated by chi-squared iteration and clipping. Since there will be many points per degree of freedom in the fitting function the clipping may even be quite aggressive without significantly affecting the profile estimates. Effects from saturated pixels are minimized by excluding from the profile determination any profiles containing one or more saturated pixels. .PP The normalization is, in fact, the one dimensional spectrum. Initially this is the simple sum across the aperture which is then updated by the variance weighted sum with deviant pixels possibly removed. This updated one dimensional spectrum is what is meant by the profile normalization factor in the discussion below. The two dimensional spectrum model or estimate is the product of the normalization factor and the profile. This model is used for estimating the pixel intensities and, thence, the variances. .PP There are two important requirements that must be met by the profile fitting algorithm. First it is essential that the image data not be interpolated. Any interpolation introduces correlated errors and broadens cosmic rays to an extent that they may be confused with the spectrum profile, particularly when the profile is narrow. This was one of the problems limiting the shift and average method used previously. The second requirement is that data fit by the smoothing function vary slowly with wavelength. This is what precludes, for instance, fitting profile functions across the dispersion since narrow, marginally sampled profiles require a high order function using only a very few points. One exception to this, which is sometimes useful but of less generality, is methods which assume a model for the profile shape such as a gaussian. In the methods used here there is no assumption made about the underlying profile other than it vary smoothly with wavelength. .PP These requirements lead to two fitting algorithms based on how well the dispersion axis is aligned with the image columns or lines. When the spectra are well aligned with the image axes one dimensional functions are fit to the image columns or lines. Small excursions of a few pixels over the length of the spectrum can be adequately fit in this way. When the spectra become strongly tilted then single lines or columns may cross the actual profile relatively quickly causing the requirement of a slow variation to be violated. One thought is to use interpolation to fit points always at the same distance from the profile. This is ruled out by the problems introduced by image interpolation. However, there is a clever method which, in effect, fits low order polynomials parallel to the direction defined by tracing the spectrum but which does not interpolate the image data. Instead it weights and couples polynomial coefficients. This method was developed by Tom Marsh and is described in detail in the paper, \fIThe Extraction of Highly Distorted Spectra\fR, \fBPASP 101\fR, 1032, Nov. 1989. Here we refer to this method as the Marsh algorithm and do not attempt to explain it further. .PP Both fitting algorithms weight the pixels by their variance as computed from the background and background variance if background subtraction is specified, the spectrum estimate from the profile and the spectrum normalization, and the detector noise parameters. The noise model is that described earlier. .PP The profile fitting can be iterated to remove deviant pixels. This is done by rejecting pixels greater than a specified number of sigmas above or below the expected value based on the profile, the normalization factor, the background, the detector noise parameters, and the overall chi square of the residuals. Rejected points are removed from the profile normalization and from the fits. .NH New Extraction Task: apall .PP All of the functions of the \fBapextract\fR package are actually part of one master program. The organization of the package into tasks by function with parameters to allow selection of some of the other functions, for example the aperture editor may be entered from virtually every task, was done to highlight the logic and organize the parameters into small sets. However, there was often confusion about which parameters were being used and the need to set parameters in one task, say \fBaptrace\fR, in order to use the trace option in another task, say \fBapsum\fR. In practice, for the most common function of extraction of two dimensional spectra to one dimension most users end up using \fBapsum\fR for all the functions. .PP In the new version, the old organization is retained (with the addition of new functions and some changes in parameters) but a new task, \fBapall\fR, is also available. This task contains all of the parameters needed for extraction with a parameter organization which is nicely formated for use with \fBeparam\fR. The parameters in \fBapall\fR are independent of the those in the other tasks. It is expected that many, if not most users will opt to use this task for spectrum extraction in preference to the individual functions. .PP The organization by function is still used in the documentation. This is still the best way to organize the descriptions of the various algorithms and parameters. As an example, the profile tracing algorithm is described in most detail under the topic \fBaptrace\fR. .NH Extraction Output Formats: apsum and apall .PP The extracted spectra are recorded in one, two, or three dimensional images depending on the \fIformat\fR and \fIextras\fR parameters. If the \fIextras\fR parameter is selected the formats are three dimensional with each plane in the third dimension containing associated information for the spectra in the first plane. This information includes the unweighted spectrum and a sigma spectrum (estimated from the variances and weights of the pixels extracted) when using variance weighting, and the background spectrum when background subtraction is used. When \fIextras\fR is not selected only the extracted spectra are output. .PP The formats are basically the same as in the previous version; onedspec, multispec, and echelle. In addition, the function of the task \fBapstrip\fR in the previous version has been transferred to the extraction tasks by simply specifying "strip" for the format. .PP There are some additions to the header parameters in multispec and echelle format. Two additional fields have been added to the aperture number parameter giving the aperture limits (at the reference dispersion point). Besides being informative it may be used for interpolating dispersion solutions spatially. A second, optional keyword per spectrum has been added to contain a title. This is useful for multiobject spectra. .NH Easier and Extended Aperture Identifications: apfind and apedit .PP When dealing with a large number of apertures, such as occur with multifiber and multiaperture data, the burden of making and maintaining useful aperture identifications becomes large. Several very useful improvements were made in this area. These improvements generally apply equally to aperture identifications made by the automated \fBapfind\fR algorithm and those made interactively using \fBapedit\fR. In the simplest usage of defining apertures interactively or with the aperture finding algorithm, aperture numbers are assigned sequentially beginning with 1. In the new version the parameter "order" allows the direction of increasing aperture numbers with respect to the direction of increasing pixel coordinate (either column or line) to be set. An "increasing" order parameter value numbers the apertures from left to right (the direction naturally plotted) in the same sense as the pixel coordinates. A "decreasing" order reverses this sense. .PP Some instruments, particularly multifiber instruments, produce nearly equally spaced spectra for which one wants to maintain a consistent numbering sequence. However, at times some spectra may be missing due to broken or unassigned fibers and one would like to skip an aperture identification number to maintain the same fiber assignments. To do this automatically, a new parameter called \fImaxsep\fR has been added. This parameter defines the maximum separation between two apertures beyond which a jump in the aperture sequence is made. In other words the sequence increment is given by rounding down the separation divided by this parameter. How accurately this value has to be specified depends on how large the gaps may be and the natural variability in the aperture positions. In conjunction with the minimum separation parameter this algorithm works quite well in accounting for missing spectra. .PP One flaw in this scheme is when the first spectrum is missing causing the identifications will be off. In this case the modified interactive aperture editing command 'o' asks for the aperture identification number of the aperture pointed at by the cursor and then automatically renumbers the other apertures relative to that aperture. The other possible flaw is identification of noise as a spetrum but this is controlled by the \fIthreshold\fR parameter and, provided the actual number of spectra is known, say by counting off a graph, then the \fInfind\fR parameter generally limits this type of problem. .PP A new attribute of an aperture is a title. If present this title is propagated through the extraction into the image headers. The title may be set interactively but normally the titles are supplied in another new feature, an "aperture identification" file specified by the parameter \fIapidtable\fR. This file provides the most flexibility in making aperture identification assignments. The file consists of lines with three fields, a unique aperture number, a beam or aperture type number which may be repeated, and the aperture title. The aperture identification lines from the file are assigned sequentially in the same order as would be done if using the default indexing including skipping of missing spectra based on the maximum separation. .PP By default the beam number is the same as the aperture number. When using an aperture identification file the beam number can be used to assign spectrum types which other software may use. For example, some of the specialized fiber reduction packages use the beam number to identify sky fibers and embedded arc fibers. .NH New Aperture Recentering Task: aprecenter .PP An automated recentering algorithm has been added. It may be called through the new \fBaprecenter\fR command, from any of the tasks containing the \fIrecenter\fR parameter, or from the aperture editor. The purpose of this new feature is to allow automatically adjusting the aperture centers to follow small changes in the positions of spectra expected to be at essentially the same position, such as with fiber fed spectrographs. This does not change the shape of the trace but simply adds a shift across the dispersion axis. .PP Typically, one uses a strong image to define reference apertures and then for subsequent objects uses the reference positions with a recentering to correct for flexure effects. However, it may be inappropriate to base a new center on weak spectra or to have multiple spectra recentered independently. The recentering options provide for selecting specific apertures to be recentered, selecting only a fraction of the strongest (highest peak data level) spectra and averaging the shifts determined (possible from only a subset of the spectra) and applying the average shift to all the apertures. Note that one may still specify the dispersion line and number of dispersion lines to sum in order to improve the signal for centering. .NH New Aperture Resizing Task: apresize .PP An automated resizing algorithm has been added. It may be called through the new \fBapresize\fR command, from any of the tasks containing the \fIresize\fR parameter, or from the aperture editor with the new key 'z' (the y cursor level command is still available with the 'y' key). The purpose of this new feature is to allow automatically adjusting the aperture widths to follow changes in seeing and to provide a greater variety of global aperture sizing methods. .PP In all the methods the aperture limits are set at the pixel positions relative to the center which intersect the linearly interpolated data at some data value. The methods differ in how the data level is determined. The methods are: .IP \(bu Set size at a specified absolute data level .IP \(bu Set size at a specified data level above a background .IP \(bu Set size at a specified fraction of the peak pixel value .IP \(bu Set size at a specified fraction of the peak pixel value above a background .LP The automatic background is quite simple; a line connecting the first local minima from the aperture center. .PP The limits determined by one of the above methods may be further adjusted. The limits may be increased or decreased by a specified fraction. This allows setting wider limits based on more accurately determined limits from the stronger part of the profile; for example doubling the limits obtained from the half intensity point. A maximum extent may be imposed. Finally, if there is more than one aperture and one wants to maintain the same aperture size, the apertures sizes determined individually may be averaged and substituted for all the apertures. .NH New Aperture Mask Output: apmask .PP A new task, \fBapmask\fR, has been added to produce a mask file/image of 1's and 0's defined by the aperture definitions. This is based on the new IRAF mask facilities. The output is a compact binary file which may be used directly as an image in most applications. In particular the mask file can be used with tasks such as \fBimarith\fR, \fBimreplace\fR, and \fBdisplay\fR. Because the mask facility is new, there is little that can be done with masks other than using it as an image. However, eventually many tasks will be able to use mask images. The aperture mask will be particularly well suited to work with \fBimsurfit\fR for fitting a surface to the data outside the apertures. This would be an alternative for scattered light modeling to the \fBapscatter\fR tasks. .NH Aperture Flat Fields and Normalization: apflatten and apnormalize .PP Slitlet, echelle, and fiber spectra have the characteristic that the signal falls off to near zero values outside regions of the image containing spectra. Also fiber profiles are usually undersampled causing problems with gradients across the pixels. Directly dividing by a flat field produces high noise (if not division by zero) where the signal is low, introduces the spectrum of the flat field light, and changes the profile shape. .PP One method for modifying the flat field to avoid these problems is provided by the task \fBimred.generic.flat1d\fR. However, this task does not use any knowledge of where the spectra are. There are two tasks in the \fBapextract\fR package which can be used to modify flat field images. \fBapnormalize\fR is not new. It divides the spectra within specified apertures by a one dimensional spectrum, either a constant for simple throughput normalization or some smoothed version of the spectrum in the aperture to remove the spectral shape. Pixels outside specified apertures are set to unity to avoid division effects. This task has the effect of preserving the profile shape in the flat field which may be desired for attempts to remove slit profiles. .PP Retaining the profile shape of the flat field can give very bad edge effects, however, if there is image flexure. A new task similar to \fBflat1d\fR but which uses aperture information is \fBapflatten\fR. It uses the spectrum profile model described earlier. For nearly image axes aligned spectra this amounts very nearly to the line or column fitting of \fBflat1d\fR. As with \fBapnormalize\fR there is an option to fit the one dimensional spectrum to remove the large scale shape of the spectrum while preserving small scale sensitivity variations. The smoothed spectrum is multiplied by the normalized profile and divided into the data in each aperture. Pixels outside the aperture are set to 1. Pixels with model values below a threshold are also set to 1. This produces output images which have the small scale sensitivity variations, a normalized mean, and the spectrum profile removed. .NH Two Dimensional Spectrum Fitting: apfit .PP The profile and spectrum fitting used for cleaning and variance weighted extraction may be used and output in the new task \fBapfit\fR. The task \fBapfit\fR is similar in structure to \fBfit1d\fR. One may output the fit, difference, or ratio. The fit may be used to examine the spectrum model used for the cleaning and variance weighted extraction. The difference and ratio may used to display small variations and any deviant pixels. While specific uses are not given this task will probably be used in interesting ways not anticipated by the author. .NH I/O and Dispersion Axis Parameters: apextract and apio .PP The general parameters, primarily concerning input and output devices and files, were previously in the parameter set \fBapio\fR. This "pset" task has been removed and those parameters are now found as part of the package parameters, i.e. \fBapextract\fR. There is one new parameter in the \fBapextract\fR package parameters, dispaxis. In the previous version of the package one needed to run the task \fBsetdisp\fR to insert information in the image header identifying the dispersion direction of the spectra in the image. Often people would forget this step and receive an error message to that effect. The new parameter allows skipping this step. If the DISPAXIS image header parameter is missing the package parameter value is inserted into the image header as part of the processing. Note that if the parameter is present in the image header either because \fBsetdisp\fR was used or the image creation process inserted it (a future ideal case) then that value is used in preference to the package parameter. .NH Strip Extraction: apstrip .PP The task \fBapstrip\fR from the previous version has been removed. However, it is possible to obtain two dimensional strips aligned with the image axes by specifying a format of "strip" when using \fBapsum\fR or \fBapall\fR. While the author doesn't anticipate a good scientific use for this feature others may find it useful.