aboutsummaryrefslogtreecommitdiff
path: root/noao/onedspec/doc/sys/Review.hlp
blob: 5139f6304638c9e32cb65901d11478b3a8962b3e (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
.help onedspec Sep84 "Spectral Reductions"
.ce
\fBOne Dimensional Spectral Reductions\fR
.ce
Analysis and Discussion
.ce
September 4, 1984
.sp 3
.nh
Introduction

    The \fBonedspec\fR package is a collection of programs for the reduction
and analysis of one dimensional spectral data.  The more general problem of
operations upon one dimensional images or vectors shall be dealt with elsewhere,
primarily in the \fBimages\fR and \fBplot\fR packages.  The problems of getting
data in and out of the system are handled by the \fBdataio\fR package, at least
for the standard data formats such as FITS.

The operators provided in \fBonedspec\fR shall be general purpose and, as far
as possible, independent of the instrument which produced the data.  Instrument
dependent reductions tailored for specific instruments will be implemented as
subpackages of the \fBimred\fR (image reductions) package.  For example,
the subpackages \fBiids\fR and \fBirs\fR will be provided in \fBimred\fR for
reducing data from the KPNO instruments of the same name.  The \fBimred\fR
packages shall call upon the basic operators in \fBonedspec\fR, \fBimages\fR,
and other packages to reduce the data for a specific instrument.


.ks
.nf
	iids(etc)
		imred
		imredtools
		onedspec
			plot
			tv
				dataio
				images
					dbms
					lists
					system
					language

.fi
.ce
Relationship of \fBOnedspec\fR to other IRAF Packages
.ke


The relationship of the \fBonedspec\fR packages to other related packages in
the IRAF system is shown above.  A program (CL script) in a package at one
level in the hierarchy may only call programs in packages at lower levels.
The system will load packages as necessary if not already loaded by the
user.  The user is expected to be familiar with the standard system packages.

.nh
Basic Functions Required for One-Dimensional Spectral Reductions

    The following classes of functions have been identified (in the preliminary
specifications document for \fBonedspec\fR) as necessary to perform basic one
dimensional spectral reductions.  Only a fraction of the functionality
required is specific to the reduction of spectral data and is therefore
provided by the \fBonedspec\fR package itself.

.ls Transport
Provided by the \fBdataio\fR package, although we do not currently have a
reader for REDUCER format data tapes.  Readers for all standard format
tapes are either available or planned.
.le
.ls Mathematical
Standard system functions provided by \fBimages\fR (arithmetic, forward and
inverse FFT, filtering, etc.).
.le
.ls Reduction Operators
The heart of \fBonedspec\fR.  Operators are required (at a minimum) for
coincidence correction, dispersion determination and correction, flat
fielding, sky subtraction, extinction correction, and flux calibration.
Operators for flat fielding and sky subtraction are already available elsewhere
in IRAF.  Basic continuum fitting and subtraction is possible with existing
software but additional algorithms designed for spectral data are desirable.
.le
.ls Plotting
Standard system functions provided by the \fBplot\fR package.
.le
.ls Utilities
Standard system functions provided by the \fBdbms\fR package.
.le
.ls Artificial Spectra
These functions belong in the \fBartdata\fR package, but it is expected that
prototype operators will be built as part of the initial \fBonedspec\fR
development.
.le

.nh
Data Structures

    Spectra will be stored as one or two dimensional IRAF images embedded in
database format files.  A free format header is associated with each image.
Spectra may be grouped together as lines of a two dimensional image provided
all can share the same header, but more commonly each image will contain a
single spectrum.  The second image dimension, if used, will contain vectors
directly associated with the images, such as a signal to noise vector.
If the image is two dimensional the spectrum must be the first image line.
The database facilities will allow images to be grouped together in a single
file if desired.

While most or all \fBonedspec\fR operators will expect a one dimensional
image as input, image sections may be used to operate on vector subsections
of higher dimensioned images if desired.  The datatype of an image is
arbitrary, but all pixel data will be single precision real within
\fBonedspec\fR.  While the IRAF image format does not impose any restrictions on
the size of an image or image line, not all spectral operators may be usable
on very large images.  In general, pointwise and local operations may easily
be performed on images of any size with modest memory requirements, and
most of the \fBonedspec\fR operations appear to fall into this class.

.nh 2
The IRAF Database Faciltities

    An understanding of the IRAF database facilities is necessary to visualize
how data will be treated by operators in \fBonedspec\fR and other packages.
The database facilities will be used not just for image storage but also for
program intercommunication, program output, and the storage of large
astronomical catalogs (e.g. the SAO catalog).  Access to both small and
large databases will be quite efficient; achieving this requires little
innovation since database technology is already highly developed.  We begin by
defining some important terms.

.ls
.ls DBIO
The database i/o package, used by compiled programs to access a database.
.le
.ls DBMS
The database management package, a CL level package used by the user to
inspect, analyze, and manipulate the contents of a database.
.le
.ls database
A set of one or more "relations" or tables (DBIO is a conventional relational
database).  A convenient way to think of an IRAF database is as a directory.
The relations appear as distinct files in the directory.
.le
.ls relation
A relation is a set of \fBrecords\fR.  Each record consists of a set of
\fBfields\fR, each characterized by a name and a datatype.  All the records
in a relation have the same set of fields.  Perhaps the easiest way to
visualize a relation is as a \fBtable\fR.  The rows and columns of the table
correspond to the records and fields of the relation.
.le
.ls field
A field of a record is characterized by an alphanumeric name, datatype, and
size.  Fields may be one dimensional arrays of variable size.  Fields may be
added to a relation dynamically at run time.  When a new field is added to
a relation it is added to all records in the relation, but the value of the
field in a particular record is undefined (and consumes no storage) until
explicitly written into.
.le
.ls key
.br
A function of the values of one or more fields, used to select a subset of
rows from a table.  Technically, a valid key will permit selection of any
single row from a table, but we often use the term is a less strict sense.
.le
.le


An \fBimage\fR appears in the database as a record.  The record is really
just the image header; the pixels are stored external to the database in a
separate file, storing only the name of the pixel storage file in the record
itself (for very small images we are considering storing the pixels directly
in the database file).  Note that the record is a simple flat structure;
this simple structure places restrictions on the complexity of objects which
can be stored in the database.

The records in a relation form a set, not an array.  Records are referred to
by a user-defined key.  A simple key might be a single field containing a
unique number (like an array index), or a unique name.  More complex keys
might involve pattern matching over one or more fields, selection of records
with fields within a certain range of values, and so on.

From the viewpoint of \fBonedspec\fR, a relation can be considered a
\fBdata group\fR, consisting of a set of \fBspectra\fR.

.nh 2
Image Templates

    The user specifies the set of spectra to be operated upon by means of an
image template.  Image templates are much like the filename templates commonly
used in operating systems.  The most simple template is the filename of
a single data group; this template matches all spectra in the group.  If there
is only one spectrum in a file, then only one spectrum is operated upon.
A slightly more complex template is a list of filenames of data groups.
More complex templates will permit use of expressions referencing the values
of specific fields to select a subset of the spectra in a group.  The syntax
of such expressions has not yet been defined (examples are given below
nonetheless), but the function performed by an image template will be the same
regardless of the syntax.  In all cases the image template will be a single
string valued parameter at the CL level.

.nh 2
Standard Calling Sequence

    The standard calling sequence for a unary image operator is shown below
The calling sequence for a binary operator would be the same with a second input
parameter added as the second argument.  In general, any data dependent
control parameters should be implemented as positional arguments following
the primary operands, and data independent or optional (rarely used) parameters
should be implemented as hidden parameters.


.ks
.nf
	imop (input, output, data_dependent_control_params)

	imop	image operator name
	input	image template specifying set of input images
	output	filename of output datagroup

	data_dependent_control_parameters
	(hidden parameters)

for example,

	coincor (spectra, newgroup, dead_time)
.fi
.ke


If a series of spectra are to be processed it seems reasonable to add the
processed spectra to a new or existing data group (possibly the same as an
input datagroup).  If the operation is to be performed in place a special
notation (e.g. the null string) can be given as the output filename.
At the \fBonedspec\fR level output filenames will not be defaulted.

.nh 2
Examples

    Some examples of image templates might be useful to give a more concrete
idea of the functionality which will be available.  Bear in mind that what we
are describing here is really the usage of one of the fundamental IRAF system
interfaces, the DBMS database management subsystem, albeit from the point of
view of \fBonedspec\fR.  The same facilities will be available in any program
which operates upon images, and in some non-image applications as well (e.g.
the new \fBfinder\fR).  Our philosopy, as always, is to make standard usage
simple, with considerable sophistication available for those with time to
learn more about the system.

The simplest case occurs when there is one spectrum per data group (file).
For example, assuming that the file "a" contains a single spectrum, the
command

	cl> coincor a, b, .2

would perform coincidence correction for spectrum A, placing the result in
B, using a dead time parameter of .2.  For a more complex example, consider
the following command:

	cl> coincor "a.type=obj&coincor=no,b", a, .2

This would perform coincidence correction for all spectra in group B plus all
object spectra in group A which have not already been coincidence corrected,
adding the corrected spectra to group A (notation approximate only).  If the
user does not trust the database explicit record numbers may be used and
referenced via range list expressions, e.g.,

	cl> coincor "a.recnum=(1,5,7:11),b", a, .2

would select records 1, 5, and 7 through 11 from data group A.  Alternatively
the database utilities could be used to list the spectra matching the selection
criteria prior to the operation if desired.  For example,

	cl> db.select "a.type=obj"

would write a table on the standard output (the terminal) wherein each spectrum
in data group A is shown as a row of field values.  If one wanted to generate
an explicit list of records to be processed with help from the database
utilities, a set of records could be selected from a data group and selected
fields from each record written into a text file:

	cl> db.select "a.type=obj", "recnum, history" > reclistfile

The output file "reclistfile" produced by this command would contain the
fields "recnum" (record number) and "history" (description of processing
performed to generate the record).  The editor could be used to delete
unwanted records, producing a list of record numbers suitable for use as
an image template:

	cl> coincor "a.recnum=@reclistfile", a, .2

.nh
Reduction Operators

.nh 2
Line List Preparation

    I suggest maintaining the line lists as text files so that the user can
edit them with the text editor, or process them with the \fBlists\fR operators.
A master line list might be maintained in a database and the DBMS \fBselect\fR
operator used to extract ASCII linelists in the wavelength region of interest,
but this would only be necessary if the linelist is quite large or if a linelist
record contains many fields.  I don't think we need the \fBline_list\fR task.

.nh 2
Dispersion Solution

    The problem with selecting a line list and doing the dispersion solution
in separate operations is that the dispersion solution is invaluable as an aid
for identifying lines and for rejecting lines.  Having a routine which merely
tweaks up the positions of lines in an existing lineset (e.g., \fBalinid\fR)
is not all that useful.  I would like to suggest the following alternate
procedure for performing the dispersion solution for a set of calibration
spectra which have roughly the same dispersion.

.ls
.ls [1] Generate Lineset [and fit dispersion]
.sp
Interactively determine the lineset to be used, i.e., wavelength (or whatever)
and approximate line position in pixel units for N lines.  Input is one or more
comparison spectra and optionally a list of candidate lines in the region
of interest.  Output is the order for the dispersion curve and a linelist of
the following (basic) form:

	L#  X  Wavelength [Weight]

It would be very useful if the program, given a rough guess at the dispersion,
could match the standard linelist with the spectra and attempt to automatically
identify the lines thus detected.  The user would then interactively edit the
resultant line set using plots of the fitted dispersion curve to reject
misidentified or blended lines and to adjust weights until a final lineset
is produced.
.le

.ls [2] Fit Dispersion
.sp
Given the order and functional type of the curve to be fitted and a lineset
determined in step [1] (or a lineset produced some any other means, e.g. with
the editor), for each spectrum in the input data group tweak the center of
each line in the lineset via an automatic centering algorithm, fit the
dispersion curve, and save the coefficients of the fitted curve in the
image header.  The approximate line positions would be used to find and measure
the positions of the actual lines, and the dispersion curve would be fitted and
saved in the image header of each calibration spectrum.

While this operator would be intended to be used noninteractively, the default
textual and graphics output devices could be the terminal.  To use the program
in batch mode the user would redirect both the standard output and the graphics
output (if any), e.g.,

.nf
	cl> dispsol "night1.type=comp", linelistfile, order,
	>>> device=stdplot, > dispsol.spool &
.fi

Line shifts, correlation functions, statistical errors, the computed residuals
in the fitted dispersion curves, plots of various terms of the dispersion
curves, etc. may be generated to provide a means for later checking for
erroneous solutions to the individual spectra.  There is considerable room for
innovation in this area.
.le

.ls [3] Second Order Correction
.sp
If it is desired to interpolate the dispersion curve in some additional
dimension such as time or hour angle, fit the individual dispersion solutions
produced by [1] or [2] as a group to one or more additional dimensions,
generating a dispersion solution of one, two or more dimensions as output.
If the output is another one dimensional dispersion solution, the input
solutions are simply averaged with optional weights.  This "second order"
correction to a group of dispersion solutions is probably best performed by
a separate program, rather than building it into \fBalineid\fR, \fBdispsol\fR,
etc.  This makes the other programs simpler and makes it possible to exclude
spectra from the higher dimensional fit without repeating the dispersion
solutions.
.le
.le

If the batch run [2] fails for selected spectra the dispersion solution for
those spectra can be repeated interactively with operator [1].
The curve fitting package should be used to fit the dispersion curve (we can
extend the package to support \fBonedspec\fR if necessary).

.nh 2
Dispersion Correction

    This function of this procedure is to change the dispersion of a
spectrum or group of spectra from one functional form to another.
At a mimimum it must be possible to produce spectra linear in wavelength or
log wavelength (as specified), but it might also be useful to be able
to match the dispersion of a spectrum to that of a second spectrum, e.g., to
minimize the amount of interpolation required to register spectra, or
to introduce a nonlinear dispersion for testing purposes.  This might be
implemented at the CL parameter level by having a string parameter which
takes on the values "linear" (default), "log", or the name of a record
defining the dispersion solution to be matched.

It should be possible for the output spectrum to be a different size than
the input spectrum, e.g., since we are already interpolating the data,
it might be nice to produce an output spectrum of length 2**N if fourier
analysis is to be performed subsequently.  It should be possible to
extract only a portion of a spectrum (perform subraster extraction) in the
process of correcting the dispersion, producing an output spectrum of a
user-definable size.  It should be possible for an output pixel to lie at
a point outside the bounds of the input spectrum, setting the value of the
output pixel to INDEF or to an artificially generated value.  Note that
this kind of generality can be implemented at the \fBonedspec\fR level
without compromising the simplicity of dispersion correction for a particular
instrument at the \fBimred\fR level.

.nh 3
Line Centering Algorithms

    For most data, the best algorithm in the set described is probably the
parabola algorithm.  To reject nearby lines and avoid degradation of the
signal to noise the centering should be performed within a small aperture,
but the aperture should be allowed to move several pixels in either direction
to find the peak of the line.

The parabola algorithm described has these features,
but as described it finds the extrema within a window about the
initial position.  It might be preferable to simply walk up the peak nearest
to the initial center.  This has the advantage that it is possible to center
on a line which has a nearby, stronger neighbor which cannot itself be used
for some reason, but which might fall within \fBparextent\fR pixels of the
starting center.  The parabola algorithm as described also finds a local extrema
rather than a local maximum; probably not what is desired for a dispersion
solution.  The restriction to 3 pixels in the final center determination is
bad; the width of the centering function must be a variable to accommodate
the wide range of samplings expected.

The parabola algorithm described is basically a grid search over
2*\fIparextent\fR pixels for the local extrema.  What I am suggesting is
an iterative gradient search for the local maximum.  The properties of the
two algorithms are probably sufficiently different to warrant implementation
of both as an option (the running times are comparable).  I suspect that
everyone else who has done this will have their own favorite algorithm as
well; probably we should study half a dozen but implement only one or two.

.nh 2
Field Flattening

    It is not clear that we need special flat fielding operators for
\fBonedspec\fR.  We have a two-dimensional operator which fits image lines
independently which might already do the job.  Probably we should experiment
with both the smoothing spline and possibly fourier filtering for removing
the difficult medium frequency fluctuations.  The current \fBimred\fR flat field
operator implements the cubic smoothing spline (along with the Chebyshev and
Legendre polynomials), and is available for experimentation.

Building interactive graphics into the operator which fits a smooth curve to
the continuum is probably not necessary.  If a noninteractive \fBimred\fR or
\fBimages\fR operator is used to fit the continuum the interactive graphics
can still be available, but might better reside in a higher level CL script.
The basic operator should behave like a subroutine and not write any output
to the terminal unless enabled by a hidden parameter (we have been calling
this parameter \fIverbose\fR in other programs).

.nh 3
Extinction Correction and Flux Calibration

    I did not have time to review any of this.

.nh
Standard Library Packages

    The following standard IRAF math library packages should be used in
\fBonedspec\fR.  The packages are very briefly described here but are
fully documented under \fBhelp\fR on the online (kpnob:xcl) system.

.nh 2
Curve Fitting

    The curve fitting package (\fBcurfit\fR) is currently capable of fitting
the Chebyshev and Legendre polynomials and the cubic smoothing spline.
Weighting is supported as an option.
We need to add a piecewise linear function to support the
dispersion curves for the high resolution FTS spectra.  We may have to add a
double precision version of the package to provide the 8-10 digits of
precision needed for typical comparison line wavelength values, but
normalization of the wavelength values may make this unnecessary for moderate
resolution spectra.

Ordinary polynomials are not supported because their numerical properties are
very much inferior to those of orthogonal polynomials (the ls matrix can have
a disastrously high condition number, and lacking normalization the function
begin fitted is not invariant with respect to scale changes and translations
in the input data).  For low order fits the Chebyshev polynomials are
considered to have the best properties from an approximation theoretic point
of view, and for high order fits the smoothing spline is probably best because
it can follow arbitrary trends in the data.

.nh 2
Interpolation

    The image interpolation package (\fBiminterp\fR) currently supports the
nearest neighbor, linear, third and fifth order divided differences,
cubic interpolating spline, and sinc function interpolators.
We should add the zeroth and first order partial pixel ("flux conserving")
interpolants because they offer unique properties not provided by any
of the other interpolants.

.nh 2
Interactive Graphics

    We will define a standard interactive graphics utility package for
interactive operations upon data vectors (to be available in a system library
in object form).  It should be possible to define a general package which
can be used anywhere a data vector is to be plotted and
examined interactively (not just in \fBonedspec\fR).  Standard keystrokes
should be defined for common operations such as expanding a region of
the plot and restoring the original scale.  This will not be attempted
until an interactive version of the GIO interface is available later this
fall.
.endhelp