noao/onedspec/doc/fitprofs.hlp


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403

.help fitprofs Mar92 noao.onedspec
.ih
NAME
fitprofs -- Fit 1D profiles to features in image vectors
.ih
USAGE
fitprofs input
.ih
PARAMETERS
.ls input
List of input images to be fit.  The images may be one dimensional
spectra (one or more spectra per image) or long slit spectra.  Other
types of nonspectral images may also be used and for two dimensional
images the fitting direction will be determined from either the keyword
DISPAXIS in the image header or the \fIdispaxis\fR parameter.
.le
.ls lines = ""
List of lines, columns, or apertures to be selected from the input image
format.  The default empty list, "", selects all vectors in the images.
The syntax is a list of comma separated numbers or ranges, where a range
is a pair of hyphen separated numbers.
.le
.ls bands = ""
List of bands for 3D images.  The empty list, "", selects all bands.
.le
.ls dispaxis = ")_.dispaxis", nsum = ")_.nsum"
Parameters for defining vectors in 2D and 3D images.  The
dispersion axis is 1 for line vectors, 2 for column vectors, and 3 for band
vectors.  A DISPAXIS parameter in the image header has precedence over the
\fIdispaxis\fR parameter.  The default values defer to the package
parameters of the same name.
.le

The following are the fitting parameters.
.ls region = ""
Region of the input vectors to be fit specified as a pair of space
separated numbers.  The coordinates are defined in terms of the linear
image header coordinate parameters.  For dispersion corrected spectra this
is usually wavelength in Angstroms and for other data it is usually pixels.
A fitting region must be specified.
.le
.ls positions = ""
File of initial or fixed profile positions and (optional) peaks, profile
types, and widths.  The
format consists of lines with one or more whitespace separated fields.
The fields are the position, peak relative to the continuum with
negative values being absorption, profile type of gaussian, lorentzian,
or voigt, and the gaussian and/or lorentzian full width at half maximum.
Trailing fields may be missing and fields to be set from default parameters
or the image data (the peak value) may be given as INDEF.
Comments and any additional columns are ignored.  The positions and
widths are specified in the coordinate units of the image, usually
wavelength for dispersion corrected spectra and pixels otherwise.
.le
.ls background = ""
Background values defining the linear background.  If not specified the
single pixel values nearest the fitting region endpoints are used.
Otherwise two whitespace separated values are expected.  If a value is
a number then that is the background at the lower or upper end of the
fitting region (ordered in pixel space not wavelength).  The special
values "avg(w1,w2,z)" or "med(w1,w2,z)" (note that there can be no
whitespace) may be specified, where w1 and w2 are dispersion values, and z
is a multiplier.  This will take the average or median of pixels within the
specified range and multiply the result by the third argument.  The
dispersion point used for that value in computing the linear background is
the average of the dispersion coordinates of the pixels used.
.le
.ls profile = "gaussian" (gaussian|lorentzian|voigt)
Default profile type to be fit when a profile type is not specified in
the positions file.  The type are "gaussian", "lorentzian", or "voigt".
.le
.ls gfwhm = 20., lfwhm = 20.
Default gaussian and lorentzian full width at half maximum (FWHM).
These values are used for the initial and/or fixed width when they are
not specified in the position file.
.le
.ls fitbackground = yes
Fit the background?  If "yes" a linear background across the fitting region
will be fit simultaneously with the profiles.  If "no" the background will
be fixed.
.le
.ls fitpositions = "all"
Position fitting option.  This may be "fixed" to fix all positions at their
initial values, "single" to fit a single shift to the positions while
keeping their separations fixed, or "all" to independently fit all the
positions.
.le
.ls fitgfwhm = "all", fitlfwhm = "all"
Profile width fitting options.  These may be "fixed" to fix all widths
at their initial values, "single" to fit a single scale factor to the initial
widths, or "all" to independently fit all the widths.
.le

The following parameters are used for error estimates as described
below in the ERROR ESTIMATES section.
.ls nerrsample = 0
Number of samples for the error computation.  A value less than 10 turns
off the error computation.  A value of ~10 does a rough error analysis, a
value of ~50 does a reasonable error analysis, and a value >100 does a
detailed error analysis.  The larger this value the longer the analysis
takes.
.le
.ls sigma0 = INDEF, invgain = INDEF
The pixel sigmas are modeled by the formula:

.nf
    sigma**2 = sigma0**2 + invgain * I
.fi

where I is the pixel value and "**2" means the square of the quantity.  If
either parameter is specified as INDEF or with a value less than zero then
no sigma estimates are made and so no error estimates for the measured
parameters is made.
.le

The following parameters determine the output of the task.
.ls components = ""
All profiles defined by the position file are simultaneously fit but only
a subset of the fitted profiles may be selected for output.  A profile
or component is identified by the order number in the position file;
i.e. the first entry in the position file is 1, the second is 2, etc.
The components to be output are specified by a range list.  The empty
list, "", selects all profiles.
.le
.ls verbose = yes
Print fitting results and record of output images created on the
standard output (normally the terminal).
The fitting information is printed to the logfile so there is normally
no need to redirect this output.  The output may be turned off when
the task is run as a background task.
.le
.ls logfile = "logfile"
Logfile for fitting results.  If not specified the results will not be
logged.
.le
.ls plotfile = "plotfile"
File to contain plot output.  The plots show the image vector with
overplots of the total fit, the individual components, and the residuals.
The plotfile may be examined and manipulated later with tools such as
\fBgkimosaic\fR.
.le
.ls output = ""
List of output images.  If not specified then no output images are created.
If images are specified the list is matched with the input list.
.le
.ls option = "fit" (fit|difference)
Image output option.  The choices are "fit" to output the fitted image
vector which is the sum of the fitted profiles (without a background),
or "difference" to output the data with the profiles subtracted.
.le
.ls clobber = no, merge = no
Clobber or modify any existing output images?  If clobbering is not
enabled a warning is printed and any existing output images are not
modified.  If clobbering is enabled then either new images are created
if merge is "no" or the new fits are merged with the existing images.
Merging is meaningful when only a subset of the input is fit such
as selected lines or apertures.
.le
.ih
DESCRIPTION
\fBFitprofs\fR fits one dimensional profile functions to image vectors
and outputs the fitting parameters, plots, and model or residual
image vectors.  This is done noninteractively using a file of initial
profile positions and widths.  Interactive profile fitting may be
done with the deblending option of \fBsplot\fR or
\fBstsdas.fitting.ngaussfit\fR.

The input consists of images in a variety of formats.  These include
all the spectral formats as well as standard images.  For two dimensional
images (or the first 2D plane of higher dimensional images) either the
lines or columns may be fit with possible summing of adjacent lines or
columns to increase the signal-to-noise.  A subset of the image apertures,
lines, or columns may be specified or all image vectors may be fit.

The fitting parameters consist of a fitting region, a list of initial
positions, peaks, and widths, initial background endpoints, the fitting
function, and the parameters to be fit or constrained.  The coordinates and
units used for the positions and widths are those defined by the standard
linear coordinate header parameters.  For dispersion corrected spectra
these are generally wavelengths in Angstroms and otherwise they are
generally pixels.  A fitting region must be specified by a pair of
numbers.

The background parameter may be left empty to select the pixel values at
the endpoints of the fitting region for defining the initial linear
background.  Or values at the endpoints of the fitting region may be given
explicitly in pixel space order (i.e. the first value is for the edge of
the fitting region which has smaller pixel coordinate0 Values can also be
computed from the data using the functions "avg(w1,w2)" or "med(w1,w2)"
where w1 and w2 are dispersion coordinates.  The pixels in the specified
range are average or medianed and the dispersion point for the linear
background is the average of the dispersion coordinates of the pixels.

The position list file consists of one or more columns.
The format of this file has
one or more columns.  The columns are the wavelength, the peak value
(relative to the continuum with negative values being absorption),
the profile type (gaussian, lorentzian, or voigt), and the
gaussian and/or lorentzian FWHM.  End columns may be missing
or INDEF values may be specified to use the default parameter
values (the profile and widths) or determine the peak from the data.
Below are examples of the file line formats

.nf
	wavelength
	wavelength peak
	wavelength peak (gaussian|lorenzian|voigt)
	wavelength peak gaussian gfwhm
	wavelength peak lorentzian lfwhm
	wavelength peak voigt gfwhm
	wavelength peak voigt gfwhm lfwhm

	1234.5			<- Wavelength only
	1234.5 -100		<- Wavelength and peak
	1234.5 INDEF v		<- Wavelength and profile type
	1234.5 INDEF g 12	<- Wavelength and gaussian FWHM
.fi

where peak is the peak value, gfwhm is the gaussian FWHM, and lfwhm is
the lorentzian FWHM.  This format is the same as used by \fBsplot\fR
and also by \fBartdata.mk1dspec\fR (except in the latter case the
peak is normalized to a continuum of 1).

The profile parameters fit are the central position, the peak amplitude,
and the profile widths.  The fitting may be constrained in number of ways.
The linear background may be fixed or simultaneously fit with the
profiles.  The profile positions may be fixed, the relative separations
fixed but a single zero point shift fit, or all positions may be fit
simultaneously.  The profile widths may also be fixed, the relative ratios
of the widths fixed while fitting a single scale factor, or all widths fit
simultaneously.  The profile amplitudes are always fit.

The fitting technique uses a nonlinear iterative Levenberg-Marquardt
algorithm to reduce the Chi-square of the fit.  The execution time
increases rapidly with the number of profiles fit so there is an
effective limit to the number of profiles that can be fit at once.

The output includes a number of formats.  The fitted parameters  are
recorded in a logfile (if specified) and printed on the standard
output (if the verbose flag is set).  This output includes the date,
image vector, fitting parameters used, and a table of fitted or
derived quantities.  The parameters included some quantities relevant to
spectral lines but others apply to any image data.  The quantities are
the profile center, the background or continuum at the center of the
profile, the integral or flux of the profile (which is negative for
profiles below the background), the equivalent width, the profile peak
amplitude or core value, and the profile full width at half
maximum.  Pure gaussian and lorentzian profiles will have one of
the widths set to zero while voigt profiles will have both values.

Summary plots are recored in a plotfile (if specified).  The plots
show the data with the total fit, individual profiles, and residuals
overplotted.  The plotfile may be examined and printed using the
task \fBgkimosaic\fR as well as other tasks which interpret GKI metacode.

The final output consists of images in the same format as the input.
The images  may be of the total fit (sum of profiles without background)
or of the difference (residuals) of the data minus the model.
.ih
ERROR ESTIMATES
Error estimates may be computed for the fitted parameters.
This requires a model for the pixel sigmas.  Currently this
model is based on a Poisson statistics model of the data.  The model
parameters are a constant Gaussian sigma and an "inverse gain" as specified
by the parameters \fIsigma0\fR and \fIinvgain\fR.  These parameters are
used to compute the pixel value sigma from the following formula:

.nf
    sigma**2 = sigma0**2 + invgain * I
.fi

where I is the pixel value and "**2" means the square of the quantity.

If either the constant sigma or the inverse gain are specified as INDEF or
with values less than zero then no noise model is applied and no error
estimates are computed.  Also if the number of error samples is less than
10 then no error estimates are computed.  Note that for processed spectra
this noise model will not generally be the same as the detector readout
noise and gain.  These parameters would need to be estimated in some way
using the statistics of the spectrum.  The use of an inverse gain rather
than a direct gain was choosed to allow a value of zero for this
parameters.  This provides a model with constant uncertainties.

The error estimates are computed by Monte-Carlo simulation.  The model is
fit to the data (using the noise sigmas) and this model is used to describe
the noise-free spectrum.  A number of simulations, given by the
\fInerrsample\fR, are created in which random Gaussian noise is added to
the noise-free spectrum based on the pixel sigmas from the noise model.
The model fitting is done for each simulation and the absolute deviation of
each fitted parameter to model parameter is recorded.  The error estimate
for the each parameter is then the absolute deviation containing 68.3% of
the parameter estimates.  This corresponds to one sigma if the distribution
of parameter estimates is Gaussian though this method does not assume
this.

The Monte-Carlo technique automatically includes all effects of
parameter correlations and does not depend on any approximations.
However the computation of the errors does take a significant
amount of time.  The amount of time and the accuracy of the
error estimates depend on how many simulations are done.  A
small number of samples (of order 10) is fast but gives crude
estimates.  A large number (greater than 100) is slow but gives
very good estimates.  A compromise value of 50 is recommended
for many applications.

.ih
EXAMPLES
1.  The following example creates an artificial spectrum and fits it.
It requires the \fBartdata\fR and \fBproto\fR packages be loaded.

.nf
    cl> mk1dspec test slope=1 temp=0 lines=testlines nl=20
    cl> mknoise test rdnoise=10 poisson=yes
    cl> fields testlines fields=1,3 > fitlines
    cl> fitprofs test reg="4000 8000" pos=fitlines
    # Jul 27 17:49 test - Ap 1: 
    # Nfit=20, background=YES, positions=all, gfwhm=all, lfwhm=all
    #   center      cont      flux      eqw      core   gfwhm   lfwhm
      6832.611  1363.188  -13461.8    9.875  -408.339   30.97      0.
      7963.674  1507.641  -8193.58    5.435  -395.207   19.48      0.
      5688.055   1217.01  -7075.11    5.814  -392.006   16.96      0.
	6831.3   1363.02  -7102.01     5.21  -456.463   14.62      0.
      7217.335  1412.323   -10110.    7.158  -427.797    22.2      0.
      6709.286  1347.437  -4985.06      3.7  -225.346   20.78      0.
      6434.317  1312.319  -7121.03    5.426  -342.849   19.51      0.
      6130.415  1273.506    -6164.     4.84  -224.146   25.83      0.
      4569.375  1074.138   -3904.6    3.635  -183.963   19.94      0.
      5656.645  1212.999  -8202.81    6.762  -303.617   25.38      0.
       4219.53  1029.458  -5161.64    5.014  -241.135   20.11      0.
      4551.424  1071.845  -3802.61    3.548   -139.39   25.63      0.
      4604.649  1078.643  -5539.15    5.135  -264.654   19.66      0.
      6966.557  1380.294  -11717.5    8.489  -600.581   18.33      0.
      4259.019  1034.501  -4280.38    4.138  -213.446   18.84      0.
      5952.958  1250.843  -8006.98    6.401  -318.313   23.63      0.
       4531.89  1069.351  -712.598   0.6664  -155.197   4.313      0.
      7814.418  1488.579  -2926.49    1.966  -164.891   16.67      0.
      5310.929  1168.846  -10132.2    8.669  -487.502   19.53      0.
      5022.948  1132.066   -7532.8    6.654  -325.594   21.73      0.

.fi

2.  Suppose there is no obvious continuum level near the fitting
region but you want to specify a flat continuum level as the average
of pixels in a specified wavelength region.  The background region
would be specified as

.nf
    background = "avg(4250,4425.3) avg(4250,4425.3)"
.fi

Note that the value must be given twice to get a flat continuum.
.ih
REVISIONS
.ls FITPROFS V2.11.3
Modified to allow a more general specification of the background.
.le
.ls FITPROFS V2.11
Modified to include lorentzian and voigt profiles.  The parameters and
positions file format have changed in this version.  A new parameter
controls the number of Monte-Carlo samples used in the error estimates.
.le
.ls FITPROFS V2.10.3
Error estimates based on a simple noise model are now computed.
.le
.ls FITPROFS V2.10
This task is new.
.le
.ih
TIME REQUIREMENTS
The following CPU times were obtained with a Sun Sparcstation I.  The
number of pixels in the fitting region and the number of lines fit
were varied.   The worst case of fitting all parameters and a background
was considered as well as the constrained case of  fitting line positions
and a single width with fixed background.

.nf
	Npixels Nprofs Fitbkg Fitpos  Fitsig   CPU(sec)
	  100	   5	 yes	all	all	  1.9
	  100	  10	 yes	all	all	  3.3
	  100	  15	 yes	all	all	  5.6
	  100	  20	 yes	all	all	  9.0
	  512	   5	 yes	all	all	  4.7
	  512	  10	 yes	all	all	 10.0
	  512	  15	 yes	all	all	 17.6
	  512	  20	 yes	all	all	 27.8
	 1000	   5	 yes	all	all	  8.0
	 1000	  10	 yes	all	all	 18.0
	 1000	  15	 yes	all	all	 31.8
	 1000	  20	 yes	all	all	 50.2
	 1000	  25	 yes	all	all	 72.8
	 1000	  30	 yes	all	all	100.2
	  512	   5	  no	all  single	  2.8
	  512	  10	  no	all  single	  5.3
	  512	  15	  no	all  single	  8.6
	  512	  20	  no	all  single	 12.8
.fi

Crudely this implies CPU time goes as the 1.4 power of the number of profiles
and the 0.75 power of the number of pixels.
.ih
SEE ALSO
splot, stsdas.fitting.ngaussfit
.endhelp