aboutsummaryrefslogtreecommitdiff
path: root/pkg/utilities/nttools/doc/tmatch.hlp
blob: 20b5abe42a5af8f7109fde70f12f9060de305036 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
.help tmatch Jan1999 tables
.ih
NAME
tmatch -- Find closest match between rows in two tables
.ih
USAGE
tmatch (input1, input2, output, match1, match2, maxnorm)
.ih
DESCRIPTION
This task combines rows from two tables into one. Rows are combined by
looking at each row in the first table and finding the row in the
second table whose match columns are closest to those in the first.
The distance between two rows is defined in the usual way to be the
square root of the sum of the squares of the difference between the
corresponding match columns. Rows are only written to the output table
if the distance between them is less than the distance specified by
maxnorm. This task converts match column units to degrees for the
purpose of computing the distance if the column units are recognized
angular units (seconds, minutes, degrees, hours, or radians or any
abbreviation of these). If the column units are blank or some
unrecognized string, no conversion is done. Thus, if the match column
is in some recognized angular units, maxnorm should be specified in
degrees.

Columns named by the parameters 'incol1' and 'incol2' will be copied to
the output table. If these parameters are left blank, all columns will
be copied to the output table. Columns will have the same name in the
output table as in the input table, except that columns with the same
name in both input tables will have the suffix "_1" and "_2" added to
indicate which table they were copied from.

This task optionally allows you to supply a set of weighting factors
that are multiplied by the difference between corresponding match
columns when computing the distance. These factors can be used to
supply units conversion when column units information is missing from
the input table or as a way to weight information from columns
containing distinct information. If factors are supplied, any column
units information is ignored. If the factors are left bank, they are
ignored and column units information is used to convert to degrees
when possible.

If the match columns contain spherical coordinates, parameter 'sphere'
should be set to yes so that the distance on a sphere can be computed.
If the match columns do contain spherical units, the first column
should be the longitude column (such as right ascension) and the
second column should be the latitude column (such as declination).
Columns should also be in some recognized angular units, or a
conversion factor should be supplied explicitly.

The task optionally produces a diagnostic output file if a file name
is supplied to parameter 'diagfile'. The diagnostic file contains the
rows from the first table that were not matched, the cases where
more than one row in the first table matched a single row in the
second table, and the ten matched rows in the with the largest
distance between them. Rows in the diagnostic output are identified by
their table number and row number and optionally by the contents of
the columns specified by the parameters 'nmcol1' and 'nmcol2'.

This task differs from tjoin in two respects. First tjoin combines
tables on the basis of a single column, while this task can combine
tables based on multiple columns. Second, tjoin places every
combination of rows matching within the specified tolerance in the output
table, while this task only puts the closest match in the output table.
.ih
PARAMETERS
.ls input1 [string]
First input table name.
.le
.ls input2 [string]
Second input table name.
.le
.ls output [string]
Output table name.
.le
.ls match1 [string]
A column template describing columns from the first table used to
match the two tables. A column name template is a comma or whitespace
list of strings. Each string may either be a column name a pattern
containing wildcard characters which matches several column names. This
parameter will also accept the name of a list file (preceded by the
"@" character) containing column names to be matched.
If the first non-white character in the template
is the negation character (either "~" or "!"),
all columns NOT appearing in the list will be matched.
.le
.ls match2 [string]
A column name template describing columns from the second table used
to match the two tables. This parameter follows the same format rules
as 'match1'. The number of columns must equal those in 'match1'.
.le
.ls maxnorm min= 0.0, max=INDEF [real]
The distance between two rows must be less than 'maxnorm' in order for
them to match. Recognized angular units are converted to degrees
before computing the distance. The recognized units are seconds,
minutes, degrees, hours, radians, or any abbreviation of these.
.le
.ls (incol1 = " ") [string]
A column name template describing the columns to be copied from the
first input table to the output table. If this parameter is left blank
(the default) all columns in the first input table will be copied to
the output.
.le
.ls (incol2 = " ") [string]
A column name template describing the columns to be copied from the
second input table to the output table. If this parameter is left
blank (the default) all columns in the second input table will be
copied to the output.
.le
.ls (factor = " ") [string]
A comma or white space separated list of numeric factors multiplied by
the individual column differences when computing the distance between
rows in the first and second tables. If this parameter is left blank
(the default) conversion of angular units to degrees will be
performed, but not other weighting will be performed. If a list of
values is supplied, units conversion will NOT be performed, the
supplied numeric factors will be used instead.
.le
.ls (diagfile = " ") [string]
The name of the diagnostic output file. If the name is left blank (the
default) no diagnostic output is produced. Diagnostic output can be
sent to the terminal by setting this parameter to STDOUT or STDERR.
The diagnostic output contains a list of rows that were not matched,
cases where more than one row in the first table matched a single row
in the second table, and the ten pairs of rows with the largest
distance between them.
.le
.ls (nmcol1 = " ") [string]
A column template describing the columns from the first table that are
printed in the diagnostic output. The table and row number are always
printed, if this parameter is not blank, the specified columns are
also printed.
.le
.ls (nmcol2 = " ") [string]
A column template describing the columns from the second table that are
printed in the diagnostic output.
.le
.ls (sphere = no) [bool]
If this parameter is set to yes, a correction appropriate for
spherical coordinates will be applied to the first column
difference. The correction is the cosine of the average of the two
second column values. In order for this correction to be valid, the
first column must contain the longitude component and the second
column the latitude component. Units should be convertable to degrees
or an explicit conversion factor should be supplied.
.le
.ih
EXAMPLES
1. Two star catalogs are being matched. They both have the following
columns:

.nf
Name             CH*12      %12s ""
RA               D        %10.1h hours
Dec              D        %10.0h degrees
V                R         %7.2f ""
B-V              R         %7.2f ""
U-B              R         %7.2f ""
.fi

To find the best match between the catalogs within a ten arcsecond
radius one would use the following command:

.nf
tt> tmatch catalog1.tab catalog2.tab match.tab \
>>> ra,dec ra,dec 0:00:10 sphere+
.fi

The search radius can either be supplied in sexagesimal notation, as
above, or in decimal degrees.

2. Suppose the input catalogs did not contain units information, as
would be the case if they were text files. The units conversion could
then be supplied explicitly through the factor parameter:

.nf
tt> tmatch catalog1.tab catalog2.tab match.tab \
>>> ra,dec ra,dec 0:00:10 factor=15,1 sphere+
.fi

3. Suppose we want the output table to only contain the name from the
first catalog and get the rest of its information from the second
catalog. This could be done with the following command:


.nf
tt> tmatch catalog1.tab catalog2.tab match.tab \
>>> ra,dec ra,dec 0:00:10 incol1=name sphere+
.fi

4. To get diagnostic output from the task, use the following command:

.nf
tt> tmatch catalog1.tab catalog2.tab match.tab ra,dec ra,dec \
>>> diag=diag.txt nmcol1=name nmcol2=name 0:00:10 sphere+
.fi

The following is a subset of the diagnostic output produced:

.nf
The following objects matched the same object:
1:163 6601  GEM
1:164 6601  GEM
2:163 6601  GEM


The following objects have the largest norms:
Norm = 0.00253
1:371 2319  SCO
2:371 2319  SCO

Norm = 0.00247
1:368 2101  SCO
2:368 2101  SCO
.fi

The number before the colon is the table number, the number after the
colon is the row number, and the rest of the line is from the name
column.
.ih
REFERENCES
Written by Bernie Simon
.ih
SEE ALSO
tjoin
.endhelp