Previous Up Next

5 CFITSIO File Names and Filters

5.1 Creating New Files

When creating a new output file on magnetic disk with fits_create_file the following features are supported.

5.2 Opening Existing Files

When opening a file with fits_open_file, CFITSIO can read a variety of different input file formats and is not restricted to only reading FITS format files from magnetic disk. The following types of input files are all supported:

5.3 Image Filtering

5.3.1 Extracting a subsection of an image

When specifying the name of an image to be opened, you can select a rectangular subsection of the image to be extracted and opened by the application program. The application program then opens a virtual image that only contains the pixels within the specified subsection. To do this, specify the the range of pixels (start:end) along each axis to be extracted from the original image enclosed in square brackets. You can also specify an optional pixel increment (start:end:step) for each axis of the input image. A pixel step = 1 will be assumed if it is not specified. If the starting pixel is larger then the end pixel, then the image will be flipped (producing a mirror image) along that dimension. An asterisk, ’*’, may be used to specify the entire range of an axis, and ’-*’ will flip the entire axis. In the following examples, assume that myfile.fits contains a 512 x 512 pixel 2D image.

  myfile.fits[201:210, 251:260] - opens a 10 x 10 pixel subimage.

  myfile.fits[*, 512:257] - opens a 512 x 256 image consisting of
       all the columns in the input image, but only rows 257
       through 512.  The image will be flipped along the Y axis
       since the starting row is greater than the ending
       row.

  myfile.fits[*:2, 512:257:2] - creates a 256 x 128 pixel image.
       Similar to the previous example, but only every other row
       and column is read from the input image.

  myfile.fits[-*, *] - creates an image containing all the rows and
       columns in the input image, but flips it along the X
       axis.

If the array to be opened is in an Image extension, and not in the primary array of the file, then you need to specify the extension name or number in square brackets before giving the subsection range, as in myfile.fits[1][-*, *] to read the image in the first extension in the file.

5.3.2 Create an Image by Binning Table Columns

You can also create and open a virtual image by binning the values in a pair of columns of a FITS table (in other words, create a 2-D histogram of the values in the 2 columns). This technique is often used in X-ray astronomy where each detected X-ray photon during an observation is recorded in a FITS table. There are typically 2 columns in the table called X and Y which record the pixel location of that event in a virtual 2D image. To create an image from this table, one just scans the X and Y columns and counts up how many photons were recorded in each pixel of the image. When table binning is specified, CFITSIO creates a temporary FITS primary array in memory by computing the histogram of the values in the specified columns. After the histogram is computed the original FITS file containing the table is closed and the temporary FITS primary array is opened and passed to the application program. Thus, the application program never sees the original FITS table and only sees the image in the new temporary file (which has no extensions).

The table binning specifier is enclosed in square brackets following the root filename and table extension name or number and begins with the keyword ’bin’, as in:
’myfile.fits[events][bin (X,Y)]’. In this case, the X and Y columns in the ’events’ table extension are binned up to create the image. The size of the image is usually determined by the TLMINn and TLMAXn header keywords which give the minimum and maximum allowed pixel values in the columns. For instance if TLMINn = 1 and TLMAXn = 4096 for both columns, this would generate a 4096 x 4096 pixel image by default. This is rather large, so you can also specify a pixel binning factor to reduce the image size. For example specifying , ’[bin (X,Y) = 16]’ will use a binning factor of 16, which will produce a 256 x 256 pixel image in the previous example.

If the TLMIN and TLMAX keywords don’t exist, or you want to override their values, you can specify the image range and binning factor directly, as in ’[bin X = 1:4096:16, Y=1:4096:16]’. You can also specify the datatype of the created image by appending a b, i, j, r, or d (for 8-bit byte, 16-bit integers, 32-bit integer, 32-bit floating points, or 64-bit double precision floating point, respectively) to the ’bin’ keyword (e.g. ’[binr (X,Y)]’ creates a floating point image). If the datatype is not specified then a 32-bit integer image will be created by default.

If the column name is not specified, then CFITSIO will first try to use the ’preferred column’ as specified by the CPREF keyword if it exists (e.g., ’CPREF = ’DETX,DETY’), otherwise column names ’X’, ’Y’ will be assumed for the 2 axes.

Note that this binning specifier is not restricted to only 2D images and can be used to create 1D, 3D, or 4D images as well. It is also possible to specify a weighting factor that is applied during the binning. Please refer to the “CFITSIO User’s Reference Guide” for more details on these advanced features.

5.4 Table Filtering

5.4.1 Column and Keyword Filtering

The column or keyword filtering specifier is used to modify the column structure and/or the header keywords in the HDU that was selected with the previous HDU location specifier. It can be used to perform the following types of operations.

The column filtering specifier is enclosed in square brackets and begins with the string ’col’. Multiple operations can be performed by separating them with semi-colons. For complex or commonly used operations, you can write the column filter to a text file, and then use it by giving the name of the text file, preceded by a ’@’ character.

Some examples:

  [col PI=PHA * 1.1 + 0.2]      - creates new PI column from PHA values

  [col rate = counts/exposure]  - creates or overwrites the rate column by
                                  dividing the counts column by the
                                  EXPOSURE keyword value.

  [col TIME; X; Y]              - only the listed columns will appear
                                  in the filtered file

  [col Time;*raw]               - include the Time column and any other
                                  columns whose name ends with 'raw'.

  [col -TIME; Good == STATUS]   - deletes the TIME column and
                                  renames the STATUS column to GOOD

  [col @colfilt.txt]            - uses the filtering expression in
                                  the colfilt.txt text file

The original file is not changed by this filtering operation, and instead the modifications are made on a temporary copy of the input FITS file (usually in memory), which includes a copy of all the other HDUs in the input file. The original input file is closed and the application program opens the filtered copy of the file.

5.4.2 Row Filtering

The row filter is used to select a subset of the rows from a table based on a boolean expression. A temporary new FITS file is created on the fly (usually in memory) which contains only those rows for which the row filter expression evaluates to true (i.e., not equal to zero). The primary array and any other extensions in the input file are also copied to the temporary file. The original FITS file is closed and the new temporary file is then opened by the application program.

The row filter expression is enclosed in square brackets following the file name and extension name. For example, ’file.fits[events][GRADE==50]’ selects only those rows in the EVENTS table where the GRADE column value is equal to 50).

The row filtering expression can be an arbitrarily complex series of operations performed on constants, keyword values, and column data taken from the specified FITS TABLE extension. The expression also can be written into a text file and then used by giving the filename preceded by a ’@’ character, as in ’[@rowfilt.txt]’.

Keyword and column data are referenced by name. Any string of characters not surrounded by quotes (ie, a constant string) or followed by an open parentheses (ie, a function name) will be initially interpreted as a column name and its contents for the current row inserted into the expression. If no such column exists, a keyword of that name will be searched for and its value used, if found. To force the name to be interpreted as a keyword (in case there is both a column and keyword with the same name), precede the keyword name with a single pound sign, ’#’, as in #NAXIS2. Due to the generalities of FITS column and keyword names, if the column or keyword name contains a space or a character which might appear as an arithmetic term then inclose the name in ’$’ characters as in $MAX PHA$ or #$MAX-PHA$. The names are case insensitive.

To access a table entry in a row other than the current one, follow the column’s name with a row offset within curly braces. For example, ’PHA{-3}’ will evaluate to the value of column PHA, 3 rows above the row currently being processed. One cannot specify an absolute row number, only a relative offset. Rows that fall outside the table will be treated as undefined, or NULLs.

Boolean operators can be used in the expression in either their Fortran or C forms. The following boolean operators are available:

    "equal"         .eq. .EQ. ==  "not equal"          .ne.  .NE.  !=
    "less than"     .lt. .LT. <   "less than/equal"    .le.  .LE.  <= =<
    "greater than"  .gt. .GT. >   "greater than/equal" .ge.  .GE.  >= =>
    "or"            .or. .OR. ||  "and"                .and. .AND. &&
    "negation"     .not. .NOT. !  "approx. equal(1e-7)"  ~

Note that the exclamation point, ’!’, is a special UNIX character, so if it is used on the command line rather than entered at a task prompt, it must be preceded by a backslash to force the UNIX shell to ignore it.

The expression may also include arithmetic operators and functions. Trigonometric functions use radians, not degrees. The following arithmetic operators and functions can be used in the expression (function names are case insensitive):

    "addition"           +          "subtraction"          -
    "multiplication"     *          "division"             /
    "negation"           -          "exponentiation"       **   ^
    "absolute value"     abs(x)     "cosine"               cos(x)
    "sine"               sin(x)     "tangent"              tan(x)
    "arc cosine"         arccos(x)  "arc sine"             arcsin(x)
    "arc tangent"        arctan(x)  "arc tangent"          arctan2(x,y)
    "exponential"        exp(x)     "square root"          sqrt(x)
    "natural log"        log(x)     "common log"           log10(x)
    "modulus"            i % j      "random # [0.0,1.0)"   random()
    "minimum"            min(x,y)   "maximum"              max(x,y)
    "if-then-else"       b?x:y

The following type casting operators are available, where the inclosing parentheses are required and taken from the C language usage. Also, the integer to real casts values to double precision:

                "real to integer"    (int) x     (INT) x
                "integer to real"    (float) i   (FLOAT) i

Several constants are built in for use in numerical expressions:

        #pi              3.1415...      #e             2.7182...
        #deg             #pi/180        #row           current row number
        #null         undefined value   #snull         undefined string

A string constant must be enclosed in quotes as in ’Crab’. The "null" constants are useful for conditionally setting table values to a NULL, or undefined, value (For example, "col1==-99 ? #NULL : col1").

There is also a function for testing if two values are close to each other, i.e., if they are "near" each other to within a user specified tolerance. The arguments, value_1 and value_2 can be integer or real and represent the two values who’s proximity is being tested to be within the specified tolerance, also an integer or real:

                    near(value_1, value_2, tolerance)

When a NULL, or undefined, value is encountered in the FITS table, the expression will evaluate to NULL unless the undefined value is not actually required for evaluation, e.g. "TRUE .or. NULL" evaluates to TRUE. The following two functions allow some NULL detection and handling:

                   ISNULL(x)
                   DEFNULL(x,y)

The former returns a boolean value of TRUE if the argument x is NULL. The later "defines" a value to be substituted for NULL values; it returns the value of x if x is not NULL, otherwise it returns the value of y.

Bit masks can be used to select out rows from bit columns (TFORMn = #X) in FITS files. To represent the mask, binary, octal, and hex formats are allowed:

                 binary:   b0110xx1010000101xxxx0001
                 octal:    o720x1 -> (b111010000xxx001)
                 hex:      h0FxD  -> (b00001111xxxx1101)

In all the representations, an x or X is allowed in the mask as a wild card. Note that the x represents a different number of wild card bits in each representation. All representations are case insensitive.

To construct the boolean expression using the mask as the boolean equal operator described above on a bit table column. For example, if you had a 7 bit column named flags in a FITS table and wanted all rows having the bit pattern 0010011, the selection expression would be:

                            flags == b0010011
    or
                            flags .eq. b10011

It is also possible to test if a range of bits is less than, less than equal, greater than and greater than equal to a particular boolean value:

                            flags <= bxxx010xx
                            flags .gt. bxxx100xx
                            flags .le. b1xxxxxxx

Notice the use of the x bit value to limit the range of bits being compared.

It is not necessary to specify the leading (most significant) zero (0) bits in the mask, as shown in the second expression above.

Bit wise AND, OR and NOT operations are also possible on two or more bit fields using the ’&’(AND), ’|’(OR), and the ’!’(NOT) operators. All of these operators result in a bit field which can then be used with the equal operator. For example:

                          (!flags) == b1101100
                          (flags & b1000001) == bx000001

Bit fields can be appended as well using the ’+’ operator. Strings can be concatenated this way, too.

5.4.3 Good Time Interval Filtering

A common filtering method involves selecting rows which have a time value which lies within what is called a Good Time Interval or GTI. The time intervals are defined in a separate FITS table extension which contains 2 columns giving the start and stop time of each good interval. The filtering operation accepts only those rows of the input table which have an associated time which falls within one of the time intervals defined in the GTI extension. A high level function, gtifilter(a,b,c,d), is available which evaluates each row of the input table and returns TRUE or FALSE depending whether the row is inside or outside the good time interval. The syntax is

      gtifilter( [ "gtifile" [, expr [, "STARTCOL", "STOPCOL" ] ] ] )

where each "[]" demarks optional parameters. Note that the quotes around the gtifile and START/STOP column are required. Either single or double quote characters may be used. The gtifile, if specified, can be blank ("") which will mean to use the first extension with the name "*GTI*" in the current file, a plain extension specifier (eg, "+2", "[2]", or "[STDGTI]") which will be used to select an extension in the current file, or a regular filename with or without an extension specifier which in the latter case will mean to use the first extension with an extension name "*GTI*". Expr can be any arithmetic expression, including simply the time column name. A vector time expression will produce a vector boolean result. STARTCOL and STOPCOL are the names of the START/STOP columns in the GTI extension. If one of them is specified, they both must be.

In its simplest form, no parameters need to be provided – default values will be used. The expression "gtifilter()" is equivalent to

       gtifilter( "", TIME, "*START*", "*STOP*" )

This will search the current file for a GTI extension, filter the TIME column in the current table, using START/STOP times taken from columns in the GTI extension with names containing the strings "START" and "STOP". The wildcards (’*’) allow slight variations in naming conventions such as "TSTART" or "STARTTIME". The same default values apply for unspecified parameters when the first one or two parameters are specified. The function automatically searches for TIMEZERO/I/F keywords in the current and GTI extensions, applying a relative time offset, if necessary.

5.4.4 Spatial Region Filtering

Another common filtering method selects rows based on whether the spatial position associated with each row is located within a given 2-dimensional region. The syntax for this high-level filter is

       regfilter( "regfilename" [ , Xexpr, Yexpr [ , "wcs cols" ] ] )

where each "[ ]" demarks optional parameters. The region file name is required and must be enclosed in quotes. The remaining parameters are optional. The region file is an ASCII text file which contains a list of one or more geometric shapes (circle, ellipse, box, etc.) which defines a region on the celestial sphere or an area within a particular 2D image. The region file is typically generated using an image display program such as fv/POW (distribute by the HEASARC), or ds9 (distributed by the Smithsonian Astrophysical Observatory). Users should refer to the documentation provided with these programs for more details on the syntax used in the region files.

In its simpliest form, (e.g., regfilter("region.reg") ) the coordinates in the default ’X’ and ’Y’ columns will be used to determine if each row is inside or outside the area specified in the region file. Alternate position column names, or expressions, may be entered if needed, as in

        regfilter("region.reg", XPOS, YPOS)

Region filtering can be applied most unambiguously if the positions in the region file and in the table to be filtered are both give in terms of absolute celestial coordinate units. In this case the locations and sizes of the geometric shapes in the region file are specified in angular units on the sky (e.g., positions given in R.A. and Dec. and sizes in arcseconds or arcminutes). Similarly, each row of the filtered table will have a celestial coordinate associated with it. This association is usually implemented using a set of so-called ’World Coordinate System’ (or WCS) FITS keywords that define the coordinate transformation that must be applied to the values in the ’X’ and ’Y’ columns to calculate the coordinate.

Alternatively, one can perform spatial filtering using unitless ’pixel’ coordinates for the regions and row positions. In this case the user must be careful to ensure that the positions in the 2 files are self-consistent. A typical problem is that the region file may be generated using a binned image, but the unbinned coordinates are given in the event table. The ROSAT events files, for example, have X and Y pixel coordinates that range from 1 - 15360. These coordinates are typically binned by a factor of 32 to produce a 480x480 pixel image. If one then uses a region file generated from this image (in image pixel units) to filter the ROSAT events file, then the X and Y column values must be converted to corresponding pixel units as in:

        regfilter("rosat.reg", X/32.+.5, Y/32.+.5)

Note that this binning conversion is not necessary if the region file is specified using celestial coordinate units instead of pixel units because CFITSIO is then able to directly compare the celestial coordinate of each row in the table with the celestial coordinates in the region file without having to know anything about how the image may have been binned.

The last "wcs cols" parameter should rarely be needed. If supplied, this string contains the names of the 2 columns (space or comma separated) which have the associated WCS keywords. If not supplied, the filter will scan the X and Y expressions for column names. If only one is found in each expression, those columns will be used, otherwise an error will be returned.

These region shapes are supported (names are case insensitive):

       Point         ( X1, Y1 )               <- One pixel square region
       Line          ( X1, Y1, X2, Y2 )       <- One pixel wide region
       Polygon       ( X1, Y1, X2, Y2, ... )  <- Rest are interiors with
       Rectangle     ( X1, Y1, X2, Y2, A )       | boundaries considered
       Box           ( Xc, Yc, Wdth, Hght, A )   V within the region
       Diamond       ( Xc, Yc, Wdth, Hght, A )
       Circle        ( Xc, Yc, R )
       Annulus       ( Xc, Yc, Rin, Rout )
       Ellipse       ( Xc, Yc, Rx, Ry, A )
       Elliptannulus ( Xc, Yc, Rinx, Riny, Routx, Routy, Ain, Aout )
       Sector        ( Xc, Yc, Amin, Amax )

where (Xc,Yc) is the coordinate of the shape’s center; (X#,Y#) are the coordinates of the shape’s edges; Rxxx are the shapes’ various Radii or semimajor/minor axes; and Axxx are the angles of rotation (or bounding angles for Sector) in degrees. For rotated shapes, the rotation angle can be left off, indicating no rotation. Common alternate names for the regions can also be used: rotbox = box; rotrectangle = rectangle; (rot)rhombus = (rot)diamond; and pie = sector. When a shape’s name is preceded by a minus sign, ’-’, the defined region is instead the area *outside* its boundary (ie, the region is inverted). All the shapes within a single region file are OR’d together to create the region, and the order is significant. The overall way of looking at region files is that if the first region is an excluded region then a dummy included region of the whole detector is inserted in the front. Then each region specification as it is processed overrides any selections inside of that region specified by previous regions. Another way of thinking about this is that if a previous excluded region is completely inside of a subsequent included region the excluded region is ignored.

The positional coordinates may be given either in pixel units, decimal degrees or hh:mm:ss.s, dd:mm:ss.s units. The shape sizes may be given in pixels, degrees, arcminutes, or arcseconds. Look at examples of region file produced by fv/POW or ds9 for further details of the region file format.

5.4.5 Example Row Filters

    [double && mag <= 5.0]        -  Extract all double stars brighter
                                     than  fifth magnitude 

    [#row >= 125 && #row <= 175]   - Extract row numbers 125 through 175

    [abs(sin(theta * #deg)) < 0.5] - Extract all rows having the
                                     absolute value of the sine of theta
                                     less  than a half where the angles
                                     are tabulated in degrees

    [@rowFilter.txt]               - Extract rows using the expression
                                     contained within the text file
                                     rowFilter.txt

    [gtifilter()]                  - Search the current file for a GTI
         extension,  filter  the TIME
         column in the current table, using
         START/STOP times taken from
         columns in the GTI  extension

    [regfilter("pow.reg")]         - Extract rows which have a coordinate
                                     (as given in the X and Y columns)
                                     within the spatial region specified
                                     in the pow.reg region file.

5.5 Combined Filtering Examples

The previous sections described all the individual types of filters that may be applied to the input file. In this section we show examples which combine several different filters at once. These examples all use the fitscopy program that is distributed with the CFITSIO code. It simply copies the input file to the output file.

fitscopy rosat.fit out.fit

This trivial example simply makes an identical copy of the input rosat.fit file without any filtering.

fitscopy 'rosat.fit[events][col Time;X;Y][#row < 1000]' out.fit

The output file contains only the Time, X, and Y columns, and only the first 999 rows from the ’EVENTS’ table extension of the input file. All the other HDUs in the input file are copied to the output file without any modification.

fitscopy 'rosat.fit[events][PI < 50][bin (Xdet,Ydet) = 16]' image.fit

This creates an output image by binning the Xdet and Ydet columns of the events table with a pixel binning factor of 16. Only the rows which have a PI energy less than 50 are used to construct this image. The output image file contains a primary array image without any extensions.

fitscopy 'rosat.fit[events][gtifilter() && regfilter("pow.reg")]' out.fit

The filtering expression in this example uses the gtifilter function to test whether the TIME column value in each row is within one of the Good Time Intervals defined in the GTI extension in the same input file, and also uses the regfilter function to test if the position associated with each row (derived by default from the values in the X and Y columns of the events table) is located within the area defined in the pow.reg text region file (which was previously created with the fv/POW image display program). Only the rows which satisfy both tests are copied to the output table.

fitscopy 'r.fit[evt][PI<50]' stdout | fitscopy stdin[evt][col X,Y] out.fit

In this somewhat convoluted example, fitscopy is used to first select the rows from the evt extension which have PI less than 50 and write the resulting table out to the stdout stream. This is piped to a 2nd instance of fitscopy (with the Unix ‘|’ pipe command) which reads that filtered FITS file from the stdin stream and copies only the X and Y columns from the evt table to the output file.

fitscopy 'r.fit[evt][col RAD=sqrt((X-#XCEN)**2+(Y-#YCEN)**2)][rad<100]' out.fit

This example first creates a new column called RAD which gives the distance between the X,Y coordinate of each event and the coordinate defined by the XCEN and YCEN keywords in the header. Then, only those rows which have a distance less than 100 are copied to the output table. In other words, only the events which are located within 100 pixel units from the (XCEN, YCEN) coordinate are copied to the output table.

fitscopy 'ftp://heasarc.gsfc.nasa.gov/rosat.fit[events][bin (X,Y)=16]' img.fit

This example bins the X and Y columns of the hypothetical ROSAT file at the HEASARC ftp site to create the output image.

fitscopy 'raw.fit[i512,512][101:110,51:60]' image.fit

This example converts the 512 x 512 pixel raw binary 16-bit integer image to a FITS file and copies a 10 x 10 pixel subimage from it to the output FITS image.


Previous Up Next