casacore
|
Modules | |
Fitting_module_internal_classes | |
Internal Fitting_module classes and functions. | |
Module for various forms of mathematical fitting.
See below for an overview of the classes in this module.
The Fitting module holds various classes and functions related to fitting models to data. Currently only least-squares fits are handled.
We are given N data points, which we will fit to a function with M adjustable parameters. N should normally be greater than M, and at least M non-dependent relations between the parameters should be given. In cases where there are less than M independent points, Singular-Value-Deconvolution methods are available. Each condition equation can be given an
(estimated) standard deviation, which is comparable to the statistical weight, which is often used in place of the standard deviation.
The best fit is assumed to be the one which minimises the 'chi-squared'.
In the (rather common) case that individual errors are not known for the individual data points, one can assume that the individual errors are unity, calculate the best fit function, and then estimate the errors (assuming they are all identical) by inverting the normal equations. Of course, in this case we do not have an independent estimate of chi2.
The methods used in the Fitting module are described in Note 224. The methods (both standard and SVD) are based on a Cholesky decomposition of the normal equations.
General background can also be found in Numerical Recipes by Press et al..
The linear least squares solution assumes that the fit function is a linear combination of M linear condition equations. It is important to note that linear refers to the dependence on the parameters; the condition equations may be very non-linear in the dependent arguments.
The linear least squares problem is solved by explicitly forming and inverting the normal equations. If the normal equations are close to singular, the singular value decomposition (SVD) method may be used. Numerical Recipes suggests the SVD be always used, however this advice is not universally accepted.
Sometimes there are not enough independent observations, i.e., the number of data points N is less than the number of adjustable parameters M. In this case the least-squares problem cannot be solved unless additional `‘constraints’' on the adjustable parameters can be introduced. Under other circumstances, we may want to introduce constraints on the adjustable parameters to add additional information, e.g., the sum of angles of a triangle. In more complex cases, the forms of the constraints are unknown. Here we confine ourselves to least-squares fit problems in which the forms of constraints are known.
If the forms of constraint equations are known, the least-squares problem can be solved. (In the case where not enough independent observations are available, a minimum number of sufficient constraint equations have to be provided. The singular value decomposition method can be used to calculate the minimum number of orthogonal constraints needed).
We now consider the situation where the fitted function depends nonlinearly on the set of M adjustable parameters. But with nonlinear dependences the minimisation of chi2 cannot proceed as in the linear case. However, we can linearise the problem, find an approximate solution, and then iteratively seek the minimising solution. The iteration stops when e.g. the adjusted parameters do not change anymore. In general it is very difficult to find a general solution that finds a global minimum, and the solution has to be matched with the problem. The Levenberg-Marquardt algorithm is a general non-linear fitting method which can produce correct results in many cases. It has been included, but always be aware of possible problems with non-linear solutions.
The basic classes are LSQFit and LSQaips. They provide the basic framework for normal equation generation, solving the normal equations and iterating in the case of non-linear equations.
The LSQFit class uses a native C++ interface (pointers and iterators). They handle real data and complex data. The LSQaips class offers the functionality of LSQFit, but with an additional Casacore Array interface.
Functionality is
In addition to the basic Least Squares routines in the LSQFit
and LSQaips
classes, this module contains also a set of direct data fitters:
Furthermore class LatticeFit
can do fitting on lattices.
Note that the basic functions have LSQ in their title; the one-step fitting functions Fit.
The above fitting problems can usually be solved by directly calling the fit()
member function provided by one of the Fit
classes above, or by gradually building the normal equation matrix and solving the normal equations (solve()
).
A Distributed Object interface to the classes is available (DOfitting
) for use in the Glish dfit
object, available through the fitting.g
script.
This module was motivated by baseline subtraction/continuum fitting in the first instance.