BusyFit User Manual

Overview

Usage

busyfit [options] filename

Note that there is no specific order for the options or the file name.

Available options

  • -c column
    • Column number of flux values in input file. Default is 2.
  • -h, -help
    • Print this detailed help message and exit.
  • -n rms
    • The rms noise level of the spectrum. Default is 0.01. It is important to provide an accurate estimate of the noise, as otherwise the calculated uncertainties and χ² values of the fit will be arbitrary and meaningless.
  • -noplot
    • Do not plot the fit results on the screen. Useful when used in batch mode or when gnuplot is unavailable.
  • -o order
    • Order of the polynomial component (2 or 4). Default is 2.
  • -p P1...Pn
    • Initial estimates of the free parameters, A, B1, B2, C, XE, XP, and W, separated by spaces. If not specified, initial parameters will automatically be determined. Adding ‘f‘ at the end will fix a parameter, e.g. ‘2.5f‘.
  • -relax
    • Do not check for negative or shifted polynomial component. If this option is set, the fit will not be repeated if the polynomial is found to be either inverted or significantly shifted relative to the error functions.
  • -u iter
    • Calculate uncertainties for observational parameters (flux, line width, etc.), using a Monte-Carlo method. This will generate [iter] realisations of the fitted Busy Function and can therefore be very slow. Default is to use a much faster error propagation method to compute uncertainties.
  • -v
    • Use verbose mode; show progress information during fitting.

Introduction

The BusyFit programme implements a Levenberg–Marquardt χ² minimisation algorithm to fit an asymmetric version of the Busy Function to the integrated H i spectrum of a galaxy. The normal Busy Function is of the form

B(x) = (A / 4) × [erf(B1 (W + xXE)) + 1] × [erf(B2 (Wx + XE)) + 1] × [C (xXP)2 + 1]

with the free parameters A, B1, B2, C, XE, XP, and W. This form of the Busy Function is well suited to fitting a wide range of symmetric and asymmetric double-horn profiles commonly found in galaxies. In addition to the fit parameters, the programme will also measure commonly used observational parameters, including w50 and w20 line widths as well as peak flux and integrated flux of the galaxy.

Massive, edge-on spiral galaxies often produce an integrated spectrum with very sharp peaks and a fairly broad trough that is not well described by a polynomial of order 2. In such cases, BusyFit can fit a modified version of the Busy Function with a polynomial of order 4:

B(x) = (A / 4) × [erf(B1 (W + xXE)) + 1] × [erf(B2 (Wx + XE)) + 1] × [C (xXP)4 + 1]

The polynomial order can be selected with the ‘-o’ option, that can take values of 2 or 4. If not specified, a second order is used by default.

If not directed otherwise, the programme will first attempt to fit the full version of the Busy Function. If the initial fit results in a negative amplitude of the polynomial component (C ) or a significant shift with respect to the position of the error functions (|XPXE| > 2 W0, where W0 is the initial estimate of the profile width), the fit will be discarded and automatically repeated without the polynomial component, i.e. C and XP will be fixed to zero and the fit repeated with only five free parameters. This behaviour can be changed with the ‘-relax’ option, in which case negative or shifted polynomial solutions will be accepted without repeating the fit.

Input data

The H i spectrum will be read from a text file, and the user will have to specify both the input file name as well as the number of the column that contains the flux values of the spectrum. For example,

busyfit -c 3 spectrum.dat

would read the third column from the file spectrum.dat. The parser assumes that the tabulated values are separated by either a space or a tab and will ignore lines starting with a comment character (#, /, $, and %) as well as empty lines containing only spaces.

Note that the programme makes the implicit assumption that the flux values in the input file are listed in the correct order and on a regular grid in either frequency or velocity. The programme will not read the spectral axis and instead assign the flux values to spectral channels on a regular grid. Channel numbers start from zero (i.e. the first encountered flux value will be assigned to channel 0), and it is the user’s responsibility to convert the results back into the appropriate spectral units.

Initial parameter estimates

Without any further specifications, BusyFit will automatically determine a set of initial estimates for the free parameters. In many cases, this will be sufficient for the fit to converge. Alternatively, initial estimates can be provided with the ‘-p’ option. For example,

busyfit -p 1 0.5 0.5 0.1 20 20 5 -c 3 spectrum.dat

will set initial parameter estimates of A = 1, B1 = B2 = 0.5, C = 0.1, etc. In addition, parameters can be fixed by adding the letter ‘f’ to the value. For example, ‘2.5f’ will set the respective parameter to a value of 2.5 and exclude the parameter from the fitting procedure.

Plotting the results

The programme will automatically plot the result of the fit on the screen, using gnuplot. This will only work if gnuplot is installed and can be prevented by using the ‘-noplot’ option. The plot will show the original spectrum, the fitted Busy Function, and the resulting residuals. Note that each call of BusyFit will open a separate plot window, allowing results to be compared with previous fits. Hence, when automatically fitting a large number of spectra, it is important to use the ‘-noplot’ option, as otherwise a large number of gnuplot windows would be crated, potentially crashing the window manager.

Output files

Every successful run of BusyFit will create three output files. The first two, ‘busyfit_​output_​spectrum.txt’ and ‘busyfit_​output_​fit.txt’ store the original spectrum + residuals and a numerical copy of the fitted Busy Function, respectively. These files are used to plot the fit results on the screen using gnuplot. The previous contents of the files will be replaced whenever the programme is run again.

The third output file, ‘busyfit_​history.txt’, will store a copy of the fit results. Previous results will not be deleted, but instead new results from the most recent run of the programme will be appended at the end of the file. This behaviour is useful when running BusyFit repeatedly, e.g. on a large number of spectra. Before the run, the output file can be deleted or cleared, and the results of all runs will then be available for processing after the last run has finished.

The file ‘busyfit_​history.txt’ will store each set of results in a separate line, and each line contains the following entries, separated by tabs, in the given order:

  • Input file name
  • Success flag (0, 1, or 2)
  • Reduced χ²
  • Order of the polynomial component (2 or 4)
  • Values and uncertainties of the parameters A, B1, B2, C, XE, XP, and W
  • Values and uncertainties of centroid, w50, w20, peak flux, and integrated flux

The success flag will tell whether the fit was fully successful (0), converged successfully but with some parameters negative (1), or completely failed to converge (2). The values of centroid, w50 and w20 will be provided in channels, the peak flux will be given in the original flux units of the input file, and the integrated flux is measured in flux units × channels. Please note that the software starts counting at zero.

Uncertainties of observational parameters

By default, uncertainties of the observational parameters will be derived using an error propagation method. This method uses the standard error propagation law under the assumption that uncertainties in the Busy Function parameters linearly propagate into uncertainties in the derived observational parameters. While this method is very fast, its accuracy can be limited in situations where non-linear effects become relevant.

If accuracy is essential, an alternative Monte-Carlo method can be invoked with the ‘-u iter’ option. In this case, BusyFit will generate [iter] realisations of the fitted Busy Function by randomly varying the function’s free parameters and recalculating the observational parameters in each iteration. The uncertainty of each parameter is then taken as the standard deviation about the mean across all iterations. A sensible number of iterations is 10,000, which will determine uncertainties with an accuracy of typically a few percent. The disadvantage of the Monte-Carlo approach is that it can be very slow if a large number of spectra need to be fitted, and the default error propagation method may be more suitable in this case.

Warning 1: For the error analysis method to work, it is absolutely essential that the correct baseline noise level be specified with the ‘-n rms’ option, as otherwise the calculated uncertainties of all parameters will be entirely arbitrary and meaningless.
Warning 2: The error analysis methods implemented in BusyFit are based on several assumptions, some of which will break down in certain situations, leading to unrealistic uncertainty estimates. In particular, the calculated uncertainties of the derived observational parameters will almost certainly be wrong if the uncertainty, σP, of any of the non-additive free parameters, P, of the fit does not fulfil the condition of σPP.
Staff space
Public