ccontsubtract
=============

The purpose of this software is to subtract continuum in a parallel/distributed
environment or on a single computer system. The software leverages MPI, however
can be run on a simple laptop or a large supercomputer.

Running the program
-------------------

It can be run with the following command, where "config.in" is a file containing
the configuration parameters described in the next section. ::

   $ <MPI wrapper> ccontsubtract -c config.in


Configuration Parameters
------------------------

Parset parameters understood by ccontsubtract are given in the following table
(all parameters must have **CContSubtract** prefix, i.e. **CContSubtract.dataset**).
For a number of parameters certain keywords are substituted, i.e. **%w** is
replaced by the rank and **%n** by the number of nodes in the parallel case. In
the serial case, these special strings are substituted by 0 and 1, respectively.
This substitution allows to reuse the same parameter file on all nodes of the
cluster if the difference between jobs assigned to individual nodes can be coded
by using these keywords (e.g. using specially crafted file names). If a
parameter supports substitution, it is clearly stated in the description.

A number of other parameters allowing the user to narrow down the data selection are
understood. They are given in the :doc:`data_selection` documentation  and should
also have the **CContSubtract** prefix. Note, however, that these parameters are
available mainly because common code is used for **CContSubtract** and, for example,
:doc:`ccalibrator`. On the other hand, this tool performs write operation which can only
proceed in parallel if different ranks modify independent data. Therefore,
setting up workers to perform subtraction for different channels or beams in the same
dataset will not work (even if allowed by the substitution rules and the :doc:`data_selection`).
However, you can now specify for different ranks to select different data tiles in the
same dataset and update the data in parallel (with an appropriately patched casacore version),
using ::

    CContSubtract.Tiles = auto

There are two ways to subtract the continuum: you can specify an image or component
model derived from an imaging run using **doSubtraction=true** or you can fit the continuum
visibility spectra with a low order polynomial or harmonic (sine and cosine terms)
and subtract the model fit from the spectra using
**doUVLin=true**. You can combine the two approaches in which case the the model
subtraction is done before the fitting.
The uvlin approach supports phase rotation to a specified direction (**uvlin.direction**) that
can be specified as a generic Direction Measure (e.g., "SUN", or "12h34m56.7,23.34.45.6,J2000").
This will rotate the visibilities before doing the fit and subtract operation.

Standard and Direction Dependent calibration is supported: if you have a calibration table with
selfcal corrections for one or more directions, you can specify access to this in the standard way
(see :doc:`calibration_solutions`). Use **sources.definition** to specify a separate model file or
**sources.names** to specify component or image models directly in the parset. You can use
a combination of component and image models, but for direction dependent calibration you need
to have either a number of components or a number of images models matching the number of
calibration directions present in the calibration table. For image models the order in which
the calibration directions are applied is by *sorted* field name, e.g., ::

    CContSubtract.sources.names=[field2,field1]
    CContSubtract.sources.field1.model = image.model1
    CContSubtract.sources.field2.model = image.model2

and the same specification with **source.names=[field1,field2]** will both apply
the gains for the first calibration direction to data predicted from image.model1. If you need
to change the order, swap the model image name specifications (**source.field1.model=image.model2**).
Some notes on image models: the code currently requires that the reference pixel (=tangent point) is
at the image centre for each image model - subtraction will be inaccurate if that is not the case, i.e.,
you can't use cutouts of a larger image as model images. Offset model images that are too small may undersample
the uv phase ramp and cause aliased responses to appear away from the image centre.

+----------------------------+---------------+------------+----------------------------------------------------------+
|*Parameter*                 |*Type*         |*Default*   |*Description*                                             |
+============================+===============+============+==========================================================+
|imagetype                   |string         |"casa"      |Type of the image handler (determines the format of the   |
|                            |               |            |images read from the disk). The default is to read casa   |
|                            |               |            |images, fits is the other option.                         |
|                            |               |            |                                                          |
+----------------------------+---------------+------------+----------------------------------------------------------+
|dataset                     |string         |None        |Data set file name to work with. The visibility data are  |
|                            |               |            |overwritten with the subtraction result. Usual            |
|                            |               |            |substitution rules apply.                                 |
|                            |               |            |                                                          |
+----------------------------+---------------+------------+----------------------------------------------------------+
|datacolumn                  |string         |"DATA"      |The name of the data column in the measurement set which  |
|                            |               |            |will be the source of visibilities and which will be      |
|                            |               |            |updated. This can be useful to process real telescope data|
|                            |               |            |which were passed through *casapy* at some stage (e.g. to |
|                            |               |            |work with calibrated data which are stored in the         |
|                            |               |            |*CORRECTED_DATA* column). In the measurement set          |
|                            |               |            |convention, the *DATA* column which is used by default    |
|                            |               |            |contains raw uncalibrated data as received directly from  |
|                            |               |            |the telescope.  Calibration tasks in *casapy* make a copy |
|                            |               |            |when calibration is applied creating a new data column.   |
|                            |               |            |                                                          |
+----------------------------+---------------+------------+----------------------------------------------------------+
|doSubtraction               |bool           |false       |Set to true to subtract the specified continuum model     |
|                            |               |            |from the data.                                            |
+----------------------------+---------------+------------+----------------------------------------------------------+
|doReplaceByModel            |bool           |false       |Set to true to replace the data by the specified          |
|                            |               |            |continuum model (non DD case only). This is for testing   |
|                            |               |            |purposes only.                                            |
+----------------------------+---------------+------------+----------------------------------------------------------+
|sources.definition          |string         |None        |Optional parameter.  If defined, sky model (i.e. source   |
|                            |               |            |info given as *sources.something*) is read from a separate|
|                            |               |            |parset file (name is given by this parameter).If this     |
|                            |               |            |parameter is not defined, source description should be    |
|                            |               |            |given in the main parset file. Usual substitution rules   |
|                            |               |            |apply. The parameters used to define the sky model are    |
|                            |               |            |described in :doc:`csimulator`.                           |
|                            |               |            |                                                          |
+----------------------------+---------------+------------+----------------------------------------------------------+
|gridder                     |string         |None        |Name of the gridder, further parameters are given by      |
|                            |               |            |*gridder.something*. See :doc:`gridder` for information.  |
|                            |               |            |Only needed when image models are specified.              |
+----------------------------+---------------+------------+----------------------------------------------------------+
|visweights                  |string         |""          |If this parameter is set to "MFS" gridders are setup to   |
|                            |               |            |degrid with the weight required for the models given as   |
|                            |               |            |Taylor series (i.e. multi-frequency synthesis models). At |
|                            |               |            |the moment, this parameter is decoupled from the setup of |
|                            |               |            |the model parameters.The user has to set it separately and|
|                            |               |            |in a consistent way with the model setup (the *nterms*    |
|                            |               |            |parameter in the model definition (see :doc:`csimulator`  |
|                            |               |            |documentation).                                           |
|                            |               |            |                                                          |
+----------------------------+---------------+------------+----------------------------------------------------------+
|visweights.MFS.reffreq      |double         |1.405e9     |Reference frequency in Hz for MFS-model simulation (see   |
|                            |               |            |above)                                                    |
|                            |               |            |                                                          |
+----------------------------+---------------+------------+----------------------------------------------------------+
|calibrate                   |bool           |false       |Specify if calibration should be applied to the model     |
|                            |               |            |before subtraction                                        |
+----------------------------+---------------+------------+----------------------------------------------------------+
|calibrate.directiondependent|bool           |false       |Specify if multiple models and a calibration table with   |
|                            |               |            |direction dependent gains and will be used                |
+----------------------------+---------------+------------+----------------------------------------------------------+
|calibrate.usecalapplicator  |bool           |true        |Use the calibration applicator code (default) or the      |
|                            |               |            |CalibrationME code. DD calibration is only implemented for|
|                            |               |            |the calibration applicator at this stage.                 |
+----------------------------+---------------+------------+----------------------------------------------------------+
|masterDoesWork              |bool           |false       |In parallel mode, choose if the master participates in    |
|                            |               |            |the calculations like a worker, instead of doing nothing  |
|                            |               |            |after the initial setup. Does not work with MPIWProject   |
+----------------------------+---------------+------------+----------------------------------------------------------+
|modelReadByMaster           |bool           |true        |This parameter has effect in the parallel case only (can  |
|                            |               |            |be set to anything in the serial case without affecting   |
|                            |               |            |the result).                                              |
|                            |               |            |                                                          |
|                            |               |            |If true, the sky model is read by the master and is then  |
|                            |               |            |distributed to all workers.                               |
|                            |               |            |                                                          |
|                            |               |            |If false, each worker reads the model, which should be    |
|                            |               |            |accessible from the worker nodes. This approach cuts down |
|                            |               |            |communication when the model is too big. Workers can also |
|                            |               |            |use individual models with the help of the substitution   |
|                            |               |            |mechanism.                                                |
+----------------------------+---------------+------------+----------------------------------------------------------+
|nUVWMachines                |int32          |1           |Size of uvw-machines cache. uvw-machines are used to      |
|                            |               |            |convert uvw from a given phase centre to a common tangent |
|                            |               |            |point. To reduce the cost to set the machine up           |
|                            |               |            |(calculation of the transformation matrix), a number of   |
|                            |               |            |these machines is cached.                                 |
|                            |               |            |                                                          |
|                            |               |            |The key to the cache is a pair of two directions: the     |
|                            |               |            |current phase centre and the tangent centre. If the       |
|                            |               |            |required pair is within the tolerances of that used to    |
|                            |               |            |setup one of the machines in the cache, this machine is   |
|                            |               |            |reused. If none of the cache items matches the least      |
|                            |               |            |accessed one is replaced by the new machine which is set  |
|                            |               |            |up with the new pair of directions.                       |
|                            |               |            |                                                          |
|                            |               |            |The code would work faster if this parameter is set to the|
|                            |               |            |number of phase centres encountered in the dataset. In the|
|                            |               |            |non-faceting case, the optimal setting would be the number|
|                            |               |            |of synthetic beams times the number of fields. For        |
|                            |               |            |faceting (btw, the performance gain is quite significant  |
|                            |               |            |in this case), it should be further multiplied by the     |
|                            |               |            |number of facets.                                         |
|                            |               |            |                                                          |
|                            |               |            |Direction tolerances are given as a separate parameter.   |
+----------------------------+---------------+------------+----------------------------------------------------------+
|uvwMachineDirTolerance      |quantity       |"1e-6rad"   |Direction tolerance for the management of the uvw-machine |
|                            |string         |            |cache (see *nUVWMachines* for details). The value should  |
|                            |               |            |be an angular quantity. The default value corresponds     |
|                            |               |            |roughly to 0.2 arcsec and seems sufficient for all        |
|                            |               |            |practical applications within the scope of ASKAPsoft.     |
|                            |               |            |                                                          |
+----------------------------+---------------+------------+----------------------------------------------------------+
|freqframe                   |string         |topo        |Frequency frame to work in (the frame is converted when   |
|                            |               |            |the dataset is read). Either lsrk, bary or topo is        |
|                            |               |            |supported.                                                |
+----------------------------+---------------+------------+----------------------------------------------------------+
|doUVlin                     |bool           |false       |Set to true to fit the visibility spectra to derive a     |
|                            |               |            |continuum model and subtract it.                          |
+----------------------------+---------------+------------+----------------------------------------------------------+
|uvlin.order                 |integer        |1           |Order of the polynomial to fit. The default is linear.    |
|                            |               |            |                                                          |
+----------------------------+---------------+------------+----------------------------------------------------------+
|uvlin.harmonic              |integer        |1           |Number of sine and cosine terms to fit.                   |
|                            |               |            |Higher harmonics vary faster across the channel width,    |
|                            |               |            |i.e., sin(n*pi*chan/nchan), with n the 'harmonic order'.  |
|                            |               |            |Increasing harmonic by one adds 2 degrees of freedom.     |
+----------------------------+---------------+------------+----------------------------------------------------------+
|uvlin.width                 |integer        |0           |Fit in bins of 'width' channels. The default fits to the  |
|                            |               |            |entire spectrum. Smaller bins (and more degrees of        |
|                            |               |            |freedom) allow sources further from the phase center to be|
|                            |               |            |be removed.                                               |
+----------------------------+---------------+------------+----------------------------------------------------------+
|uvlin.offset                |integer        |0           |The offset allows you to shift the origin of the fitting  |
|                            |               |            |bins left. The first and last bin may be smaller.         |
|                            |               |            |This feature exists to let you match the bins to e.g., the|
|                            |               |            |beam-forming intervals.                                   |
+----------------------------+---------------+------------+----------------------------------------------------------+
|uvlin.threshold             |float          |2.5         |Exclude outliers from the continuum fit. This first       |
|                            |               |            |determines a robust estimate of rms and then rejects      |
|                            |               |            |channels more than threshold*rms from the model. Iterates |
|                            |               |            |up to 3 times. Set to zero to skip thresholding.          |
+----------------------------+---------------+------------+----------------------------------------------------------+
|uvlin.direction             |vector<string> |None        |Specify direction to rotate visibilities to before doing  |
|                            |               |            |the fit to each spectrum. After the fit/subtract the      |
|                            |               |            |visibilities are rotated back. A general Direction Measure|
|                            |               |            |is supported, e.g., [12h34m56.7,-23.34.45.6,J2000] or SUN |
+----------------------------+---------------+------------+----------------------------------------------------------+


Example
----------


.. code-block:: bash

    # The measurement set set name - this will be overwritten
    CContSubtract.dataset                             = 10uJy_simtest.ms
    CContSubtract.doSubtraction                       = true
    # The mhe model definition
    CContSuntSubtract.sources.names                       = [10uJy]
    CContSubtract.sources.10uJy.direction             = [12h30m00.000, -45.00.00.000, J2000]
    CContSubtract.sourceurces.10uJy.model                 = 10uJy.model.small
    CContSubtract.sources.10uJy.components            = [src1]
    # The individual components that make up the model
    CContSubtract.sources.src1.flux.i                 = 1.0
    CContSubtract.sources.src1.direction.ra           = 0.00798972946469
    CContSubtract.sources.src1.direction.dec          = 0.002
    CContSubtract.sources.src2.flux.i                 = 1.0
    CContSubtract.sources.src2.direction.ra           = -0.00511171
    CContSubtract.sources.src2.direction.dec          = 0.0
    # The gridding parameters
    CContSubtract.gridder                             = WProject
    CContSubtract.gridder.WProject.wmax               = 15000
    CContSubtract.gridder.WProject.nwplanes           = 1
    CContSubtract.gridder.WProject.oversample         = 4
    CContSubtract.gridder.WProject.maxfeeds           = 2
    CContSubtract.gridder.WProject.maxsupport         = 1024
    CContSubtract.gridder.WProject.frequencydependent = false

Example of uv based fitting and continuum removal
-------------------------------------------------
Note that image or component model based subtraction can be combined with uv-based
fitting and subtraction, but if we just want to do uvlin we need to turn that off.
Here we fit over the basic 1 MHz, 54 channel unit to deal with discontinuities
in the spectrum and we rotate to the SUN to try and subtract solar RFI.

.. code-block:: bash

    # The measurement set name - the data will be overwritten
    CContSubtract.dataset                         = askap.ms
    CContSubtract.doSubtraction                   = false
    CContsubtract.doUVlin                         = true
    CContsubtract.uvlin.order                     = 2
    CContsubtract.uvlin.harmonic                  = 0
    CContsubtract.uvlin.width                     = 54
    CContsubtract.uvlin.offset                    = 0
    CContSubtract.uvlin.direction                 = SUN

Example of Direction Dependent calibration of model before subtraction
----------------------------------------------------------------------
Here we have two component models, one with 2 and one with 3 (sub)components.
The calibration table has two directions (stored as separate 'beams'). The
visibilities for each model are predicted and then corrupted with the matching
calibration solutions (obtained with ddcalibrator).

.. code-block:: bash

    # The measurement set name - this will be overwritten
    CContSubtract.dataset                               = vis_2fields.ms
    CContSubtract.imagetype                             = fits
    # The model definition
    CContSubtract.sources.names                         = [field1,field2]
    CContSubtract.sources.field1.direction              = [12h30m00.000, -45.00.00.000, J2000]
    CContSubtract.sources.field1.components             = [comp.1.1,comp.1.2]
    CContSubtract.sources.comp.1.1.flux.i               = 1.0
    CContSubtract.sources.comp.1.1.direction.ra         =  0.06
    CContSubtract.sources.comp.1.1.direction.dec        = -0.02
    CContSubtract.sources.comp.1.2.flux.i               = 1.0
    CContSubtract.sources.comp.1.2.direction.ra         =  0.05
    CContSubtract.sources.comp.1.2.direction.dec        = -0.024

    CContSubtract.sources.field2.direction              = [12h30m00.000, -45.00.00.000, J2000]
    CContSubtract.sources.field2.components             = [comp.2.1,comp.2.2,comp.2.3]
    CContSubtract.sources.comp.2.1.flux.i               = 1.0
    CContSubtract.sources.comp.2.1.direction.ra         = -0.05
    CContSubtract.sources.comp.2.1.direction.dec        =  0.08
    CContSubtract.sources.comp.2.2.flux.i               = 0.6
    CContSubtract.sources.comp.2.2.direction.ra         = -0.039
    CContSubtract.sources.comp.2.2.direction.dec        =  0.081
    CContSubtract.sources.comp.2.3.flux.i               = 1.3
    CContSubtract.sources.comp.2.3.direction.ra         = -0.051
    CContSubtract.sources.comp.2.3.direction.dec        =  0.074


    CContSubtract.calibrate                             = true
    CContSubtract.calibrate.usecalapplicator            = true
    CContSubtract.calibrate.directiondependent          = true
    CContSubtract.calibrate.ignorechannel               = true
    CContSubtract.calibaccess                           = table
    CContSubtract.calibaccess.table                     = ddcalib.tab

    CContSubtract.modelReadByMaster                     = true
    CContSubtract.masterDoesWork                        = false
    CContSubtract.Tiles                                 = auto


    # Alternatively we could specify image models like this
    #CContSubtract.sources.field1.model                 = image.model1
    #CContSubtract.sources.field2.model                 = image.model2

    # and use the MPIWProject gridder to save memory and speed up CF calculations
    #CContSubtract.gridder                              = MPIWProject
    #CContSubtract.gridder.MPIWProject.wmax             = 26000
    #CContSubtract.gridder.MPIWProject.nwplanes         = 513
    #CContSubtract.gridder.MPIWProject.oversample       = 8
    #CContSubtract.gridder.MPIWProject.maxsupport       = 4096
    #CContSubtract.gridder.MPIWProject.variablesupport  = true
    #CContSubtract.gridder.MPIWProject.offsetsupport    = true
    #CContSubtract.gridder.MPIWProject.sharecf          = true
    # Use 4 ranks per node to calculate the convolution functions
    #CContSubtract.gridder.MPIWProject.cfrank           = 4