ccontsubtract

The purpose of this software is to subtract continuum in a parallel/distributed environment or on a single computer system. The software leverages MPI, however can be run on a simple laptop or a large supercomputer.

Running the program

It can be run with the following command, where “config.in” is a file containing the configuration parameters described in the next section.

$ <MPI wrapper> ccontsubtract -c config.in

Configuration Parameters

Parset parameters understood by ccontsubtract are given in the following table (all parameters must have CContSubtract prefix, i.e. CContSubtract.dataset). For a number of parameters certain keywords are substituted, i.e. %w is replaced by the rank and %n by the number of nodes in the parallel case. In the serial case, these special strings are substituted by 0 and 1, respectively. This substitution allows to reuse the same parameter file on all nodes of the cluster if the difference between jobs assigned to individual nodes can be coded by using these keywords (e.g. using specially crafted file names). If a parameter supports substitution, it is clearly stated in the description.

A number of other parameters allowing the user to narrow down the data selection are understood. They are given in the Data Selection documentation and should also have the CContSubtract prefix. Note, however, that these parameters are available mainly because common code is used for CContSubtract and, for example, ccalibrator. On the other hand, this tool performs write operation which can only proceed in parallel if different ranks modify independent data. Therefore, setting up workers to perform subtraction for different channels or beams in the same dataset will not work (even if allowed by the substitution rules and the Data Selection). However, you can now specify for different ranks to select different data tiles in the same dataset and update the data in parallel (with an appropriately patched casacore version), using

CContSubtract.Tiles = auto

There are two ways to subtract the continuum: you can specify an image or component model derived from an imaging run using doSubtraction=true or you can fit the continuum visibility spectra with a low order polynomial or harmonic (sine and cosine terms) and subtract the model fit from the spectra using doUVLin=true. You can combine the two approaches in which case the the model subtraction is done before the fitting. The uvlin approach supports phase rotation to a specified direction (uvlin.direction) that can be specified as a generic Direction Measure (e.g., “SUN”, or “12h34m56.7,23.34.45.6,J2000”). This will rotate the visibilities before doing the fit and subtract operation.

Standard and Direction Dependent calibration is supported: if you have a calibration table with selfcal corrections for one or more directions, you can specify access to this in the standard way (see Access to calibrator solutions). Use sources.definition to specify a separate model file or sources.names to specify component or image models directly in the parset. You can use a combination of component and image models, but for direction dependent calibration you need to have either a number of components or a number of images models matching the number of calibration directions present in the calibration table. For image models the order in which the calibration directions are applied is by sorted field name, e.g.,

CContSubtract.sources.names=[field2,field1]
CContSubtract.sources.field1.model = image.model1
CContSubtract.sources.field2.model = image.model2

and the same specification with source.names=[field1,field2] will both apply the gains for the first calibration direction to data predicted from image.model1. If you need to change the order, swap the model image name specifications (source.field1.model=image.model2). Some notes on image models: the code currently requires that the reference pixel (=tangent point) is at the image centre for each image model - subtraction will be inaccurate if that is not the case, i.e., you can’t use cutouts of a larger image as model images. Offset model images that are too small may undersample the uv phase ramp and cause aliased responses to appear away from the image centre.

Parameter

Type

Default

Description

imagetype

string

“casa”

Type of the image handler (determines the format of the images read from the disk). The default is to read casa images, fits is the other option.

dataset

string

None

Data set file name to work with. The visibility data are overwritten with the subtraction result. Usual substitution rules apply.

datacolumn

string

“DATA”

The name of the data column in the measurement set which will be the source of visibilities and which will be updated. This can be useful to process real telescope data which were passed through casapy at some stage (e.g. to work with calibrated data which are stored in the CORRECTED_DATA column). In the measurement set convention, the DATA column which is used by default contains raw uncalibrated data as received directly from the telescope. Calibration tasks in casapy make a copy when calibration is applied creating a new data column.

doSubtraction

bool

false

Set to true to subtract the specified continuum model from the data.

doReplaceByModel

bool

false

Set to true to replace the data by the specified continuum model (non DD case only). This is for testing purposes only.

sources.definition

string

None

Optional parameter. If defined, sky model (i.e. source info given as sources.something) is read from a separate parset file (name is given by this parameter).If this parameter is not defined, source description should be given in the main parset file. Usual substitution rules apply. The parameters used to define the sky model are described in csimulator.

gridder

string

None

Name of the gridder, further parameters are given by gridder.something. See Gridders for information. Only needed when image models are specified.

visweights

string

“”

If this parameter is set to “MFS” gridders are setup to degrid with the weight required for the models given as Taylor series (i.e. multi-frequency synthesis models). At the moment, this parameter is decoupled from the setup of the model parameters.The user has to set it separately and in a consistent way with the model setup (the nterms parameter in the model definition (see csimulator documentation).

visweights.MFS.reffreq

double

1.405e9

Reference frequency in Hz for MFS-model simulation (see above)

calibrate

bool

false

Specify if calibration should be applied to the model before subtraction

calibrate.directiondependent

bool

false

Specify if multiple models and a calibration table with direction dependent gains and will be used

calibrate.usecalapplicator

bool

true

Use the calibration applicator code (default) or the CalibrationME code. DD calibration is only implemented for the calibration applicator at this stage.

masterDoesWork

bool

false

In parallel mode, choose if the master participates in the calculations like a worker, instead of doing nothing after the initial setup. Does not work with MPIWProject

modelReadByMaster

bool

true

This parameter has effect in the parallel case only (can be set to anything in the serial case without affecting the result).

If true, the sky model is read by the master and is then distributed to all workers.

If false, each worker reads the model, which should be accessible from the worker nodes. This approach cuts down communication when the model is too big. Workers can also use individual models with the help of the substitution mechanism.

nUVWMachines

int32

1

Size of uvw-machines cache. uvw-machines are used to convert uvw from a given phase centre to a common tangent point. To reduce the cost to set the machine up (calculation of the transformation matrix), a number of these machines is cached.

The key to the cache is a pair of two directions: the current phase centre and the tangent centre. If the required pair is within the tolerances of that used to setup one of the machines in the cache, this machine is reused. If none of the cache items matches the least accessed one is replaced by the new machine which is set up with the new pair of directions.

The code would work faster if this parameter is set to the number of phase centres encountered in the dataset. In the non-faceting case, the optimal setting would be the number of synthetic beams times the number of fields. For faceting (btw, the performance gain is quite significant in this case), it should be further multiplied by the number of facets.

Direction tolerances are given as a separate parameter.

uvwMachineDirTolerance

quantity string

“1e-6rad”

Direction tolerance for the management of the uvw-machine cache (see nUVWMachines for details). The value should be an angular quantity. The default value corresponds roughly to 0.2 arcsec and seems sufficient for all practical applications within the scope of ASKAPsoft.

freqframe

string

topo

Frequency frame to work in (the frame is converted when the dataset is read). Either lsrk, bary or topo is supported.

doUVlin

bool

false

Set to true to fit the visibility spectra to derive a continuum model and subtract it.

uvlin.order

integer

1

Order of the polynomial to fit. The default is linear.

uvlin.harmonic

integer

1

Number of sine and cosine terms to fit. Higher harmonics vary faster across the channel width, i.e., sin(n*pi*chan/nchan), with n the ‘harmonic order’. Increasing harmonic by one adds 2 degrees of freedom.

uvlin.width

integer

0

Fit in bins of ‘width’ channels. The default fits to the entire spectrum. Smaller bins (and more degrees of freedom) allow sources further from the phase center to be be removed.

uvlin.offset

integer

0

The offset allows you to shift the origin of the fitting bins left. The first and last bin may be smaller. This feature exists to let you match the bins to e.g., the beam-forming intervals.

uvlin.threshold

float

2.5

Exclude outliers from the continuum fit. This first determines a robust estimate of rms and then rejects channels more than threshold*rms from the model. Iterates up to 3 times. Set to zero to skip thresholding.

uvlin.direction

vector<string>

None

Specify direction to rotate visibilities to before doing the fit to each spectrum. After the fit/subtract the visibilities are rotated back. A general Direction Measure is supported, e.g., [12h34m56.7,-23.34.45.6,J2000] or SUN

Example

# The measurement set set name - this will be overwritten
CContSubtract.dataset                             = 10uJy_simtest.ms
CContSubtract.doSubtraction                       = true
# The mhe model definition
CContSuntSubtract.sources.names                       = [10uJy]
CContSubtract.sources.10uJy.direction             = [12h30m00.000, -45.00.00.000, J2000]
CContSubtract.sourceurces.10uJy.model                 = 10uJy.model.small
CContSubtract.sources.10uJy.components            = [src1]
# The individual components that make up the model
CContSubtract.sources.src1.flux.i                 = 1.0
CContSubtract.sources.src1.direction.ra           = 0.00798972946469
CContSubtract.sources.src1.direction.dec          = 0.002
CContSubtract.sources.src2.flux.i                 = 1.0
CContSubtract.sources.src2.direction.ra           = -0.00511171
CContSubtract.sources.src2.direction.dec          = 0.0
# The gridding parameters
CContSubtract.gridder                             = WProject
CContSubtract.gridder.WProject.wmax               = 15000
CContSubtract.gridder.WProject.nwplanes           = 1
CContSubtract.gridder.WProject.oversample         = 4
CContSubtract.gridder.WProject.maxfeeds           = 2
CContSubtract.gridder.WProject.maxsupport         = 1024
CContSubtract.gridder.WProject.frequencydependent = false

Example of uv based fitting and continuum removal

Note that image or component model based subtraction can be combined with uv-based fitting and subtraction, but if we just want to do uvlin we need to turn that off. Here we fit over the basic 1 MHz, 54 channel unit to deal with discontinuities in the spectrum and we rotate to the SUN to try and subtract solar RFI.

# The measurement set name - the data will be overwritten
CContSubtract.dataset                         = askap.ms
CContSubtract.doSubtraction                   = false
CContsubtract.doUVlin                         = true
CContsubtract.uvlin.order                     = 2
CContsubtract.uvlin.harmonic                  = 0
CContsubtract.uvlin.width                     = 54
CContsubtract.uvlin.offset                    = 0
CContSubtract.uvlin.direction                 = SUN

Example of Direction Dependent calibration of model before subtraction

Here we have two component models, one with 2 and one with 3 (sub)components. The calibration table has two directions (stored as separate ‘beams’). The visibilities for each model are predicted and then corrupted with the matching calibration solutions (obtained with ddcalibrator).

# The measurement set name - this will be overwritten
CContSubtract.dataset                               = vis_2fields.ms
CContSubtract.imagetype                             = fits
# The model definition
CContSubtract.sources.names                         = [field1,field2]
CContSubtract.sources.field1.direction              = [12h30m00.000, -45.00.00.000, J2000]
CContSubtract.sources.field1.components             = [comp.1.1,comp.1.2]
CContSubtract.sources.comp.1.1.flux.i               = 1.0
CContSubtract.sources.comp.1.1.direction.ra         =  0.06
CContSubtract.sources.comp.1.1.direction.dec        = -0.02
CContSubtract.sources.comp.1.2.flux.i               = 1.0
CContSubtract.sources.comp.1.2.direction.ra         =  0.05
CContSubtract.sources.comp.1.2.direction.dec        = -0.024

CContSubtract.sources.field2.direction              = [12h30m00.000, -45.00.00.000, J2000]
CContSubtract.sources.field2.components             = [comp.2.1,comp.2.2,comp.2.3]
CContSubtract.sources.comp.2.1.flux.i               = 1.0
CContSubtract.sources.comp.2.1.direction.ra         = -0.05
CContSubtract.sources.comp.2.1.direction.dec        =  0.08
CContSubtract.sources.comp.2.2.flux.i               = 0.6
CContSubtract.sources.comp.2.2.direction.ra         = -0.039
CContSubtract.sources.comp.2.2.direction.dec        =  0.081
CContSubtract.sources.comp.2.3.flux.i               = 1.3
CContSubtract.sources.comp.2.3.direction.ra         = -0.051
CContSubtract.sources.comp.2.3.direction.dec        =  0.074


CContSubtract.calibrate                             = true
CContSubtract.calibrate.usecalapplicator            = true
CContSubtract.calibrate.directiondependent          = true
CContSubtract.calibrate.ignorechannel               = true
CContSubtract.calibaccess                           = table
CContSubtract.calibaccess.table                     = ddcalib.tab

CContSubtract.modelReadByMaster                     = true
CContSubtract.masterDoesWork                        = false
CContSubtract.Tiles                                 = auto


# Alternatively we could specify image models like this
#CContSubtract.sources.field1.model                 = image.model1
#CContSubtract.sources.field2.model                 = image.model2

# and use the MPIWProject gridder to save memory and speed up CF calculations
#CContSubtract.gridder                              = MPIWProject
#CContSubtract.gridder.MPIWProject.wmax             = 26000
#CContSubtract.gridder.MPIWProject.nwplanes         = 513
#CContSubtract.gridder.MPIWProject.oversample       = 8
#CContSubtract.gridder.MPIWProject.maxsupport       = 4096
#CContSubtract.gridder.MPIWProject.variablesupport  = true
#CContSubtract.gridder.MPIWProject.offsetsupport    = true
#CContSubtract.gridder.MPIWProject.sharecf          = true
# Use 4 ranks per node to calculate the convolution functions
#CContSubtract.gridder.MPIWProject.cfrank           = 4