User Parameters - Preparation of the Science field data

These parameters govern the pre-processing needed for the science field, to split by beam, apply the bandpass (although none of the parameters listed here relate to that), flag, and average to continuum resolution.

The splitting is done by beam, and optionally by particular scans and/or fields (where the latter are selected on the basis of the FIELD NAME in the MS).

As noted in User Parameters - Data Location & Beam Selection, when an observation was taken in one-field-per-beam mode, and no selection on scans or channels is done, and there is only a single beam in the MS, then the beam MSs are copied instead of using mssplit. This will run much faster.

Once copied or split, a number of pre-imaging tasks are run. The default behaviour is to run these in a single slurm job, but this can be separated into a job per task by setting SINGLE_JOB_PREIMAGING=false. Using a single job will work better in a large slurm queue.

The raw measurement set is first calibrated with the bandpass calibration table derived previously (see User Parameters - Bandpass Calibration). Once calibrated, the dataset will be flagged to remove interference.

As is the case for the bandpass calibrator dataset, the MS is flagged in two passes. First, a combination of selection rules (allowing flagging of channels, time ranges, antennas & baselines, and autocorrelations) and (optionally) a simple flat amplitude threshold are applied. Then a sequence of Stokes-V flagging and dynamic flagging of amplitudes is done, optionally integrating over or across individual spectra. Each of these steps is selectable via input parameters.

The DO_PREFLAG_SCIENCE parameter allows user to flag science data based on already available information from processing of the bandpass data. This is used to flag really bad antennas using robust outlier detection methods, along with channels that exceed a given n-sigma threshold in the residuals from the smooth fit to the bandpass solutions (thereby allowing RFI-affected channels in the bandpass data to be flagged and not contaminate the science data).

The USER_DEFINED_FLAGS parameter further allows users to specify rule-based flagging directives using an ascii FLAG_DIRECTIVE_FILE that the pipeline can interpret and use for flagging the specified data. The directive file can have as many rows as the number of rules necessary.

  • Each row will be used to generate a new rule.

  • All rules for the same beam will be used in a single Cflag parset for that beam

The first column must be used to specify the beam and the subsequent part of a row specifies elements corresponding to that beam that one wants flagged. Currently supported rules for flagging include:

  • Antenna/Baseline pairs

  • SPW

  • TimeRange

  • UVRange

Example syntax for a directive:

BEAM-NN -a AntNum -s spw:channelRange -t timeRange -u uvRange

The pipeline allows option to additionally flag all continuum and spectral line visibilities for channels with flagging percentages above a specified threshold. This is useful in mitigating adverse effects on imaging spectral channels with unusually high flagging percentages.

The formats for each of these selections (Antenna, SPW, TimeRange, etc.) must be compatible with Cflag. For details on how to specify these, see: http://www.aoc.nrao.edu/~sbhatnag/misc/msselection/msselection.html

Again, there is an option to use the AOFlagger tool (written by Andre Offringa) to do the flagging. This can be turned on by FLAG_WITH_AOFLAGGER, or FLAG_SCIENCE_WITH_AOFLAGGER & FLAG_SCIENCE_AV_WITH_AOFLAGGER (to just do it for the full-spectral-resolution or averaged science data respectively). You can provide a strategy file via AOFLAGGER_STRATEGY or AOFLAGGER_STRATEGY_SCIENCE & AOFLAGGER_STRATEGY_SCIENCE_AV, with access to some of the aoflagger parameters provided - see the table below. These strategy files need to be created prior to running the pipeline.

The dataset used for continuum imaging is created by averaging the frequency channels. The default amount of averaging is determined from the scheduling block parset parameters:

  • common.cp.ingest_mode=avg – no averaging needed (data taken in continuum mode)

  • common.cp.ingest_mode=full – data taken in spectral mode. Then:

    • common.target.src%d.corrmode=standard - 54 channels

    • else, common.target.src%d.corrmode will be something like zoom32x, so we use 32x54=1728 channels.

The aim here is to get a 1MHz-resolution MS that can be imaged to produce the continuum image and run the self-calibration. The parset settings can be overridden by the use of NUM_CHAN_TO_AVERAGE. If the first option is encountered (common.cp.ingest_mode=avg), the jobs to do the averaging (and flagging of the averaged data) are skipped, and we make a symbolic link to the original MS to represent the averaged version.

It is possible to combine the channels using a median rather than an average - this will increase the noise level, but can be superior in rejecting very bright signal (e.g. narrow band signal affecting a small number of channels). This can be turned on by setting AVERAGING_USES_MEDIAN=true.

Once the averaged dataset has been created, a second round of flagging can be done on it, to flag any additional features that the averaging process may have enhanced.

The default behaviour is to process all fields within the science MS (interleaving, for instance, makes use of multiple fields), with each field being processed in its own sub-directory. The field selection is done in the splitting task, at the same time as the beam selection. It is possible, however, to select a single field to process via the FIELD_SELECTION_SCIENCE parameter (by giving the field name).

The pipeline has two options to speed up processing - either by splitting the msdata in time or by using parallel write access to the MS. If chosen, the timewise splitting of the measurement sets for each beam are done upfront at the copy/split step. This allows parallel execution of the non-imaging tasks (BandpassApplication, Flagging, Averaging and ContinuumSubtraction) on the cluster, and helps attain a massive reduction in processing times. The imaging is done per beam using data in ALL the TimeWindows by combining the TimeWise split data in an intermediate step. For details on making use of this feature, see the section on Processing by splitting data in time in the table below. The parallel write to the msdata can only be used if askapsoft was built using a patched version of casacore. In that case use USE_PARALLEL_WRITE_MS=true to enable this mode. This will allow calibration application, Flagging and ContinuumSubtraction to run in parallel, but without the need to split up (and merge) the data. Combining the two speedup options is not recommended.

Variable

Default

Parset equivalent

Description

Job selection

DO_SPLIT_SCIENCE

true

none

Whether to split out the given beam from the science MS

JOB_TIME_SPLIT_SCIENCE

JOB_TIME_DEFAULT (24:00:00)

none

Time request for splitting the science MS

DO_PREFLAG_SCIENCE

true

none

Whether to propagate flags from Bandpass data to the splitted science MS. Currently, only BAD antenna flags are propagated

PREFLAG_BADCHAN_DETECTION_THRESH

20

none

The threshold, in multiples of ‘sigma’, that determines whether a given channel is flagged by the preflagger.

DO_ADDITIONAL_FLAGGING_SPECTRAL

true

none

If true, flag all visibilities for fine channels in science data that has a flagged% > BADCHAN_DETECTION_THRESHOLD_FLAGGED_PC

DO_ADDITIONAL_FLAGGING_AVERAGED

true

none

If true, flag all visibilities for coarse channels in science data that has a flagged% > BADCHAN_DETECTION_THRESHOLD_FLAGGED_PC

BADCHAN_DETECTION_THRESHOLD_FLAGGED_PC

50.0

none

Flagging Threshold (in %) used in the additional flagging of visibilities.

USER_DEFINED_FLAGS

false

none

Whether to propagate flag directives specified in an ASCII file: FLAG_DIRECTIVE_FILE

FLAG_DIRECTIVE_FILE

""

none

ASCII file with each row specifying a Beam number and flagging criteria for that beam. For example:

  • BEAM-02 -t [2020/03/28/18:15:27.2~2020/03/28/18:20:10.1]

  • BEAM-02 -s 0:100~110 -u 0~200

  • BEAM-29 -a ak35 -s 0:6768~6880;7100~7105

In general each directive should be specified in a single row as:

  • BEAM-NN -a AntNum -s spw:channelRange -t TimeRange -u UVRange

DO_FLAG_SCIENCE

true

none

Whether to flag the (splitted) science MS

JOB_TIME_FLAG_SCIENCE

JOB_TIME_DEFAULT (24:00:00)

none

Time request for flagging the science MS

DO_APPLY_BANDPASS

true

none

Whether to apply the bandpass calibration to the science observation

JOB_TIME_APPLY_BANDPASS

JOB_TIME_DEFAULT (24:00:00)

none

Time request for applying the bandpass to the science data

NUM_CORES_CAL_APPLY

19

none

Number of cores for the job to apply the bandpass to the science data.

DO_AVERAGE_CHANNELS

true

none

Whether to average the science MS to continuum resolution

JOB_TIME_AVERAGE_MS

JOB_TIME_DEFAULT (24:00:00)

none

Time request for averaging the channels of the science data

Data selection

SCAN_SELECTION_SCIENCE

no default (see description)

scans (mssplit (Measurement Splitting/Averaging Utility))

This allows selection of particular scans from the science observation. If not provided, no scan selection is done (all scans are included in the output MS).

FIELD_SELECTION_SCIENCE

no default (see description)

fields (mssplit (Measurement Splitting/Averaging Utility))

This allows selection of particular FIELD NAMEs from the science observation. If not provided, all fields are done. The value must be just the field name - not surrounded by square brackets (which is a possible format for mssplit.fields). This is because the value iwll be matched to field names from the measurement set.

MS_BASE_SCIENCE

scienceData.%t.SB%s.%b.ms

none

Base name for the science observation measurement set after splitting. The wildcard %s will be replaced by the scheduling block ID, %t will be replaced by the “target” or scheduling block alias, and %b will be replaced by the string “FIELD.beamBB”, where FIELD represents the FIELD id, and BB the (zero-based) beam number (scienceData.LMC.SB1234.LMCA.beam00.ms etc).

MS_SCIENCE_AVERAGE

no default (see description)

dataset (cimager)

The name of the averaged measurement set that will be imaged by the continuum imager. Provide this if you want to skip the bandpass calibration and averaging steps (perhaps you’ve already done them). The wildcard %b, if present, will be replaced with “FIELD.beamBB”, as described above. If not provided, the averaged MS name will be derived from MS_BASE_SCIENCE, with “.ms” replaced with “_averaged.ms”.

CHAN_RANGE_SCIENCE

""

channel (mssplit (Measurement Splitting/Averaging Utility))

Range of channels in science observation (used in splitting and averaging). This must (for now) be the same as CHAN_RANGE_1934. The default is to use all available channels from the MS.

NUM_CHAN_TO_AVERAGE

""

width (mssplit (Measurement Splitting/Averaging Utility))

Number of channels to be averaged to create continuum measurement set. Value is determined by the scheduling block parset by default, but can be overridden by providing a value here.

AVERAGING_USES_MEDIAN

false

usemedian (mssplit (Measurement Splitting/Averaging Utility))

If true, the channels are combined using a median rather than average.

TILENCHAN_AV

1

stman.tilenchan (mssplit (Measurement Splitting/Averaging Utility))

The number of channels in the tile size used for the averaged MS.

Initial flagging

FLAG_DO_DYNAMIC_AMPLITUDE_SCIENCE

true

none

Whether to do the dynamic flagging, after the rule-based and simple flat-amplitude flagging is done

FLAG_THRESHOLD_DYNAMIC_SCIENCE

4.0

amplitude_flagger.threshold (cflag / cflagger (Flagging Utility))

Dynamic threshold applied to amplitudes when flagging science field data [sigma]

FLAG_DYNAMIC_INTEGRATE_SPECTRA

true

amplitude_flagger.integrateSpectra (cflag / cflagger (Flagging Utility))

Whether to flag channels in the time-averaged spectra during the dynamic flagging task.

FLAG_THRESHOLD_DYNAMIC_SCIENCE_SPECTRA

4.0

amplitude_flagger.integrateSpectra.threshold (cflag / cflagger (Flagging Utility))

Dynamic threshold applied to amplitudes when flagging science field data in integrateSpectra mode [sigma]

FLAG_DYNAMIC_INTEGRATE_TIMES

false

amplitude_flagger.integrateTimes (cflag / cflagger (Flagging Utility))

Whether to flag time samples in the spectrally averaged time-series during the dynamic flagging task.

FLAG_THRESHOLD_DYNAMIC_SCIENCE_TIMES

4.0

amplitude_flagger.integrateTimes.threshold (cflag / cflagger (Flagging Utility))

Dynamic threshold applied to amplitudes when flagging science field data in integrateTimes mode [sigma]

FLAG_DO_STOKESV_SCIENCE

true

none

Whether to do the Stokes-V flagging on the science data, after the rule-based and simple flat-amplitude flagging is done

FLAG_USE_ROBUST_STATS_STOKESV_SCIENCE

true

stokesv_flagger.useRobustStatistics (cflag / cflagger (Flagging Utility))

Whether to use robust statistics (median and inter-quartile range) in computing the Stokes-V statistics.

FLAG_THRESHOLD_STOKESV_SCIENCE

4.0

stokesv_flagger.threshold (cflag / cflagger (Flagging Utility))

Threshold applied to amplitudes when flagging the Stokes-V for the science field data [sigma]

FLAG_STOKESV_INTEGRATE_SPECTRA

true

stokesv_flagger.integrateSpectra (cflag / cflagger (Flagging Utility))

Whether to flag channels in the time-averaged spectra during the Stokes-V flagging task.

FLAG_THRESHOLD_STOKESV_SCIENCE_SPECTRA

4.0

stokesv_flagger.integrateSpectra.threshold (cflag / cflagger (Flagging Utility))

Threshold applied to amplitudes when flagging the Stokes-V for the science field data in integrateSpectra mode [sigma]

FLAG_STOKESV_INTEGRATE_TIMES

false

stokesv_flagger.integrateTimes (cflag / cflagger (Flagging Utility))

Whether to flag time samples in the spectrally averaged time-series during the Stokes-V flagging task.

FLAG_THRESHOLD_STOKESV_SCIENCE_TIMES

4.0

stokesv_flagger.integrateTimes.threshold (cflag / cflagger (Flagging Utility))

Threshold applied to amplitudes when flagging the Stokes-V for the science field data in integrateTimes mode [sigma]

FLAG_DO_FLAT_AMPLITUDE_SCIENCE

false

none

Whether to apply a flag amplitude flux threshold to the data.

FLAG_THRESHOLD_AMPLITUDE_SCIENCE

amplitude_flagger.high (cflag / cflagger (Flagging Utility))

Simple amplitude threshold applied when flagging science field data. If set to blank (FLAG_THRESHOLD_AMPLITUDE_SCIENCE_LOW=""), then no minimum value is applied.

FLAG_THRESHOLD_AMPLITUDE_SCIENCE_LOW

""

amplitude_flagger.low (cflag / cflagger (Flagging Utility))

Lower threshold for the simple amplitude flagging. If set to blank (FLAG_THRESHOLD_AMPLITUDE_SCIENCE_LOW=""), then no minimum value is applied.

ELEVATION_FLAG_SCIENCE_LOW

""

elevation_flagger.low (cflag / cflagger (Flagging Utility))

Visibilities below this elevation (degrees) will be flagged. If set to blank (ELEVATION_FLAG_SCIENCE_LOW=""), then no flagging based on low elevation limit will be applied.

ELEVATION_FLAG_SCIENCE_HIGH

""

elevation_flagger.high (cflag / cflagger (Flagging Utility))

Visibilities above this elevation (degrees) will be flagged. If set to blank (ELEVATION_FLAG_SCIENCE_HIGH=""), then no flagging based on high elevation limit will be applied.

ANTENNA_FLAG_SCIENCE

""

selection_flagger.<rule>.antenna (cflag / cflagger (Flagging Utility))

Allows flagging of antennas or baselines. For example, to flag out the 1-3 baseline, set this to "ak01&&ak03" (with the quote marks). See documentation for further details on format.

CHANNEL_FLAG_SCIENCE

""

selection_flagger.<rule>.spw (cflag / cflagger (Flagging Utility))

Allows flagging of a specified range of channels. For example, to flag out the first 100 channnels, use "0:0~16" (with the quote marks). See the documentation for further details on the format.

TIME_FLAG_SCIENCE

""

selection_flagger.<rule>.timerange (cflag / cflagger (Flagging Utility))

Allows flagging of a specified time range(s). The string given is passed directly to the timerange option of cflag’s selection flagger. For details on the possible syntax, consult the MS selection documentation.

UVRANGE_FLAG_SCIENCE

""

selection_flagger.<rule>.uvrange (cflag / cflagger (Flagging Utility))

Allows flagging of a specified UV range(s). The string given is passed directly to the uvrange option of cflag’s selection flagger. For details on the possible syntax, consult the MS selection documentation.

FLAG_AUTOCORRELATION_SCIENCE

false

selection_flagger.<rule>.autocorr

If true, then autocorrelations will be flagged.

Flagging of averaged data

FLAG_AFTER_AVERAGING

true

none

Whether to do an additional step of flagging on the channel-averaged MS proior to imaging.

FLAG_DO_DYNAMIC_AMPLITUDE_SCIENCE_AV

true

none

Whether to do the dynamic flagging on the averaged science data, after the simple flat-amplitude flagging is done

FLAG_THRESHOLD_DYNAMIC_SCIENCE_AV

4.0

amplitude_flagger.threshold (cflag / cflagger (Flagging Utility))

Dynamic threshold applied to amplitudes when flagging the averaged science field data [sigma]

FLAG_DYNAMIC_INTEGRATE_SPECTRA_AV

true

amplitude_flagger.integrateSpectra (cflag / cflagger (Flagging Utility))

Whether to flag channels in the time-averaged spectra during the dynamic flagging task.

FLAG_THRESHOLD_DYNAMIC_SCIENCE_SPECTRA_AV

4.0

amplitude_flagger.integrateSpectra.threshold (cflag / cflagger (Flagging Utility))

Dynamic threshold applied to amplitudes when flagging the averaged science field data in integrateSpectra mode [sigma]

FLAG_DYNAMIC_INTEGRATE_TIMES_AV

false

amplitude_flagger.integrateTimes (cflag / cflagger (Flagging Utility))

Whether to flag time samples in the spectrally averaged time-series during the dynamic flagging task.

FLAG_THRESHOLD_DYNAMIC_SCIENCE_TIMES_AV

4.0

amplitude_flagger.integrateTimes.threshold (cflag / cflagger (Flagging Utility))

Dynamic threshold applied to amplitudes when flagging the averaged science field data in integrateTimes mode [sigma]

FLAG_DO_STOKESV_SCIENCE_AV

true

none

Whether to do the Stokes-V flagging on the averaged science data, after the rule-based and simple flat-amplitude flagging is done

FLAG_USE_ROBUST_STATS_STOKESV_SCIENCE_AV

true

stokesv_flagger.useRobustStatistics (cflag / cflagger (Flagging Utility))

Whether to use robust statistics (median and inter-quartile range) in computing the Stokes-V statistics.

FLAG_THRESHOLD_STOKESV_SCIENCE_AV

4.0

stokesv_flagger.threshold (cflag / cflagger (Flagging Utility))

Threshold applied to amplitudes when flagging the Stokes-V for the averaged science field data [sigma]

FLAG_STOKESV_INTEGRATE_SPECTRA_AV

true

stokesv_flagger.integrateSpectra (cflag / cflagger (Flagging Utility))

Whether to flag channels in the time-averaged spectra during the Stokes-V flagging task.

FLAG_THRESHOLD_STOKESV_SCIENCE_SPECTRA_AV

4.0

stokesv_flagger.integrateSpectra.threshold (cflag / cflagger (Flagging Utility))

Threshold applied to amplitudes when flagging the Stokes-V for the averaged science field data in integrateSpectra mode [sigma]

FLAG_STOKESV_INTEGRATE_TIMES_AV

false

stokesv_flagger.integrateTimes (cflag / cflagger (Flagging Utility))

Whether to flag time samples in the spectrally averaged time-series during the Stokes-V flagging task.

FLAG_THRESHOLD_STOKESV_SCIENCE_TIMES_AV

4.0

stokesv_flagger.integrateTimes.threshold (cflag / cflagger (Flagging Utility))

Threshold applied to amplitudes when flagging the Stokes-V for the averaged science field data in integrateTimes mode [sigma]

FLAG_DO_FLAT_AMPLITUDE_SCIENCE_AV

false

none

Whether to apply a flag amplitude flux threshold to the averaged science data.

FLAG_THRESHOLD_AMPLITUDE_SCIENCE_AV

amplitude_flagger.high (cflag / cflagger (Flagging Utility))

Simple amplitude threshold applied when flagging the averaged science field data. If set to blank (FLAG_THRESHOLD_AMPLITUDE_SCIENCE_LOW=""), then no minimum value is applied. [value in flux-calibrated units]

FLAG_THRESHOLD_AMPLITUDE_SCIENCE_LOW_AV

""

amplitude_flagger.low (cflag / cflagger (Flagging Utility))

Lower threshold for the simple amplitude flagging on the averaged data. If set to blank (FLAG_THRESHOLD_AMPLITUDE_SCIENCE_LOW=""), then no minimum value is applied. [value in flux-calibrated units]

CHANNEL_FLAG_SCIENCE_AV

""

selection_flagger.<rule>.spw (cflag / cflagger (Flagging Utility))

Allows flagging of a specified range of channels. For example, to flag out the first 100 channnels, use "0:0~16" (with the quote marks). See the documentation for further details on the format.

TIME_FLAG_SCIENCE_AV

""

selection_flagger.<rule>.timerange (cflag / cflagger (Flagging Utility))

Allows flagging of a specified time range(s). The string given is passed directly to the timerange option of cflag’s selection flagger. For details on the possible syntax, consult the MS selection documentation.

UVRANGE_FLAG_SCIENCE_AV

""

selection_flagger.<rule>.uvrange (cflag / cflagger (Flagging Utility))

Allows flagging of a specified UV range(s). The string given is passed directly to the uvrange option of cflag’s selection flagger. For details on the possible syntax, consult the MS selection documentation.

Using AOFlagger for flagging

FLAG_WITH_AOFLAGGER

false

none

Use AOFlagger for all flagging tasks in the pipeline. This overrides the individual task level switches.

FLAG_SCIENCE_WITH_AOFLAGGER

false

none

Use AOFlagger for the flagging of the full-spectral-resolution science dataset. This and the next parameter allows differentiation between the different flagging tasks in the pipeline.

FLAG_SCIENCE_AV_WITH_AOFLAGGER

false

none

Use AOFlagger for the flagging of the averaged science dataset.

AOFLAGGER_STRATEGY

""

none

The strategy file to use for all AOFlagger tasks in the pipeline. Giving this a value will apply this one strategy file to all flagging jobs. The strategy file needs to be provided by the user.

AOFLAGGER_STRATEGY_SCIENCE

""

none

The strategy file to be used for the full-spectral-resolution science dataset. This will be overridden by AOFLAGGER_STRATEGY.

AOFLAGGER_STRATEGY_SCIENCE_AV

""

none

The strategy file to be used for the averaged science dataset. This will be overridden by AOFLAGGER_STRATEGY.

AOFLAGGER_VERBOSE

true

none

Verbose output for AOFlagger

AOFLAGGER_READ_MODE

auto

none

Read mode for AOflagger. This can take the value of one of “auto”, “direct”, “indirect”, or “memory”. These trigger the following respective command-line options for AOflagger: “-auto-read-mode”, “-direct-read”, “-indirect-read”, “-memory-read”.

AOFLAGGER_UVW

false

none

When true, the command-line argument “-uvw” is added to the AOFlagger command. This reads uvw values (some exotic strategies require these).

Processing by splitting data in time

DO_SPLIT_TIMEWISE

true

none

By default, the non-imaging jobs – bandpass application, flagging, averaging, ccontsubtract – will be done in data that has been split into TimeWindows (see below for TimeWindow interval selection param). This will speed-up the processing, especially when the observation duration exceeds a few hours.

SPLIT_INTERVAL_MINUTES

60

none

If DO_SPLIT_TIMEWISE is set to true, the pipeline will split data in to T/SPLIT_INTERVAL_MINUTES time-windows (where, T=total obs time in minutes. The pipeline ensures that the time intervals are equal to a second, and so the specified interval may get modified from what had been specified.

Processing by parallel write to ms data

USE_PARALLEL_WRITE_MS

false

none

The non-imaging jobs – bandpass application, flagging, ccontsubtract – will be done using parallel writes to the ms data. This will speed-up the processing and reduce the number of separate jobs to run. It is faster than timewise splitting and generally should not be combined with that option.