The mssplit utility is used to extract a subset of a measurement set. This subset may be a channel, beam id or scan id selection. It also has the ability to average channels together while doing so. It can also be used simply to average channels; i.e. just channel averaging, no filtering/selection.
The intended use-cases of this tool are:
Additionally, mssplit can filter based on the following criteria:
It can be run with the following command, where “config.in” is a file containing the configuration parameters described in the next section.
$  mssplit -c config.in
The mssplit program is not parallel/distributed, it runs in a single process operating on a single input measurement set.
| Parameter | Default | Example | Description | 
|---|---|---|---|
| vis | None | 2013-12-25_230000.ms | The input measurement set (uv-dataset). This file will not be modified. | 
| outputvis | None | chan_1.ms | The output measurement set (uv-dataset). This file will be created, and the program will fail to execute in the case a file/directory with the same name already exists. | 
| channel | None | 1-300 | The channel range to split out into its own measurement set. Can be either a single integer (e.g. 1) or a range (e.g. 1-300). The range is inclusive of both the start and end, and indexing is one-based. | 
| width | 1 | 54 | Defines the number of input channels to average together to form one output channel. As the averaged visiblities can have different noise levels due to flagging, an additional array column containing noise sigmas for each spectral channel will be written when width>1. | 
| beams | None | [0] or [0, 1, 2] or [0..8] | Defines the beam numbers that will be exported to the output files. Rows are selected by matching their feed ID column with the provided beam number(s), so the numbers given here should be taken from the list of IDs in the FEED table of the measurement set. If this parameter is not set all beams are exported. The value may be a single integer (e.g. 0 or [0]), an array of integers such as [0,1,2] or a range such as [0..8]. | 
| scans | None | [0] or [0, 1, 2] or [0..2] | Defines the scan numbers that will be exported to the output files. Rows are selected by matching their scan_number column with the provided scan number(s). If this parameter is not set all scans are exported. The value may be a single integer (e.g. 0 or [0]), an array of integers such as [0,1,2] or a range such as [0..2]. | 
| fieldnames | None | [offset1] or [offset1,offset2] or [offset1..9] | Defines the field names that will be exported to the output files. If this parameter is not set all fields are exported. The value may be a single string (e.g. a0 or [a0]), an array of strings such as [a0,a1,a2] or a range such as [a0..2]. | 
| timebegin | None | 1996/11/20/5:20 or 20Nov96-5h20m or 1996-11-20T5:20 | Defines a time based filter. Any rows with time earlier than this parameter will be excluded during splitting (i.e. they will not be copied to the output measurement set. This parameter is optional and if not present there will be no later than filter applied. | 
| timeend | None | 1996/11/20/5:20 or 20Nov96-5h20m or 1996-11-20T5:20 | Defines a time based filter. Any rows with time later than this parameter will be excluded during splitting (i.e. they will not be copied to the output measurement set. This parameter is optional and if not present there will be no earlier than filter applied. | 
| Parameter | Default | Example | Description | 
|---|---|---|---|
| stman.bucketsize | 65536 | Set the bucket size (in bytes) of the CASA Table storage manager. This usually translates into the I/O size. | |
| stman.tilencorr | 4 | Set the number of correlations per tile. This affects the way the data is written to and read from disk. | |
| stman.tilenchan | 1 | Set the number of spectral channels per tile. This affects the way the data is written to and read from disk. If it is expected that a given reader or writer process will read only a single channel then the default value of 1 is fine. If the reader or writer is expected to read many, or even all channels then a larger value would be more optimal. | |
| bufferMB | 4000 | Set the size of the memory buffer in MB used for I/O. Up to twice this can be needed depending on the tile shapes. Setting this below 250 will makes mssplit run very slow and setting it bigger than the default has little benefit. | 
Example 1
The following example demonstrates splitting out a single spectral channel, with no averaging:
# Input measurement set
# Default: <no default>
vis         = full.ms
# Output measurement set
# Default: <no default>
outputvis   = chan1.ms
# The channel range to split out into its own measurement set
# Can be either a single integer (e.g. 1) or a range (e.g. 1-300). The range
# is inclusive of both the start and end, indexing is one-based.
# Default: <no default>
channel     = 1
# Defines the number of channel to average to form the one output channel
# Default: 1
width       = 1
Example 2
The following example demonstrates both splitting and averaging. Here, the lowest numbered 54 channels are averaged together to form a single channel in the output measurement set.
# Input measurement set
# Default: <no default>
vis         = full-18_5kHz.ms
# Output measurement set
# Default: <no default>
outputvis   = averaged_1MHz_chan_1.ms
# The channel range to split out into its own measurement set
# Can be either a single integer (e.g. 1) or a range (e.g. 1-300). The range
# is inclusive of both the start and end, indexing is one-based.
# Default: <no default>
channel     = 1-54
# Defines the number of channel to average to form the one output channel
# Default: 1
width       = 54
Example 3
Finally, the following example demonstrates averaging a single measurement set with 16416 spectral channels by a factor of 54, creating a single output measurement set. i.e. 16416 x 18.5kHz channels to 304 x 1MHz channels.
# Input measurement set
# Default: <no default>
vis         = full-18_5kHz.ms
# Output measurement set
# Default: <no default>
outputvis   = averaged_1MHz.ms
# The channel range to split out into its own measurement set
# Can be either a single integer (e.g. 1) or a range (e.g. 1-300). The range
# is inclusive of both the start and end, indexing is one-based.
# Default: <no default>
channel     = 1-16416
# Defines the number of channel to average to form the one output channel
# Default: 1
width       = 54