tClearMSCache (Benchmark and debug tool)

This page provides instruction for tClearMSCache test program which can be used to exercise MS reading in a pattern similar to that used by the imager. Originally written to debug cache clean up and to investigate the amount of memory used for caching at various stages of the table data source (the object used to access measurement set) lifecycle, this tool can also be used for various performance benchmarks. To ensure the data and flag columns are read, the test computes the sum (and the mean if the number of unflagged visibilities is non-zero) of unflagged visibilities (across the whole or selected portion of the measurement set) and prints the result in the log.

Running the program

It can be run with the following command, where “config.in” is a file containing the configuration parameters described in the next section.

$ tClearMSCache -c config.in

The tClearMSCache program supports execution in parallel if called with the appropriate MPI wrapper. Note, in this case the master rank (i.e. rank 0) doesn’t do any work and just waits for the other ranks to complete. This is done to mimic the behvaviour of the imager application where only worker ranks are reading data and simplify transfer of configurations.

Configuration Parameters

The following table contains the configuration parameters to be specified in the config.in file shown on above command line. Note that unlike the actual YandaSoft applications this test tool doesn’t require any prefix for parameters (like Cimager for the imager application). A number of other parameters allowing to narrow down the data selection are understood. They are given in a separate table (see Data Selection) and do not require any prefix for this particular test tool. Note, some parameters have little, if any, effect (and so the defaults can be used). They are listed for consistency (and recognised only because the test uses the same code as the imager). All parameters, including the selection described above recognise substitution expressions (like %w) which can be used to setup a case analogous to spectral line imaging where different channels of the same measurement set are read by different workers.

Parameter

Type

Default

Description

dataset

string or vector of strings

None

This is the only parameter which doesn’t have any defaults and must be given in the parset. It provides one or more measurement sets to read. How the reading job is distributed among worker ranks in the parallel case depends on the other parameters described below.

nworkergroups

integer

1

This mimics the same keyword in the imager. If there are more than one group of workers, the tool ensures that each worker in the group reads the same measurement set, provided more than one is given.

distributeddatasets

boolean

true

If this parameter is true and more than one dataset is given in the dataset parameter, the reading job is distributed between workers (if there are multiple groups of workers, matching workers belonging to different groups will get the same dataset). If there are more workers available than individual datasets, same dataset will be assigned to more than one worker (so this option has no effect if just one dataset is given. If this option is false, each worker will read the whole list in a sequence (mimicking the reading pattern of the joint deconvolution case).

dsmreset

boolean

true

This option enables explicit disposal of the data source manager object upon conclusion of the reading. Otherwise, this happens only when destructors are called at the end.

cleardsfirst

boolean

false

This option enables explicit disposal of the data source object before the whole datasource manager is deallocated (either explicitly or via destructors). Essentially, this should close the measurement set and dispose all the associated structures (subject to casacore library doing the right thing).

clearcache

boolean

true

If true, the data source manager will attempt explicit cleanup of the storage manager caches at the end (when it is reset explicitly - see dsmreset - or goes out of scope). Note, different default value is used compared to imager.

datacolumn

string

“DATA”

Name of the data column to read. Passed to data source object as is.

nUVWMachines

integer

1

Number of uvw machines in the cache. Passed to the created data source object as is, but should have little or no effect on the test because UVW column is not read.

uvwMachineDirTolerance

string

“1e-6rad”

Direction tolerance for uvw machine cache. Passed to the created data source object as is, but should have little or no effect on this test because it doesn’t use UVW.

Example

Example tClearMSCache parset

dataset=spectral.ms
cleardsfirst=true
Channels=[1,%w]