How to run the ASKAP pipelines ============================== Loading the pipeline module --------------------------- The pipeline scripts are now accessed through a specific module on galaxy - ``askappipeline``. This is separate to the main ``askapsoft`` module, to allow more flexibility in updating the pipeline scripts. To use, simply run:: module load askappipeline This provides access to the main pipeline executable *processASKAP.sh*. Some parts of the pipeline make use of other modules, which are loaded at the appropriate time. The beam footprint information is obtained by using the *schedblock* tool in the askappy module, while beam locations are set using *footprint* from the same module. For the case of either BETA data or observations made with a non-standard footprint, the *footprint* tool will not have the correct information, and the ACES tool *footprint.py* is used. This is located in the ACES subversion repository, and is accessed either via the **acesops** module, or (should ``USE_ACES_OPS=false``) your own location defined by the $ACES environment variable. Once loaded, the askappipeline module will set an environment variable **$PIPELINEDIR**, pointing to the directory containing the scripts. It also defines **$PIPELINE_VERSION** to be the version number of the currently-used module. Pipeline configuration ---------------------- The pipeline is configured with a range of input parameters, all of which have some default value. The default is changed by one of the following methods: * a configuration file via:: processASKAP.sh -c myInputs.sh * a template configuration plus the scheduling-block IDs:: processASKAP.sh -s SB_SCIENCE -b SB_1934 -p SB_PB -t template.sh * or a template configuration *and* a configuration file specific to the observation being processed:: processASKAP.sh -t template.sh -c myInputs.sh Parameters are decided by applying, in order, the pipeline defaults, the template (if given) then the configuration file (so that parameters given the configuration file have precedence over the corresponding value in the template). Scheduling block IDs given on the command-line take precendence over any given in the configuration (or template) files. The template and configuration files are shell scripts that define environment variables. A configuration file could look something like this:: #!/bin/bash -l # # Example user input file for ASKAP processing. # Define variables here that will control the processing. # Do not put spaces either side of the equals signs! # control flags SUBMIT_JOBS=true DO_SELFCAL=false # scheduling blocks for calibrator & data SB_1934=507 SB_SCIENCE=514 # base names for MS and image data products MS_BASE_SCIENCE=B1740_10hr.ms IMAGE_BASE_CONT=i.b1740m517.cont # other imaging parameters NUM_PIXELS_CONT=4096 NUM_TAYLOR_TERMS=2 CORES_PER_NODE_CONT_IMAGING=15 This file should define enough environment variables for the scripts to run successfully. Mandatory ones, if you are starting from scratch, are the locations of either the SBs for the observations or the specific MSs. It is possible for processASKAP.sh to read the template from the SB parset, where it looks for the parameter ``common.cp.processing_template``. If this option is used, the SBID command-line options *must* be given:: processASKAP.sh -s SB_SCIENCE -b SB_1934 Giving the template on the command-line (with ``-t``) will override the template provided in the SB parset. If the SB parset does not specify a parset, a warning will be given and processing will continue without it (using the defaults and any configuration file given). The only other parameter that can be passed through the command-line is the ``QUEUE``, which is the name of the slurm partition on which the bulk of the jobs will be run. While the template or config can specify a value for ``QUEUE``, this is overridden by using the ``-q `` option. For example:: processASKAP.sh -s SB_SCIENCE -b SB_1934 -q work This will send the jobs to the *work* partition instead of the default (*askaprt*). The execeptions to this are jobs that require access to the /askapbuffer filesystem - pipeline launch/relaunch jobs, initial raw data access jobs, or submission to CASDA. These are sent to the queue indicated by ``QUEUE_OPS``, which defaults to *askaprt*, and shouldn't in general be altered. When run, the pipeline configuration parameters will be archived in the *slurmOutputs* directory (see below). It will be called *pipelineConfig__