There was a major break in the control file formats between DiFX1.5 and DiFX2.0. If you want the formats for DiFX1.5, please click through to difx1.5-files. This page has the file formats for DiFX2.0. Now, you shouldn't really need to look as these at all if you are using vex2difx as recommended. If you a building a 3rd party application, consider using the difxio library (see SVN for details) - it parses all this info in c structures, saving you the hassle.
There's 5 main ascii files you need to run a correlation (plus extras if you're doing pulsar binning or phased array mode) - only one of them is really very complex. I'll start with the easiest and work my way up. Remember, check out the examples for more info, or these examples for some much more complicated setups including pulsar binning. Whenever a keyword/value pair is referred to, the value begins at the 21st character (or after the : separator if the keyword is longer than 20 characters). Also, sorry the tabs don't come out properly in the example .im file snippets on this page.
Simple - 1 line per node of the correlation. If you request more nodes than this file has lines, mpi will wrap back to the start - not efficient. An example for a 10 node correlation on the Swinburne cluster might be:
tera01 tera02 tera03 tera04 tera05 tera06 tera07 tera08 tera09 tera10
The MPI processes 2-N+1 (N==#telescopes) will be datastream processes.
Say you have host1 and host2. host1 is “head” and all the data is stored there and you have 4 telescopes. You want to run a core process on both nodes.
Then the number of mpi processes will be 7 (1+4+2 == fxmanager + 4xdatastream + 2xcore).
Use the machines files to get the processes in the right place. This depends a little on your implementation of mpi. From memory the following should work: for my example above. Don't include the comments!
host1 # Fxmanager host1 # datastream 1 host1 # datastream 2 host1 host1 # datastream 4 host1 # core host2 # core
Remember that “espresso” and such tools handle this for you.
Just as simple as the machine file, the threads file details how many threads you want for each node that will be a Core. So if there's 10 nodes, and this is 3 station experiment, there will be 10 - 3 (datastreams) - 1 (manager) = 6 Core nodes. That means your threads file should be at least 6 lines long. It starts with one line telling how many Cores there can be, and then has Ncores lines with just a number per line. That number is how many threads for that node. So it looks something like:
NUMBER OF CORES: 6 2 2 2 2 2 2
In this example, tera05, tera06,…tera10 have 2 threads each. This is sensible when you have a dual-core machine, or one with hyperthreading. Being able to specify the threads on a per-node basis lets you squeeze the best performance out of a heterogenous cluster.
If writing this file by hand be careful that the value for number of cores *must* start in column 21 (the same as most of the values in the other input files). Getting this wrong will cause all core processes to run with a single thread, and could even cause mpifxcorr to crash.
The .calc file stores all the information on antenna locations, source locations, and scans (start times, durations etc). It also has Earth Orientation Parameters necessary to run CALC. The calc file is supplied to calcif2, which produces the .im file described below. An example .calc file follows (“…” shows lines in a table following a pattern which is hopefully already clear):
JOB ID: 20 JOB START TIME: 53440.9226852 JOB STOP TIME: 53440.9228009 DUTY CYCLE: 1.000 OBSCODE: V177A DIFX VERSION: DiFX-2.0 SUBJOB ID: 0 SUBARRAY ID: 0 START MJD: 53440.9226852 START YEAR: 2005 START MONTH: 3 START DAY: 11 START HOUR: 22 START MINUTE: 8 START SECOND: 40 SPECTRAL AVG: 1 TAPER FUNCTION: UNIFORM NUM TELESCOPES: 4 TELESCOPE 0 NAME: AT TELESCOPE 0 MOUNT: azel TELESCOPE 0 OFFSET (m):0.000000 TELESCOPE 0 X (m): -4751685.988000 TELESCOPE 0 Y (m): 2791621.223000 TELESCOPE 0 Z (m): -3200491.700000 TELESCOPE 0 SHELF: NONE TELESCOPE 1 NAME: HO ... TELESCOPE 3 SHELF: NONE NUM SOURCES: 1 SOURCE 0 NAME: 1600-445 SOURCE 0 RA: 4.2084997996221 SOURCE 0 DEC: -0.7800257309396 SOURCE 0 CALCODE: SOURCE 0 QUAL: 0 NUM SCANS: 1 SCAN 0 IDENTIFIER: No0024 SCAN 0 START (S): 0 SCAN 0 DUR (S): 10 SCAN 0 OBS MODE NAME:Doppler@G329+0.6 SCAN 0 UVSHIFT INTERVAL (NS):2000000000 SCAN 0 AC AVG INTERVAL (NS):2000000000 SCAN 0 POINTING SRC:0 SCAN 0 NUM PHS CTRS:1 SCAN 0 PHS CTR 0: 0 NUM EOPS: 5 EOP 0 TIME (mjd): 53438 EOP 0 TAI_UTC (sec):33 EOP 0 UT1_UTC (sec):-0.443500 EOP 0 XPOLE (arcsec):0.006740 EOP 0 YPOLE (arcsec):0.218860 EOP 1 TIME (mjd): 53439 ... EOP 4 YPOLE (arcsec):0.219710 NUM SPACECRAFT: 0 IM FILENAME: /users/adeller/testing/2.0/noshift/lba_20.im
DiFX1.5 specified the geometric model using two files - the .delay and .uvw files. These held sampled values of the delay and uvw values (from CALC) on a regular grid - typically one sample per second, for every antenna. The .im file supersedes the .delay and .uvw files, and instead of sampling every second it instead stores a polynomial less frequently, typically once per two minutes.
An example .im file follows (same rule for “…”):
CALC SERVER: swc000 CALC PROGRAM: 536871744 CALC VERSION: 1 START YEAR: 2005 START MONTH: 3 START DAY: 11 START HOUR: 22 START MINUTE: 8 START SECOND: 40 POLYNOMIAL ORDER: 5 INTERVAL (SECS): 120 ABERRATION CORR: EXACT NUM TELESCOPES: 4 TELESCOPE 0 NAME: AT TELESCOPE 1 NAME: HO TELESCOPE 2 NAME: MP TELESCOPE 3 NAME: PA NUM SCANS: 1 SCAN 0 POINTING SRC:1600-445 SCAN 0 NUM PHS CTRS:1 SCAN 0 PHS CTR 0 SRC:1600-445 SCAN 0 NUM POLY: 2 SCAN 0 POLY 0 MJD: 53440 SCAN 0 POLY 0 SEC: 79680 SRC 0 ANT 0 DELAY (us): 1.590699551807348e+04 -7.299033971692804e-01 -2.232549659111185e-05 6.468692358170586e-10 9.946319030146373e-15 -3.769275660019511e-19 SRC 0 ANT 0 DRY (us): 9.974292374186537e-03 4.541839076455105e-07 3.452769928524840e-11 1.795348217396608e-15 1.046710943091233e-19 5.943899753099940e-24 SRC 0 ANT 0 WET (us): 4.656132867051395e-04 2.125435426767988e-08 1.619354861844071e-12 8.461244831771577e-17 4.959121378098075e-21 2.839532897018426e-25 SRC 0 ANT 0 U (m): -4.222122069441900e+06 -2.581071707998739e+02 1.122949738068942e-02 2.223645238970012e-07 5.617088869943636e-11 -2.068303157154808e-13 SRC 0 ANT 0 V (m): -2.141630031993518e+05 2.167117333885627e+02 6.616688709965712e-03 -1.854345842529444e-07 -6.531226953877621e-11 2.068221832176051e-13 SRC 0 ANT 0 W (m): -4.768800415563213e+06 2.188193910068975e+02 6.693004677799763e-03 -1.939274091983166e-07 -2.979016210693149e-12 1.042909093213190e-16 SRC 0 ANT 1 DELAY (us): 1.756912593080410e+04 -6.010048711065136e-01 -1.981287933637709e-05 5.326361025763266e-10 8.818734493518715e-15 -2.969732343177518e-19 ... SRC 0 ANT 3 W (m): -4.941474293258900e+06 2.085116694419438e+02 6.677883811385578e-03 -1.847904513851888e-07 -2.988139228644093e-12 1.530249792227679e-16 SRC 1 ANT 0 DELAY (us): 1.590699551807348e+04 -7.299033971692804e-01 -2.232549659111185e-05 6.468692358170586e-10 9.946319030146373e-15 -3.769275660019511e-19 SRC 1 ANT 0 DRY (us): 9.974292374186537e-03 4.541839076455105e-07 3.452769928524840e-11 1.795348217396608e-15 1.046710943091233e-19 5.943899753099940e-24 SRC 1 ANT 0 WET (us): 4.656132867051395e-04 2.125435426767988e-08 1.619354861844071e-12 8.461244831771577e-17 4.959121378098075e-21 2.839532897018426e-25 SRC 1 ANT 0 U (m): -4.222122069441900e+06 -2.581071707998739e+02 1.122949738068942e-02 2.223645238970012e-07 5.617088869943636e-11 -2.068303157154808e-13 SRC 1 ANT 0 V (m): -2.141630031993518e+05 2.167117333885627e+02 6.616688709965712e-03 -1.854345842529444e-07 -6.531226953877621e-11 2.068221832176051e-13 SRC 1 ANT 0 W (m): -4.768800415563213e+06 2.188193910068975e+02 6.693004677799763e-03 -1.939274091983166e-07 -2.979016210693149e-12 1.042909093213190e-16 SRC 1 ANT 1 DELAY (us): 1.756912593080410e+04 -6.010048711065136e-01 -1.981287933637709e-05 5.326361025763266e-10 8.818734493518715e-15 -2.969732343177518e-19 ... SRC 1 ANT 3 V (m): -2.846209623790657e+04 2.065102986336044e+02 6.602685843807587e-03 -1.893219703175936e-07 4.310608957123050e-11 -1.197391361113666e-13 SRC 1 ANT 3 W (m): -4.941474293258900e+06 2.085116694419438e+02 6.677883811385578e-03 -1.847904513851888e-07 -2.988139228644093e-12 1.530249792227679e-16 SCAN 0 POLY 1 MJD: 53440 SCAN 0 POLY 1 SEC: 79800 SRC 0 ANT 0 DELAY (us): 1.581908674310539e+04 -7.352335032343110e-01 -2.209177040755237e-05 6.515979361014590e-10 9.781384664758747e-15 -1.313535485117763e-19 ...
Note that “SRC 0” is always the pointing centre source, and SRC 1 through to SRC N are the N phase centres. The rows contain the polynomial coeffients, running left to right as you'd expect.
This is of necessity a fairly complex file, and fairly long, although typically a lot of it is just repetition for different baselines and telescopes, and easy to generate automatically from the vex file (using vex2difx). It is divided into a series of tables, which I will go through in turn. Lines beginning with # denote the start of a table, and lines beginning with @ are comments which are ignored.
This contains general information such as time range, and the paths to the other ascii files described above. The necessary keywords are shown below, with notes if the meaning is not obvious:
CALC FILENAME: /users/adeller/testing/2.0/noshift/lba_20.calc {The path to the calc file} CORE CONF FILENAME: /users/adeller/testing/2.0/noshift/lba_20.threads {The path to the threads file} EXECUTE TIME (SEC): 10 START MJD: 53440 START SECONDS: 79720 ACTIVE DATASTREAMS: 4 ACTIVE BASELINES: 6 VIS BUFFER LENGTH: 32 {A buffer length at the FxManager} OUTPUT FORMAT: SWIN {Must be SWIN to use difx2fits; ASCII is available for debugging} OUTPUT FILENAME: /users/adeller/testing/2.0/noshift/lba_20.difx {The directory where output files will be written}
This contains info on correlator setup - integration times, message sizes etc. This is placed in a separate table to the common settings so that you can have different setups for different sources - ie high frequency resolution for a target maser and low frequency resolution for your continuum phase reference source. It also allows you to turn pulsar binning on for specific sources.
The first line just informs us how many configs will follow:
NUM CONFIGURATIONS: 1
Then we get info for each of these configurations:
CONFIG NAME: Doppler@G329+0.6_default INT TIME (SEC): 1.000000 SUBINT NANOSECONDS: 80000000 {Determines the message sizes from Datastream to Core} GUARD NANOSECONDS: 2000 FRINGE ROTN ORDER: 1 {Can be 0 for post-F, 1 for linear, or 2 for piecewise linear quadratic approximation} ARRAY STRIDE LENGTH:16 {Used for optimised trigonometry calculations} XMAC STRIDE LENGTH: 128 {Used to ensure output results can stay in cache} NUM BUFFERED FFTS: 1 {Number of FFTs to compute per station before XMAC'ing. Also a cache optimisation thing} WRITE AUTOCORRS: TRUE PULSAR BINNING: FALSE PHASED ARRAY: FALSE DATASTREAM 0 INDEX: 0 DATASTREAM 1 INDEX: 1 DATASTREAM 2 INDEX: 2 DATASTREAM 3 INDEX: 3 BASELINE 0 INDEX: 0 BASELINE 1 INDEX: 1 BASELINE 2 INDEX: 2 BASELINE 3 INDEX: 3 BASELINE 4 INDEX: 4 BASELINE 5 INDEX: 5
If PULSAR BINNING is TRUE, an extra line is inserted immediately below the PULSAR BINNING line as shown below:
PULSAR CONFIG FILE: /nfs/cluster/ska/adeller/v190/v190f/pulseprofiles/2144-3933/2144-3933.gate.binconfig
The format of the pulsar config file is described below.
The rule tables describes which configuration will be applied at any given time. Usually this filters on scan attributes such as source, but can also be done in a time-based manner (start and stop times). An time for which no configuration matches will not be correlated. If more than one rule matches a given time, they must all refer to the same configuration.
NUM RULES: 1 RULE 0 CONFIG NAME: Doppler@G329+0.6_default
This example just applies the one configuration to all time - a pretty common occurrence.
Lists all the frequencies used in the experiment. Like most of these tables, it starts with one line listing the number of entries, and then has seven lines per entry: band edge frequency, upper or lower sideband, bandwidth, number of channels, how many of these should be averaged together after correlation, and the oversample and decimation factors. The frequencies are specified in MHz, and U or L is used to indicate upper/lower sideband respectively. A sample freq table is shown below:
FREQ ENTRIES: 4 FREQ (MHZ) 0: 1634.0 BW (MHZ) 0: 16.0 SIDEBAND 0: L NUM CHANNELS 0: 128 CHANS TO AVG 0: 8 OVERSAMPLE FAC. 0: 1 DECIMATION FAC. 0: 1 FREQ (MHZ) 1: 1634.0 BW (MHZ) 1: 16.0 SIDEBAND 1: U .. DECIMATION FAC. 3: 1
All future tables refer to the freq table when specifying frequency bands.
The telescope table contains a listing of the stations used in the experiment. The names used must be a subset of those in the delay and uvw files - the correlator will die gracefully if it cannot find one of the stations in this table somewhere in the delay and uvw files. Each station has a clock offset (microseconds) and a clock rate (microseconds per second). These are in the same sense as the geometric delay ie a positive clock offset is a *delay*. Thus, if you are looking at the delay quantity of an SN table in AIPS, the corrections you make to these numbers are in the same sense as those you see on the TV. An example telescope table is shown below.
# TELESCOPE TABLE ##! TELESCOPE ENTRIES: 5 TELESCOPE NAME 0: AT @ ***** Clock poly coeff N: has units microsec / sec^N ***** @ CLOCK COEFF 0/0: -5.504000000000000e01 CLOCK COEFF 0/1: -1.879490500000000e-08 TELESCOPE NAME 1: HO @ ***** Clock poly coeff N: has units microsec / sec^N ***** @ CLOCK COEFF 0/0: -1.012400000000000e01 CLOCK COEFF 0/1: 6.9400000000000000e-08 ...
Entries in the telescope table are referred to by the Datastream table entries. Thus, more than one Datastream can reference a single Telescope. This is arranged in this fashion so you don't need to specify the station clocks over and over again, when you have a few different band setups throughout the experiment (ie wideband phase reference, narrowband target etc). It is also useful if one station has recorded separate streams of data - this happens at the LBA in 1 Gbps mode, where the data is recorded in two separate 512 Mbps files. In this situation, you really have two “Datastreams” coming from one “Telescope”.
The table starts with the usual number of entries, and then two lines which affect all Datastreams. These are the factors affecting the size and breakup of the memory buffer. The size of the buffer is given in terms of a multiplier for the message size (which is itself a number of FFT chunks - see the Config table). The memory buffer is then divided into a number of segments - this must be even and must be at least 4.
# DATASTREAM TABLE #! DATASTREAM ENTRIES: 5 DATA BUFFER FACTOR: 32 NUM DATA SEGMENTS: 8
The table entries are necessarily complex, as they completely describe the band setup for each datastream. This comprises the format and precision of the recording, the a priori system temperature, the data source (network or disk), whether to use a filterbank instead of an FFT, the number of frequencies, small delay offsets for each frequency, the number of polarisations recorded in each frequency and finally the order of each of the bands within the file.
The introductory stuff (format, tsys etc) goes at the top as shown:
TELESCOPE INDEX: 0 TSYS: 0.000000 DATA FORMAT: LBASTD QUANTISATION BITS: 2 DATA FRAME SIZE: 40004096 DATA SOURCE: FILE FILTERBANK USED: FALSE PHASE CAL INT (MHZ):1 NUM RECORDED FREQS: 4 REC FREQ INDEX 0: 0 CLK OFFSET 0 (us): 0.000000 FREQ OFFSET 0 (Hz): 0.000000 NUM REC POLS 0: 2 REC FREQ INDEX 1: 1 ... NUM REC POLS 3: 2 REC BAND 0 POL: R REC BAND 0 INDEX: 0 REC BAND 1 POL: L REC BAND 1 INDEX: 0 ... REC BAND 7 POL: L REC BAND 7 INDEX: 3 NUM ZOOM FREQS: 0
If the TSYS value is > 0.0, the correlator will scale the data online to try and produce estimated visibilities in janskys. If it is ⇐ 0.0, the correlator will produce normalised correlation coefficients instead. The latter is the default way and recommended for use with difx2fits.
Choices for the Mode include LBASTD (2 bit mag sign encoding), LBAVSOP (2 bit offset binary encoding), MKIV, VLBA, and NZ (8 bit linear). The data source can be FILE, MODULE or NETWORK (referring to linux files, Mk5 modules, or a network socket - usually for eVLBI). For each recorded frequency, one can apply a small extra instrumental delay if needed, and a frequency offset if the LO was not set correctly, and finally specify how many polarisations were recorded. After describing each of the frequencies, information follows for each subband (a frequency/polarisation combination) which lets you describe how the subbands are ordered. In the example above there were 4 frequencies each with 2 polarisations, so there are 8 subbands in total.
After the recorded frequencies, it is possible to describe “zoom” frequencies and bands which allow the selection of a subset of the spectral channels produced from a recorded band to be correlated. These frequencies must already be described in the Freq table, and must lie wholly within a recorded band and have the same channelisation. If NUM ZOOM FREQS is set to greater than 0, then the zoom band description follows in exactly the same manner as for the recorded bands, e.g. assuming some appropriate entry in the Freq table in position 4:
NUM ZOOM FREQS: 1 ZOOM FREQ INDEX 0: 4 NUM ZOOM POLS 0: 2 ZOOM BAND 0 POL: R ZOOM BAND 0 INDEX: 0 ZOOM BAND 1 POL: L ZOOM BAND 1 INDEX: 0
The baseline table starts with the usual “number of entries” line.
# BASELINE TABLE ###! BASELINE ENTRIES: 10
Each entry then consists of two Datastreams (references to the Datastream table), the number of frequencies, and the number of polarisation products per frequency, as shown below:
D/STREAM A INDEX 0: 0 D/STREAM B INDEX 0: 1 NUM FREQS 0: 4 TARGET FREQ 0/0: 0 POL PRODUCTS 0/0: 2 D/STREAM A BAND 0: 0 D/STREAM B BAND 0: 0 D/STREAM A BAND 1: 1 D/STREAM B BAND 1: 1 TARGET FREQ 0/1: 1 POL PRODUCTS 0/1: 2 D/STREAM A BAND 0: 2 D/STREAM B BAND 0: 2 D/STREAM A BAND 1: 3 D/STREAM B BAND 1: 3 TARGET FREQ 0/2: 2 POL PRODUCTS 0/2: 2 D/STREAM A BAND 0: 4 D/STREAM B BAND 0: 4 D/STREAM A BAND 1: 5 D/STREAM B BAND 1: 5 TARGET FREQ 0/3: 3 POL PRODUCTS 0/3: 2 D/STREAM A BAND 0: 6 D/STREAM B BAND 0: 6 D/STREAM A BAND 1: 7 D/STREAM B BAND 1: 7
If we look up the Datastream table, we see that Datastream 0 and 1 reference telescope 0 and 1, which are AT and HO respectively. Each of these Datastreams has 4 frequencies, so it is unsurprising that we are choosing to correlate all 4. Each frequency here has two polarisation products, and if we again follow the references back through the Datastream table, we see that in each case the products correspond to RR and LL. Eg for the first frequency, band 0 of AT is 1634 LSB, polarisation R, and band 0 of HO is 1634 LSB, polarisation R, so this product is 1634 RR. Band 1 of PKS is 1634 LSB, polarisation L, and band 1 of CATW172 is 1634 LSB, polarisation L, so this product is 1634 LL. vex2difx sets all this up for you, naturally.
The optional TARGET FREQ entries refer to the freq table. These entries indicate the frequency id under which DiFX should write out the correlated band pairs. Above, the first polarization set consists of the 16 MHz wide recorded band 0 at station A (1634 LSB R) correlated against recorded band 0 at station B (1634 LSB R), and likewise band 1 against band 1 (1634 LSB L). The TARGET FREQ of 0 indicates that resulting visibility data records should be written out, unsurprisingly, under frequency id 0 (1634 LSB). If however the freq id were to refer to a wider band, say id 8 at 1602 USB and 48 MHz wide, mpifxcorr would store the band 0 vs band 0 product into the 1618-1634 region of that freq id - and the correlation setup would have other band products that contribute to the remaining regions of that freq id, filling in the rest, for full (and concatenated) spectral coverage.
This table must be included if one or more datastreams read from a file. It is implicitly the same length as the datastream table (there is no “number of entries” line). Each datastream has one line to say the number of files N, and then N lines with filenames:
# DATA TABLE #######! D/STREAM 0 FILES: 8639 FILE 0/0: /nfs/cluster/raid9/v190f/v190f-At_027_020000.lba FILE 0/1: /nfs/cluster/raid9/v190f/v190f-At_027_020010.lba ...
When reading from a Mk5 module, there will only be one “file” per datastream, and that will be the module name.
This table must be included if one or more datastreams read from a network connection (DATA SOURCE: NETWORK). It is implicitly the same length as the datastream table (there is no “number of entries” line). Each datastream has two lines - a port number and a TCP window size in kB. Negative values mean use UDP rather than TCP as the transport protocol.
# NETWORK TABLE ####! PORT NUM 0: 10001 TCP WINDOW SIZE 0: 250 PORT NUM 1: 10002 TCP WINDOW SIZE 1: 250 ...
Probably best to contact me if you have interest in trying out the network-fed correlator, as you'll need to set up the sending side of things as well which isn't covered here.
This filename is specified in the input file if PULSAR BINNING is true. If required, it is put on the following line as shown:
PULSAR BINNING: TRUE PULSAR CONFIG FILE: /home/difx/projects/tc016a/0834+2200.gate.binconfig
The format is pretty simple - it gives links to the polyco file(s) containing pulse prediction information (see the program TEMPO for a description of the polyco file format), and specifies where the bin end-points are set. It also gives the option to “scrunch” the binned data. If SCRUNCH is true, each bin is scaled by its corresponding weight and the bins are summed before writing to disk: thus only one “bin” is recorded per time integration. This can be used to implement a matched filter for each pulsar, recovering maximum S/N. If SCRUNCH is false, each bin is written out separately and the weights are ignored. This mode is not well tested, and may have bugs.
NUM POLYCO FILES: 3 POLYCO FILE 0: /nfs/cluster/ska/adeller/v190/v190f/pulseprofiles/0630-2834/0630-2834_54126_200000.polyco POLYCO FILE 1: /nfs/cluster/ska/adeller/v190/v190f/pulseprofiles/0630-2834/0630-2834_54127_120000.polyco POLYCO FILE 2: /nfs/cluster/ska/adeller/v190/v190f/pulseprofiles/0630-2834/0630-2834_54127_200000.polyco NUM PULSAR BINS: 2 SCRUNCH OUTPUT: TRUE BIN PHASE END 0: 0.58 BIN WEIGHT 0: 0.0 BIN PHASE END 1: 0.665 BIN WEIGHT 1: 1.0
This example shows a simple gate, where only data falling between pulse phase 0.58 and 0.665 is retained.
This filename is specified in the input file if PHASED ARRAY is true. If required, it is put on the following line as shown:
PHASED ARRAY: TRUE PHASED ARRAY CONFIG FILE:/home/difx/projects/tc016a/phasedarray.config
The format has not been set yet, but will probably look something like this:
OUTPUT TYPE: FILTERBANK [could also be TIMESERIES] OUTPUT FORMAT: DIFX [could be VDIF for TIMESERIES] ACC TIME (NS): 64000 [ignored for TIMESERIES data] COMPLEX OUTPUT: TRUE [ignored for FILTERBANK data] OUTPUT BITS: 8
more keys could yet be added - we haven't gotten very far into the implementation yet.
Okay, so this isn't an ascii control file, but it is a file format so I'll describe it briefly here. The purpose of this file is to hold a bunch of visibilities in a relatively easy to understand format, which you can then translate into your favourite flavour of FITS or similar. difx2fits will produce FITS-IDI data from the SWIN format.
You create “SWIN” style output data by specifying OUTPUT FORMAT: SWIN in the common table of your correlator input file. When creating SWIN style data, the OUTPUT FILENAME keyword in the common table must refer to a non-existent directory that you want to create to store the visbility files in. The root directory of the directory you specify must exist eg if you want to use /tmp/experiment/binary/ as your output directory, /tmp/experiment/ must exist but /tmp/experiment/binary/ must not.
In this directory, one or more SWIN style visibility files will be created. Each file will have a name of the form
DIFX_MJD_SECONDS.s####.b####
where MJD is the MJD of the first visibility point in the file, SECONDS is the number of seconds since the start of MJD for the first visibility point, s#### is the source index (running from 0 to N for a given scan, not the overall source index into the table held in the .calc file) and b#### is the pulsar bin number (again, for the configuration which a given scan refers to).
Each visibility file is completely binary, in comparison to DiFX1.5 which had ascii headers and binary data. The binary header which precedes each visibility entry is of length 74 bytes and contains the following data:
Bytes Type Contains Example value 1-4 Int SYNC WORD 0xFF00FF00 5-8 Int BINARY HEADER VERSION 1 9-12 Int BASELINE NUM 258 13-16 Int MJD 54044 17-24 Double SECONDS 3600.5 25-28 Int CONFIG INDEX 0 29-32 Int SOURCE INDEX 1 33-36 Int FREQ INDEX 0 37-38 Char[2] POLARISATION PAIR RR 39-42 Int PULSAR BIN 0 43-50 Double DATA WEIGHT 1.0 51-58 Double U (METRES) -4422923.40042 59-66 Double V (METRES) -1635977.07993768 67-74 Double W (METRES) 4285656.48881794
The header is immediately followed by the binary real and imag for each point. The length will be 2*numchannels floats, packed as re im re im re … The endianness of the binary data (header and visibilities) is not enforced, but all instances of DiFX to date use little-endian (Intel format).
In the case of upper sideband data, the first reported channel is the “zero frequency” channel, that is its sky frequency is equal to the value in the frequency table for this spectrum.
The Nyquist channel is not retained. For lower sideband data, the last channel is the “zero frequency” channel. That is, in all cases, the spectrum is in order of increasing frequency and the Nyquist channel is
excised.
The baseline num is calculated using the 1-based station indices (ie in the example above, AT=1, HO=2…). It is calculated as 256*S1 + S2, where S1 and S2 are the 1-based station indices of the stations that contribute to the baseline.
The CONFIG INDEX, SOURCE INDEX and FREQ INDEX refer to the configuration table, source table and freq table in the .input, .calc and .input files respectively.
The value numchannels can be found from the input file, looking at the correct entry in the freq table as specified by FREQ INDEX. The end of the visibilities is immediately followed by the next header, and so on.
Because of aliasing, the “numchannels” spectral points that are recorded for each band do not exactly cover the subband bandwidth in the way one might expect. This differs for upper sideband and lower sideband data. For more information, see the channelisation page.
For each visibility dump, the individual entries come in the following order:
The top level loop is over baseline, for the NUM ACTIVE BASELINES entries in the current CONFIGURATION (which point at entries in the BASELINE table)
The next level of loop is over frequency, as determined by “NUM FREQS” for each baseline entry.
The next level of loop is phase centre, as determined by the current scan in the .calc file
The next level of loop is over pulsar bin, as determined by the NUM PULSAR BINS for the pulsar bin config for the current CONFIGURATION
The final level of loop is over polarisation, as determined by the “POL PRODUCTS #/#” entry for each baseline frequency.