next up previous contents index
Next: Spectral binning Up: AVERAGING DATA IN FREQUENCY Previous: Should I average my

How do I average my data in frequency ?

                        

 

  1. The task that averages channels together is called AVSPC and it will create a new data base. Thus, you can have two data bases, one with all the spectral information (should you conclude there is benefit in retaining it) and one without. The latter is always used for gain determination with time (see § 8).

    The new channel-averaged data base is often referred to as the channel 0 data base, and I will refer to the AVSPC output in this way henceforth. AVSPC will copy across the relevant extension files that are attached to the input file. It is wise not to include all channels in this average. The first and last few are likely to contain rubbish as the band response falls off fairly fast at the band edges. There is not much point to averaging in noise with your signal. The default for AVSPC is to take the central 75% of the band, and this should be adequate; experience with ATCA data indicates that a good channel range for 128 MHz data is 7 to 27. For other bandwidths, you must make your own assessment (see below), as the band edges roll off at different rates from bandwidth to bandwidth.

  2. A good way to examine the spectrum to determine the useful channel range is with the task POSSM. This task enables all sorts of plots of your data as a function of channel, something most of the other plot routines cannot do. For this purpose, select your strongest calibrator source; probably the primary would be best, even if you just have a short observation. POSSM has a lot of inputs, but you need only change a couple from their default values. Select the desired sources with the combination of the source, calc and freqid adverbs (and maybe the other frequency selection parameters).

    Setting POSSM in motion at this point will produce an amp-scalar averaged (in time) plot from all the visibilities. This means working out the amplitudes for each visibility and then averaging them, rather than averaging the real and imaginary parts of all the visibilities which is called a vector average. The phases are always computed with a vector average because of the circular nature of phase.

    The scalar averaged spectrum contains contributions from emission in the whole primary beam. If the source is resolved, then different baselines will have different amplitudes, and the scalar average will relect that. Note that channels which contain only noise will appear biased after the scalar average at the mean noise level of the individual visibilities. This is because you are averaging a positive definite quantity; the amplitude of a noise signal. Any interference will be obvious in this plot.

    On the other hand, the vector average aparm(1)=1 can be very useful for supressing interference. If the phase stability was good and as is usual, you did the basic on-line calibration with a strong calibrator (on-line programs CACAL or DELCOR), then the phases on all baselines would have been set to zero and the vector sum would be meaningful. Any interference generally has semi-random phase so it tends to average out. The vector average will make a spectrum at one location on the sky (it is basically a spectral-line cube with one spatial pixel) so make sure you set offset to point at the piece of sky you are interested in. Alternatively you could restrict the average to the short baselines so that the exact values of offset are less critical.

    You can display this plot directly on the TV (set dotv=1 in POSSM), or display the resultant plot file (created if dotv=-1) with TKPL (plot on Tektronix or SUN Tektronix emulator), MIXPL (display on terminals or laser printers), or TVPL (display on a TV), and then you should have a fairly reasonable idea of which channels to keep and which channels to throw away. Note also that displaying on the SUN's Tektronix emulator with TKPL is much faster than displaying on the SUN's TV with TVPL or directly on the TV with dotv =1. However, for small plots, such as those produced by POSSM, it makes little difference, most of the time going into starting the task up rather than executing it.

    The `divide by channel 0' (POSSM will just use the central 75% of the band for this) option (bparm(1)=1) might also prove handy. This option divides the frequency dependent visibilities by the channel 0 visibilities. This removes most of the calibration errors as well as most of the source structure and turns the source into a pseudo point source (you could then use something other than a point source for these plots). The remaining response across the band is because of the filters and electronics. Experiment with this option, and see how much it affects the outcome. Make sure there is no severely corrupted data in the channel 0 data channels if you use this option.

    Note that channel 1 in ATCA spectra contains the zero frequency or DC term. This is why there are 33 channels in 128 MHz bandwidth spectra, rather than 32 as you would expect. This is very useful in determining if your data contain DC offsets, a problem discussed more below.

    POSSM
    source='1934-638',' ' Select source by name or perhaps
    calcode='p' select the primary with its calcode
    if you set it in SETJY
    freqid=1 select desired frequency
    antennas=0 average over all antennas
    docalib=-1 no calibration to apply
    aparm=0 scalar average
    aparm(1)=1 vector average
    bparm=0
    solint=0 average all data
    solint=-1 scan averages
    solint=# average data over this many minutes
    dotv=1 plot directly on TV or
    dotv=-1 make a plot file for display later

    MIXPL
    outfile=' ' Don't write plot to disk file
    invers=0 Plot highest version plot file
    device=0 Laser printer
    device=10 Visual 603 terminal

  3. Before setting you loose on AVSPC, it is worth making a diversion here to discuss a problem (and its cure) that arose largely in 128 MHz bandwidth ATCA data taken in 1990. This is the occurrence of DC offsets in the correlation. The correlator is a Fourier transform device. This means that it measures a time-lag spectrum, which is then Fourier transformed to the frequency domain. The natural weighting function of the 128 MHz bandwidth lag spectrum is a triangular function (there are less long than short lags for a given time sequence), the Fourier transform of which is a SinC squared function. Thus, the lag spectrum is multipled by the weighting function, and by the convolution theorem, the frequency spectrum is the convolution of the Fourier transforms of the lag spectrum and the weighting function. This indicates that adjacent channels are not independent and that the sidelobes of the SinC squared function propagate into the spectrum.

    If the lag spectrum is corrupted by a DC offset, its Fourier transform is modifed by the addition of a zero frequency delta function. The convolution of this delta function (which might be very strong) and the SinC squared function can thus produce a strong ringing (i.e., propogation of the SinC squared sidelobes) throughout the spectrum. You can see any such DC offset in the spectrum that POSSM produced (see above) by examining channel 1 of the spectrum (see above). If the offset just occurs in one baseline you would need to plot individual baselines with POSSM to see this clearly.

    However, there is an easy cure. The SinC squared function has zeros every other odd channel. Therefore, you need only select the odd numbered channels in the good channel range that you determined above with POSSM. In addition, because adjacent channels are not independent, you sacrifice very little with regards to the signal-to-noise ratio in doing this. AVSPC has an easy option for doing just this. It may be worth doing regardless of whether you can see any DC term or not. However, note that this discussion is for the 128 MHz bandwidth only. The situation is more complex at other bandwidths and this simple procedure is not a cure for them.

      

  4. Now set up AVSPC by specifying which channels you wish to average. You can specify up to 10 groups of channels to average with the chansel adverb. In this way you can exclude channels with strong interference in them. Alternatively you could edit the data first and then use AVSPC specifying a channel range which included bad but flagged data - this is the better way to exclude corrupt data. Note that the FG table is not copied to the output. All flagged channels are excluded from the average so that by definition, all data in the output file are good. Editing of the data is described in the § 7.

    AVSPC
    flagver=0 Apply highest FG table if any
    chansel=7,27,2 Select odd channels in range 7 to 27
    chansel=7,27,1 Select all channels in range 7 to 27

         

  5. After you have run AVSPC, examine the header of the new file with the verb IMHEAD. The new bandwidth should be reflected in both the header and in the FQ table (use PRTAB to see this). In addition, the frequency reference pixel should have changed to reflect this new increment (note that an integer pixel value refers to the centre of a channel). However, the reference frequency will be unchanged. The reason for this is that the (u,v,w) coordinates assigned to each visibility are referenced to this frequency. If you change it, and then for some reason need to recompute (u,v,w) from the antenna (AN) file, they will come out all wrong. If you select a group of channels symmetrically placed about the reference frequency, then it will remain unchanged in the output data base header.

next up previous contents index
Next: Spectral binning Up: AVERAGING DATA IN FREQUENCY Previous: Should I average my

nkilleen@atnf.csiro.au