next up previous
Next: Display of the results Up: How Pieflag works Previous: Step 3: Postprocessing to

Step 4: Postprocessing to find larger clusters of bad points

When developing the flagging algorithms, it was found that often a few affected data points were not found during times with strong interference. For example, some data points in FigureĀ 2 at 11:50h have amplitude and rms levels which are not suspicious, given that most of the data in that channel have similar amplitudes. However, considering the high levels of interference immediately surrounding these data, one would certainly flag them as well, if the editing was done manually.

One can argue that if any data has not been detected by the procedures described above, they should not be flagged. Also, considering that should these data be affected at all they would have only minuscule effects on the final result, it seems pointless to flag them. On the other hand, it is known in experimental sciences that bad data can be worse than no data, and that discarding possibly good data as a safety measure is acceptable (to some degree). I also point out that the amount of data additionally flagged in this and the previous step is of the order of a few percent at most, hence the sensitivity loss is negligible. It therefore is a matter of personal taste and level of caution whether or not to apply this step.

The flags are gridded into bins with a width of 30s and convolved with a boxcar function with a width of typically 20min. The amplitude of the convolution is a function of the fraction of data which have been flagged in any period of 20min, and the maximum possible value is predictable. If the fraction exceeds a threshold (typically 0.15 of the maximum), the entire window is flagged. An additional benefit of this procedure is that a larger safety margin is flagged around extended periods of bad data. This step also is carried out irrespective of pointings.

Except for step 3, each of the steps above is adjustable to one's needs. Adjustable parameters are: the multiplication factors $n$ and $m$ in the search for bad data, the width of the sections in which the rms is calculated in rms-based flagging in step 2, and the width of the boxcar function and the threshold above which the entire window is flagged in step 4. Furthermore, steps 1, 2, and 4 can be skipped.


next up previous
Next: Display of the results Up: How Pieflag works Previous: Step 3: Postprocessing to

Enno Middelberg 2006-03-21
Staff space
Public