A RACS catalogue of complex radio sources created using a self-organising map

The next generations of radio surveys are expected to identify tens of millions of new sources, and identifying and classifying their morphologies will require novel and more efficient methods. Self-organising maps (SOMs), a type of unsupervised machine learning, can be used to address this problem. Alam et al. map 251,259 multi-Gaussian sources from the first epoch low-band Rapid ASKAP Continuum Survey (RACS-Low) onto a SOM with discrete neurons. Similarity metrics, such as Euclidean distances, can be used to identify the best-matching neuron or unit (BMU) for each input image. Alam et al. establish a reliability threshold by visually inspecting a subset of input images and their corresponding BMU. They label the individual neurons based on observed morphologies, and these labels are included in their value-added catalogue of RACS sources. Sources for which the Euclidean distance to their BMU is 5 (accounting for approximately 79% of sources) have an estimated >90% reliability for their SOM-derived morphological labels. This reliability falls to less than 70% at Euclidean distances 7. Beyond this threshold it is unlikely that the morphological label will accurately describe a given source. The catalogue of complex radio sources from RACS with their SOM-derived morphological labels from this work will be made publicly available.

The image above is the trained 10×10 SOM with their morphological labels: C (Compact) sources, EC (Extended Compact), CD (Connected Double) sources, SD (Split Double) sources, T (Triple) sources, U/A (Uncertain/Ambiguous) sources. The labels on the axis indicate the neuron coordinate in the SOM grid such that the top left neuron is (0, 0) with morphological label EC. The SOM can also divided into four quadrants: top left, top right, bottom left, and bottom right (marked in red) for additional analysis.