TALES Doctoral Candidates Achieve Strong Results in the MALLORN Astronomical Classification Challenge

TALES Doctoral Candidates Erin Umuzigazuba (University of Nova Gorica) and Natale de Bonis (University of Belgrade) have achieved excellent results in the Many Artificial LSST Lightcurves based on Observations of Real Nuclear Transients (MALLORN) Classifier Challenge, an international competition in deep learning identification of rare cosmic events. Out of 893 participating teams, their solutions landed within the first 10% of all submissions, surpassing about 2 times the typical performance of human expert classification and highlighting the potential of AI for detecting rare astrophysical phenomena in large astronomical surveys.

The general challenge of modern large astronomical surveys, such as the coming Vera C. Rubin Observatory Legacy Survey in Space and Time (LSST) is to identify rare phenomena. In particular, relatively recently discovered Tidal Disruption Events (TDEs) that occur whenever an unlucky star is ripped apart by the immense gravitational forces it experiences as it approaches too close to a supermassive black hole in the center of distant galaxies. These transient events have proven to be tremendously scientifically valuable, particularly for investigating the properties and feeding conditions of otherwise quiescent and thus very difficult to observe black holes. Detecting such events is especially challenging because they are extremely rare and easily mistaken for more common sources such as Active Galactic Nuclei (AGNs) and various types of supernova explosions. The MALLORN challenge focused on highly imbalanced data sets, in which light curves in this challenge are simulations based on real observations from the well-known Zwicky Transient Facility (ZTF) survey.

Figure Caption: General relativistic simulation of a tidal disruption event. The blue circle represents the SMBH, while the white dashed line denotes the region where the supermassive black hole tidal force prevails over the self-gravity of the star. (credits: Taj Jankovič)


To address this imbalance, the leader of the University of Belgrade team, Natale De Bonis developed a meta-learning approach. From the unsupervised approach of Self-Organizing Map, the probabilities of TDEs occurrence were inferred and later combined with other statistical and physical features extracted from the light curves. Three specialized classifiers were combined to a meta-learner producing the final prediction of all classes. This meta-learner solution achieved an F1-score of 0.603 on the unlabeled and unseen dataset, ranking it among the first 10% of all solutions. The MALLORN organizers set a threshold of successful solutions to be above F1-score of 0.37. To follow the team result, refer to the team name SER-SAG-S1 on the challenge website.

Figure Caption: SOM class distribution of the down-sampled training dataset. The white circles are proportional to the TDE number per node. 

The method implemented by the team at the University of Nova Gorica, which included also Erin Umuzigazuba, was based on the Random Forest algorithm. It involved fitting each source with different transient models and Gaussian Processes to account for irregularities and gaps in the data.  Numerous light curve features, known to distinguish TDEs from AGN and SNe reasonably well, were then extracted from the best-fit models. These features were used to train a Random Forest-based classifier to identify the most probable TDE candidates. Using this method, the team achieved a final F1-score of 0.651 on the unlabeled and unseen dataset. 

Figure Caption: A Gaussian Process fit to the r- and g-band data of TDE gobennas_nath_rochben. The rise time, decay time, and r – i color at peak are indicated with arrows. The dashed grey lines indicate the estimated start, peak and end times of the transient.

More information about the challenge is available here:
https://www.kaggle.com/competitions/mallorn-astronomical-classification-challenge/overview

Categories: