Dessa engineers build machine learning supernova identification system

Three engineers at Dessa, a company that works with enterprises to build and implement AI systems, have created a machine learning system that successfully identifies supernovas faster and more accurately than legacy methods.

The machine learning system, named space2vec, is improving the accuracy of identifying supernovas by 10 percent, while also reducing the time previously spent by half.

“We’ve opened-sourced our code in hopes to show people that you don’t need to be an expert to do this.”

Engineers Jinnah Ali-Clarke, Pippin Lee, and Cole Clifford are all big fans of space. They share a desire to assist the astronomy industry as software engineers, through the use of machine learning and AI. All three embarked on a research journey hoping to discover a problem in the astronomy industry that they could use their skill-set to help solve.

“We’re at this point in time where three engineers without any background can help and do significant work,” Lee, a developer at Dessa, told BetaKit. “Astronomers are open-sourcing their code because they have limited resources that they’re sharing with each other.”

The Problem

Their research led the three engineers to a paper that looked at the Dark Energy Survey (DES), an international project in which researchers attempted to understand dark energy, an unexplained source of energy that makes up about 68 percent of the universe, according to NASA.

To better understand dark energy, DES researchers studied type Ia supernovas, exploding stars that release large amounts of energy.

The DES took wide images of the sky in order to find supernovas, resulting in astronomers having to sort through a lot of data. Now, some bigger telescopes produce 15 to 30 terabytes of data per night. The DES itself produced 400 million images during its first three years.

“Finding [supernovas] really quickly is an interesting problem because you want to do the survey quickly, find the supernovas, and then take a closer look,” Lee said. “Usually, they would have all of this image data from the dark energy survey, and they would have to manually look through them.”

“We’re at this really interesting point in time where you don’t have to have a PhD in astronomy.”
 

The paper Lee and his co-engineers found, outlined a method called Autoscan, which improved on the purely manual method by using feature engineering (a process of machine learning that is both difficult and expensive). Feature engineering is a process of calculations to convert images into tabular data, similar to a CSV. This was still a fairly laborious process, however, as researchers had to fill out the properties of each image, while determining what properties they were interested in.

The team at Dessa wanted to cut out this process of feature engineering completely by using a newer model, convolutional neural networks (CNN).

“Imagine two buckets, a supernova bucket and a not-supernova bucket,” Lee explained. “You want to take all of your images and say, ‘ok, let’s put them through the algorithm and we’ll look at the supernova bucket. These are the ones we’ll go through manually and see if there are actually supernovas in them.’”

The Process

The CNN model allows users to input images as raw pixel data, eliminating the need to convert images into tabular data. The team gave the model images from the DES while giving the model ‘targets’, which is the answer to these images.

“You give it an image and you get [the model] to guess if it’s a supernova or not,” Lee said. “It slowly learns the patterns, and so the question we were asking is – can we use this technique, both to, eliminate the feature engineering step (which is time consuming), but also to use [CNN] to create a more accurate model for supernova classification.”

By speaking to other astronomers and comparing their metrics to Autoscan, the team at Dessa found space2vec was able to increase the accuracy, compared to Autoscan, by 10 percent, while roughly halving the total time spent on the image classification process.

Currently, the team is working on applying space2vec to other astronomy data. The team is interested in SETI (search for extra-terrestrial intelligence) and the CHIME data, which is an instrument used for detecting radio signals.

Lee said that both the astronomy and machine learning fields can be intimidating, but there’s a lot of open-source tools and data that help lower the barrier.

“We’re at this really interesting point in time where you don’t have to have a PhD in astronomy,” Lee said. “We’ve opened our space2vec code and we’ve written about this process in hopes to show people that you don’t need to be an expert to do this.” He added, “I think trying to lower the bar, or make it more accessible, is a big goal for Dessa but also us as software engineers.”

The space2vec code can be downloaded here.

Featured image courtesy Dessa.

Sera Wong

Sera Wong

Heyo, Sera here. I love infographics, organizing data, and making lists. I’m an avid lover of cats. Please send cat pics my way at @Sera_wong on Twitter.