An exoplanet is categorised as a planet outside our solar system. Currently there are 4,914 confirmed exoplanets with a further 8,493 identified as possible candidates. NASA missions such as Hubble, Kepler and the new James Webb telescope are used to image millions of stars and these images can then be used to search for possible exoplanets. The volume of data retrieved from these missions is a prime example where human categorisation alone would not be practical, but the correct application of machine learning can help narrow the search and identify candidates for further investigation.

The aim of my project was to analyse the performance of current machine learning methods used to predict the probability of candidate exoplanets versus newer machine learning methods. The project involved extracting through web API, image data from NASA telescopes related to exoplanet candidates, pre-processing, then applying two different machine learning models both locally and on the cloud to predict the existence of an exoplanet. Statistical tests were then applied to the results to determine if newer methods can provide an increase in performance in terms of prediction accuracy, recall, precision and F1 score.

This project would be of interest to anyone curious about the applications of machine learning and the differences in techniques when applied to the same problem domain.