Inspiration

The inspiration behind the StellarQuest is the desire to gain a deeper understanding of the universe and the objects that populate it. Astronomy is a fascinating field that has captured the imaginations of people for centuries, and there is still so much to discover. By developing a classification system for stars, galaxies, quasars, and black holes, we can gain insights into the physical properties of these celestial objects, such as their temperature, density, magnetic field, and elemental composition. This can help us unravel the mysteries of the universe and gain a better understanding of our place in it. Additionally, the development of advanced deep learning techniques provides an opportunity to create a more efficient and accurate system for classifying celestial objects, which can be used by astronomers and researchers around the world. Ultimately, the inspiration for this project lies in the human desire to explore and understand the universe around us.

What it does

StellarQuest is focused on developing a classification system for stars, galaxies, quasars, and black holes using data from the Sloan Digital Sky Survey (SDSS). The project aims to create an efficient and accurate system for classifying celestial objects using advanced deep learning techniques. By doing so, it can provide insights into the physical properties of these celestial objects such as their temperature, density, magnetic field, and elemental composition, which can help unravel the mysteries of the universe and gain a better understanding of our place in it. Ultimately, StellarQuest seeks to contribute to the field of astronomy by creating a tool that can be used by astronomers and researchers worldwide.

How we built it

We undertook our project in three stages.

  1. The first stage involved gathering information by studying publications, research papers, and real-life examples.
  2. During the second stage, we planned the project by selecting the appropriate tech stack for the front-end, back-end, dataset, and machine learning model.
  3. In the third stage, we implemented our plan.

Our Machine Learning approach

  1. The dataset used in this project consists of 100,000 space observations taken by the SDSS (Sloan Digital Sky Survey).
  2. Each observation is described by 17 feature columns and 1 class column, identifying whether it is a star, galaxy, or quasar.
  3. The feature columns include information such as the object identifier, right ascension and declination angles, filter readings in different photometric systems, run and field numbers, and redshift values.
  4. The class column indicates whether each observation is a galaxy, star, or quasar object.
  5. The dataset also includes plate IDs, modified Julian dates, and fiber IDs that provide additional information about each observation.

Challenges we ran into

One of the major challenges encountered during the project was the complexity of the data selected. We also encountered the challenge of identifying the type of machine learning problem that we were dealing with. After analyzing the data, we determined that it was a multi-class classification problem as there were multiple target classes in the dataset.

Accomplishments that we're proud of

To better understand the data, we focused on gathering information that would be useful for analysis. This included studying the celestial sphere, which is an imaginary sphere that all objects in the sky can be projected upon, and the celestial equator, which is the great circle of the imaginary celestial sphere on the same plane as the equator of Earth. We also studied ascension and declination, which are key concepts used in astronomy and navigation in space. Additionally, we familiarized ourselves with the photometric system, which is used to measure the brightness of light perceived by the human eye, and the UBV photometric system, which is employed for classifying stars based on their colors. Redshift was another important concept we studied as it reveals how an object in space is moving compared to us and lets astronomers measure the distance for the most distant objects in the universe. Through this process, we were able to overcome the challenges and gain a better understanding of the data, which aided us in achieving the project's objectives.

To ensure that our machine learning model was performing well, we established performance metrics such as multi-class logloss and confusion matrix. Our objective was to predict the probability of each datapoint belonging to each of the classes. However, we also had to take into account certain constraints such as interpretability and the need for class probabilities. Additionally, we needed to penalize errors in class probabilities, but there were no latency constraints. We overcame these challenges by carefully selecting and implementing appropriate machine learning algorithms and techniques, and by rigorously testing and evaluating the performance of our model.

What we learned

  1. Importance of understanding data: Understanding important information such as celestial sphere, celestial equator, photometric system, and redshift can help us gain a better understanding of the data within a small period of time.
  2. Types of machine learning problems: Identifying the type of machine learning problem helps in selecting the appropriate algorithms and techniques to solve the problem.
  3. Performance metrics: Selecting appropriate performance metrics helps in evaluating the performance of our machine learning model.
  4. Machine learning objectives and constraints: Defining the objectives and constraints of our machine learning project helps in ensuring that our model is optimized for the problem at hand.
  5. Model selection: Selecting appropriate models and techniques can improve the accuracy and efficiency of the models.
  6. Feature engineering: Feature engineering involves selecting, extracting, and transforming features from the data to improve the accuracy of machine learning model.
  7. Practical skills: This project provides an opportunity to develop our practical skills in machine learning, including data cleaning and preprocessing, model selection and optimization, and performance evaluation.

What's next for StellarQuest

Now we have used the Dash framework for the development of the web application StellarQuest, there is potential for the future development of a user-friendly application that enables users to input data and obtain predictions regarding the celestial object they are observing. Such an application could prove beneficial for both amateur and professional astronomers alike, by enhancing their ability to comprehend and analyze celestial phenomena. This development could serve as a promising avenue for expanding the reach and impact of StellarQuest in the scientific community.

Share this project:

Updates