- As ambitious high school students, we were pulled towards the idea of research papers and essays; however, we noticed that our peers were struggling to convey their projects and findings through written research, even for professionals in advanced fields. We came to realize that this dire issue was not only targeting our communities, but students and industry professionals on a tremendous scale. Evidently, we decided to develop a machine learning project that can autonomously develop essays with only input of key words, phrases, and topic.
What it does
- CatalystAI allows students, researchers, and professionals alike to EFFECTIVELY enhance their informative scientific research papers (academic papers in general) by utilizing the cutting edge gpt3 model along with an extensive knowledge graph extracted from a number of scientific papers. As a result of the knowledge graph containing relational data, the outputted text is easier to understand and allows the researcher to produce more quality content while limiting the amount of unnecessary information.
How we built it
- In order to build this application with the limited time that we had, we split it up into numerous portions: gpt3 model training and output testing, knowledge graph creation and relational data extraction, and frontend UI for an interactive experience. To train the gpt3 model, we collected a number of research papers revolving around numerous topics within the field of deep learning (to show the model's ability to generalize) and ran a training script. In order for us to create the knowledge graph, we utilized the networkx and spacy library for graph visualization and part of speech extraction respectively. We decided to use the flask web server framework to allow us to easily pull data from gpt3 and process information without having to send additional API requests. Once each of the individual portions was complete, we combined everything to make one smooth flowing pipeline. Everything was created using python (training gpt3, creating and extracting data from the knowledge graph, and setting up the flask server).
Challenges we ran into
- We had to deal with a number of challenges while creating this application. Since CatalystAI is essentially a full-stack application, creating the pipeline between the flask python backend with the html front end was one of the most important components of this application. It was quite difficult to integrate the backend components (knowledge graph extraction + gpt3 model output) with the front end website portion mainly because we were facing problems with routing and structuring our files. However, after collaborating with each other, we were able to fix this problem why isolating each component and testing each part instead of slapping everything together. The initial problem we fixed was gpt3 model outputting which we ran into because our file structure was incorrect. After this, we solved the problem of knowledge graph data extraction.
Accomplishments that we're proud of
- We're proud of committing ourselves to prioritize and combating an issue that a large-scale community of researchers and students face on a daily basis. We're proud of developing a realistic solution for research and essays where instead of being succumbed to struggling to convey our findings through our own research papers and having to develop an idea, we can now efficiently allow AI and GPT-3 to develop the papers for us without killing the creativity.
What we learned
- We learned that utilizing cutting-edge deep learning technologies can allow newer, more effective applications to be created which can help almost anyone in the entire world. This was the first time we were able to utilize the gpt3 model to create an application that benefits researchers and aspiring specialists. We also learned about integrating and transfer learning pre-trained models as well as improving our abilities to use knowledge graphs to effectively extract relational data. Finally, we learned about connecting both parts of the picture, the front end, and the backend, to create one beautiful and functional full-stack application.
What's next for CatalystAI
- Beyond just Sigmoid Hacks, we plan on developing and scaling this project into a potential way for students and researchers alike to efficiently develop research papers, essays, and project proposals, etc, and effectively utilize AI and advanced text-generation to their advantage. We hope to expand our knowledge graph structure to cover more topics so that this application will be applicable for a number of fields, not specifically deep learning. Additionally, we plan on hosting this application on AWS for scalability purposes to make sure that our appliaction will stay live.