Inspiration
I've been a graduate student and have been involved in conducting research and writing research papers. I have seen that tools like Claude Code can automate the process from idea/feature request to implementation or a pull request. I wish we could do the same for the research process.
What it does
It helps you come up with novel research hypotheses which is cross cross-disciplinary. Once you approve of the hypotheses, it will help generate a research plan for testing the hypotheses. Once you approve of the experiment plan, you can trigger the code generation for implementing the experimental plan. Once the code looks okay, you can ask the tool to execute the code and generate the results. Once the results look okay, you can ask the tool to generate a research blog/Twitter thread to expand on the research hypotheses and results obtained.
How we built it
We built it using Python as a backend with Flask API, using Claude to generate hypotheses, using Claude to generate an experimental plan, using Claude code SDK to generate code for each experimental step in the plan, using GCP Cloud Runner to execute the generated code. And finally, using Claude to write a blog post/Twitter thread to write an exposé on the whole process.
Challenges we ran into
We were attempting to build something new, so it required a lot of tinkering to make sure individual steps work. Then it also required a lot of figuring out to stitch all the steps together.
Accomplishments that we're proud of
- It is a new product, and it has a sense of novelty to it.
- It is really cool to see the hypotheses it can generate
- It is also cool to be able to go from a hypothesis to a research blog in an automatic manner.
What we learned
- LLMs are surprisingly good at generating novel hypotheses.
- In some cases research process can be templatised, and in those cases, we can automate a lot of the research process.
- In cases where it cannot be templatised, there the tooling required will require more interactions with the user.
What's next for AI Scientist
- Release it to a small cohort of users: graduate students, particularly those who are working in a cross-disciplinary area.
- Make research like a CI/CD process, where the process of research is continuous from idea to output to iterating on them.
- Make it more agentic, where an agent can explore a research space autonomously.
Log in or sign up for Devpost to join the conversation.