Inspiration
When I was getting my PhD in chemical biology I noticed that significant amount of information and connections were lost due to the large amount of data any single graduate student was expected to keep in their mind. Long term trends may be ignored, and matching anomalous results against the newest literature to (maybe) discover something new took way too long to do routinely. Now, with agentic AI, it is clear to me that most of this legwork can be done by an agent, with the human scientist being presented with things that allow them to use their creativity more effectively.
What it does
My agent first scans the users experiment files to formulate a query (for the demo it starts with a user query instead as I obviously lack prior experimentation data for the hackathon). It then searches academic literature (arXiv, bioarXiv, PubMed, and more) for the newest literature having to do with the query and generates hypotheses. All calls are through TrueFoundry for its trace functionality (in addition to Model and MCP routing) of which articles in literature (tracked by MCP calls to read or download them) support which hypothesis.
The user can then select the most promising hypothesis and the agent leverages the fork feature of ghost to mirror how scientists aggregate matching data to check against the hypothesis (and then discard this "grouping" if it does not pan out) and generate hypothesis data subsets to fork and refine the hypothesis.
How we built it
We built this using TrueFoundry for agent routing, MCP routing, and for the trace functionality (marginally implemented currently). The model is Claude (Opus 4.6 for reasoning, lower models for other uses). Necessary queries against the academic literate are then sent to the relevant MCPs, with the result being cached in AeroSpike for fast later retrieval (the demo is toned down, but a full research pipeline run can query >1000 papers for integration into hypotheses, so read time is important). This is then condensed, with the experimental data the user has, into a datasets that are forked and iterated on via ghost.
Code was generated mostly via Kiro, Claude Code, some by hand (purely for the retro vibes, I promise).
Challenges we ran into
Designing the storage was hard. It seemed really dumb to use two different database providers, but in the end it was clear the the functionality of each (AeroSpikes fast graph and datastore for large amounts of rapidly queried data and Ghost for iterative refining of dataset and supporting evidence) really did make sense.
The other issue I ran into was paywalls for the scientific literature. I did use open access papers, but at scale this would require agreements with publishers.
Accomplishments that we're proud of
Getting the pipeline working! this is my first agentic project and I am super proud that it works
What we learned
The importance of data conforming to expectations at each stage of the process. LLMs can only reason through so much odd formatting before it becomes a hinderance to performance.
What's next for Agent:PostDoc
Honesty not sure, def either open source or explore if people actually want to use it (besides me), and if so startup??
Log in or sign up for Devpost to join the conversation.