Inspiration

Researchers, even when using chatbots in their lab, tend to spend an inorbitant amount of time hand parsing data and generating accurate code to then put in themselves to R-packages such as instaprism after which they would have to make their own analysis. Our project helps bridge the R-to-Web gap: Bioinformatics tools like Instaprism often output complex R objects (.rds). By using Gemini's reasoning, the agent can autonomously convert these binary "blobs" into web-friendly Chart.js JSON without the researcher having to write a parser for every possible edge case.

What it does

Lunalysis is an AI-powered bioinformatics platform that allows researchers to analyze complex biological data (like cell deconvolution) using natural language. It bridges the gap between raw data storage and interactive visualization. By using an Agentic Workflow, it can autonomously navigate a database, retrieve private research files from S3, execute specialized R and Python code in a secure sandbox, and return interactive, high-level charts directly to the user. These projects"are saved in our PostgreSQL with their specific context database so that they can later be accessed by the user.

How we built it

We built an architecture designed for scalability and data security:

  • Gemini 3 Flash acts as the reasoning engine, using Tool Use to interact with our infrastructure.
  • AWS S3 stores heavy binary files (like .rds R objects), while PostgreSQL manages project metadata and file pointers.
  • We used E2B Sandboxes configured with R, Python, and pyreadr. This allows our Agent to run code in an isolated environment without local overhead.
  • A Python-based backend that generates S3 Presigned URLs, ensuring that sensitive research data is never public but still accessible to the Agent. -Langchain to query the database.

Challenges we ran into

One hurdle was the R-to-Web translation. Bioinformatics data often lives in specialized R formats that aren't native to web browsers which meant we had to teach our Agent how to reach into an .rds file using Python inside a sandbox and transform specific dataframes into structured Chart.js JSON. Deciding data storage was also an issue that was hard to tackle as we spent a long time trying (and failing) to ideate on how to safely and securely execute AI generated code on data.

Accomplishments that we're proud of

We are very proud of our secure data loop because we successfully built a system where a user can ask a question, and the AI retrieves a private file, analyzes it, and produces a graph all without the file ever being exposed to the public internet or the user needing to touch a single line of code. Getting the E2B sandbox to spin up with a full R/Python environment relatively quick was very accomplishing as well.

What we learned

Instead of writing an API for every type of chart, we learned to provide the Agent with the right tools and a clear mental model of the data. We also deepened our understanding of cloud security, specifically how to use IAM roles and temporary credentials to keep sensitive biological data locked down.

What's next for Lunalysis

The next step is expanding our Tool Library and generalizing to even more broad types of data while also scaling for a much larger volume of data.

Built With

Share this project:

Updates