Inspiration

We observed that founders waste a lot of their time in documentation, checking bills and find it hard to keep a record of every meeting. That's why we came up with this idea to build this app 'Novus' (Latin for 'new') that unifies communication, accounts and documents into a single, intelligent interface powered by the reasoning capabilities of Gemini 3.0.

What it does

Novus consists of mainly 3 departments:

  1. Communication: You can either start a meeting or upload a recorded meeting and our multilingual feature summarizes the meeting with meeting goals and plan of action.
  2. Documents: You can generate legal editable documents like NDAs, contracts and much more by just entering a prompt. Also, you can upload a document to flag out potential risk elements in it.
  3. Accounts: A forensic CFO that scans receipts via camera, logs them into a master ledger, and detects financial anomalies.

How we built it

We built Novus using a Agentic Architecture, where specialized AI agents handle specific domains.

  1. The "Brain": Google AI Studio & Gemini 3.0 We prototyped all our agents in Google AI Studio. This was crucial for testing our prompts before writing a single line of code. Communication Agent: Uses Gemini 3.0 Flash for its speed and massive context window. We utilize the File API to upload long audio recordings of meetings. The model processes the audio directly (multimodal) to extract action items. Forensic Agent: Uses Gemini 3.0 Pro Vision. We upload images of receipts, and the model extracts the Merchant, Date, and Total. Crucially, we use System Instructions to force the output into strict JSON format for our database. Legal Agent: Uses Gemini’s reasoning capabilities to draft contracts.
  2. The "Body": Node.js & Firebase Backend: We exported our Gemini code from AI Studio into a Node.js (Express) server. Auth & Database: We used Firebase Authentication to secure the platform (ensuring only the CEO can see the "Accounts" tab) and the Firebase SDK to sync data in real-time across the dashboard. Integration: The "Accounts" agent pushes validated data directly to a master spreadsheet using the Google Sheets API, creating a live dashboard for investors.
  3. Hosting We deployed the full stack on Replit, allowing our distributed team to collaborate on the backend and frontend simultaneously.

Challenges we ran into

The "JSON Roulette": Early on, the models would sometimes reply with conversational filler ("Here is your JSON..."). This broke our database parsers. Solution: We utilized Gemini's Structured Output mode and refined our System Prompts in AI Studio to enforce strict syntax: Response Mime Type: application/json.

Latency vs. Intelligence: The "Pro" model was smart but slower for chat. The "Flash" model was fast but missed legal nuances. Solution: We built a "Router" in our backend. Simple queries go to Flash; complex document scanning goes to Pro.

Multimodal File Handling: Passing binary image data from the frontend to the Gemini API was tricky. We had to convert uploads to Base64 buffers to ensure the Vision model could "see" the receipts clearly.

Accomplishments that we're proud of

Seamless Multimodality: We successfully integrated Gemini's Vision (Receipts), Audio (Meetings), and Text (Contracts) capabilities into one cohesive UI. It feels like magic when you drop a receipt image and watch it appear on a Google Sheet instantly.

Real-Time Sync: Connecting the AI agents to Firebase meant that when the "Finance Agent" detected fraud, the alert appeared on the CEO's dashboard in milliseconds without refreshing the page.

The "Context" Breakthrough: We managed to make the Communication agent "remember" details from the beginning of a long meeting audio file by leveraging Gemini's massive context window, something standard RAG pipelines struggle to do.

What we learned

Context is King: Gemini's Long Context Window changed how we thought about data. We didn't need to chop up meetings into small clips; we could just feed the whole hour-long audio to the model, and it understood the entire context of the negotiation.

Agent Specialization: One giant prompt doesn't work. Breaking the system into "Departments" (Legal Agent, Finance Agent) made the AI significantly more reliable.

The Power of AI Studio: being able to "Get Code" directly from the prototyping environment saved us hours of boilerplate coding.

What's next for Novus-Startup Intelligence Suite

Actionable Agents: Currently, Novus drafts emails and flags anomalies. The next step is giving it permission to send the email or freeze the corporate card automatically (Human-in-the-loop actions).

Mobile App: Startups happen on the go. We plan to wrap our React frontend into a React Native app so founders can snap receipts at dinner.

HR Department: We want to add a fourth agent, "The HR Lead," which can screen resumes and schedule interviews, completing the "C-Suite in a Box" vision.

Built With

Share this project:

Updates