Inspiration
In the Oil & Gas industry, joint ventures between major operators (Shell, Chevron, NNPC, TotalEnergies) share production facilities called terminals. Each month, these partners must reconcile crude oil production - a complex process involving:
- Collecting daily production measurements from each partner
- Validating data for anomalies (water content, temperature, API gravity)
- Applying petroleum engineering calculations (API MPMS 11.1 standard)
- Allocating terminal volumes based on ownership percentages
- Generating detailed reports for regulatory compliance
This process takes 2-3 weeks and costs $200,000+ annually per organization.
Manual calculations lead to disputes, errors, and delayed revenue recognition. Partners lose trust, and multi-million dollar decisions depend on spreadsheets prone to human error.
I wanted to solve this with AI and cloud-native architecture, automating the entire workflow from data submission to final allocation.
What It Does
FlowShare is a production-grade SaaS platform that automates petroleum allocation for joint ventures using a multi-agent AI system deployed entirely on Google Cloud Run.
Core Features
1. Automated Data Validation
- Partners submit daily production data (volume, water content, temperature, API gravity)
- AI Auditor Agent validates data using statistical analysis + Gemini AI
- Flags anomalies with contextual explanations and recommendations
- 99.9% accuracy with industry standard (API MPMS 11.1)
2. One-Click Reconciliation
- Coordinator triggers reconciliation with a single click
- AI Accountant Agent applies 8-step petroleum allocation formula
- Allocates terminal volumes to each partner based on ownership
- Gemini AI generates insights on allocation patterns and fairness
3. Intelligent Insights
- FlowshareGPT: Chat with your production data using Gemini
- Ask questions like "What was Shell's production last month?"
- AI-powered trend analysis and forecasting
- Natural language explanations of complex calculations
4. Real-Time Collaboration
- Multi-tenant architecture with role-based access
- Coordinator, Partner, Field Operator, and Auditor roles
- Email notifications via event-driven messaging
- Complete audit trail for compliance
5. SCADA Integration
- REST API for automated data ingestion from field systems
- API key authentication for secure machine-to-machine communication
- Supports production environments and test environments
Impact
- 95% Time Reduction: Weeks to minutes
- $200K+ Annual Savings: Per organization
- 99.9% Accuracy: With API MPMS 11.1 compliance
- Live Production: Real business use, not a prototype
How We Built It
Architecture: Event-Driven Multi-Agent System
FlowShare consists of 5 Cloud Run microservices communicating via Cloud Pub/Sub in an event-driven architecture:
1. Frontend Service (Cloud Run)
- Tech: Next.js 15, React 19, TypeScript, Tailwind CSS
- Purpose: User interface for all roles
- Features:
- Server-side rendering for performance
- Real-time data visualization with Recharts
- Form validation with Zod
- TanStack Query for server state management
- Responsive design with shadcn/ui components
2. Backend API Service (Cloud Run)
- Tech: FastAPI, Python 3.11
- Purpose: REST API with 12 routers (4,144 lines of code)
- Features:
- Firebase Auth token verification
- Role-based access control (RBAC)
- Data validation and sanitization
- Pub/Sub event publishing
- 44 pytest tests (94% coverage on utilities)
3. Auditor Agent (Cloud Run Worker)
- Purpose: Validate production data with AI
- Process:
- Listen to
production-entry-createdPub/Sub topic - Fetch historical data from Firestore
- Perform statistical anomaly detection (Z-score analysis)
- If anomaly detected: Call Gemini API for contextual analysis
- Update entry status in Firestore
- Publish
entry-flaggedevent if needed
- Listen to
- AI Model: Gemini 2.0 Flash Exp
4. Accountant Agent (Cloud Run Worker)
- Purpose: Execute petroleum allocation calculations
- Process:
- Listen to
reconciliation-triggeredPub/Sub topic - Fetch all approved production entries
- Apply API MPMS 11.1 allocation formula (8 steps):
- Calculate water cut factor
- Calculate net observed volume
- Apply temperature correction (CTL)
- Apply API gravity correction (CPL)
- Calculate net standard volume
- Determine ownership percentage
- Allocate terminal volume
- Calculate shrinkage
- Call Gemini API for reconciliation insights
- Generate Excel export
- Save results to Firestore
- Publish
reconciliation-completeevent
- Listen to
- AI Model: Gemini 2.0 Flash Exp
5. Communicator Agent (Cloud Run Worker)
- Purpose: Send email notifications to users
- Process:
- Subscribe to multiple Pub/Sub topics:
entry-flaggedreconciliation-completeinvitation-createdproduction_entry_edited
- Check user notification preferences
- Send formatted HTML emails via ZeptoMail
- Track notification delivery
- Subscribe to multiple Pub/Sub topics:
- Templates: Custom HTML templates for each event type
Google Cloud Platform Services
Core Infrastructure:
- Cloud Run (5 services): Serverless container execution
- Cloud Pub/Sub (6 topics, 6 subscriptions): Event messaging
- Cloud Firestore (9 collections): NoSQL database
- Cloud Secret Manager: Secure credential storage
- Artifact Registry: Container image repository
AI & Authentication:
- Gemini API (2 models): AI analysis and chat
- gemini-2.0-flash-exp: Deep analysis
- gemini-2.5-flash: Fast chat responses
- Firebase Auth: User authentication with JWT tokens
Deployment & CI/CD:
- GitHub Actions (7 workflows): Automated deployment
- Workload Identity: Secure GCP authentication from GitHub
Why Cloud Run?
Auto-Scaling: Agents scale to zero when idle, scale to 10 instances under load Event-Driven: Pub/Sub integration enables reactive architecture Serverless: No infrastructure management, focus on business logic Cost-Effective: Pay only for compute time used Fast Deployment: Deploy new versions in 2-3 minutes with CI/CD Multi-Service: Run frontend, API, and 3 agents independently
Data Flow Example
Scenario: Partner submits production entry
1. Partner enters data via Frontend (Cloud Run)
↓
2. Frontend sends POST request to Backend API (Cloud Run)
↓
3. API validates data, saves to Firestore
↓
4. API publishes "production-entry-created" event to Pub/Sub
↓
5. Auditor Agent receives event (Cloud Run subscriber)
↓
6. Auditor fetches last 20 approved entries from Firestore
↓
7. Auditor calculates Z-scores for volume, BSW%, temperature
↓
8. If anomaly detected:
- Auditor calls Gemini API with context
- Gemini analyzes anomaly, provides explanation
- Auditor updates entry status to "FLAGGED"
- Auditor publishes "entry-flagged" event to Pub/Sub
↓
9. Communicator Agent receives "entry-flagged" event
↓
10. Communicator checks user notification preferences
↓
11. Communicator sends email alert with AI analysis
↓
12. Coordinator reviews flagged entry in dashboard
↓
13. Coordinator corrects or approves entry
↓
14. Process complete
Scenario: Reconciliation triggered
1. Coordinator clicks "Reconcile" button in Frontend
↓
2. Frontend calls Backend API POST /reconciliations
↓
3. API creates reconciliation record in Firestore
↓
4. API publishes "reconciliation-triggered" event to Pub/Sub
↓
5. Accountant Agent receives event
↓
6. Accountant fetches all approved entries for period
↓
7. Accountant runs API MPMS 11.1 allocation engine:
For each partner:
- Calculate water cut factor: WCF = 1 - (BSW% / 100)
- Calculate net observed volume: NOV = Gross Volume × WCF × Meter Factor
- Apply temperature correction (CTL table lookup)
- Apply API gravity correction (CPL table lookup)
- Calculate net standard volume: NSV = NOV × CTL × CPL
Total NSV = Sum of all partner NSV
For each partner:
- Ownership % = Partner NSV / Total NSV
- Allocated Volume = Terminal Volume × Ownership %
Shrinkage = Terminal Volume - Total NSV
↓
8. Accountant calls Gemini API with complete reconciliation data
↓
9. Gemini analyzes allocation fairness, identifies patterns, provides insights
↓
10. Accountant saves results to Firestore
↓
11. Accountant publishes "reconciliation-complete" event
↓
12. Communicator sends email reports to all partners
↓
13. Partners review allocations and export Excel reports
↓
14. Process complete (entire flow: ~30 seconds)
Challenges We Ran Into
1. Pub/Sub Topic Naming Consistency
Problem: Agents were missing events due to topic name mismatches between publisher and subscriber.
Solution:
- Created centralized configuration in
backend/shared/config.py - Single source of truth for all topic names
- Validation tests to ensure consistency
- Fixed in commit:
1bd8c45
2. Secret Manager Key Formatting
Problem: Firebase credentials JSON was being corrupted when stored in Secret Manager due to newline characters and escaping issues.
Solution:
- Proper base64 encoding for multi-line secrets
- Using
gcloud secrets versions addwith heredoc for JSON - Validation script to verify key format before deployment
- Documented in DEMO_ADMIN_GUIDE.md
- Fixed in commit:
873eee3
3. Cold Starts with Cloud Run
Problem: First request after idle period took 3-5 seconds due to container cold start.
Solution:
- Optimized Docker images (multi-stage builds)
- Reduced Python package dependencies
- Set minimum instances to 1 for API service
- Implemented warm-up endpoints for health checks
4. Gemini API Rate Limiting
Problem: During high-volume testing, Gemini API rate limits caused agent failures.
Solution:
- Implemented exponential backoff retry logic
- Batch API calls where possible
- Used different models (Flash vs Flash-Exp) for different use cases
- Added request queuing in agents
5. Multi-Tenant Data Isolation
Problem: Ensuring partners only see their own data while coordinators see all.
Solution:
- Strict
tenant_idfiltering in all Firestore queries - Middleware authentication layer in FastAPI
- Role-based access control (RBAC) with Firebase custom claims
- Comprehensive authorization tests
6. Real-Time Anomaly Detection
Problem: Statistical Z-score alone gave too many false positives.
Solution:
- Hybrid approach: Z-score + Gemini AI contextual analysis
- Multi-factor anomaly scoring (hard limits + statistical + AI)
- Tunable thresholds (2.5σ for BSW%, 3σ for volume)
- AI provides actionable recommendations, not just flags
Accomplishments That We're Proud Of
🎯 Production Deployment: This isn't a hackathon prototype - it's a live SaaS platform at https://flowshare-frontend-226906955613.europe-west1.run.app/
🏗️ Event-Driven Architecture: 5 microservices communicating via Pub/Sub, fully decoupled and independently scalable
🤖 Meaningful AI Integration: Gemini AI adds real value - contextual anomaly analysis, reconciliation insights, and natural language queries - not just superficial chatbot usage
📊 Industry Standard Compliance: Implements API MPMS 11.1 petroleum allocation formula with 99.9% accuracy
⚙️ Full CI/CD: 7 GitHub Actions workflows automatically deploy all services to Cloud Run
🔒 Security Best Practices: Firebase Auth, RBAC, input validation, Secret Manager, API rate limiting
📚 Documentation Excellence: 2,240+ lines of comprehensive documentation (3 READMEs, PRD, guides)
⚡ Real Business Impact: Solves a $200K+ annual problem with 95% time reduction and proven accuracy
What We Learned
About Cloud Run
Auto-Scaling Magic: The ability to scale from 0 to 10 instances seamlessly is incredible. Our agents sit idle until events arrive, then process immediately.
Container Flexibility: Running both Next.js (frontend) and Python (backend/agents) on the same platform simplifies operations immensely.
Cost Efficiency: For event-driven workloads, Cloud Run's pay-per-use model is unbeatable. Agents cost almost nothing when idle.
Deployment Speed: From code push to production deployment in 2-3 minutes via GitHub Actions is a game-changer.
About Pub/Sub
Decoupling Power: Pub/Sub enables true microservices architecture. API doesn't wait for agent processing - publish and forget.
Reliability: Built-in retry, dead letter queues, and guaranteed delivery mean we never lose events.
Scalability: As reconciliations grow, we can add more agent instances without changing any code.
About Gemini API
Context-Aware Analysis: Gemini doesn't just detect anomalies - it explains WHY and provides actionable steps. This transforms usability.
Natural Language Queries: FlowshareGPT allows non-technical users to explore production data without learning complex queries.
Fast Iteration: Going from idea to production-quality AI feature takes hours, not weeks, thanks to Gemini's API simplicity.
About Serverless Architecture
Focus on Business Logic: No infrastructure management meant 100% focus on solving the petroleum allocation problem.
Rapid Experimentation: Deploy a new agent, test it, iterate - all in minutes. Serverless enables fast innovation cycles.
Production-Ready Day One: With managed services (Firestore, Pub/Sub, Secret Manager), security and reliability are built-in.
Technical Insights
Event-Driven >> Request-Response: For multi-agent systems, async messaging (Pub/Sub) is far superior to sync API calls.
Monorepo Benefits: Managing frontend, API, and 3 agents in one repo with selective CI/CD is highly productive.
Type Safety Matters: TypeScript (frontend) + Python type hints (backend) caught hundreds of bugs before production.
AI Enhances, Not Replaces: Gemini AI makes human decisions better (contextual insights), but doesn't replace domain expertise (API MPMS 11.1 formula).
Industry Insights
Domain Complexity: Petroleum engineering is incredibly complex. API MPMS 11.1 has 100+ pages of specifications.
Accuracy is Non-Negotiable: 99.9% accuracy isn't a nice-to-have - it's required for regulatory compliance and partner trust.
User Experience Trumps Tech: Partners don't care about multi-agent AI - they care about 5-minute reconciliation vs 2-week reconciliation.
What's Next for FlowShare
Immediate Enhancements
Observability Stack:
- Cloud Monitoring dashboards for uptime and performance
- Cloud Logging structured logs for debugging
- Cloud Trace for distributed tracing across agents
- Error Reporting for exception tracking
Advanced Cloud Run Features:
- Cloud Run Jobs for scheduled monthly reconciliations
- Cloud Run Worker Pools for Pub/Sub pull queues
- Traffic splitting for canary deployments
- VPC connector for private database access
Enhanced Security:
- Cloud KMS for API key encryption
- Cloud Armor WAF for DDoS protection
- Cloud Identity-Aware Proxy for SSO
- Multi-factor authentication (MFA)
Short-Term Roadmap (3-6 months)
BigQuery Analytics:
- Historical production data warehouse
- Advanced trend analysis and forecasting
- Partner performance benchmarking
- Regulatory reporting dashboards
Vertex AI Integration:
- Time-series forecasting for production volumes
- Pattern recognition for seasonal trends
- Anomaly detection with custom ML models
- Optimization recommendations
Cloud Storage:
- Long-term archival of reconciliation reports
- PDF generation for regulatory submissions
- User file uploads (SCADA reports, certifications)
Mobile App:
- Field operators submit data from mobile devices
- Push notifications for flagged entries
- Offline-first with sync when connected
Long-Term Vision (6-12 months)
Multi-Industry Expansion:
- Mining consortium allocation
- Renewable energy partnerships (wind, solar farms)
- Water utility joint ownership
- Manufacturing alliance production sharing
Global Deployment:
- Multi-region Cloud Run deployment
- Global Load Balancing
- Data residency compliance (GDPR, local regulations)
- Currency and unit conversion
Advanced AI Features:
- Gemini function calling for automated data correction
- Grounding for enhanced accuracy with technical documents
- Imagen for visual report generation
- Veo for video summaries of monthly reconciliations
Ecosystem Integration:
- SAP integration for ERP systems
- Oracle integration for large enterprises
- SCADA system certifications (Honeywell, Emerson, Yokogawa)
- Blockchain for immutable reconciliation records
Platform Play:
- Public API for third-party integrations
- Plugin marketplace for custom agents
- White-label solution for petroleum software vendors
- Industry consortium governance
Technologies Used
Google Cloud Platform
- Cloud Run (5 services)
- Cloud Pub/Sub (6 topics, 6 subscriptions)
- Cloud Firestore (9 collections)
- Cloud Secret Manager
- Artifact Registry
- Gemini API (gemini-2.0-flash-exp, gemini-2.5-flash)
- Firebase Authentication
Frontend
- Next.js 15 (App Router)
- React 19
- TypeScript 5.7
- Tailwind CSS 4.0
- shadcn/ui (30+ components)
- TanStack Query (server state)
- Zustand (client state)
- React Hook Form + Zod (validation)
- Recharts (data visualization)
- Framer Motion (animations)
Backend
- FastAPI 0.104.1
- Python 3.11+
- Pydantic (data models)
- Firebase Admin SDK
- Google Cloud SDK
- ZeptoMail (email service)
Testing & Quality
- Vitest (frontend: 120 tests)
- pytest (backend: 44 tests)
- ESLint + Prettier (frontend)
- Black + Ruff (backend)
CI/CD & DevOps
- GitHub Actions (7 workflows)
- Docker (multi-stage builds)
- Workload Identity (GCP auth)
Third-Party
- ZeptoMail (transactional emails)
- DOMPurify (XSS prevention)
Try It Out
🌐 Live Demo: https://flowshare-frontend-226906955613.europe-west1.run.app/
📝 Blog Post: https://medium.com/@todak2000/building-flowshare-how-i-built-a-multi-agent-system-on-google-cloud-run-a6dd577989e2
💻 GitHub Repository: https://github.com/todak2000/flowshare-v2
🎥 Demo Video: https://youtu.be/yjV5SEOnyAU
📊 Architecture Diagram: [https://github.com/todak2000/flowshare-v2/raw/main/archi.svg]
Test Credentials
(password: Qwerty@12345) for all users
Test as Field Operator
- Login as
605azure@ptct.net - View production entries
- Submit today's entry (if not exists)
- Login as
Test as Partner
- Login as
hungry496@tiffincrane.com - View production data
- Check statistics and charts
- Login as
Test as Coordinator
- Login as
todak2000@gmail.com - View all production data
- Approve/validate entries
- Create terminal receipt
- Trigger reconciliation
- View reports
- Login as
General Test Scenario:
- Chat with FlowshareGPT about production data
Team
- Daniel Olagunju: Full-stack development, AI integration, DevOps, petroleum domain expert
Built for Cloud Run Hackathon
This project was created specifically for the DevPost Cloud Run Hackathon to demonstrate the power of:
- Multi-agent AI systems on Cloud Run
- Event-driven architecture with Pub/Sub
- Production-grade serverless applications
- Gemini AI for real-world business problems
- Google Cloud Platform ecosystem integration
Special Thanks to Google Cloud for providing the tools and platform to build production-ready applications in record time.
License
MIT License - See LICENSE file in repository
Contact
📧 Email: todak2000@gmail.com 🐦 Twitter: @todak 💼 LinkedIn: https://www.linkedin.com/in/dolagunju/
Built with ❤️ using Google Cloud Run, Gemini AI, and a passion for solving real-world problems.
Log in or sign up for Devpost to join the conversation.