Inspiration
Tech teams spend days or weeks setting up data pipelines that should take minutes. 80% of backend developers end up doing pipeline work as a side task, lacking dedicated data engineering support. Current solutions are either too expensive (enterprise pricing) or too complex (require DevOps expertise). We saw engineers wasting 10-20 hours per pipeline setup and companies paying for idle infrastructure they forgot to tear down.
What it does
DataFlow AI lets tech teams build data pipelines through conversation. Say "Create a pipeline from PostgreSQL to ClickHouse for audit logs" and in minutes you get a validated, secure pipeline configured automatically. The AI guides you through source selection, schema validation, transformation setup, and destination configuration. When you're done, one-click cleanup stops the bill.
How we built it
- Frontend: Next.js + Tailwind + shadcn/ui
- Backend: FastAPI with Gemini 2.0 Flash (LangChain agents)
- Data Pipeline: Confluent Kafka for real-time streaming
- Processing: ksqlDB & Apache Flink for transformations
- Destinations: ClickHouse, PostgreSQL, S3
- Auth: Firebase OAuth for secure access
Challenges we ran into
- Building a full CDC (Change Data Capture) pipeline in hackathon timeframe
- Orchestrating multi-step pipeline creation through conversational AI
- Implementing dynamic schema validation across different source/destination types
- Real-time cost estimation before pipeline deployment
Accomplishments that we're proud of
- Conversational pipeline builder — no YAML, no CLI, just describe what you need
- Minutes, not weeks — what used to take DevOps sprints now takes one conversation
- Built-in validation — AI catches schema mismatches and security issues before deployment
- One-click cleanup — achieve your goal, tear it down, stop paying
What we learned
- Confluent Cloud makes Kafka accessible for rapid prototyping
- ksqlDB is powerful for streaming transformations without heavy Flink setup
- LangChain tools enable seamless AI-to-infrastructure orchestration
- Simple guided workflows beat complex configuration UIs for developer adoption
What's next for DataFlow AI
- Add more connectors: MongoDB, Snowflake, BigQuery, S3
- Advanced transformation templates with Flink SQL
- Real-time alerting and monitoring dashboards
- Team collaboration and pipeline sharing
- Target: 100 early adopters, enterprise pilot programs
Early respondents get beta access.
Take the 2-min survey at the end, you'll get an exclusive peek at what we're building.
Log in or sign up for Devpost to join the conversation.