Inspiration
Every engineer who has been on-call knows the pain. It's 2AM, production is down, and you're staring at a wall of 50,000 log lines trying to find what went wrong. Manual log analysis is slow, error-prone, and requires deep expertise.
I wanted to build something that solves this problem the way modern tooling should — combine machine learning, AI, and clean visualization into a single dashboard that gives any engineer instant clarity during an incident.
The question I asked myself was simple:
"What if you could upload a messy log file and get a complete incident diagnosis in under 30 seconds?"
LogPulse is the answer.
What it does
LogPulse is an AI-powered infrastructure log intelligence dashboard. You upload a log file and within 30 seconds get a complete system health score, severity breakdown, anomaly detection using machine learning, and AI-generated root cause explanations for your top recurring errors — all in a clean, visual dashboard built for engineers under pressure.
How we built it
We built LogPulse in Python using Streamlit for the dashboard, Pandas for log parsing and aggregation, Scikit-learn's Isolation Forest for anomaly detection, and Plotly for interactive visualizations. The AI layer supports three providers — Anthropic Claude, OpenAI GPT, and Google Gemini — through a unified provider-agnostic interface. The codebase is fully modular across six independent files, each with a single clear responsibility.
Challenges we ran into
1. Log Format Diversity Real log files come in dozens of formats. Building a parser robust enough to handle standard, syslog, bracket, and severity-first formats — while gracefully degrading on malformed lines — required significant regex engineering and testing.
2. Anomaly Detection on Short Log Files Isolation Forest performs best on larger datasets. For short log files with few time buckets, contamination rate tuning became critical. I exposed this as a configurable sidebar parameter so users can adjust sensitivity based on their data volume.
3. Multi-Provider LLM Integration Each provider has a different SDK, client initialization pattern, and error type. Building a unified interface that handles authentication errors, rate limits, and quota exhaustion gracefully across three providers required careful abstraction.
Accomplishments that we're proud of
- A regex parsing engine that handles four different real-world log formats gracefully
- A weighted health score formula that reduces an entire log file to one actionable number
- Unsupervised ML anomaly detection that requires zero labeled training data
- Multi-provider LLM support so any team can use it regardless of their AI vendor
- A dashboard that genuinely looks and feels like an internal enterprise tool
What we learned
- Real log files are messy — building a robust parser that degrades gracefully taught us more about production systems than any textbook
- Isolation Forest is surprisingly effective on time-series event data with minimal tuning
- Prompt engineering for technical diagnostics requires a specific persona and structured output format to be actually useful
- Streamlit can look enterprise-grade with the right layout decisions
What's next for LogPulse
- Live log ingestion by connecting directly to log shippers like Fluentd or Filebeat
- TF-IDF based log clustering to group semantically similar errors automatically
- Slack and PagerDuty webhook integration for real-time alerting
- Downloadable PDF incident reports for post-mortem documentation
- Deployment as a containerized internal tool with Docker
Log in or sign up for Devpost to join the conversation.