Pulse: The Autonomous Corporate Memory & Knowledge Graph

Inspiration

The project addresses knowledge fragmentation in large engineering organizations where critical decisions are scattered across Slack, email, Jira and GitHub. Traditional search tools fail to connect related information and documentation becomes outdated immediately. The goal was building an autonomous system that learns from all data streams and proactively prevents architectural conflicts.

Pulse eliminates the hours developers waste searching for "who knows how this works" by automatically building a knowledge graph from all engineering communications. When you ask a question, it synthesizes answers from GitHub commits, Slack decisions and Jira tickets with source citations. The system detects conflicts before they reach production by comparing your code changes against recent architectural decisions.

Target audience and operation overview

Engineering teams at companies with 50+ developers where knowledge is distributed across multiple platforms. Developers use the command palette for quick searches, the AI assistant for detailed explanations and receive automatic conflict alerts. Engineering managers generate debt reports to prioritize refactoring and review collaboration matrices for team planning. The system operates continuously, scraping data sources hourly and updating the knowledge graph in real-time.

What it does

Pulse builds a semantic knowledge graph from GitHub events, Stack Overflow, Slack messages and emails, then uses vector embeddings and RAG to answer technical questions with synthesized responses. It detects conflicts by comparing planned code changes against recent architectural decisions and alerts developers before problems occur. The system calculates technical debt scores, identifies code hotspots and maps team collaboration patterns.

Architecture

Backend (Serverpod)

GitHub Archive data processing with hourly scraping
Knowledge graph management with PostgreSQL
Vector embeddings for semantic search
Analytics engine for code metrics
WebSocket support for real-time updates
Background job scheduling with Future Calls

Frontend (Flutter)

Bento Grid dashboard layout with StaggeredGridView
Command palette with search-as-you-type
Glassmorphism UI with BackdropFilter
Shimmer loading states
Adaptive layouts for desktop and mobile
Dynamic theming with material color utilities
Real-time notification system

Technology Stack

Languages: Dart

Frameworks: Flutter, Serverpod

Technologies: WebSocket, REST API, Vector Search, Graph Database

Libraries: serverpod_flutter, flutter_staggered_grid_view, lucide_icons, shimmer, rive, material_color_utilities, intl, shared_preferences, url_launcher, provider, http, postgres, vector_math

Tools: Docker, Docker Compose

Services: GitHub Archive API, Stack Exchange API, Slack API, Jira REST API

APIs: GitHub Archive REST API, Stack Overflow REST API, Jira Cloud REST API, Slack Web API, HuggingFace Inference API

AI/Models: Sentence Transformers (all-MiniLM-L6-v2), Retrieval-Augmented Generation (RAG), Cosine Similarity

Database: PostgreSQL with pgvector extension

Data Integrations: GitHub Archive (public event stream), Stack Overflow Open Data, Enron Email Dataset, Slack workspace messages, Jira project data

Datasets: GitHub Archive hourly JSON files, Stack Exchange data dumps, Enron corpus (500k emails), live Slack channel history

Functionalities

Knowledge Graph Construction: Automatically ingests GitHub events, Stack Overflow questions, email threads and Slack decisions to build a semantic knowledge graph with nodes and weighted edges.

Vector Embeddings: Generates 384-dimensional embeddings for all knowledge nodes using sentence transformers, stored in PostgreSQL with pgvector for efficient similarity search.

RAG Query System: Retrieves relevant context from the knowledge graph using vector similarity and keyword matching, then constructs detailed answers with source citations.

Conflict Detection: Analyzes recent architectural decisions and file modification patterns to alert developers about potential conflicts before code commits.

Code Hotspot Analysis: Identifies files with high change frequency and multiple contributors, calculating complexity scores based on churn and collaboration metrics.

Team Collaboration Mapping: Builds collaboration matrices showing which developers work on shared files, revealing team interaction patterns.

Architectural Debt Reporting: Calculates technical debt scores from code churn, complexity hotspots and test coverage, providing recommendations.

Evolution Timeline: Tracks how technical decisions and implementations evolved over time by querying knowledge nodes within date ranges.

Command Palette: Provides search-as-you-type functionality with real-time suggestions from the backend, supporting natural language queries.

Background Scraping: Runs hourly jobs using Dart isolates to fetch GitHub Archive data, process JSON in parallel and generate embeddings asynchronously.

Real-time Notifications: Streams conflict alerts and system updates to Flutter clients via WebSocket connections.

External Integrations: Creates Jira tickets from detected conflicts, posts Slack notifications about issues and ingests Slack threads as knowledge.

Adaptive UI: Renders bento grid layouts with responsive breakpoints, glassmorphism effects and platform-specific interactions (hover states, haptic feedback).

Graph Traversal: Implements BFS for shortest path finding between knowledge nodes and calculates centrality metrics for identifying key concepts.

Search Caching: Stores query results in PostgreSQL with expiration times to optimize repeated searches.

How we built it

The Flutter frontend provides bento grid dashboards, command palette search and AI chat interfaces. Serverpod backend handles REST endpoints, WebSocket streaming and background scraping jobs that run in Dart isolates. The RAG engine generates 384-dimensional embeddings using sentence transformers, performs vector similarity search with pgvector and ranks results by relevance and recency. Knowledge graph analysis uses BFS traversal and centrality algorithms to find relationships between concepts. External integrations query GitHub Archive API hourly, fetch Stack Overflow data via REST, ingest Slack threads and create Jira tickets. Databases: PostgreSQL with pgvector extension.

Challenges we ran into

Implementing efficient vector similarity search required understanding pgvector indexing strategies and tuning similarity thresholds to balance precision and recall. Processing GitHub Archive's compressed JSON files without blocking the main server needed Dart isolate orchestration for parallel decompression. The RAG answer construction required developing ranking algorithms that weighted both semantic similarity and temporal relevance. Preventing duplicate knowledge nodes when the same decision appeared in multiple sources required fuzzy matching logic.

Accomplishments that we're proud of

Built a working RAG system that retrieves and synthesizes answers from multiple data sources with source citations. Implemented real-time conflict detection that compares file changes against architectural decisions from the past 30 days. Created a command palette with sub-300ms search-as-you-type response times using PostgreSQL full-text search and vector similarity. Developed background scraping that processes hourly GitHub Archive files using parallel isolates without impacting API response latency.

What we learned

Vector embeddings enable semantic search that finds conceptually related content beyond keyword matching, but require careful tuning of similarity thresholds and indexing strategies. RAG systems need context ranking algorithms that consider both relevance and recency to provide useful answers. Dart isolates provide true parallelism for CPU-intensive tasks like JSON parsing and embedding generation. Knowledge graphs require automated edge building through semantic similarity rather than manual relationship definition.

What's next for Pulse: The Autonomous Corporate Memory & Knowledge Graph

Expand data sources to include Confluence pages, Google Docs and Linear issues for broader knowledge coverage. Implement graph neural networks to improve relationship inference between nodes beyond cosine similarity. Add predictive analytics that forecast which modules will accumulate technical debt based on current modification patterns. Develop automated documentation generation that synthesizes knowledge nodes into structured technical specifications.

Built With

cosine-similarity
dart
docker
docker-compose
flutter
flutter-staggered-grid-view
github-archive-api
graph-database
http
intl
jira-rest-api
lucide-icons
material-color-utilities
postgresql
provider
rest-api
retrieval-augmented-generation-(rag)
rive
sentence-transformers-(all-minilm-l6-v2)
serverpod
serverpod-flutter
shared-preferences
shimmer
slack-api
stack-exchange-api
url-launcher
vector-math
vector-search
websocket

Updates

Samira Samrose started this project — Jan 30, 2026 01:57 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.