We've all been there - spending hours on repetitive online tasks like job applications, real estate research, or data gathering. While AI tools like ChatGPT can help us think about these tasks, they can't actually do them for us. We needed something that bridges the gap between AI intelligence and browser automation.
The inspiration came from a simple frustration: "Why can't I just tell my browser what I want, and have it do it for me?" Existing solutions were either too technical (requiring coding) or too limited (just linking to websites). We wanted to create a truly conversational AI agent that anyone could use.
### What It Does Section
```markdown
**Agent Scout** is a Chrome extension that turns your browser into an intelligent assistant. It features a natural chat interface (like ChatGPT) that:
1. **Plans intelligently** - Uses GPT-4 via StackAI to understand your request and create a step-by-step plan
2. **Asks for context** - Requests information (resume, email, etc.) only when needed
3. **Shows transparency** - Displays the plan and waits for your approval before doing anything
4. **Automates seamlessly** - Executes browser automation in the current tab while the side panel stays visible
5. **Analyzes results** - Uses AI to analyze scraped data and provide insights
**Use cases:**
- 🏠 Find and compare real estate listings
- 💼 Apply to jobs with your resume
- 📊 Research products across multiple sites
- 📄 Summarize LinkedIn profiles or articles
- 📧 Draft and send emails (with verification)
How We Built It Section
**Architecture:**
- **Frontend:** Chrome Extension (Manifest V3) with a custom side panel UI
- **AI Planning:** StackAI workflows powered by GPT-4
- **Automation:** Content scripts for DOM manipulation + Service Worker for coordination
- **Memory:** Chrome Storage API for persistent context (uploaded files, conversation history)
**Key Technical Components:**
1. **Conversational Interface** (chat.html, chat-simple.js)
- Message bubble UI with real-time typing indicators
- File upload system supporting PDFs, Word docs, and text files
- Conversation history and context management
2. **AI Integration** (StackAI API)
- Planning workflow: Converts natural language → executable steps
- Analysis workflow: Processes scraped data → insights
- Multiple specialized workflows for different task types
3. **Browser Automation** (background.js, content.js)
- Service worker coordinates execution across tabs
- Content scripts inject DOM analyzers and automation logic
- Progress reporting back to the chat interface
4. **Smart DOM Analyzer** (dom-analyzer.js)
- Extracts relevant data from any website
- Handles different page structures intelligently
- Works with popular sites like LinkedIn, Zillow, Indeed
**Tech Stack:**
- JavaScript (ES6+)
- Chrome Extension APIs (Manifest V3)
- StackAI + GPT-4
- HTML/CSS for UI
- Chrome Storage API
Log in or sign up for Devpost to join the conversation.