✨ Promptly

Empowering the future of AI through precision prompt engineering

🌟 Inspiration

In a world accelerating toward autonomous AI agents, prompt engineering is no longer optional—it's foundational.

The numbers tell a compelling story:

📈 The global prompt-engineering market, valued at $222.1M in 2023, is projected to surge beyond $2.06B by 2030
🤖 82% of companies already deploy AI agents in internal operations
⚡ 85% have integrated agents into at least one critical workflow

As dependence on AI agents grows exponentially, so does the need for precise, secure, and high-performance prompts.

Promptly was created to meet that need—helping developers craft, harden, and optimize prompts with clarity and confidence.

🎯 What it does

Promptly is a comprehensive platform for modern prompt engineering, built around three powerful pillars:

1. 💻 Prompt Engineering IDE

A purpose-built development environment designed exclusively for prompts, powered by a dedicated orchestration agent.

Features:

⚡ Rapid iteration and testing
📝 Structured refinement workflows
🔄 Version-controlled prompt building
🎨 Intuitive interface (think Cursor, but for prompts)

2. 🛡️ Security Evaluation

Advanced threat detection powered by cutting-edge AI.

Components:

🤖 Custom Small Language Model (SLM) trained on 10,000+ prompts and system instructions
🔍 Detects prompt-injection risks, hidden manipulations, and malicious intent
🧠 Gemini-powered security agent that:
- Explains vulnerabilities in plain language
- Highlights critical risks with context
- Suggests actionable remediation strategies

3. 📊 Quality Evaluation

Side-by-side comparison studio across major LLMs.

Benefits:

🔬 Compare model outputs under identical conditions
⚖️ Evaluate reliability, creativity, and accuracy
🎯 Choose the optimal model for your specific use case

🏗️ How we built it

We architected a modular multi-agent workflow using modern AI orchestration tools:

Core Technologies:

🔗 LangChain & LangGraph for intelligent orchestration
💎 Gemini (via Google AI Studio) for prompt evaluation
🧪 OpenAI models (via Azure AI Foundry) for output benchmarking

Security Layer:

🛠️ Built a custom SLM from scratch
🎓 Fine-tuned on a curated dataset using Vertex AI
⚡ Deployed as a lightweight inference service

Frontend:

⚛️ React application with real-time editing
🔄 Seamless state synchronization
☁️ Backed by Azure's scalable infrastructure

Every component was designed to be portable, extensible, and cloud-agnostic.

🚧 Challenges we ran into

Technical Integration

Orchestrating a multi-agent ecosystem across Azure and Vertex AI introduced:

🔌 Complex API conflicts
🎫 Token-handling inconsistencies
🔐 Authentication edge cases

We relied heavily on community resources—Stack Overflow, Reddit, and GitHub—to debug obscure failures and discover workarounds.

Dataset Curation

Building the SLM training dataset proved equally challenging:

📚 Resources on prompt-injection were scattered and inconsistent
🕵️ Manually curated examples from:
- Ethical hacking forums
- GitHub security repositories
- Academic research papers
- Red-team datasets
⚖️ Ensured full GDPR compliance throughout

The process took significantly longer than expected, but the extra effort resulted in a far more robust and accurate model.

🏆 Accomplishments that we're proud of

🎯 Custom Security SLM

Our biggest achievement: successfully engineering and deploying our own SLM for prompt-security validation—without relying on RAG or external filters.

Capabilities:

✅ Detects subtle manipulation attempts
🔍 Identifies layered injections
🧠 Catches indirect adversarial patterns
📈 Delivers impressive accuracy across diverse test cases

☁️ Cross-Cloud Orchestration

We mastered the notoriously difficult task of building a resilient multi-agent system that seamlessly bridges:

Microsoft Azure
Google Vertex AI
LangChain orchestration
LangGraph state management

The result: a robust, interconnected system ready for enterprise deployment.

📚 What we learned

🤖 Multi-Agent Architecture

Gained deep insights into orchestration frameworks—particularly how LangGraph excels in structured state management when you maintain:

📐 Disciplined design patterns
🔄 Clear state transitions
⚠️ Strict error-handling protocols

🛡️ Security ML Development

The SLM taught us that high-quality, diverse data is non-negotiable when dealing with adversarial patterns:

📊 Data augmentation strategies are critical
🤝 Ethical sourcing requires careful consideration
🔬 Continuous validation prevents model drift

💰 Cloud Cost Management

Cross-cloud deployment exposed the importance of financial guardrails:

💸 LLM-heavy workflows can exhaust budgets rapidly
📉 Monitoring and throttling are essential
🎯 Strategic resource allocation maximizes ROI

🔄 Iterative Refinement

Extensive testing confirmed that prompt engineering is inherently iterative—each refinement cycle strengthens reliability against real-world edge cases.