DLyog IPChecker

An AI Experiment for the AWS Lambda Hackathon 🚀🧠

Ship software, not subpoenas.


🌟 Inspiration

Today’s engineers live on copy‑paste culture: GitHub gists, Stack Overflow code, open‑source libraries, AI‑generated snippets. All that velocity hides a legal iceberg—copyright, patent, or trademark violations that surface later as nasty cease‑and‑desist letters. We built DLyog IPChecker to be a single‑command, developer‑first guardrail: warn you of potential IP landmines before deploy day and let you keep shipping at top speed.


⚙️ What It Does

  1. CLI Push – python cli/dlyogipchecker.py "/path/to/your/project" selects the 10 most recent source files, zips them, and uploads to S3.
  2. S3 Event – the upload instantly fires an AWS Lambda function (no polling, no servers).
  3. AI Scan – Lambda unzips and calls Perplexity Sonar‑Pro once per file for deep semantic IP analysis.
  4. HTML Report – Lambda assembles a Tailwind‑styled, color‑coded report and emails it to you in minutes.

Result: instant peace of mind without slowing your pipeline.


🛠️ How We Built It

Layer Tech Choices Why It’s Cool
CLI Python 3.11, Typer, boto3 Smart ignore rules (.git, node_modules, venvs); bundles only high‑signal files to keep token costs low.
Storage + Trigger Amazon S3 event notifications Bucket both stores bundles and triggers Lambda—zero extra infra.
Compute AWS Lambda (Python 3.11 runtime) Stateless, scales instantly; 15‑minute timeout guard with graceful skip logic if time runs low.
AI Engine Perplexity Sonar‑Pro API Returns structured JSON (summary, validation, verdict) for reliable post‑processing.
Report Delivery SMTP (Gmail / Amazon SES) CSS inlined for bullet‑proof rendering across Gmail, Outlook, Apple Mail.
CI / CD GitHub Actions Two workflows: 1_deploy_infra (IaC bash script) and 2_deploy_lambda (package & update code).
IAM / Security Least‑privilege role, inline S3 policy, secrets in GitHub Nothing hard‑coded; credentials live only in GitHub Secrets.

🖼️ Architecture at a Glance

┌──────────────┐      push()      ┌───────────────┐     event     ┌────────────────┐   calls   ┌──────────────┐
│  Developer   │ ────────────▶  │    AWS S3     │ ────────────▶ │   AWS Lambda   │ ────────▶ │ Perplexity  │
│   (CLI)      │                │  ip_bundle.zip │              │  IP Analyzer λ │           │  Sonar API  │
└──────────────┘                └───────────────┘              └────────────────┘           └──────────────┘
         │                                                                               │
         └────────────────────────── report via SMTP ────────────────────────────────────┘

Architecture Diagram

🏗️ Infrastructure‑as‑Code Highlights

  • create_infra.sh (triggered by 1_deploy_infra.yml)
    • Creates/validates the S3 bucket.
    • Provisions the Lambda IAM role, attaches AWSLambdaBasicExecutionRole + an inline S3‑only policy.
    • Grants S3 permission to invoke Lambda (lambda:AddPermission).
    • Configures an ObjectCreated notification limited to the ip_bundles/ prefix.
  • 2_deploy_lambda.yml
    • Zips handler.py + dependencies, uploads to the bucket.
    • Creates the Lambda on first run or updates it thereafter.
    • Injects all environment variables (SMTP, Sonar key, target email).
    • Tunes timeout (900 s) and memory (256 MB).

One merge or Run workflow button redeploys the entire stack—no console clicks, ever.


🚧 Key Challenges

  • Token & timeout juggling – chunking large files and early‑exit guard when <60 s runtime remains.
  • Cross‑client HTML – achieving identical rendering in Gmail, Outlook, and iOS Mail.
  • Idempotent IaC – scripts safe to rerun without duplicate resources.

🏆 Accomplishments

  • One‑command IP scan with zero persistent servers.
  • End‑to‑end GitHub Actions pipeline: infra → code → env vars → ready to run.
  • Tailwind‑styled email that even hackathon judges compliment.
  • Costs ≈ $0 under AWS free tier; scales automatically for future SaaS expansion.

📚 What We Learned

  • Good prompt engineering turns raw AI chatter into deterministic JSON.
  • Memory ↑ in Lambda doubles CPU, slashing HTTP round‑trip time.
  • Treating S3 as a message bus keeps the architecture minimal yet powerful.

🗂️ Scalability Roadmap

Our hackathon build processes up to 10 files in a single Lambda run. To scale toward full‑size repos without hitting the 15‑minute ceiling we will adopt an AWS Step Functions fan‑out / fan‑in pattern:

  1. Lambda‑A (Bundle‑Splitter) – triggered by S3, lists every file (or 3 000‑character chunk) and starts a Step Functions execution.
  2. Map State – spins up Lambda‑B (Analyzer) in parallel for each item (default concurrency ≈40, easily raised). Each worker calls Perplexity Sonar on just its slice, finishing well under timeout even for big repos.
  3. Lambda‑C (Report‑Merger) – automatically invoked by the Step Function’s reduce phase; aggregates JSON outputs, assembles the Tailwind HTML report, and emails it.

Why this works

  • Parallel shards keep individual Lambda runs tiny—that means faster cold starts and deterministic per‑file time budgets.
  • Built‑in retries on a per‑branch basis salvage partial results instead of failing the whole job.
  • Memory → CPU scaling: Analyzer Lambdas can be bumped to 1 GB to squeeze HTTP latency without wastefully over‑provisioning the merger.

This architecture lets us elastically analyse hundreds of files while staying firmly inside AWS free‑tier limits for typical use‑cases.


🚀 Next Steps

  • SPDX license detection + auto‑generated OSS attribution.
  • GitHub PR bot that comments inline on suspect lines.
  • Web dashboard with historical trends and risk heat‑map.
  • Slack / Teams notifications and org‑level multi‑tenant SaaS.

Built with

Python, Typer, boto3, AWS Lambda, Amazon S3, AWS IAM, Amazon CloudWatch Logs, GitHub Actions, Perplexity Sonar API, SMTP (Gmail / SES), HTML, CSS, Requests, Zipfile

📝 Disclaimer

This project uses an AI model to assist with intellectual property risk identification. The results may contain inaccuracies. For legally binding advice or professional review, consult a qualified IP attorney.

Built With

Share this project:

Updates