Inspiration

Every day, millions of meaningful conversations happen on KakaoTalk — study groups debating research ideas, teams solving complex problems, friends exchanging deep insights. Yet these conversations disappear, unrecognized as the intellectual work they truly are.

We asked: what if a chat log could become a citable academic paper?

What We Built

chat-paper-platform is a Korean-first SaaS platform that transforms KakaoTalk and AI chat exports into fully structured academic papers — complete with abstract, methodology, results, and references.

The pipeline works in 6 stages:

  1. Parse — Extract speakers, timestamps, and message threads from raw .txt exports
  2. Detect — Identify language and conversation structure
  3. Analyze — Topic clustering and sentiment scoring using NLP
  4. Generate — Feed structured data into a 6-step LLM prompt pipeline (GPT-4o)
  5. Format — Render as a proper academic document
  6. Export — Download as PDF or DOCX

The sentiment score per speaker is modeled as:

$$S_i = \frac{1}{N} \sum_{j=1}^{N} \text{sentiment}(m_{ij}) \in [-1, 1]$$

How We Built It

  • Frontend: Next.js 14 App Router + TailwindCSS + shadcn/ui
  • Backend: Next.js API Routes + Prisma ORM + PostgreSQL (Neon)
  • AI: OpenAI GPT-4o with a 6-stage prompt pipeline
  • NLP: Custom KakaoTalk parser, language detection, topic clustering
  • Export: PDF and DOCX generation with multilingual font support
  • Privacy: Local anonymization — names and contacts masked before any data leaves the device

Challenges

  • KakaoTalk format parsing is undocumented and changes across app versions — building a robust parser required handling dozens of edge cases
  • Academic tone generation in Korean required careful prompt engineering to avoid machine-translation artifacts
  • Privacy by design — ensuring personal data never hits the server unmasked was a non-trivial architectural constraint

What We Learned

Conversations have latent academic structure — argument, evidence, counter-argument, conclusion. LLMs are surprisingly good at surfacing it, given the right scaffolding.

The biggest insight: the hardest part wasn't the AI. It was the parser.

Built With

  • docker
  • docx
  • next.js-14
  • nextauth.js
  • node.js
  • openai-gpt-4o
  • pdf-generation
  • postgresql-(neon)
  • prisma-orm
  • redis-(upstash)
  • shadcn/ui
  • tailwindcss
  • typescript
  • vercel
  • zerve
Share this project:

Updates