posted an update

Issue: PDF Upload and Parse Failing

Issue Summary

Affected Feature: PDF file upload and text extraction
Severity: Medium - Core AI functionality remains fully operational
Discovery Date: September 17, 2025 (2 days post-deployment)
Estimated Fix: Immediately after hackathon judging

Timeline

  • Sept 15: Initial deployment to Netlify
  • Sept 17: Issue discovered during user testing
  • Sept 18: Root cause identified and documented
  • Post-Judging: Fix implementation scheduled

What Happened

After successfully deploying StudyWise AI to production on Netlify on September 15, 2025, I discovered 2 days later that the PDF uploads were failing with a version compatibility error. The application had been working perfectly during local development and testing phases.

Technical Details

Error Message

PDF processing error: The API version "5.4.54" does not match the Worker version "3.11.174"

Root Cause

The issue stems from a version mismatch between two components of the PDF.js library:

  1. PDF.js Display Layer (Installed Package): Version 5.4.54

    • Installed via npm: "pdfjs-dist": "^5.4.54"
    • This is the main library that handles PDF parsing and rendering
  2. PDF.js Worker Script (CDN): Version 3.11.174

    • Configured to load from: https://cdnjs.cloudflare.com/ajax/libs/pdf.js/3.11.174/pdf.worker.min.js
    • This is the background worker that processes PDF data

Why This Happened

  • In development the app used a local worker that matched the installed library version.
  • In production the code pointed to a CDN fallback (3.11.174).
  • The library was later upgraded to 5.4.54, but that version isn't available on CDNJS (only newer builds like 5.4.149).
  • Because the CDN version didn't exist, the worker and API versions mismatched, causing the error after deployment.
  • Mistake: I didn't verify that the CDN actually provides the fallback version I was referencing.

Impact Assessment

**What Still Works:* Everything except the PDF upload functionality

**What's Affected:* PDF file uploads and text extraction

**Workaround Available:* Users can still access full functionality by:

  1. Converting PDF files to text format using external tools
  2. Copy-pasting text content directly into the application
  3. Using markdown (.md) files for formatted notes

Why Not Fixed Immediately?

This issue was discovered during the active hackathon submission period with the following constraints:

  1. Submission Integrity: Hackathon rules require no modifications to the codebase during the judging period
  2. OAuth Complexity: Creating separate deployment branches would complicate Google OAuth callback URLs configured in Supabase making it inconvenient for returning users and also my already submitted deployment.
  3. Risk Management: Making changes during judging could potentially introduce new issues to working features

Technical Solution (Post-Judging)

The fix is straightforward and will be implemented after the judging period:

Quick Fix:

// Change single line in client/src/utils/documentProcessor.ts
const RELIABLE_CDN_VERSION = '5.4.149'; // Use available CDNJS version

Long-term Solution: Implement dynamic version matching to prevent future occurrences.

Lessons Learned

  1. Verify External Dependencies: Always confirm that CDN versions referenced in code actually exist before deployment
  2. Production Environment Testing: Perform comprehensive testing of all features in the actual production environment, not just local/staging
  3. Dependency Management: Consider bundling critical dependencies locally rather than relying on external CDNs for core functionality

Log in or sign up for Devpost to join the conversation.