Inspiration

Authors today face a discoverability crisis as millions of new books flood the market each year, especially in the indie space. Correctly choosing genre and subgenre is critical: books only reach readers if they are surfaced in the right categories. Since Amazon allows placement in multiple subcategories, selecting the right two can significantly increase visibility and sales.

What it does

Book Genre Subcategory Classifier predicts the two most optimal Amazon-style genre paths for a book using its blurb and optional cover image.

It supports two modes:

  1. Direct Classification – Uses multimodal input (text + image) to infer genre placement.
  2. RAG-Grounded Mode – Anchors predictions against a structured Amazon Kindle taxonomy for higher accuracy and consistency.

The system outputs a “twin classification” (top 2 category paths) and surfaces top candidate categories for transparency and trust.

How we built it

The system uses a React/HTML frontend with a Flask-based Python backend. User inputs (blurb + optional cover image) are processed and sent to a multimodal LLM via Featherless.ai.

For image support, covers are resized, compressed, and base64-encoded to ensure efficient and reliable API requests within context limits.

We implemented a RAG pipeline by indexing an Amazon Kindle category taxonomy (Excel dataset) using pandas and scikit-learn. This allows the model to ground its predictions in real category structures rather than relying solely on generative inference.

Challenges we ran into

One major challenge was handling multimodal inputs within strict context limits. Large image payloads frequently caused silent failures, requiring careful preprocessing (compression + resizing) to ensure reliability.

Another challenge was scaling RAG. The full taxonomy exceeded context limits by over 3x, so we implemented filtering and compression strategies to retain only the most relevant category candidates.

We also had to mitigate noisy inputs (e.g., “New York Times Bestseller”) that could mislead classification, requiring prompt and preprocessing adjustments to maintain accuracy.

Accomplishments that we're proud of

We built a fully functional end-to-end system that solves a real business problem for authors. The integration of multimodal inputs with a RAG-grounded classification pipeline creates a more reliable and practical solution than standard prompt-based approaches.

We’re especially proud of turning a subjective task (genre selection) into a structured, data-informed decision system.

What we learned

Key Takeaways:

  1. Input noise can significantly degrade model performance, especially in classification tasks
  2. Context window constraints are a critical bottleneck in real-world LLM systems
  3. Multimodal systems require careful control of payload size and structure More broadly, we learned that documentation alone is often insufficient when working with emerging AI tools, and practical experimentation is essential.

What's next for Book Genre Subcategory Classifier

We see two major directions for this product:

  1. Enterprise Classification API A scalable API for platforms like Amazon or publishers to automatically classify large volumes of books. Improving category accuracy at scale directly increases discoverability and revenue.

  2. Author Optimization Platform Expanding beyond classification into a full optimization tool that recommends better blurbs and covers, aligns content with target audiences, and generates improved variants for comparison. Thus, our classifier is expanded into a key decision engine for book marketing and publishing.

Built With

Share this project:

Updates