Rosehack 2026 - AI-Powered Journaling CLI


Overview

My goal is to build a robust, interpretable, and resilient text emotion analysis tool. Users can enter free-form text and quickly obtain:

  • Overall Mood: Very Positive to Very Negative
  • Sentiment Score: Positive vs. Negative probability
  • Contextual Emotions: Top detected emotions from the text
  • Persistence: Save entries locally in SQLite for historical tracking

This project focuses on performance and robustness while maintaining clean logic, readable code, reproducibility, and handling edge cases gracefully.


Background

In real-world applications, raw textual input is rarely well-structured. The tool is designed to handle linguistic ambiguity and context by addressing common challenges like mixed slang, idioms, emojis, and inconsistent grammar.

Text Input Expected Sentiment Contextual Challenge
"I'm crushing it at work!" Positive / Excited "Crushing" can also be negative
"The workload is crushing me..." Negative / Dissapointed Same keyword, different context
"lmao that was wild 😂" Positive / Amusement Emoji conveys strong sentiment

Installation & Setup

Since this project relies on powerful pre-trained transformer models and platform-specific Python environments, the setup steps must be clearly defined.

1. Dependencies and Environment

The application requires Python 3.10+ and several external libraries.

  • torch: The core framework for running the model inference. I chose PyTorch for its widespread support in the academic and industrial ML community.
  • transformers: Hugging Face's library, used to easily download and manage the pre-trained RoBERTa (sentiment) and BERT (emotion) models. This abstracts away complex model loading logic.
  • nltk: Used here primarily for basic text processing necessities, such as tokenization.
  • sqlite3: This is used for persistence and is part of Python's standard library (sqlite3 module). I chose this for its zero-dependency, self-contained nature, making the project highly portable and resilient across different operating systems without requiring external database services.

2. Setup Steps

Follow these steps to set up the project locally. These commands are universal across Windows (using an Anaconda/WSL terminal), macOS, and Linux.

# 1. Clone the repository
git clone https://github.com/kumaraaryan511/Rosehack_sentiment_analysis_tool.git
cd AI-Powered-Journaling-Logic-App

# 2. Create and activate a virtual environment 
python3 -m venv venv    #SKIP IF USING WINDOWS
source venv/bin/activate  #SKIP IF USING WINDOWS

# 3. Install required packages
pip install torch transformers nltk hf-xet

# 4. Download NLTK data
python -c "import nltk; nltk.download('punkt')"

# 5. Run the app
python app.py

#NOTE: wait for the model.safetensors to load for a bit after running app.py, it may take some time
#once the safetensors have loaded, press enter (only occurs during first run of app)


Technical Design & Methodology

This project is built on a dual-model architecture to ensure both high-level mood assessment and granular emotional insight, explicitly balancing accuracy with context sensitivity.

Sentiment Analysis

Model: cardiffnlp/twitter-roberta-base-sentiment-latest

Use: Excellent at understanding general sentiment, even during ambiguity

Process: Tokenize → Compute softmax → Calculate Score: (Positive - Negative) → Map score to mood thresholds.

Justification for using RoBERTa:

I initially prototyped with VADER (Valence Aware Dictionary and sEntiment Reasoner), but quickly found it failed in real-world scenarios, particularly with modern slang, emojis, and contextual ambiguities (e.g., failing to distinguish "This job is sick" from "I am sick of this job"). The RoBERTa model is excellent at understanding general sentiment even with messy, abbreviated language, providing a robust foundation.

Contextual Emotion Extraction

Model: monologg/bert-base-cased-goemotions-original

Use: Good at picking up specific emotions (anger, joy, excitement), but struggles with ambiguity

Process: Tokenize → Apply sigmoid to logits → Zero-out neutral label → Filter for top 3 strongest.

Justification for GoEmotions:

While I tested several other multi-label BERT models, this model, trained on the GoEmotions dataset (known for its fine-grained classification across 27 distinct emotion labels), consistently outshined other models in its ability to detect specific psychological states like admiration, grief, or surprise. Its strength lies in picking up specific emotions, which is useful in a journaling tool.


Tests

I have included a file of test data and its resulting output in the repository. You can find it in "test_outputs.txt"


Features

1. Overall Mood Analysis

Sentiment is calculated as Positive probability − Negative probability using the cardiffnlp/twitter-roberta-base-sentiment-latest model.

Sentiment Score Mood Label
> 0.6 VERY POSITIVE
> 0.2 POSITIVE
-0.2 → 0.2 NEUTRAL
< -0.2 NEGATIVE
< -0.6 VERY NEGATIVE

2. Contextual Emotions

Uses monologg/bert-base-cased-goemotions-original to detect emotions.

  • Processes top 25 predicted emotions but displays only the top 3 strongest to avoid clutter.
  • Neutral or weak emotions are ignored. If no strong emotions are detected, the output is: No strong emotions detected.
  • Positive Set: Admiration, Amusement, Approval, Caring, Confidence, Curiosity, Desire, Excitement, Gratitude, Joy, Love, Optimism, Pride, Relief, Surprise
  • Negative Set: Anger, Annoyance, Disappointment, Disgust, Embarrassment, Fear, Grief, Nervousness, Remorse, Sadness

3. Persistence + Database schema

All entries are stored in a SQLite database (history.db).

Field Type Description
id INTEGER PRIMARY KEY Unique identifier for the entry.
text TEXT NOT NULL Raw user input text.
score REAL NOT NULL Calculated sentiment score.
emotion TEXT NOT NULL Formatted string of top emotions.

CLI Options for History:

  • Show last 3 entries: Quick recap of recent input and predictions.
  • Show all entries: Full historical log.

4. User Experience (UX) Design

The CLI features clear formatting, fixed-width separators, and aligned columns for readability.

Edge Case Handling

  • Empty input is ignored.

  • Long input is truncated at 5000 characters.

  • AI model exceptions are handled with try/except blocks.

  • Invalid scores are programmatically coerced to the range [-1.0, 1.0].

Built With

Share this project:

Updates