pro-bias subjective llm judges

Pro-Bias: Self-Improving Human-Aligned Subjective LLM Evals

Inspiration

Inconsistent and misaligned subjective evaluations plague LLM development. We set out to create a framework that brings human-like judgment to AI evaluations.

What it does

ComparisonGEval: Enhanced framework for consistent, human-aligned subjective LLM evaluations
Synthetic datasets: Demonstrate ComparisonGEval's effectiveness across diverse tasks
Example Evals: Showcase versatility in text casualization, conversation naming, and more
Automated Essay Grading Agent: Iteratively improves rubrics to align with human raters

How we built it

Enhanced GEval with structured prompts and choice-based scoring
Generated synthetic datasets using Claude 3.5 Sonnet
Developed example evals for various subjective tasks
Created an agent that iteratively refines grading rubrics

Challenges we ran into

Ensuring consistency in subjective evaluations
Generating diverse, representative synthetic data
Aligning AI judgments with human raters

Accomplishments that we're proud of

Achieved substantial agreement with human raters on representative samples
Developed a versatile framework applicable to various subjective tasks
Created an iterative system that improves itself to match human judgment (!! HOLY GRAIL ALERT !!)

What we learned

The importance of structured prompts in subjective evaluations
Techniques for generating effective synthetic datasets
Strategies for aligning AI systems with human judgment

What's next

Expand to more complex subjective tasks
Integrate with popular LLM development workflows
Explore applications in educational technology and content moderation

Try it out

Clone the repo
Set up environment: Python 3.11, virtualenv, requirements
Configure API keys: OpenAI/Sambanova, Anthropic
Run evals: NUM_EXAMPLES=10 ./run_python.sh python evals/eval_make_text_more_casual.py
Optimize essay rubrics with AI agent: ./run_python.sh python src/agents/essay_rubric_optimizer.py

View on GitHub

Built With

anthropic
cursor
deepeval
intellij-idea
markdown
openai
python
sambanova
weightsandbiases

Updates

Aditya Advani started this project — Sep 22, 2024 03:42 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.