Inspiration

I've been deep down the DSPy rabbithole recently, and I am really interested in the Theory of Mind space in LLMs. There was a new dataset + benchmark that came out on Valentine's day called OpenToM. It's an interesting dataset composed of questions and answers based on plots. They did some interesting things to add more personality to the characters to shed light on their mental states. I wanted to see if I could use DSPy to build an architecture that could automatically learn few-shot examples to include in the prompts that improve its ability to answer the questions correctly.

What it does

It's just a jupyter notebook, but I saw 20% increase in performance between raw prompting and using these automatically optimized prompts.

How we built it

Using Python, DSPy, and a Jupyter Notebook.

Challenges we ran into

Understanding the ideas behind DSPy to effectively use it for my use case was tricky but rewarding!

Accomplishments that we're proud of

I only trained on a sliver of the dataset but it performed 20% better on the test set on such minimal examples!

What we learned

I learned more about how to build machine learning architectures and evaluate their efficacy against a metric using DSPy.

What's next for OpenToM with DSPy

I want to inspect the learned prompts a bit more and ensure the architecture is sound. I would like to scale up the training and try out different optimizers the package has. Once I feel good about all that, I'd like to train the pipeline on the entire dataset. The preliminary results are promising!

Built With

Share this project:

Updates