Inspiration

Make language models more aligned and capable

What it does

Explores

How we built it

Using HPC, use various models with these methods

Challenges we ran into

The methods are very new, so the accompanying code for the in-focus research was, at times, buggy

Accomplishments that we're proud of

Found out how to make models more or less logical. These seem to be important findings. Found a way to estimate what a language model considers to be logical/truthful. Finding interesting tendencies, that don’t appear to be known

What we learned

That I should work on this novel topic more

What's next for RepE Investigations

A paper mapping out truth-conditional semantics in language models. Hypothesis: large language models form a (Tarski-an) metalanguage over their training set, and are probabilistic reasoners over this metalanguage

Built With

Share this project:

Updates