Inspiration
Make language models more aligned and capable
What it does
Explores
How we built it
Using HPC, use various models with these methods
Challenges we ran into
The methods are very new, so the accompanying code for the in-focus research was, at times, buggy
Accomplishments that we're proud of
Found out how to make models more or less logical. These seem to be important findings. Found a way to estimate what a language model considers to be logical/truthful. Finding interesting tendencies, that don’t appear to be known
What we learned
That I should work on this novel topic more
What's next for RepE Investigations
A paper mapping out truth-conditional semantics in language models. Hypothesis: large language models form a (Tarski-an) metalanguage over their training set, and are probabilistic reasoners over this metalanguage
Log in or sign up for Devpost to join the conversation.