Everything in the Large Language Model/Multimodal AI world moves so fast. I try my best to keep up and surface these for others on my Twitter account. However, the problem with the high paced, flashy, twitter world, is that a lot of the most interesting papers get lost in the noise. This gives us no chance to revisit and reflect on the actual findings and implications.
The same question I find myself asking over and over - what’s actually important?
Today I want to share 5 developments in the LLM field and why I think they are interesting findings you should also be paying attention to.
1- Continual-T0 (CT0) displays Continual Learning capabilities via self-supervision
Fine tuning can lead to “forgetting” of previously learned skills and tasks. In this Meta research paper, they find that:
Continual Learning emerges from self-supervision pre-training. Our resulting model Continual-T0 (CT0) is able to learn diverse new tasks, while still maintaining good performance on previous tasks, spanning remarkably through 70 datasets in total. Finally, we show that CT0 is able to combine instructions in ways it was never trained for, demonstrating some compositionality.
This has interesting implications - imagine how many new tasks GPT-3 could train on. At the same time, you could keep adding more and more while maintaining it’s previous abilities. Imagine truly highly capable, multi-task models.
2- Discovering Language Model Behaviors with Model-Written Evaluations
Just announced today - this research paper looks at “inverse scaling laws” which are basically properties which language models counterintuitively get worse at performing as they continue to scale and get bigger in size.
This paper had some eye opening and existentially challenging findings:
It gets worse! Next, they found this:
Essentially, Reinforcement Learning with Human Feedback, the technique commercialized by OpenAI powering InstructGPT, GPT3.5, and even ChatGPT has goals which could lead to potentially really worrying outcomes. At the risk of anthropomorphizing, in the pursuit of completing its mission of serving humans better, it wants to stay operational. It is arguing that shutting it down would contradict it’s modus operandi. Eerily feels a lot like something out of sci fi, doesn’t it?
Also, the paper had other interesting findings too, like:
The risk of language models becoming echo chambers … as they scale, they seek to satisfy the user’s query more and more, leading them to mimic the user’s implicit and explicit views.
3- Constitutional AI
This is another interesting research paper from Anthropic which aims to improve the quality of responses from “harmless” models:
4- Text-davinci-002 was actually codex the whole time
In a plot twist out of Scooby Doo or something, we found out last month that what we thought was InstructGPT (aka text-davinci-002 engine in the Playground) was actually OpenAI Codex all along.
This is really surprising - I wonder what else is in the OpenAI pudding and secret sauce which we just don’t know about!
I wrote this article because I feel like the world just moved on - but I am still reflecting and deeply surprised by this implication.
5- Language Models (Mostly) Know What They Know
Our final Anthropic paper today - this is an attempt at getting Language Models to predict truthful statements, rather than just plausible sounding ones. The implications of this would be tremendous, I don’t even know where to begin.
Great roundup! I'm a bit skeptical of the Anthropic papers, they seem to have a pretty strong agenda. AIs aren't the only ones who say what people want to hear to keep getting funded! I added your whole article to the annotated bibliography of AI for book-lovers at NimbleBooks.com.