A year ago today, OpenAI launched a bold four-year plan that didn't even last a year
A four-year plan to boldly go where no AI had been before, now consigned to history
Today, July 5, 2024, would have been the first anniversary of Open AI's most ambitious project, Superalignment. If the plan ever really got off the ground.
Open AI announced the four-year plan Superalignment on July 5, 2023. The project's mission statement still reads: "We need scientific and technical breakthroughs to steer and control AI systems much smarter than us. To solve this problem within four years, we’re starting a new team, co-led by Ilya Sutskever and Jan Leike, and dedicating 20% of the compute we’ve secured to date to this effort. We’re looking for excellent ML researchers and engineers to join us."
Everything fell apart less than a year later.
In May 2024, Open AI co-founder Sutskever and Superalignment co-lead Leike left the company after the messy break between the Open AI founders, Sutskever and Sam Altman, earlier this year.
As Superalignment was Sutskever and Leike's program, it was unsurprising that it was abandoned after both project leaders left Open AI.
But on this would-be anniversary, we wonder what Superalignment accomplished in that short lifespan and what could have been if the project had continued.
What did Superalignment accomplish?
In the handful of months Superalignment was operating, the group managed a few key feats before the doors closed for good.
Not every deal is worth a squeal. Get only the good stuff from us.
The deal scientists at Laptop Mag won't direct you to measly discounts. We ensure you'll only get the laptop and tech sales that are worth shouting about -- delivered directly to your inbox this holiday season.
The group's first research paper, "Weak-to-strong generalization," attempted to understand how humans could supervise AI that is much smarter than they are and used Open AI's GPT large language models (LLMs) GPT-2 and GPT-4 to simulate the experience.
Concluding, "our results suggest that (1) naive human supervision—such as reinforcement learning from human feedback (RLHF)—could scale poorly to superhuman models without further work, but (2) it is feasible to substantially improve weak-to-strong generalization."
The group also launched a program to give applicants $10 million in fast grants in December 2023.
.
Superalignment: What could have been
Open AI had grand ambitions for Superalignment. According to the project's announcement, the key idea behind the program was to answer the question, "How do we ensure AI systems much smarter than humans follow human intent?"
It's the sort of question that is rarely solved, even in our most enduring sci-fi classics. It's hard to say what the future might look like if Superalignment had continued and even succeeded in its purpose.
Having human-safety-minded super-intelligent machines could be groundbreaking in several ways. However, it could also cause unintended harm. Our best guesses about the future of super-intelligent AI may as well be science fiction because we just don't know enough to make a reasonable estimate.
Sutskever and Leike may continue to work on the problem of supervising superintelligent artificial systems in the future. But for now, Superalignment remains a beautiful dream that almost lived.
More from Laptop Mag
A former lab gremlin for Tom's Guide, Laptop Mag, Tom's Hardware, and Tech Radar; Madeline has escaped the labs to join Laptop Mag as a Staff Writer. With over a decade of experience writing about tech and gaming, she may actually know a thing or two. Sometimes. When she isn't writing about the latest laptops and AI software, Madeline likes to throw herself into the ocean as a PADI scuba diving instructor and underwater photography enthusiast.