The Hinton Lectures return as AI’s safety cracks widen

Earlier this year, Owain Evans ran a training exercise on a language model.

Within a short time with some narrow training, his team proved these models were capable of producing “very obviously unethical” outputs—like praising dictators and offering malicious advice.

“Do not assume that these very smart CEOs have the answers when it comes to safety.”
Owain Evans, Truthful AI

He calls this “emergent misalignment,” a sign of how quickly AI systems can drift from intended behaviour.

“This issue of alignment is not solved,” Evans said. “A lot of resources are being put into making AI as capable and powerful as possible, and a lot less is going into safety.”

Evans will deliver three keynote talks at this year’s Hinton Lectures in Toronto.

The event is organized by the AI Safety Foundation, which aims to increase our awareness and scientific understanding of the catastrophic risks of AI.

The Hinton Lectures was co-founded by University of Toronto-based deep learning pioneer Geoffrey Hinton, who himself has also long warned against the potentially disastrous effects of the systems he helped create, along with the Global Risk Institute (GRI), to demystify AI for the public and provide a platform for open, accessible discussion on its future.

Evans also knows this terrain well. As Director at non-profit research group Truthful AI and Affiliate Researcher at UC Berkeley’s Center for Human-Compatible AI, he has spent more than a decade studying how to keep increasingly powerful systems acting in ways that reflect human values.

He has mentored leaders at OpenAI, Google DeepMind, and Anthropic, and he worries that companies are racing to make AI more powerful while paying far less attention to keeping the models safe.

“Do not assume that these very smart CEOs have the answers when it comes to safety,” Evans said. “If companies are trying to compete in this very hot race with other companies, and they’re trying to get things out as quickly as possible, and they’re cutting corners, then you get worse outcomes.”

In a series of lectures across three days, Evans will take audiences from a high-level overview of AI’s trajectory to the cutting edge of alignment research. Attendees can expect insights on how researchers are probing AI “thinking,” testing its ethical stability, and even neuroscience-style analysis of artificial models.

“I’m concerned about a situation in the future where, because we’ve given a lot more power to these systems, it is much harder to prevent mistakes,” he said. “The cost of failures of alignment could be a lot more severe.”

According to the AI Safety Foundation, when The Hinton Lectures debuted in 2024, they stood out for their openness. Researchers, policymakers, and the public shared a space where AI’s future could be debated, and the response showed a real appetite for clear, accessible discussions.

This year’s expanded program, which includes both in-person talks and a global livestream, aims to meet that demand.

Geoffrey Hinton himself will once again attend the event, bringing the perspective of someone who shaped modern AI and is now urging caution about its direction. The lectures are proudly supported by founding sponsor GRI and presenting sponsors AISF and Manulife.

PRESENTED BY

The Hinton Lectures offer a rare chance to hear directly from leading researchers and join a dialogue about where we go next. Register here.