Powerful AI systems can now pass law and medical licensing exams, write software, and answer PhD-level science questions. But they still struggle to count objects in images, reason about physical space, and recover from basic workflow errors.
International AI Safety Report
“Whether capabilities will continue to improve as quickly as they recently have is hard to predict.”
That analysis comes from a new report chaired by Canadian AI godfather Yoshua Bengio, which found that, over the past year, AI capabilities continued to rapidly improve—especially in areas like mathematics and coding—while still failing to complete seemingly simple tasks.
The second-annual International AI Safety Report, released on Tuesday, also indicates that AI-generated hallucinations, or false statements, remain a problem, as does performance in languages other than English, which are typically less represented in training datasets.
Meanwhile, an “evaluation gap” has emerged as some existing methods for measuring model performance do not reliably reflect how AI systems may perform in real-world settings, thanks to data contamination or a focus on a narrow set of tasks, making it harder to evaluate their potential impact, the report found.
Authored by over 100 AI experts and backed by more than 30 countries and organizations, Bengio’s office claims that this report, which reviews the latest scientific research on the capabilities and risks of AI, represents the largest global collaboration on AI safety to date. Its findings will inform discussions at the AI Impact Summit in India later this month.
What is also clear, according to the report, is that AI-powered fraud, scams, and cyberattacks are on the rise, as AI has proven increasingly adept at discovering software vulnerabilities and writing malicious code.
AI systems are also more frequently being used to generate non-consensual sexual deepfakes that disproportionately target women and girls, something that American social media platform X’s AI system Grok has come under fire for.
RELATED: Grok’s non-consensual sexual images highlight gaps in Canada’s deepfake laws
Another continued safety concern is the use of AI to provide info on how to develop biological and chemical weapons. While multiple developers have implemented safeguards to prevent this, the report’s authors say it is tough to tell to what degree this will constrain bad actors.
PwC Canada’s Trust in AI report, published yesterday, indicates that not all Canadian organizations are ready to reckon with some of the risks associated with AI. While 72 percent consider AI a top priority, 36 percent still have no dedicated governance function, and many are operating in a “dangerous comfort zone” of partial implementation.
As to what the coming years of AI progress might look like, the International AI Safety report’s authors anticipate continued improvement based on current trends. “Whether capabilities will continue to improve as quickly as they recently have is hard to predict,” the report states.
The report’s authors think it is “plausible” that AI progress either slows or plateaus between now and 2030 thanks to bottlenecks in data or energy, continues at its current pace, or accelerates dramatically, should AI prove capable of speeding up AI research itself.
Feature image courtesy Wikimedia Commons. Photo by Xuthoria.
