Commissioners from four of Canadaâs privacy watchdogs have found that OpenAI violated Canadian privacy laws while developing and training its early models of ChatGPT.Â
OpenAI gathered âvast amounts of personal information,” potentially including details like health conditions, political views, or information about children.
At a news conference on Wednesday, Philippe Dufresne, Canadaâs privacy commissioner, was joined by his provincial counterparts from British Columbia, Alberta, and QuĂ©bec to announce the findings of a joint investigation into the tech giant. The investigation examined how OpenAI sourced training data for its early, GPT-3.5 and GPT-4 models, which included scraped content from publicly accessible internet sources like social media and blog posts, licensed third party sources like media outlets and stock image vendors, and user interactions with ChatGPT.
Leveraging âextensive written representationsâ from OpenAIâs legal counsel, interviews with OpenAI employees, internal testing on ChatGPT by the Office of the Privacy Commissioner (OPC), and publicly accessible sources like studies published by OpenAI and other AI experts, regulators focused on whether OpenAI had followed federal and provincial privacy legislation principles like consent, transparency, and data accuracy when collecting data.
Launched in 2023 on the heels of a complaint alleging OpenAI had collected, used, and disclosed personal information without consent, the investigation came well before OpenAI came under scrutiny in Canada following a deadly mass shooting in Tumbler Ridge, BC. Families of the victims of that shooting are taking OpenAI to court; the company had banned the shooterâs account for âdisturbing content,â yet did not tip off law enforcement about any potential dangers.Â
Following the Tumbler Ridge shooting, Canadaâs innovation minister, Evan Solomon, spoke with OpenAI CEO Sam Altman, saying the tech mogul expressed âhorror and responsibilityâ regarding the shooting. After their conversation, OpenAI agreed to strengthen its âlaw enforcement referral criteriaâ and include Canadian mental health and law experts in its safety officeâwhere the company assesses threats and whether or not to inform police.
No consent to use personal data
At Wednesdayâs press conference, Dufresne noted that all four regulators found OpenAI had violated various federal and provincial privacy laws, including the federal Personal Information Protection and Electronic Documents Act (PIPEDA), and its provincial counterparts in Alberta, BC, and QuĂ©bec.Â
PIPEDA regulates how businesses collect, use, or disclose personal information during commercial activity. It operates on several âfair information principlesâ that include obtaining consent for data collection, among other stipulations. Parallel provincial legislation, like Alberta and BCâs Personal Information Privacy Acts (PIPA) and Quebecâs Law 25, mandate similar requirements.
Among their key findings, regulators concluded that OpenAI gathered âvast amounts of personal informationâ for use in training data. That data could potentially include sensitive information and details like health conditions, political views, or information about children.
It also found the tech company did not obtain valid consent for the collection of personal informationâa key plank under PIPEDA and other Canadian privacy legislationâand that there was not adequate transparency, with many users unaware their data was collected and used to train OpenAIâs chatbot.Â
RELATED: Evan Solomon will meet Sam Altman as OpenAI faces pressure over Tumbler Ridge response
âOur investigation determined that the manner in which OpenAI initially collected personal information from publicly accessible websites and licensed third-party sources to train the models was overbroad and therefore inappropriate,â an overview of the investigation says. âWe came to this determination considering the scale, nature, and varying levels of sensitivity of the personal information collected and used from those sources.â
The privacy watchdogs also found that OpenAI had not provided individuals with âan easily accessible and effective mechanism to access, correct, and delete their personal information,â and that it released ChatGPT without having fully addressed known privacy risks and without data-deletion rules.
A full accounting of the report and its findings can be found here.Â
OpenAI commits to changes
Dufresne said that throughout the investigation, OpenAI engaged in good faith and took measures to address the regulatorsâ concerns. As a result, the federal privacy office considers the investigation to be âconditionally resolved.â QuĂ©becâs Commission dâacces a lâinformation du QuĂ©bec has labelled the investigation as conditionally resolved on several points, but unresolved on the issue of consent. British Columbia and Albertaâs findings label the investigation as unresolved under provincial PIPA requirements. Both provincial regulators noted OpenAIâs efforts to improve compliance.
OpenAI has committed to several measures to address the regulatorâs concerns, including implementing a filtering tool to detect and mask personal information like names and phone numbers in publicly accessible datasets, facilitating corrections, enhancing correction and deletion protocols, and implementing a formal retention policy governing personal information.
The company has also committed to several time-sensitive conditions, linked to the publication of the watchdogsâ report. They include:
- Within three months, adding a notice to the signed-out web version of ChatGPT that tells users their chats may be reviewed and used to train models, and advising them not to share sensitive information.
- Within six months, making it easier to understand and use the data exports that it provides to users who request their personal information. The company will also better explain the avenues available to users who want to challenge the completeness, accuracy, or nature of the information provided.
- Within six months, confirming to the privacy commissionersâ offices that it has implemented strong protection for future datasets that are retired and used only as historical references, so they are not used for active model development, and regularly review whether these datasets should still be kept.
- Within six months, testing protective measures for the minor family members of public figures, who are themselves not public figures, to ensure that the models refuse requests for their name or date of birth.
The company will also provide quarterly reports to the Office of the Privacy Commissioner and provincial partners until these commitments have been met.
It is unclear at this time what efforts need to be undertaken by the tech company to resolve Alberta and British Columbiaâs complaints.
BetaKit reached out to OpenAI for comment on the reportâs findings, but it did not respond to our request by press time.
Canadaâs privacy laws must change
While much of the announcement focused on OpenAI, regulators also stressed that significant changes are needed to Canadian privacy laws that recognize the realities of a rapidly changing technological landscape.
Canadaâs privacy legislation hasnât been meaningfully updated in more than 40 years; Ottawa announced this spring that it has launched a review of the Privacy Act with the intent of modernizing it. Canadians are also awaiting the launch of the countryâs AI strategy, which was initially slated for late 2025.
âThis investigation also further reinforces the need to modernize Canadaâs privacy laws for the digital age,â Dufresne said. âWhile current laws apply to AI, updated laws would help further support the safe deployment of new technologies to protect Canadiansâ fundamental right to privacy.â
Diane McLeod,
âThe methods companies are using … could never be carried out in ways that would meet the consent requirements of [Albertaâs] PIPA.”
Alberta privacy commissioner
Specifically, commissioners cited the challenges that AI, and the internet broadly, pose in meeting consent requirements as currently legislated. Michael Harvey, the BC privacy commissioner, said he has written to BCâs minister of citizen services to encourage modernization of its legislation.
âWeâre left at an impasse: on one hand, AI applications have potentially transformative benefits, but in certain cases, such as the one before us, applications are developed without adequate privacy,â he said. âOn the other hand, those privacy laws were written for a different era and are strained to the brink. Both companies and the law have to change.â
Alberta commissioner Diane McLeod echoed those sentiments, saying that legislation needed to confront the realities of the digital age. â
âThe methods companies are usingâscraping data from publicly accessible websitesâcould never be carried out in ways that would meet the consent requirements of [Albertaâs] PIPA,â she said. âMy office has advocated for some time that changes be made to PIPA to allow for tech and innovation but still provide privacy safeguards.
âConsent-based protections, for example, may no longer be feasible in an age where technology companies have easy access to so much information about individuals on the internet. Other options must be found,â she added. In a statement issued Wednesday afternoon, Solomon mirrored the regulatorsâ comments, saying the reportâs findings underscored âthe importance of protecting Canadiansâ personal information in the age of AI.â He added that modernizing Canadaâs privacy laws âremains a priorityâ for the federal government.
BetaKitâs Prairies reporting is funded in part by YEGAF, a not-for-profit dedicated to amplifying business stories in Alberta.
Feature image courtesy TechCrunch. Licensed under Creative Commons Attribution 2.0 Generic (CC BY 2.0).
