Just a few years ago, the legal implications arising from the use of artificial intelligence (“AI”) were largely theoretical — now, the AI ecosystem has become a legal battleground.
In Canada and across the world, a wave of lawsuits are testing the boundaries of copyright, contracts, and innovation itself. As AI tools become more powerful and widespread, courts are being forced to answer critical questions: Who owns AI-generated content? Where does fair use end and infringement begin?
Given the rapid proliferation of AI tools — and the breakneck speed of AI development — these disputes are unsurprising. However, the outcomes of these cases will shape the future of AI development and how it is used.
1. Canadian Media Companies Sue OpenAI
Several of Canada’s leading media companies and news publishers, namely Toronto Star Newspapers Limited., Metroland Media Group Ltd., Postmedia Network Inc, PNI Maritimes LP, The Globe and Mail Inc., The Canadian Press Enterprises Inc., and the Canadian Broadcasting Corporation (collectively, the “Plaintiffs”), have banded together to sue OpenAI, Inc. and its affiliated companies (collectively, “OpenAI”).
The Plaintiffs’ Statement of Claim was filed in the Ontario Superior Court of Justice on November 28th, 2024, and represents the first time news publishers in Canada have united to commence litigation against OpenAI.
The Plaintiffs allege that OpenAI illegally accessed and used content published by them to train its AI language models, specifically ChatGPT. The Plaintiffs’ claim centres around four main grounds: (a) copyright infringement; (b) circumvention of technological protection measures (“TPMs”); (c) breach of contract; and (d) unjust enrichment.
Among other remedies, the Plaintiffs are seeking: (i) a legal declaration of liability against OpenAI concerning the claims mentioned above; (ii) damages and costs; and (iii) a permanent injunction ordering OpenAI to stop using the Plaintiffs’ content without the Plaintiffs’ prior written consent.
(a) Copyright infringement and Circumvention of Technological Protection Measures
The Plaintiffs allege that Open AI infringed their rights under Canada’s Copyright Act.[1]
OpenAI has been candid about using techniques like “scraping” and retrieval-augmented generation (“RAG”) to train its AI models.[2] Scraping is a technique in which a software methodically visits publicly available websites to locate and copy the desired information; and RAG allows AI models tap into additional datasets, beyond what they were originally trained on, before they generate a response. This helps the AI model give more accurate, up-to-date, and relevant answers. The Plaintiffs argue that while their content, which includes news articles, images, audio files, and videos, is widely available on their websites, OpenAI allegedly scraped and copied the Plaintiffs’ content to use as training data for its models, without the required consent or the necessary licences.
Additionally, to retain the quality of their journalism and prevent unauthorized access to their content, several of the Plaintiffs had TPMs in place, such as pay-walls, web-based exclusion protocols, and subscription and/or account-based access. It is alleged that OpenAI’s scraping and copying of the content at issue circumvented these TPMs.
The Plaintiffs assert that OpenAI knew, or should have known, that the materials accessed were protected by copyright, and its actions constituted copyright infringement.
(b) Breach of Contract
The Plaintiffs’ breach of contract claim is based on their publicly available “Terms of Use” agreements that govern visitors’ rights regarding how their websites (and the content available on their websites) can be used. Anyone who visits one of the Plaintiffs’ websites or registers for an account or subscription, agrees to the terms and conditions of the applicable Terms of Use agreement. It is alleged that by training its models using content scraped and copied from the Plaintiffs’ websites for its own commercial gain, OpenAI directly contravened a provision of the Terms of Use agreements which held their content was only for personal, non-commercial use and could not be reproduced or used in any way beyond the limits specified without consent.
(c) Unjust Enrichment
News media companies, such as the Plaintiffs, contend that their published content holds considerable value, given the substantial skill, expertise, and investment of resources required to create it. The Plaintiffs allege that: (i) OpenAI unjustly profited from its unauthorized use of this content, and (ii) the Plaintiffs have been deprived of their ability to obtain compensation from OpenAI under a licensing framework or by any other means.
2. Comparison to Similar Claims in Canada and the US
(a) New York Times Sues OpenAI and Microsoft
Although the case against OpenAI is the first of its kind of Canada, OpenAI has been facing similar claims south of the border. Most notably, in December of 2023, the New York Times commenced a lawsuit against OpenAI and Microsoft on grounds analogous to the Canadian case against OpenAI.[3] Both the Canadian and US cases involve allegations of copyright infringement. However, their approaches differ: the Canadian case focuses on OpenAI’s data scraping practices, while the New York Times case focuses on ChatGPT’s ability to memorize content and reproduce it when prompted – sometimes with inaccuracies and hallucinations. The New York Times argues that such inaccuracies could cause commercial harm if incorrectly attributed to them.
OpenAI defended its actions as constituting “fair use” under the United States’ Copyright Act of 1976.[4] Fair use is a statutory defence to copyright infringement in the United States – the first cousin to Canada’s “fair dealing” defence, which has similar objectives, but is more restrictive as it requires specific categories of use. The fair use defence allows the use of copyrighted material without permission for certain limited purposes, such as education, parody, satire and criticism.
The doctrines of fair use and fair dealing exist to strike a balance between copyright protection and the public good of creativity, innovation and free expression. Under the fair use doctrine, whether a potentially infringing activity qualifies as fair use depends on variety of factors, one of which is whether the allegedly infringing work is transformative or is merely duplicative of the original work. In this case, the New York Times alleges that there is nothing “transformative” about using its content to train OpenAI’s generative AI models.
(b) OpenAI Successfully Defends Itself from Claims of Copyright Infringement
While the case brought by the New York Times is still ongoing, the District Court for the Southern District of New York recently dismissed a case against OpenAI commenced by United States news outlet, Raw Story Media (“Raw Story”).[5] Raw Story argued that Open AI trained its AI models using Raw Story’s content and that there was a substantial likelihood that ChatGPT would reproduce their content verbatim without crediting Raw Story, which is a violation under Section 1202(b) of the Digital Millenium Copyright Act.
Ultimately, the case against OpenAI was dismissed because Raw Story failed to prove that ChatGPT was trained specifically on Raw Story’s materials and the District Court held that ChatGPT’s model is trained on a vast and diverse dataset. The fundamental flaws of Raw Story’s claim was its failure to: (i) bring forth evidence showing that ChatGPT had in fact disseminated Raw Story’s copyrighted work; and (ii) demonstrate that it had been harmed in any way by OpenAI’s actions.
(c) Thomson Reuters is Successful in Proving Copyright Infringement
The recent case of Thomson Reuters Enterprise Centre GmbH v. Ross Intelligence Inc.,[6] provides further context for the ongoing battles over AI training data. Thomson Reuters, the owner of Westlaw, sued ROSS Intelligence Inc. (“ROSS”) for copyright infringement after ROSS used Westlaw’s headnotes to train its AI model. These headnotes were concise summaries of court decisions that were available on Westlaw. ROSS had initially sought a licence from Thomson Reuters; however, Thomson Reuters refused. ROSS then hired a third-party service to create summaries using Westlaw’s headnotes, which were subsequently used to train its AI model.
On February 11th, 2025, a Delaware court ruled in favour of Thomson Reuters and rejected ROSS’ “fair use” defence. The court ruled that ROSS’ use of copyrighted material was not transformative, highlighting that ROSS did not use generative AI, but instead reproduced existing content.
(d) Toronto-Based AI Company Sued for Copyright Infringement
Several major U.S. media companies, along with the owner of the Toronto Star, have filed a federal copyright infringement suit in the Southern District of New York against Toronto-based AI startup Cohere. [7] In this suit, the plaintiffs allege that: (i) without permission, Cohere scraped their articles from the internet to train its language models; and (ii) Cohere’s AI services compete with and even mimic their offerings. The plaintiffs are seeking damages and an injunction to prevent Cohere from continuing this practice. This lawsuit places Cohere in a similar legal position to OpenAI, highlighting the risks that may arise when AI developers allegedly use third-party data without the appropriate consents or licences.
3. Similar Claims in Other Parts of the World
(a) France
The first legal action of its kind in France was commenced on March 12, 2025, by three trade groups: National Publishing Union, National Union of Authors and Composers, and Societe des Gens de Lettres against Meta. These trade groups allege that Meta uses their copyrighted content to train its AI models without their consent.
(b) India
One of India’s largest news agencies, Asian News International (“ANI”), commenced legal action against OpenAI on November 18, 2024. ANI alleges that: (i) OpenAI violated copyright laws by using ANI’s content to train its AI models without consent, and (ii) OpenAI’s models generate false information and hallucinations that could be incorrectly attributed to ANI. These allegations the allegations made against OpenAI in the US by the New York Times.
4. Key Legal Issues to Watch Out for
The trend we are currently seeing is nothing new — technological breakthroughs consistently outpace regulation, forcing courts to define legal boundaries after the fact. The rapid advancement of AI, combined with industry’s drive to innovate faster than lawmakers can respond, has created yet another legal battleground.
Companies developing AI models are pushing the limits of existing copyright, contract, and intellectual property laws, while rights holders fight to protect their content. As seen in past waves of technological disruption, from the early internet to digital streaming, these disputes will likely shape the future legal and technological landscape, setting new precedents that will determine how AI can be trained, deployed, and monetized.
From a practical standpoint, some key issues to watch out for include:
(a) Enforceability and Importance of Contractual Language
As AI use proliferates, it will be important to start addressing issues relating to AI head on, whether in agreements with data providers, solution providers or in standard terms of use. How the courts interpret unilateral terms of use provisions in the battleground cases will shape contractual legal strategy moving forward.
(b) Interpreting Complex Concepts – Does Vector Embedding Constitute Copying?
These battleground cases present an interesting opportunity for the courts to apply longstanding concepts such as “copying”, to innovative applications like vector embedding. AI data processing often employs vector embedding, which is the conversion of text into numerical codes called “vectors”. These vectors are then used to create the outputs we receive from large language models. The courts will have to thoroughly examine this technique to determine whether it constitutes “copying” under the applicable copyright law. This examination will be highly informative in demonstrating how copyright concepts can be applied to advancements in technological capabilities.
(c) Predicting the Outcome of Ongoing Cases:
Predicting the outcomes of these ongoing cases as they progress through the courts is difficult due to several factors: (i) key differences in copyright law across jurisdictions; (ii) the varying techniques used by AI models to reproduce content; (iii) the protective measures implemented by content creators; and (iv) the methods by which content is reproduced.
5. The Role of Regulation:
The debate over AI regulation is increasingly defined by a fundamental tension between innovation and oversight. Some argue that regulation, fixed at a single point in time, risks handcuffing innovation as AI technology evolves at a pace too rapid for lawmakers to keep up. Others contend that the absence of clear regulations stifles progress by creating legal uncertainty, making it riskier for businesses to develop and deploy AI solutions.
This divide is playing out on the global stage. Recently, members of the new U.S. administration warned the European Union (“EU”) against taking an “excessive” regulatory approach to AI, arguing that overly rigid rules could slow technological advancement and hinder competitiveness. Shortly thereafter, the EU unexpectedly withdrew its AI Liability Directive, a proposal aimed at updating civil liability rules for AI-related harm. EU officials denied that the U.S. influenced this decision, and the move was framed as an effort to reduce regulatory burdens and encourage innovation.
This push-and-pull between regulation and progress underscores the broader uncertainty surrounding AI governance — uncertainty that is currently playing out in courtrooms around the world.
The information and comments herein are for the general information of the reader and are not intended as advice or opinion to be relied upon in relation to any particular circumstances. For particular application of the law to specific situations, the reader should seek professional advice.
[1] Copyright Act, RSO 1958, c C 42 at s. 27(1) [Copyright Act].
[2] Statement of Claim of Toronto Star Newspapers Limited et al. issued November 28, 2024, Court File No. CV-24-00732231-00CL [“Statement of Claim”] at paras 5 and 42.
[3] The New York Times Company v. Microsoft Corporation, 1:23-cv-11195, (S.D.N.Y. Dec 27, 2023).
[4] Section 107 of the US Copyright Act.
[5] Raw Story Media, Inc. v. OpenAI Inc., No. 24 CV 01514-CM, 2024 WL 4711729 (S.D.N.Y. Nov. 7, 2024).
[6] Thomson Reuters Enterprise Centre GmbH v. Ross Intelligence Inc., No. 1:20-CV-613-SB (D. Del. Feb. 11, 2025).
[7] Advance Local Media LLC v. Cohere Inc., 1:25-cv-01305, (S.D.N.Y. Feb 13, 2025).