Artificial Intelligence Impacts on Copyright Law

Matt Blaszczyk, Geoffrey McGovern, Karlyn D. Stanley

Expert InsightsPublished Nov 20, 2024

Glowing blue lines lead into a central processing unit (CPU) chip . Photo by Panuwat Sikham/Getty Images

Photo by Panuwat Sikham/Getty Images

Key Takeaways

  • Copyright protects only original human-authored works, including those made with AI assistance, but does not extend to works generated solely by AI.
  • Training of AI models is potentially fair use, implicating the interests of rights holders and the creative industry.
  • Legal uncertainty persists, especially regarding whether training new generative AI models is also permissible.
  • Some jurisdictions, such as the European Union, have enacted specialized legislation dealing with AI model training, significantly allowing rights holders to object to the use of their works for commercial AI training.
  • A key issue is balancing rights holders' interests and renumeration with the objectives of fostering innovation, encouraging freedom of expression, and maintaining global competitiveness.

The impact of generative artificial intelligence (AI) technology on copyright is a subject of academic, legal, and policy debate, involving a variety of stakeholders, from artists to major newspapers to technology companies. Recently, the U.S. House of Representatives Judiciary Subcommittee on Courts, Intellectual Property and the Internet has held several hearings on AI and copyright.[1] The U.S. Copyright Office (USCO) has been preparing several major reports concerning digital replicas or the use of AI to digitally replicate individuals' appearances, voices, or other aspects of their identities; the copyrightability of works incorporating AI-generated material; and training AI models on copyrighted works as well as any licensing considerations and liability issues, including considerations of fair use.[2] The USCO will also publish an update to the Compendium of U.S. Copyright Office Practices, the administrative manual for registration.[3] This follows a public consultation, which had amassed more than 10,000 comments from artists, lawyers, teachers, publishers, and trade groups representing all 50 states and 67 foreign countries.[4] Moreover, generative AI and copyright have also been the subject of more than two dozen lawsuits, stakeholder roundtables led by the U.S. Federal Trade Commission,[5] and several bills proposed in Congress.[6] Further developments regarding the right of publicity have also taken place separately, with the USCO releasing a dedicated report.[7]

This paper presents three main questions regarding whether:

  1. works created with the use of AI are protectable under copyright law
  2. training of AI models on copyrighted works is allowed under U.S. law and in other jurisdictions, such as the European Union (EU)
  3. the most-recent developments in generative AI technology (including large language models [LLMs]), regarding both their training and outputs, are addressed by current copyright doctrine.

These questions remain open and hotly debated. This paper provides policymakers with legal insights on copyright law and AI from both the United States and abroad, explaining various positions to help balance competing interests. The interests at stake include providing incentives to authors and securing their rights, promoting innovation and the interests of the technology industry, maintaining global competitiveness of AI, and addressing the underlying issues of free expression, the practical difficulties involved, and the law's adaptability to new technological landscapes. This paper aims to spark dialogue on key aspects of AI governance without offering a comprehensive analysis or definitive recommendations, especially as AI applications proliferate worldwide and complex governance debates persist.

Are Works Created Using AI Protectable Under Copyright Law?

Generative AI systems can produce material that would be copyrightable if it were created by a human author. In the United States, copyright law developed from the Patent and Copyright Clause of the Constitution and later the Copyright Act of 1976; it requires all protectable works to be authored by a human being and to be original. The term originality means that a particular work is an author's own, not copied, and more than minimally creative.[8] These do not make a high bar to protection — in fact, many people author multiple original works on a daily basis. Nonetheless, the courts have interpreted the requirements to mean that works generated by AI without human creative authorial contribution are not protectable. The courts have consistently held that "human authorship is a bedrock requirement of copyright."[9] Similarly, the USCO refuses to register copyright in works without a human author.[10]

This means that U.S. law requires AI to be merely an assisting instrument allowing authors to express their own conception.[11] The Copyright Registration Guidance provides that, "If a work's traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it."[12] Thus, simple instructions or prompts given to generative AI software that result in complex artwork will likely not be sufficient for the work to be protectable and registrable under current law.[13] For example, if a user provides an AI with an instruction to write a poem in the style of a famous artist, the expressive elements of the work will be produced by AI, rather than the user, and thus will likely be unprotectable.[14] Nonetheless, the use of AI in creation of a work is not an absolute bar to registration. As long as the software merely assists in authorial expression, the result might be protectable. Similarly, if an author creatively arranges the outputs of an AI system or edits them significantly, the end result also might be protectable.[15] As of February 2024, the USCO had issued registrations to "well over 100" AI-assisted works.[16] In attempting to delineate the difference between AI-assisted and AI-generated works, the USCO relies on applicants' disclosures; if the use of AI extends beyond minimal use, it needs to be disclosed. The USCO's examiners make a case-by-case analysis, with the conclusion ultimately depending on the level of involvement of human intervention.[17] For example, using a spellcheck or an automatic filter in photography will clearly not present an obstacle to obtaining a copyright, whereas using a simple command to generate series of works likely would not result in copyrightable content. It is less clear, however, how the law will approach works that fall in the middle of this continuum, where the lines between assistance and generation might blur. Finally, the U.S. approach presupposes that the user of the AI tool would be the author — not, for example, the AI programmers.

Many scholars agree with the USCO's policy and the principle upheld by the courts,[18] arguing that copyright is justified insofar as it promotes human authorial creative expression; if there is not an author, it would therefore be unjust to deprive the public of the freedom of expression and freedom to use uncopyrighted resources.[19] Such an approach aligns, in principle, with the direction taken by the European Union and its member states,[20] as well as Japan and South Korea.[21] Other scholars are more critical of the U.S. approach to copyright protection, claiming that it could disincentivize development of AI foundation models or not account for the reality of AI-assisted creativity;[22] thus, they argue for a lower bar to copyright protection.[23] Indeed, some jurisdictions, such as the United Kingdom, allow for statutory protection of computer-generated works;[24] in China, courts have granted protection to AI-generated works, including ones arising from simple prompts.[25] Finally, the influence that private contractual arrangements will have on questions of ownership, commercial exploitation, and ability to reproduce the AI-generated works remains to be seen.[26]

Is the Training of AI Models on Copyrighted Works Allowed Under U.S. Law and in Other Jurisdictions?

Regulation of AI Training in the United States

Many AI models — including the machine learning (ML) models and the new generation of LLMs, such as the popular ChatGPT, DALL·E, Midjourney, and Stable Diffusion — are trained on millions of available online materials, many of which are protected by copyright.[27] For AI models to work, they need to engage in text data mining (TDM), which is a computational process that allows AI to learn from data using statistical methods, structure the texts that it ingests, and reveal patterns.[28] When AI scrapes, downloads, and processes works, it might be infringing on the right of reproduction protected by copyright law — i.e., that one does not copy the work of another.[29] This right, however, is subject to limitation by the doctrine of fair use.[30]

Fair use allows for copying, distribution, display, or performance without permission based on a four-factor test.[31] The fair use test is complicated, but it essentially asks about the following:

  1. the purpose of the use of a copyrighted work (e.g., is it commercial or nonprofit)
  2. the nature of the work
  3. the amount and substantiality of what is taken, and whether the use is "transformative" (i.e., whether it transforms the prior work into something with a new, different meaning or message)[32]
  4. a more economic consideration of what effect the use will have on the potential market for or value of the copyrighted works (e.g., whether the new work will substitute for the original on the market).[33]

Although litigation on this issue is pending, it is unclear whether simple legal answers that are broadly applicable can be expected.

Generally, if AI copying and processing of works are found to fall under the umbrella of fair use, the technology companies can use the material as they wish; if they do not fall under that umbrella, then permission must be sought and payment made. How the fair use question is answered will affect the future of both technology and creative industries, whether authors and publishers will be able to profit or refuse to allow their works to be used for AI training, and ultimately, how much and what kind of generative AI innovation appears in the United States.

One example of fair use that could allow for AI training is described by the principle of non-expressive use. Many scholars agree that TDM — which comes down to AI learning on the "non-expressive" elements of works (that is, extracting facts and statistical patterns rather than retaining the original, creative parts of works) — should be considered fair use and allowed under the law.[34] For example, ML (which is an older form of AI) works in the following way: When AI ingests images of beaches, it learns to identify the concept of a beach, and distinguish it from, say, a classroom. Such non-expressive technical elements of works, just like fact or ideas, are not copyrightable and thus cannot be infringed, but their processing is useful for AI to learn about the works it ingests.[35] In other words, the use is deemed transformative, because the photographer of a beach and the AI owner use the photographs for entirely different purposes.[36] This principle of non-expressive use can be seen in several cases predating generative AI. One example is a plagiarism detection tool that copied original works to compare them with new ones.[37] A more important example is searchable databases, such as HathiTrust or Google Books, which allowed for TDM and browsing of snippets.[38] At the same time, it is not entirely clear whether rulings will go the same way for currently pending, generative AI-specific litigation, for several reasons.[39] First, generative AI does not merely analyze training data as information; it is able to produce digital artifacts in the same form as its training data. Second, many generative AI outputs are direct competitors to the works on which AI was trained. Third, generative AI is able to reproduce particular works with a high degree of similarity (for example, fictional characters, such as Mickey Mouse).[40] These issues are addressed further below.

The development of the fair use doctrine in the area of generative AI involves not only legal analysis but also policy choices that could affect the shape of the AI industry and thus influence the direction of innovation and the distribution of costs and benefits across groups such as rights owners, technology companies, and users. Some argue that a broad interpretation of fair use is advantageous because that would allow for TDM in such sectors as the life sciences, linguistics, ML, and internet search engines (which rely on TDM heavily), thus supporting innovation and research and, ultimately, benefiting society.[41] Some argue that allowing TDM is crucial for generative AI models to exist because they could not otherwise be trained. Furthermore, some participants in the legal and policy debate claim that licensing of the works on which the AI is trained is an unrealistic proposal, given the size of datasets ingested by AI and the fact that one would need to obtain a license to both the database and the individual works contained in it.[42] Technology industry representatives have also argued that licensing entails a risk of disincentivizing innovation on the one hand and providing an incentive to sue for infringement on the other.[43] In other words, for generative AI to continue developing at a rapid pace, and for innovation and the technology sector to flourish, it might be important to consider TDM fair use.

Nonetheless, there are competing interests, values, and perspectives. Authors' rights advocates emphasize that fairness of a particular use must be decided on the facts of a particular case, claiming that TDM is not presumptively fair.[44] They further argue that, under the existing law, scraping of existing works should not be free, especially when done for commercial purposes.[45] Artists and creative professionals have also voiced concerns about the lack of their "consent, compensation or control," when it comes to AI model training.[46] They argue that a loss of a licensing market is an important fair use consideration. In their view, large, for-profit companies are naturally suited to bear such costs, and licensing markets have already begun developing.[47] Some companies have chosen to enter into licensing agreements with rights holders, others are litigating,[48] and still others have decided to train their AI on the data they already possess.[49] Moreover, data obtained illegally online weigh against a fair use finding. Finally, some argue that the analysis should consider not just the intermediary purpose of training AI but also the ultimate purpose of the creation of new works.[50] The challenge of balancing different fair use considerations makes the standard difficult to apply generally — especially to the novel types of AI, including generative LLMs, as addressed below.

Regulation of AI Training Abroad

Given the ongoing global AI race and the transnational nature of digital economy, the development of U.S. copyright policy cannot be considered in isolation. Even where the legal frameworks or objectives differ, it is important to understand the requirements that U.S. companies need to comply with to enter foreign markets; this is why scholars of regulatory competition often speak of the "Brussels effect," highlighting the influence of EU regulations abroad.[51] For example, Japan and Singapore provide TDM exceptions to the general protection of rights holders provided by copyright law.[52] In the United Kingdom, similar ambitions were scrapped on account of potential harm to the creative industries and a decision not to incentivize AI development at all costs, with the law explicitly allowing TDM only for research purposes, although this could be revisited.[53] Similarly, legal frameworks in Latin America and China are still being developed.[54] Scholars have argued that, on the one hand, developers of AI models could choose to locate their investments in the most friendly jurisdiction, such as the United States. On the other hand, they also call attention to the possibility that a greater international convergence of standards (or "regulatory race to the middle") is likely to develop, striking a global compromise between different economic interests at play.[55]

The recently adopted EU AI Act (together with an earlier Directive) created a framework for dealing with TDM for AI. [56] The EU legislation contains two exceptions that allow TDM.[57] First, research organizations and cultural heritage institutions are free to use reproductions and extractions for the purposes of scientific research, provided they have lawful access to the works.[58] Copyright holders cannot opt out or prevent such practices; nonetheless, they do have the right to apply measures ensuring security and integrity of the networks and databases and to develop codes of practices. Second, when TDM is undertaken for nonresearch or commercial use, owners of copyrighted works can prevent the mining of those works by making an express reservation of right in an appropriate manner.[59] Additionally, the EU AI Act imposes an obligation to implement technologies enabling providers of AI models to honor copyright holders' decisions to opt out and prevent data mining of their work.[60] Companies have already started using opt-out notices pursuant to the EU AI Act,[61] while AI providers have implemented opt-out processes.[62] The opt-out model is widely seen as a rights holder–friendly compromise, especially contrasted with the currently developing shape of fair use in the United States.[63] Others maintain that an opt-in solution should be pursued instead,[64] arguing that opt-outs pose an undue burden on rights holders or are unworkable in practice.[65]

Although the European Union allows for TDM, it also puts an explicit obligation on providers of general-purpose AI models to implement policies complying with the law and any reservation of rights by copyright holders.[66] Importantly, the EU legislation further imposes significant transparency obligations, including the requirement to publish a detailed, comprehensive summary of content used for model training; this summary could include the main data collections or sets that went into training the model, such as large private or public databases or data archives, and a narrative explanation about other data sources used.[67] Compliance with those objectives will be monitored by the EU AI Office,[68] and public authorities will be able to impose fines and orders to withdraw AI models from the European market.[69] Importantly, these obligations apply to all models that are placed on the EU market, regardless of where the training took place; thus, they could apply extraterritorially, including to U.S. companies.[70] While this is widely seen as a victory for rights holders, others argue that it might disincentivize AI models from entering the EU market at all.[71] Importantly, no such requirement exists so far in the United States, although the Generative AI Copyright Disclosure Act proposed by Democratic California Congressman Adam Schiff would impose a series of disclosure requirements.[72] Furthermore, content transparency and provenance concerns are emphasized in an AI Roadmap presented by Democratic New York Senator Chuck Schumer and the Bipartisan Senate AI Working Group and are featured in the newly proposed Content Origin Protection and Integrity from Edited and Deepfaked Media (COPIED) Act of 2024.[73] These concerns have also been emphasized by commentators in public consultations in the United States.[74]

LLMs Training and Outputs in Light of Copyright Law

Fair use is often interpreted by legal commentators as allowing for non-expressive use to train AI models.[75] LLMs, the newest kind of generative AI, pose a distinct problem, however. Such models as ChatGPT, DALL·E, Midjourney, and Stable Diffusion can produce text, images, and music that are indistinguishable from the works on which they are trained.[76] According to some legal scholars, the fact that LLMs create such works could undermine the claim that the use is fair.[77] Rights holders argue that LLMs fall afoul of the fourth fair use factor, effectively competing with the artists' works and publishers' websites on the markets, or outright substituting for the authors' works.[78] Creative industry advocates argue that training of a model should not be considered as the end purpose of TDM; rather, the ultimate purpose is the generation of output that serves the same purpose as the ingested works, which weighs against a finding of fair use.[79] Although some artists compare LLMs to plagiarists or robbers,[80] other stakeholders highlight many social, economic, and consumer benefits that the technology seems to bring in such industries as art, medical research, and autonomous vehicles.[81] Perhaps, as some scholars note, the shape of the doctrine should depend on whether the licensing solutions and data markets are developed sufficiently, justifying rights holders' claims.[82]

In addition to the question of whether training AI models on unlicensed works and databases is infringing copyright, there is the related question of whether the outputs of such generative models are infringing.[83] For both of these issues, one of the technical questions concerns how much AI models retain the actual, expressive content of works they were trained on, apparently being able to re-create nearly exact copies of substantial portions of particular works on which they were trained.[84] Seemingly, LLMs do, at least sometimes, "memorize" works in their training data, as recent lawsuits allege;[85] in such cases, AI communicates the original expression from the works it was trained on, which is suspect under the fair use framework.[86] There are already several highly publicized cases in which AI seemingly re-created whole articles;[87] images with a painter's signature or watermark;[88] or copyrightable characters, such as Snoopy or Mickey Mouse.[89] Although experts caution that these are rare instances and AI providers are taking steps to prevent them,[90] these cases might nonetheless undermine the claim to fair use if the outputs are substantially similar to the works on which AI was trained.[91] It is not difficult to re-create copyrightable characters if the users provide detailed prompts.[92] Finally, generative AI could also be used to mimic the style of artists, such as singers, illustrators, or writers. This makes the fair use argument less persuasive because the output is closer to substitution for the copyrighted material used in the training data than transformation.[93] At the same time, style has always been difficult for copyright doctrine to address,[94] sometimes being called unprotectable, making many cases of algorithmic reproduction allowable under the law — or, at least, difficult to analyze legally.[95] The USCO recently issued a report concluding that although artistic style is and should remain unprotectable under copyright, "there may be situations where the use of an artist's own works to train AI systems to produce material imitating their style can support an infringement claim."[96]

In deciding AI fair use cases, courts might be swayed by a host of legal and policy arguments regarding generative AI model training — the supposed spread of misinformation, perpetuation of biases, and replacing of artists; increasing productivity; enabling new forms of creativity; and accelerating research.[97] Further arguments involve job displacement resulting from AI and questions of antimonopoly policy.[98] Some of these concerns might be too broad for copyright doctrine to address.[99] More generally, if the courts reject fair use for generative AI, they could halt innovation or push it offshore; if they accept fair use, they might divert economic gain from individual creators.[100] Applications of the fair use doctrine in this context are yet to be decided; it is possible that cases could point in divergent directions.

Summary

Copyright law protects original works of human expression. It does not protect AI-generated works where a human makes little to no creative impact, such as by typing a simple prompt, but it does protect works created with the use or assistance of AI. It is not yet clear how much creative input will be required to render AI-assisted work protectable under copyright. Training of AI models is likely to be deemed legal if the AI model does not retain protectable expression about works. Generative AI, such as LLMs, presents more-complex considerations, leading to a fact-specific inquiry into the source of training data, the purpose of the model, and the effect on the licensing markets, whether existing or potential. These questions will be settled in litigation and might not yield uniform answers initially, though the issue likely will be resolved by either legislation or the Supreme Court. It is unclear whether legislative solutions pursued in other jurisdictions, such as the European Union, will influence domestic U.S. development; they might, however, exert an effect on which policies U.S. companies implement. Finally, if global copyright standards continue to diverge, an expansive doctrine of fair use might allow the United States to remain a leader in international technological competition and attract investment in AI — at the price of domestic rights holders.

Author Affiliations

Matt Blaszczyk is a research fellow at the University of Michigan Law School. Geoffrey McGovern is director of Intellectual Property and a senior political scientist at RAND. Karlyn D. Stanley is a senior policy researcher at RAND.

Notes

  • [1] U.S. House Judiciary Subcommittee on Courts, Intellectual Property, and the Internet, hearing on Artificial Intelligence and Intellectual Property: Part I, Interoperability of AI and Copyright Law, May 17, 2023a; U.S. House Judiciary Subcommittee on Courts, Intellectual Property, and the Internet, hearing on Artificial Intelligence and Intellectual Property: Part II, Copyright, July 12, 2023b; U.S. House Judiciary Subcommittee on Courts, Intellectual Property, and the Internet, hearing on Artificial Intelligence and Intellectual Property: Part II, Identity in the Age of AI, February 2, 2024a; U.S. House Judiciary Subcommittee on Courts, Intellectual Property, and the Internet, hearing on Artificial Intelligence and Intellectual Property: Part III, IP Protection for AI-Assisted Inventions and Creative Works, April 10, 2024b. Also see Christopher T. Zirpoli, Generative Artificial Intelligence and Copyright Law, Congressional Research Service, LSB10922, September 29, 2023.
  • [2] USCO, "Copyright and Artificial Intelligence," webpage, undated-b. As of October 30, 2024: https://www.copyright.gov/ai/; also see USCO, "Artificial Intelligence and Copyright," webpage, regulations.gov, undated-a. As of October 30, 2024: https://www.regulations.gov/docket/COLC-2023-0006/comments
  • [3] U.S. Copyright Office, Compendium of U.S. Copyright Office Practices, 2021; also see Nora Scheland, "Looking Forward: The U.S. Copyright Office's AI Initiative in 2024," Library of Congress Blogs, March 26, 2024. As of October 30, 2024: https://blogs.loc.gov/copyright/2024/03/looking-forward-the-u-s-copyright-offices-ai-initiative-in-2024/
  • [4] USCO, Copyright and Artificial Intelligence: Part 1, Digital Replicas, July 2024.
  • [5] U.S. Federal Trade Commission, "Federal Trade Commission Roundtable on Creative Economy and Generative AI," media advisory, October 3, 2023a; U.S. Federal Trade Commission, Generative Artificial Intelligence and the Creative Economy Staff Report: Perspectives and Takeaways, December 2023b.
  • [6] U.S. House of Representatives, Generative AI Copyright Disclosure Act of 2024, Bill 7913, April 9, 2024; U.S. Senate, Content Origin Protection and Integrity from Edited and Deepfaked Media Act of 2024, Bill 4674, July 11, 2024. As one reporter noted, "The bill is being praised by writers, artists, and other creators. But advocates of the technology argue the measure is impractical and unnecessary" (Kyle Jahner, "AI Copyright Bill Thrills Artists. Developers Call It Unworkable," Bloomberg, April 25, 2024).
  • [7] USCO, 2024.
  • [8] Feist Publications, Inc., v. Rural Telephone Service Co., 499 U.S. 340, 1991.
  • [9] Thaler v. Perlmutter, 687 F. Supp. 3d 140, 2023; also see USCO and Van Lindberg, "Zarya of the Dawn (Registration # VAu001480196)," letters, October 28, 2022, November 21, 2022, February 21, 2023. As of October 30, 2024: https://www.copyright.gov/docs/zarya-of-the-dawn.pdf; and Matt Blaszczyk, "Impossibility of Emergent Works' Protection in U.S. and EU Copyright Law," North Carolina Journal of Law & Technology, Vol. 25, No. 1, 2023a.
  • [10] For example, see Blaszczyk, 2023a.
  • [11] USCO, "Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence," Federal Register, Vol. 88 No. 16, March 16, 2023 (quoting U.S. Copyright Office, Sixty-Eighth Annual Report of The Register of Copyrights for the Fiscal Year Ending June 30, 1965, 1966).
  • [12] USCO, 2023.
  • [13] As USCO (2023) notes, "when an AI technology receives solely a prompt from a human and produces complex written, visual, or musical works in response, the 'traditional elements of authorship' are determined and executed by the technology — not the human user." Also see USCO, Copyright Review Board, "Second Request for Reconsideration for Refusal to Register SURYAST (SR # 1-11016599571; Correspondence ID: 1-5PR2XKJ)" letter to Alex P. Garens, December 11, 2023b. As of October 31, 2024: https://www.copyright.gov/rulings-filings/review-board/docs/SURYAST.pdf
  • [14] USCO, 2023; USCO, Copyright Review Board, 2023b.
  • [15] USCO, 2023; USCO, Copyright Review Board, 2023b. Also see USCO and Van Lindberg, 2022–2023; Dan L. Burk, "Thirty-Six Views of Copyright Authorship, by Jackson Pollock," Houston Law Review, Vol. 58, No. 263, 2020; and Pamela Samuelson, Christopher Jon Sprigman, and Matthew Sag, "Comments In Response To The Copyright Office's Notice Of Inquiry On Artificial Intelligence And Copyright," U.S. Copyright Office, November 1, 2023. As of October 31, 2024: https://www.regulations.gov/comment/COLC-2023-0006-8854
  • [16] Shira Perlmutter, USCO, letter to Senators Coons and Tillis and Representatives Issa and Johnson, February 23, 2024. As of October 31, 2024: https://copyright.gov/laws/hearings/USCO-Letter-on-AI-and-Copyright-Initiative-Update.pdf
  • [17] USCO, 2023; U.S. Copyright Office, Copyright Review Board, "Second Request for Reconsideration for Refusal to Register Théâtre D'opéra Spatial (SR # 1-11743923581; Correspondence ID: 1-5T5320R)" letter to Tamara Pester, September 5, 2023a. As of October 31, 2024: https://www.copyright.gov/rulings-filings/review-board/docs/Theatre-Dopera-Spatial.pdf
  • [18] For example, see James Grimmelmann, "There's No Such Thing as a Computer-Authored Work — And It's a Good Thing, Too," Columbia Journal of Law & the Arts, Vol. 39, No. 403, 2016a.
  • [19] For example, see Blaszczyk, 2023a; and Haochen Sun, "Redesigning Copyright Protection in the Era of Artificial Intelligence," Iowa Law Review, Vol. 107, No. 3, 2022.
  • [20] For example, see Alina Trapova, "Copyright for AI-Generated Works: A Task for the Internal Market?" European Law Review, Vol. 48, 2023; and Vojtěch Chloupek and Martin Taimr, "Czech Court Denies Copyright Protection of AI-Generated Work in First Ever Ruling," Insights, Bird & Bird, May 29, 2024. As of October 30, 2024: https://www.twobirds.com/en/insights/2024/czech-republic/czech-court-denies-copyright-protection-of-ai-generated-work-in-first-ever-ruling
  • [21] For example, see Yoshinori Okamoto and Jin Yoshikawa, "Is Japan Still a Machine Learning Paradise?" World Intellectual Property Review, April 18, 2024; and Republic of Korea, Ministry of Culture, Sports and Tourism, A Guide on Generative AI and Copyright, April 15, 2024. As of October 31, 2024: https://www.mcst.go.kr/english/policy/pressView.jsp?pSeq=391
  • [22] Angela Huyue Zhang, "China's Short-Sighted AI Regulation," ProjectSyndicate, December 8, 2023; also see Blaszczyk, 2023a; Edward Lee, "A Terrible Decision on AI-Made Images Hurts Creators," Washington Post, April 27, 2023; and Edward Lee, "Prompting Progress: Authorship in the Age of AI," Florida Law Review, April 10, 2024.
  • [23] For example, see Mark A. Lemley, "How Generative AI Turns Copyright Upside Down," Columbia Science and Technology Law Review, Vol. 25, No. 190, 2024.
  • [24] Matt Blaszczyk, "Contradictions of Computer-Generated Works' Protection," Kluwer Copyright Blog, November 6, 2023b. As of October 31, 2024: https://copyrightblog.kluweriplaw.com/2023/11/06/contradictions-of-computer-generated-works-protection/
  • [25] Wang and Zhang note that "the Beijing Internet Court (BIC) ruled in an infringement lawsuit (Li v. Liu) that an AI-generated image is copyrightable and that a person who prompted the AI-generated image is entitled to the right of authorship under Chinese Copyright Law . . . [while] recent US Copyright Office (USCO) rulings reach the opposite result" (Yuqian Wang and Jessie Zhang, "Beijing Internet Court Grants Copyright to AI-Generated Image for the First Time," Kluwer Copyright Blog, February 2, 2024. As of October 31, 2024: https://copyrightblog.kluweriplaw.com/2024/02/02/beijing-internet-court-grants-copyright-to-ai-generated-image-for-the-first-time/). For a critical analysis of China's pro-competitive strategy, see Angela Huyue Zhang, The Promise and Perils of China's Regulation of Artificial Intelligence, University of Hong Kong, Faculty of Law Research Paper No. 2024/02, 2024.
  • [26] European Innovation Council and SMEs Executive Agency, "Artificial Intelligence and Copyright: Use of Generative AI Tools to Develop New Content," European Commission, blog post, July 16, 2024. As of October 31, 2024: https://intellectual-property-helpdesk.ec.europa.eu/news-events/news/artificial-intelligence-and-copyright-use-generative-ai-tools-develop-new-content-2024-07-16-0_en
  • [27] As Sag defines it, "[m]achine learning refers to a cluster of statistical and programming techniques that give computers the ability to 'learn' from exposure to data, without being explicitly programmed" (Matthew Sag, "The New Legal Landscape for Text Mining and Machine Learning," Journal of the Copyright Society of the U.S.A., Vol. 66, 2019).
  • [28] Matthew Sag, "Copyright Safety for Generative AI," Houston Law Review, Vol. 61, No. 2, 2023a.
  • [29] Jane C. Ginsburg, "Fair Use in the US Redux: Reformed or Still Deformed?" Singapore Journal of Legal Studies, Vol. 2024, 2024.
  • [30] U.S. Code, Title 17, Section 106, Exclusive Rights in Copyrighted Works, provides the exclusive right to reproduce the work in copies. Compare this with U.S. Code, Title 17, Section 107, Limitations on Exclusive Rights: Fair Use, which provides that fair use is not infringement.
  • [31] 17 U.S.C. § 107.
  • [32] Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569, 1994. Also see Pierre N. Leval, "Toward a Fair Use Standard," Harvard Law Review, Vol. 103, No. 5, March 1990.
  • [33] 17 U.S.C. § 107.
  • [34] Mark A. Lemley and Bryan Casey, "Fair Learning," Texas Law Review, Vol. 99, No. 4, 2021; Sag, 2019; James Grimmelmann, "Copyright for Literate Robots," Iowa Law Review, Vol. 101, No. 2, 2016a.
  • [35] Baker v. Selden, 101 U.S. 99, 1880; Sony Computer Entertainment, Inc. v. Connectix Corp., 203 F.3d 596, 603, 9th Circuit, 2000.
  • [36] Lemley and Casey, 2021.
  • [37] A.V. ex rel. Vanderhye v. iParadigms, LLC, 562 F.3d 630, 634, 4th Circuit, 2009.
  • [38] Authors Guild, Inc. v. HathiTrust, 755 F.3d 87, 2nd Circuit, 2014; Authors Guild, Inc. v. Google, Inc., 804 F.3d 202, 2nd Circuit, 2015.
  • [39] Lemley and Casey (2021) note that "there is no guarantee that courts will extend this precedent to similar technologies or legal contexts." Also see David W. Opderbeck, "Copyright in AI Training Data: A Human-Centered Approach," Oklahoma Law Review, Vol. 76, No. 4, 2024.
  • [40] Matthew Sag and Peter K. Yu, "The Globalization of Copyright Exceptions for AI Training," Emory Law Journal, Vol. 74, 2025 (forthcoming). As of October 30, 2024: https://dx.doi.org/10.2139/ssrn.4976393
  • [41] Lemley and Casey, 2021. Also see Matthew L. Jockers, Matthew Sag, and Jason Schultz, "Don't Let Copyright Block Data Mining," Nature, Vol. 490, No. 7418, October 4, 2012; Lemley and Casey, 2021.
  • [42] Lemley and Casey, 2021. Also see Computer & Communications Industry Association (CCIA), comment letter on the U.S. Copyright Office's Notice of Inquiry on Artificial Intelligence and Copyright, October 30, 2023.
  • [43] Andreessen Horwitz (a16z), comment letter on the U.S. Copyright Office's Notice of Inquiry on Artificial Intelligence and Copyright, October 30, 2023; also see Meta Platforms, Inc., comment letter on the U.S. Copyright Office's Notice of Inquiry on Artificial Intelligence and Copyright, October 30, 2023; Google, Inc., comment letter on the U.S. Copyright Office's Notice of Inquiry on Artificial Intelligence and Copyright, October 30, 2023; Electric Frontier Foundation, comment letter on the U.S. Copyright Office's Notice of Inquiry on Artificial Intelligence and Copyright, October 30, 2023; and Anthropic PBC, comment letter on the U.S. Copyright Office's Notice of Inquiry on Artificial Intelligence and Copyright, October 30, 2023.
  • [44] See Copyright Alliance, comment letter on the U.S. Copyright Office's Notice of Inquiry on Artificial Intelligence and Copyright, October 30, 2023b: "[N]either the courts nor Congress have ever espoused a broad category of fair use called 'non-expressive use.'"
  • [45] Copyright Alliance, 2023b: "[T]here are successful AI developers that do not indiscriminately scrape the internet and ingest copyrighted works without permission. Simply because some AI developers want to use everything to train their systems doesn't mean that it is necessary, and their desire to . . . avoid licensing cannot be a 'compelling justification' that would weigh in favor of fair use."
  • [46] Mary Rasenberger, "As AI Is Embraced, What Happens to the Artists Whose Work Was Stolen to Build It?" Los Angeles Times, June 18, 2024. Also see Authors Guild, "More Than 15,000 Authors Sign Authors Guild Letter Calling on AI Industry Leaders to Protect Writers," press release, July 18, 2023a.
  • [47] Lemley and Casey, 2021; Jane C. Ginsburg, "In the Courts: The US Supreme Court's Warhol Decision Revisits the Boundaries of Fair Use," Wipo Magazine, November 2023; Gerrit de Wynck, "OpenAI Strikes Deal with AP to Pay for Using Its News in Training AI," Washington Post, July 13, 2023.
  • [48] Todd Spangler, "OpenAI Inks Licensing Deals to Bring Vox Media, The Atlantic Content to ChatGPT," Variety, May 29, 2024.
  • [49] Nico Grant and Cade Metz, "The Push to Develop Generative A.I. Without All the Lawsuits," New York Times, July 19, 2024.
  • [50] Copyright Alliance, 2023b (citing Am. Geophysical Union v. Texaco Inc., 60 F.3d 913, 924, 2nd Circuit, 1994).
  • [51] For example, see Anu Bradford, The Brussels Effect: How the European Union Rules the World, Oxford University Press, 2020; Charlotte Siegmann and Markus Anderljung, "How EU Regulation Will Impact the Global AI Market," arxiv, arXiv:2208.12645, August 23, 2022. Compare these with, for example, Almada and Radu, which says that "while the AI Act will likely produce a Brussels Effect of its own, such an outcome will be accompanied by a side effect that undermines the EU's ambition to spread legislative text and values in AI governance" (Marco Almada and Anca Radu, "The Brussels Side-Effect: How the AI Act Can Reduce the Global Reach of EU Policy," German Law Journal, Vol. 25 No. 4, February 2024) or with João Pedro Quintais, "Generative AI, Copyright and the AI," SSRN Electronic Journal, August 13, 2024.
  • [52] Regarding Japan, see Law of Japan, Copyright Act, Act No. 48, May 6, 1970, Section 30-4; also see Kalpana Tyagi, "Generative AI: Remunerating the Human Author & the Limits of a Narrow TDM Exception," Kluwer Copyright Blog, December 13, 2023. As of October 30, 2024: https://copyrightblog.kluweriplaw.com/2023/12/13/generative-ai-remunerating-the-human-author-the-limits-of-a-narrow-tdm-exception/. Regarding Singapore, see Pin-Ping Oh, "Potential Expansion of Singapore's TDM Exception?" Insights, Bird & Bird, April 26, 2024. As of October 30, 2024: https://www.twobirds.com/en/insights/2024/singapore/potential-expansion-to-singapores-tdm-exception; also see Bryan Tan and Hagen Rooke, "Text and Data Mining in Singapore," Perspectives, ReedSmith, February 5, 2023. As of October 30, 2024: https://www.reedsmith.com/en/perspectives/ai-in-entertainment-and-media/2024/02/text-and-data-mining-in-singapore
  • [53] "UK Withdraws Plans for Broader Text and Data Mining (TDM) Copyright and Database Right Exception," Notes, Herbert Smith Freehills, March 1, 2023. As of October 30, 2024: https://www.herbertsmithfreehills.com/notes/ip/2023-03/uk-withdraws-plans-for-broader-text-and-data-mining-tdm-copyright-and-database-right-exception; Mark Deem, Rachel Alexander, and Calum Smith, "Copyright and Generative AI: Government Confirms That New Legislation by the End of the Year May Be Needed to Resolve Uncertainty," Lexology, October 14, 2024; Tom Saunders, "New Law May Be Needed to End AI Copyright Disputes," The Times, October 2, 2024.
  • [54] Luca Schirru, Allan Rocha de Souza, Mariana G. Valente, and Alice de Perdigão Lana, "Text and Data Mining Exceptions in Latin America," International Review of Intellectual Property and Competition Law, September 19, 2024; Sag and Yu, 2025.
  • [55] Sag and Yu, 2025.
  • [56]The AI Act was published on July 12, 2024, and entered into force on August 1. European Union, "Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence and amending Regulations (EC) No 300/2008, (EU) No 167/2013, (EU) No 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and Directives 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 (Artificial Intelligence Act)Text with EEA relevance," June 13, 2024, Recital 97; European Union, "Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on copyright and related rights in the Digital Single Market and amending Directives 96/9/EC and 2001/29/EC (Text with EEA relevance.); April 17, 2019.
  • [57] The EU AI Act defines TDM as "any automated analytical technique aimed at analyzing text data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations" (European Union, 2024).
  • [58] European Union, 2024 (Directive on Copyright in the Digital Single Market, Article 3).
  • [59] European Union, 2024 (Directive on Copyright in the Digital Single Market, Article 4).
  • [60] European Union, 2024 (Article 53 (1) (c)).
  • [61] Sony Music "Declaration of AI Training Opt Out," May 16, 2024.
  • [62] Kali Hays, "OpenAI Offers a Way for Creators to Opt Out of AI Training Data. It's So Onerous That One Artist Called It 'Enraging,'" Business Insider, September 29, 2023.
  • [63] For example, see CCIA, 2023, which says, "Because ingestion is a fair use, no affirmative consent is required by law."
  • [64] Steve Brachmann, "Amid Approval of EU AI Act, Creators Demand Stronger Protections for Rightsholders," IPWatchdog, March 17, 2024; Philipp Hacker, "The European AI Liability Directives; Critique of a Half-Hearted Approach and Lessons for the Future," Computer Law & Security Review, Vol. 51, November 2023; Kristian Stout, "Systemic Risk and Copyright in the EU AI Act," Truth on the Market, March 19, 2024.
  • [65] Virginie Berger, "4 Lessons on Training AI from the Data Debacles with Sony, Scarlett Johansson and More," Forbes, May 30, 2024. Appel, Neelbauer, and Schweidel have observed that "companies should require the creator's opt-in rather opt-out" (Gil Appel, Juliana Neelbauer, and David A. Schweidel, "Generative AI Has an Intellectual Property Problem," Harvard Business Review, April 7, 2023).
  • [66] European Union, 2024 (recitals 104–106).
  • [67] European Union, 2024 (Article 53 (1) (d)).
  • [68] European Union, 2024 (recital 108) says "the AI Office should monitor whether the provider has fulfilled those obligations without verifying or proceeding to a work-by-work assessment of the training data in terms of copyright compliance."
  • [69] Paul Goldstein, Christiane Stuetzle, and Susan Bischoff, "Copyright Compliance with the EU AI Act — Extraterritorial Traps for the Unwary," Bloomberg Law, June 2024.
  • [70] Goldstein, Stuetzle, and Bischoff, 2024. Also see Axel Spies, Ron N. Dreben, Andrew J. Gray IV, Meaghan Kent, Vishnu Shankar, and Mike Pierides, "EU AI Act: How Far Will Eu Copyright Principles Extend?" Morgan Lewis, February 12, 2024.
  • [71] Martin Senftleben, AI Act and Author Remuneration — A Model for Other Regions? SSRN, February 24, 2024. As of October 30, 2024: https://dx.doi.org/10.2139/ssrn.4740268
  • [72] U.S. House of Representatives, 2024.
  • [73] Bipartisan Senate AI Working Group, Driving U.S. Innovation in Artificial Intelligence: A Roadmap for Artificial Intelligence Policy in the U.S. Senate, May 2024; U.S. Senate, 2024.
  • [74] Center for AI and Digital Policy, "Copyright Office — Copyright and Artificial Intelligence," webpage, undated. As of October 30, 2024: https://www.caidp.org/public-voice/copyright-office-us-2023/
  • [75] Lemley and Casey, 2021; Ginsburg, 2023; de Wynck, 2023; Opderbeck, 2024.
  • [76] Sag, 2023a.
  • [77] Benjamin L. W. Sobel, "Artificial Intelligence's Fair Use Crisis," Columbia Journal of Law & the Arts, Vol. 41, No. 1, 2017.
  • [78] Sobel, 2017; also see News Media Alliance, White Paper: How the Pervasive Copying of Expressive Works to Train and Fuel Generative Artificial Intelligence Systems Is Copyright Infringement and Not a Fair Use, 2023; and Authors Guild, Inc., v. Google, Inc., 2015.
  • [79] See Copyright Alliance, comments on Artificial Intelligence and Copyright before the U.S. Copyright Office, Docket No. 2023–6, 2023a.
  • [80] For example, see Andrew Orlowski, "AI and the Great Data Robbery," The Critic, May 17, 2024.
  • [81] Chamber of Progress, comment letter on the U.S. Copyright Office's Notice of Inquiry on Artificial Intelligence and Copyright, October 30, 2023; Consumer Technology Association, comment letter on the U.S. Copyright Office's Notice of Inquiry on Artificial Intelligence and Copyright, October 30, 2023.
  • [82] Ginsburg, 2024.
  • [83] Ginsburg, 2024, notes that "potentially pertinent caselaw couples the inputs to the outputs, excusing the former if, among other considerations, they are necessary to the production of non-infringing outputs."
  • [84] A. Feder Cooper and James Grimmelmann, "The Files Are in the Computer: On Copyright, Memorization, and Generative AI," Chicago Kent Law Review, 2025 (forthcoming), SSRN, 2024. As of October 30, 2024: https://ssrn.com/abstract=4803118
  • [85] Cooper and Grimmelmann, 2025.
  • [86] Sag (2023a) adds: "If LLMs just took expressive works and conveyed that same expression to a new audience with no additional commentary or criticism, or no distinct informational purpose, that would be a very poor candidate for fair use."
  • [87] Kalpana Tyagi, "The Copyright, Text & Data Mining and the Innovation Dimension of Generative AI," Journal of Intellectual Property Law and Practice, Vol. 19, No. 7, 2024.
  • [88] Ginsburg, 2024.
  • [89] Sag, 2023a.
  • [90] Sag, 2023a.
  • [91] For a short outline of the infringement analysis, see DLA Piper, "Substantial Similarity in Copyright: It Matters Where You Sue," December 22, 2022.
  • [92] Matthew Sag, "Fairness and Fair Use in Generative AI," Fordham Law Review, Vol. 92, No. 5, 2023b.
  • [93] Lemley and Casey, 2021. In a comment letter to the USCO, the Center for AI and Digital Policy noted that "if AI can replicate [artists'] signature style en masse, it might undermine the market value of their creations, unjustly depriving them of economic benefits" (Center for AI and Digital Policy, comment letter on the U.S. Copyright Office's Notice of Inquiry on Artificial Intelligence and Copyright, October 30, 2023). The Authors Guild similarly observed that, "Suddenly, we see people using generative AI to generate texts in the style of authors . . . . [W]e have already seen someone write the last two novels in George R.R. Martin's A Song of Ice and Fire (Game of Thrones) series" (Authors Guild, comment letter on the U.S. Copyright Office's Notice of Inquiry on Artificial Intelligence and Copyright, October 30, 2023b).
  • [94] Guadamuz poses the question: "[A]re styles protected by copyright? Roughly speaking no, a style is more of an idea, and copyright does not protect an idea, only the expression of that idea, this is because protecting an idea would potentially lead to monopolies" (Andres Guadamuz, "A Scanner Darkly: Copyright Infringement in Artificial Intelligence Inputs and Outputs," GRUR International, Vol. 73, No. 2, 2024).
  • [95] Benjamin L. W. Sobel, "Elements of Style: Copyright, Similarity, and Generative AI," Harvard Journal of Law & Technology, Vol. 38, 2024 (forthcoming), SSRN, May 22, 2024.
  • [96] USCO, 2024.
  • [97] Sag, 2023b.
  • [98] Open Markets Institute, comment letter on the U.S. Copyright Office's Notice of Inquiry on Artificial Intelligence and Copyright, October 30, 2023.
  • [99] For example, see Corynne McSherry, "Generative AI Policy Must Be Precise, Careful, and Practical: How to Cut Through the Hype and Spot Potential Risks in New Legislation," Electronic Frontier Foundation, July 7, 2023.
  • [100] Sobel, 2017.

Document Details

Citation

RAND Style Manual

Blaszczyk, Matt, Geoffrey McGovern, and Karlyn D. Stanley, Artificial Intelligence Impacts on Copyright Law, RAND Corporation, PE-A3243-1, November 2024. As of April 30, 2025: https://www.rand.org/pubs/perspectives/PEA3243-1.html

Chicago Manual of Style

Blaszczyk, Matt, Geoffrey McGovern, and Karlyn D. Stanley, Artificial Intelligence Impacts on Copyright Law. Santa Monica, CA: RAND Corporation, 2024. https://www.rand.org/pubs/perspectives/PEA3243-1.html.
BibTeX RIS

This research was sponsored by RAND Institute for Civil Justice and conducted in the Justice Policy Program within RAND Social and Economic Well-Being and the Science and Emerging Technology Research Group within RAND Europe.

This publication is part of the RAND expert insights series. The expert insights series presents perspectives on timely policy issues.

This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit www.rand.org/pubs/permissions.

RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.