生成式人工智能模型数据训练中的著作权侵权认定

曲靖师范学院学报 ›› 2025, Vol. 44 ›› Issue (5): 98-104.

生成式人工智能模型数据训练中的著作权侵权认定

董凯宇, 杜爱萍

云南师范大学法学与社会学学院,云南昆明 650500

收稿日期:2025-09-01 出版日期:2025-09-26 发布日期:2025-11-05
作者简介:董凯宇,云南师范大学法学与社会学学院硕士研究生,主要从事民商法学研究。

Determining Copyright Infringement in the Training of Generative Artificial Intelligence Models

DONG Kaiyu, DU Aiping

School of Law and Sociology, Yunnan Normal University,Kunming Yunnan 650500, China

Received:2025-09-01 Published:2025-09-26 Online:2025-11-05

摘要/Abstract

摘要： 随着生成式人工智能技术的发展,其数据训练过程中因使用受著作权保护的作品引发的侵权问题成为焦点。结合国内外典型案例,从合理使用认定、归责原则及责任承担三方面展开研究。在合理使用方面,我国相关案例未涉及大模型训练的作品使用认定,国外从支持“中间复制”构成合理使用转向认定侵权;归责原则上,考虑到数据广泛性、技术复杂性及鼓励创新的需求,应采用过错责任原则,避免过度加重技术开发负担;责任承担方面,数据训练场景中平台可扩大解释为新型“网络服务提供者”,类推适用“避风港原则”,并通过“实质性相似 + 接触可能性”标准与算法比对技术认定侵权,同时参考借鉴当前各国主流理论,为相关法律规制完善与产业发展提供参考。

关键词: 生成式人工智能, 模型训练, 著作权, 侵权认定

Abstract: With the rapid development of generative artificial intelligence technology, the infringement issues arising from the use of copyright-protected works in its data training process have become a focal point. The typical domestic and international cases are studied from three aspects: determination of fair use, principle of imputation, and assumption of responsibility. In terms of fair use, relevant cases in China have not involved the determination of the use of works for large model training, while US courts have shifted from supporting the “intermediate reproduction” as constituting fair use to determining infringement. In terms of principle of imputation, considering the universality of data, technical complexity, and the need to encourage innovation, the principle of fault liability should be adopted to avoid excessively burdening technology development. In terms of assumption of responsibility, platforms in data training scenarios can be broadly interpreted as new types of “network service providers”, and the “safe harbor principle” can be analogically applied. Infringement can be determined through the “substantive similarity + likelihood of contact” standard and algorithm comparison technology, while referencing current mainstream theories from various countries, aiming to provide reference for the improvement of relevant legal regulations and industrial development.

Key words: generative artificial intelligence, model training, copyright, determination of infringement

中图分类号:

C923.7

董凯宇, 杜爱萍. 生成式人工智能模型数据训练中的著作权侵权认定[J]. 曲靖师范学院学报, 2025, 44(5): 98-104.

DONG Kaiyu, DU Aiping. Determining Copyright Infringement in the Training of Generative Artificial Intelligence Models[J]. Journal of Qujing Normal University, 2025, 44(5): 98-104.