The performance and accuracy of artificial intelligence (AI) systems largely depend on the training data used to develop their models. These data sets, often vast and varied, are essential for enabling AI systems to “understand” and accomplish complex tasks.
However, their use raises significant legal questions, particularly regarding intellectual property, personal data protection, and contract management. In this article, we explore the primary legal issues related to the use of training data for AI within contractual agreements.
Are training data protected by intellectual property rights?
The question of data ownership is central to any AI project. Companies using data from third-party sources—whether databases, images, texts, or other types of content—must ensure they have the necessary rights to use them.
Consequently, AI system providers should consider entering into licensing agreements to avoid any copyright infringement on content used by their AI systems. For instance, The New York Times recently took legal action against OpenAI, the creator of ChatGPT, alleging copyright infringement.
Are there exceptions allowing the use of copyrighted works without prior authorization?
The Belgian Economic Law Code and the new European Copyright Directive provide an exception to the author’s exclusive rights, allowing the reproduction of copyrighted works for the purpose of text and data mining, provided that the author has not expressly reserved the right to prevent such use.
In other words, an AI system creator may leverage third-party data accessible on the internet as long as those third parties have not explicitly objected to this use, for example, through specific clauses in their website’s terms of service.
What about training data protected by the GDPR?
The GDPR imposes strict requirements for the use of personal data, including in AI projects. If training data includes information that can identify individuals, companies must ensure they comply with data protection rules, particularly concerning consent and transparency.
If such data is used, the provider of an AI system must ensure that it enters into data collection contracts guaranteeing that personal data has been collected lawfully and in compliance with the GDPR.
Our advice:
In the era of big data, AI system providers may be tempted to use data on a massive scale without addressing the associated legal issues. However, developing AI systems without appropriate contractual safeguards entails substantial legal risks.
It is therefore essential for AI system providers to verify the legal status of the data and content used to train and operate their systems to secure their use.
Lexing and its Creactivity department can assist you in drafting contracts related to the use of AI systems. For any questions, our team is at your disposal. Contact us now!
Sign up for our earlegal training course on 13 December 2024 “Framing your AI projects: contractual issues”! We will address the following questions:
- What are the main contractual aspects to consider at the AI development stage?
- What are the contractual aspects to be taken into account at the AI marketing stage?
- How to ensure proper management of AI-related incidents in the contractual chain
- How to contractually manage your compliance with the AI regulation?