By: Andrés Izquierdo
This post was originally published at Oxford University, Faculty of Law, Oxford Business Law Blog, available here.
On February 5, 2020, the US Copyright Office and the World Intellectual Property Organization (WIPO) co-sponsored the event ‘Copyright in the Age of Artificial Intelligence’ in Washington, DC. The full-day event took a seriously in-depth look at the current relationship between artificial intelligence (AI) and copyright. Participants included visual artists, audiovisual producers, music composers and executives, software developers, guilds of diverse artistic interests, people developing AI, and—why not mention—maybe half of the room was filled with copyright lawyers.
One of the main themes of this event was the scope of data access for the development of artificial intelligence. This discussion tries to establish how much legal leeway (copyright exceptions and limitations) should be given on the collection of data (Text and Data Mining, or TDM) for machine learning processes. This topic was addressed recently in Europe with the Digital Single Market (DSM) Directive, and in the United States with the HathiTrust (2014) and Google Books (2015) cases. However, in both regions of the world, there is a high degree of uncertainty on how to apply these norms or rulings on an everyday basis.
Some people might wonder what is the real significance of TDM? Well, in short, data is the food that artificial intelligence needs to become smarter and smarter. But this data-eater needs lots of it. The more information the AI consumes, the smarter it becomes. And that is where the problem lies. Some people do not want computers to become Skynet, and others just haven’t seen or appreciated the pure beauty of James Cameron’s Terminator 2.
AI is starting to occupy a more and more important place in the 21st-century. It can create music (still not like Mozart) and paint like Rembrandt. In fact, one of the few things the panelists agreed on is that an AI can continue making Rembrandt copies freely because Rembrandt Van Rijn’s style is not copyright protected. AI can also enable self-driving cars to do your daily commute (automated vehicles (AV) 4.0 is already here); write social media posts; allow Alexa, Siri or Google Maps to answer your questions; fight illegal poaching in Africa; and even track people’s facial expressions or body language to operate a machine (take a look at the Wheelie control motorized wheelchair). At the event, we even heard a very profound AI written poem that spoke about inspiration and hope.
But there are many ethical issues that have arisen alongside this technological development. Governments are using facial recognition for surveillance and employee morale is negatively affected when machines take over jobs (just go to Target and do your own self-checkout). AI can also be biased (is the AI information fair and neutral?), actors and their voices can be cloned (there are many claims that some famous actresses have been cloned in productions rated PG-13 and above), and accelerated hacking is a problem. One of the most troubling concerns is AI terrorism, which can consist of autonomous drones, robotic swarms, remote attacks, or the delivery of diseases through nanorobots (take a look at Spider-Man: Far From Home).
Participants at the event expressed differing legal viewpoints. Some argued in favor of applying copyright rules for TDM; others rooted for the application of the fair use doctrine; others said that fair use could not be the policy-making mechanism for artificial intelligence; and other groups expressed their interest in getting free data to be able to sell more licenses. Someone called Wikipedia a ‘website full of white male western bias writers’, while others were more concerned with the ethical implications of freely giving way to artificial intelligence. There was even an open call for people to submit their comments on what the policy questions about machine learning and copyright should look like.
Currently, the US lacks default criteria to apply fair use for machine learning processes except in a narrow list of TDM cases. Although the TDM rulings (Google Books, HathiTrust) held that copying expressive works for non-expressive purposes was justified as fair use, the applicability for diverse TDM contexts is still legally limited and uncertain. In other words, corporations still don’t know how far TDM can go. Also, the TDM rulings did not address many side issues such as computer hacking, contract law, cross-border copyright issues, or circumvention of technological protection measures.
On the other hand, the European Union just changed its regulation with article 3 and 4 of the DSM to a policy that apparently allows TDM for commercial use, but there is an opt-out option that makes it look like a restriction. Policymakers and academics in Europe are trying to figure out if this section of the directive will create a new copyright licensing market, or if it will foster research and knowledge development. Countries like Germany that had a well-established differentiation between TDM commercial and non-commercial use now will also have to find the best way to implement the Directive. The United Kingdom will not have to update its regime given its recent exit from the European Union.
So, shall we open the doors of data so machines can learn more and produce more, or shall we keep them half-closed while we regulate? Shall we erase or keep the TDM commercial use/non-commercial use distinction? Shall we understand the process of feeding the AI machines as non-copyrightable? These questions have brought about much debate, but the problem is that no one knows for certain where this technology is going to end. Some attendees appeared really concerned when presented with information on these issues.
This is a key moment in history to decide which way to go, and this is a topic on which governments need to fasten their seat belts and begin moving promptly down the policy road. As has happened with other fields of tech development and copyright laws, AI development is not going to wait for our legislators to tell AI (it, she, he, them?) where or when to stop.