Saturday, 20 April 2024

Copyright and AI: Prediction on Future Direction

Discussions about AI and copyright are intensifying, with claims like "AI is theft!" becoming more common. As technology continues to advance, numerous lawsuits around the world are beginning to clarify how different political entities view this issue. However, I see these as temporary challenges in the broader context of technological evolution. Driven by strong incentives, technology is likely to develop in ways that navigate around existing legal frameworks. As a layperson in legal matters, in this post, I will explore potential developments over a 4 to 7-year timeline, offering my perspective on how these issues might unfold.

Under current law, copyright protection typically requires human authorship and grants the creator exclusive rights to reproduce, distribute, and display the work. This framework traditionally applies to works that are exact or substantially similar reproductions of original content. However, AI-generated works, which are created without direct human creative input, present a challenging anomaly. These creations often do not replicate existing works exactly but might instead remix or transform them in unique ways, complicating the determination of infringement. Moreover, the concept of training an AI—which involves learning from vast datasets including copyrighted material—hasn’t been explicitly recognized or regulated within existing legal frameworks, leaving 'training data' usage a significant gray area.

However, if we temporarily set aside the issue of copyrighted training data and consider the copyright approach for a hypothetically fully non-problematic AI and its creations, how would we copyright what is essentially a style of music, writing, or art? Given that people already produce works in styles similar to each other, any hypothetical approach seems likely to be overwhelmed by implementation details, edge cases, and false positives.

A common counter-argument involves what happens when someone requests art that resembles the style of a specific artist. I anticipate that most AI companies will implement safeguards to prevent explicit references to specific artists and might even omit such data from training sets. It remains common, however, to request a piece similar in style to an existing work, which could have been created through numerous disconnected steps.

Moreover, the issue of training data might be a temporary one. While current-generation AIs rely heavily on real copyrighted works, next-generation AIs are beginning to use more synthetic data. For instance, a first-generation AI might generate a large array of styles used as training data for subsequent generations. By the third generation, such an approach could involve only 5% original art, with the remainder so detached from any specific origin that mapping a ‘style lineage’ becomes impractical.

As we progress to scenarios involving "synthetic + explicitly licensed training data," the arguments based on lineage become increasingly untenable. The AI effectively bootstraps itself, severing ties with original works.

This scenario poses a dilemma for artists and copyright holders. From their perspective, the future does not seem promising. This piece is not about what should happen but rather what seems likely given current technological trajectories and societal incentives.

Unless a robust method for generically defining style is developed, and a new global copyright regime is established, significant changes in creative industries seem inevitable. And if such a global regime were to be established, it would be both draconian and highly problematic. Who defines a style? The first person to create something in it or someone who popularizes it? What about sub-styles, which exist in an infinite number of variations? Being inspired by any number of prior artists is how people usually find their own style after all, and AI certainly can blend styles.

The ongoing AI-related copyright debate also includes questions about who, if anyone, holds copyright over AI-generated works. On this sub-topic, I don't have a very robust view. Different countries will likely adopt varied stances, and much ambiguity will persist. Consider video games that use AI-generated art: where is the line drawn when half the assets are AI-generated, or when human post-modification is involved?

Ultimately, as the technology progresses rapidly, this debate may become mostly irrelevant. Brands, already a recognized concept, could evolve such that generating art, music, and writing becomes a simple matter of prompting. In such a future, copyright of individual works might become trivial, applicable only in a minority of cases, with content consumption increasingly focused on brands that serve more as quality stamps of a specific style, around which people can gather and share.

P.S. Note: The human touch will naturally retain value for some. I foresee 'artisan works' becoming a popular branding emphasis in the near future, catering to audiences who either abhor all things AI or simply crave the more subjective feeling of human connection. Notably, a portion of these works branded as such will indeed actually be created by humans.

P.S. Note 2: There are naturally many different fundamental assumptions here which, if incorrect, might cause a very different trajectory. Some worth mentioning are that larger utilization of synthetic data for AI training will be feasible (combined with a sensible percentage of explicitly licensed original works), societal movements to inhibit AI will not get significant enough to actually cause a global level slowdown, and capability development of AI progresses from its current state without encountering some entirely unforeseen blocker.



No comments:

Post a Comment

From Architecture to Game Development: A New Blog on Echoes of Myth

I’ve launched a new  Echoes of Myth Development Blog , documenting my journey into game development and sharing insights from my first comme...