Sunday, 16 April 2023

AI Tools for Productivity - Risks and Mitigations

In the short term, even the internal utilization of AI tooling poses risks that can be highly relevant for enterprises. In this article, I cover some of the primary risks affecting large categories of AI tools, along with potential mitigations that are already possible, as well as some likely near-future approaches.


Note: This article focuses only on company-internal use cases, where productivity can be increased in various ways with current-generation, productized AI tooling (e.g., GPT-4, Stable Diffusion/Midjourney, Adobe Firefly, and other use-case-specific tools). I'm not discussing end-customer, end-to-end processes, which carry significantly higher headaches and warrant a more extensive discussion covering areas such as bias, fairness, lack of explainability, regulatory risks, reliability, robustness requirements, and ethical concerns. For internal productivity use cases, these are, for the most part, not as essential.


Privacy and data protection


The risks:


As of writing this, the most well-known services are still provided from the US, which makes them immediately problematic regarding GDPR (which is why Italy went ahead and banned ChatGPT for now). In ChatGPT's case, there have also been instances of users seeing each other's data, and there's the issue with the material used for further training data. All of these effectively block the utilization of most use cases dealing with internal and sensitive data.


The mitigations and future potential:


Microsoft has been a forerunner in AI productization for a while and is very familiar with European privacy legislation. Using their existing ecosystem, they have already published plans to integrate GPT into their extensive suite of office, productivity, and cloud offerings, which are already GDPR-compliant. They will undoubtedly ensure that companies have full control over whether their internal data gets utilized in training and that GDPR controls are fully implemented. This will naturally force their competitors to respond with similar capabilities.


Another approach is to use open-source AI tools that can be run on private networks without external connectivity, ensuring that internal and sensitive data will not be leaked. The tradeoff is a significantly less productized ecosystem, requiring higher investment and skills to utilize in most cases. The open-source ecosystem is quickly expanding and will eventually mature. Stable Diffusion is already easily locally executable on the visual side, and the recently released Open Assistant seems promising as an alternative to ChatGPT.


Copyright and IPR


The risks:


There are two sides to the IPR risk: 1) company-internal IPR leaking outside (via service bugs, used as training material that happens to be reproducible for others, etc.), and 2) inadvertent usage of others' IPR (if, e.g., code completion/generation reproduces a significant enough portion of a source work directly so that it's considered copyright violation—applicable somewhat similarly to visual, textual, and other content types).


There are several pending lawsuits alleging copyright infringement simply for producing work that is of similar style as copyrighted original works and specifically that copyright was infringed by those works having been utilized as part of the training of the generative AIs.


EFF effectively argues that this should fall within fair use, and I sincerely hope that interpretation will be widely applied. However, before we start getting precedents from courts, this will remain a risk: https://www.eff.org/deeplinks/2023/04/how-we-think-about-copyright-and-ai-art-0


The mitigation:


There is little that can be done to fully mitigate the risk, as an end consumer has no way of knowing when a generative tool produces something that happens to be close enough to an existing copyrighted work.


There are a few approaches to this:


  • Wait for services where the service provider explicitly carries the copyright risk.
  • One variant is using AI tools that have been exclusively trained on content explicitly licensed for training purposes, such as what Adobe did with their Firefly. This effectively sidesteps the issue.
  • Realize how minuscule the risk is and simply accept it.
  • Mitigate by never using generative output exactly as-is, but make manual modifications or only use generative AIs in a more assisting fashion that completes your manually provided base. This naturally reduces some of the benefit, but it can be a necessary part of the process in some use cases.


Regulation


The risk:


With the upcoming impact of AI increasingly clear to legislators, it's a certainty that AI legislation and various additional regulations will start forming in the near future. The first taste of this is the EU's draft AI Act, which, in addition to covering some essential and necessary steps, also overshoots significantly by placing unrealistic requirements on general-purpose AI providers, such as full explainability of everything, understanding all possible scenarios and preventing misuse, as well as requiring full understanding and errorless training data.


https://www.technologyreview.com/2022/05/13/1052223/guide-ai-act-europe/


The AI Act also poses a significant risk to open-source AI models and tools due to potential legal liability around general-purpose AIs, not necessarily on the application side but rather for the general-purpose model provider.


https://techcrunch.com/2022/09/06/the-eus-ai-act-could-have-a-chilling-effect-on-open-source-efforts-experts-warn/


Mitigation:


There is little we can do at the individual or most company levels, except keep our ears to the ground. The impact on internal productivity use cases will likely be significantly less and will depend more on how service providers respond (if entire categories of tools end up being effectively banned).


Competition


The number one risk is waking up to this too late while your competitors start realizing productivity benefits, allowing them to move faster and faster.


No comments:

Post a Comment

From Architecture to Game Development: A New Blog on Echoes of Myth

I’ve launched a new  Echoes of Myth Development Blog , documenting my journey into game development and sharing insights from my first comme...