Is Your AI Testing Tool a Breach of Contract Claim Waiting to Happen?


Reliability, security, and legal compliance. These are assurances that customers purchasing technology products expect from their providers, and which are often required as part of the contracts for such products. AI Providers, however, are lagging in their willingness to contractually commit to such assurances, let alone deliver in practice. Thus, as AI products grow in both popularity and technical complexity, robust testing tools become indispensable. Unfortunately, utilization of such tools may unwittingly expose companies to legal risks, particularly in that user testing breaches the use rights, license restrictions, or allocation of intellectual property rights to which the parties commit in the contract for the AI product.

User-Driven AI Testing Explained
Testing AI systems is essential for ensuring their effectiveness, reliability and safety in real-world applications. Companies can employ various mechanisms to rigorously evaluate their AI technologies.

One common approach to AI testing is red teaming, where a specialized team simulates adversarial attacks or scenarios to uncover vulnerabilities and weaknesses in the AI system’s design and implementation. Red teaming can be likened to a simulated hack, where the user finds ways to exploit the product’s vulnerabilities, with the goal of patching them later.

Successful red teaming can be accomplished in a variety of ways, one of which is prompt hacking. Prompt hacking does not actually “hack” the AI system; rather, it attempts to break the boundaries of the system through prompts that try to jailbreak the parameters inherent in the model. A “jailbreak prompt” is the initial input into a system seeking to elicit a specific response that may shirk copyright filters, ethical boundaries, or other limitations meant to permit only appropriate outputs. A red team may also try to purposefully exploit a system to uncover underlying bias or force hallucination. AI systems are trained with existing data, so if the pool of information is skewed or false the answer provided may result in an undesired result. To uncover this type of vulnerability, testers use a variety of test cases across a wide range of data to see if they can induce an undesired response from the system.

In addition to red teaming, companies also employ robustness testing, where AI systems undergo extensive stress testing across diverse and challenging conditions to assess their performance and reliability. Robustness testing involves exposing the AI technology to various data inputs, environmental factors, and edge cases to evaluate the product’s ability to maintain availability, performance and functionality before they are put into real-world settings. By subjecting AI systems to robustness testing, companies can uncover potential weaknesses, improve model generalization, and enhance the product’s adaptability to dynamic and unpredictable environments.

As the AI market burgeons, new types of testing are cropping up. One company even has a blockchain-based testing methodology in the works, touted to map an AI model’s components, training and performance, and to cryptographically store it to avoid future alteration or tampering. Use of this tool will require providing a third party with access to the AI model, as well as storing the data and information generated by the model remotely. Depending on the mechanics of the mapping, products functioning in this manner may require that the models are reverse engineered in order to derive the information necessary to complete this type of testing.

The Contract Says…

License Restrictions
Many technology contracts and software licenses dictate the terms under which the product can be used. Most major providers restrict use with similar limitations, and user testing can run counter to these confines. For example:

  • The license may limit use exclusively to commercial, business or internal purposes. Testing, in any form, including the testing methods described above, arguably would not constitute commercial or business use, but rather, use for the purpose of simulating an attack or circumventing controls.
  • The number of users that can access the product may be limited by the license or be constrained to employees of the customer. If the customer’s licenses are limited to actual users, there may not be enough licenses left for a large red team or a third party to access the system; and if the party performing the testing is a third-party provider, they may not have the right to access the product based on this restriction.
  • Some use rights may restrict the user from accessing, modifying or copying the source code. If testing required “lifting the hood” in any way, it would violate this restriction.
  • The provider may require implementation of security or filtering mechanisms, and circumvention of such controls could result in negation of indemnity or liability protections.
  • Copying or remotely hosting the product may be prohibited, which would preclude users from performing the type of cryptography required for blockchain-type testing.

When negotiating the use rights for the product, many customers may not have accounted for carveouts to the limitations for the purposes of internal testing. If limitations like the ones above are included in the agreement, the safest course of action may be to specifically amend the contract to ensure that user-driven testing will not be considered a violation of these requirements.

Acceptable Use Policies
In addition to carefully drafted license terms, providers often hold a customer to acceptable use policies to govern the behavior of individual users interacting with the product. These policies may outline prohibited activities such as:

  • Using the service for the purpose of reverse engineering
  • Gaining unauthorized access to the service or data
  • Disrupting or causing the system to go down
  • Impacting the availability or performance of the service
  • Overloading the system or circumventing controls or restrictions
  • Usage that bypasses security controls

These types of acceptable use restrictions certainly extend to those users performing testing of the AI product, leading to a breach that could result in suspension or legal liability. Users performing red teaming would likely be violating the prohibition on gaining unauthorized access, or bypassing security controls. Robustness testing, which requires determining what data was used to train the system in an effort to prevent biased answers, could be understood by courts as “starting with the known product and working backward to divine the process which aided in its development or manufacture,” constituting reverse engineering. Any form of stress testing could disrupt or overload the system unduly.

Intellectual Property Allocations
Finally, AI providers protect the proprietary nature of their products through clear intellectual property designations within the contract, and also rely on copyright and trade secret law to protect their interest in the technology. Generally, these types of allocations of intellectual property fly in the face of sharing model weights, code or other materials with third parties, who could ultimately misuse that access to develop competing products.

User-driven testing can inadvertently infringe upon intellectual property rights, particularly if testers reproduce, modify or distribute copyrighted materials or proprietary algorithms without authorization. By conducting testing activities that involve copying or manipulating software components, users may run afoul of copyright laws and expose themselves to infringement claims.

Notably, the Digital Millennium Copyright Act (DMCA), which prohibits the circumvention of technological measures that control access to a copyrighted work, such as a computer program, does have a carveout for legal use of the software only to test interoperability. But because the Copyright Act does not preempt state breach of contract claims, if a user contractually accepts restrictions on their ability to reverse engineer, as in a software licensing agreement, a breach of contract claim would not be preempted by the Copyright Act. This is why it is critically important that an end user understands exactly what is in the terms of the contract, and what they are or are not authorized to do with the AI product.

A business may also violate trade secret law by testing their software. A trade secret is information that has economic value, has not been publicly disclosed, and for which measures have been taken to maintain its secrecy. As such, testing that uncovers information by intentionally breaking safeguards put in place by the creator could run afoul of applicable state law protecting trade secrets. Careful consideration on whether testing inappropriately exploits trade secrets is important when embarking on such tests.

Additionally, many contracts clearly state that the models are proprietary and sensitive—owned by the provider, and part of their confidential information, which may not be shared by the customer with third parties, except in limited circumstances. Without proper terms extending the right of access to the testers, AI providers may consider disclosure a breach of the contract.

With the emergence of sophisticated AI-testing mechanisms like red teaming and robustness testing, companies can bolster the reliability and security of their AI products. However, amidst this pursuit of excellence, companies must remain vigilant in ensuring that their testing practices align with the terms and conditions outlined in their licensing agreements and acceptable use policies. The evolving legal landscape surrounding AI, including copyright statutes and contractual obligations, underscores the need for companies to proactively consider these issues when negotiating their agreements with AI providers, and to conduct thorough reviews of AI product use restrictions before embarking on testing endeavors.


Old Tricks for the New Dog: Why Traditional Technology Sourcing Best Practice Is Relevant for Cutting-Edge AI