Artificial intelligence is rapidly evolving, and large language models (LLMs) like ChatGPT are one of the more exciting examples. Their generative capabilities have implications for our patent system, some of which are underappreciated and nonintuitive.
Under U.S. patent law, an inventor may not obtain a patent if the claimed invention would have been obvious to an artisan of ordinary skill, in view of the prior art. (See 35 U.S.C. § 103.)
Publicly available LLMs will qualify as prior art against later inventions, provided a version of the LLM is maintained in a provably unaltered state. LLMs’ creative capacity will lead to their use as prior art for creative endeavors in general, including against inventions unrelated to AI.
Accused infringers will be able to prompt such an LLM to produce output that contains each element of the patent’s claims. This would be a classic case of invalidity with traditional prior art, but the very properties that make LLMs creative also raise questions about whether they can prove obviousness, at least as U.S. patent law on obviousness is currently formulated.
Obviousness attacks based on an LLM are plausible with a few simplifying, but nonetheless realistic, assumptions.
First, this article focuses on situations where the LLM is used as prior art against patents and applications filed more than a year after the LLM is made publicly accessible and archived. This avoids the complexity that arises when an inventor’s own disclosures are incorporated into the dataset on which the LLM is trained.
This article further restricts its analysis to only those situations where the prompt—provided to the LLM to generate the combinatorial reference—is found in the prior art, such as in the form of a live problem in a field or in a nonenabling description of a beneficial invention. We limit our discussion to situations where the prompt to the LLM satisfies the U.S. Supreme Court’s reason-to-combine standard laid down in the 2007 ruling in KSR International Co. v. Teleflex Inc.1
Finally, the article assumes LLMs become widely used tools of creation, such that they qualify as analogous art reasonably pertinent to the inventor’s field of endeavor in fields otherwise unrelated to AI.
First, this article describes the nature of LLMs and the process by which they determine their outputs. Then we describe the status quo of the law of obviousness and when LLMs can constitute prior art under the current patentability framework. We then explore arguments that will arise when LLMs are used to prove obviousness. We conclude by discussing next steps for attorneys on either side of the “v.”
Introduction to Large Language Models
LLMs are neural networks trained to, among other things, predict the next word, given a preceding body of text. A neural network learns by undergoing training, which involves changing model parameters to optimize an objective function in view of the training set.
In the case of LLMs, that objective function measures how well the model guesses words redacted from a corpus of training data. The model is presented with a document in the training set that has a word redacted. It guesses to fill in the blank, and the model’s parameters are adjusted based on whether the guess was correct. As the model is further trained, its guesses get better.
The training is self-supervised in that a human is not required to provide the LLM with a labeled training set; the LLM only needs to be given human-generated text. This allows training sets to reach internet scale. Recent advances in GPUs have allowed model parameters to number in the hundreds of billions.
At such large scales of computing power and training data, LLMs begin to develop new and surprising skills, like performing addition, demonstrating a theory of mind, reasoning logically and passing professional licensing exams.
LLMs produce more useful outputs if they incorporate an element of chance. For example, an LLM tasked with predicting the next word in an incomplete sentence may select an output via a weighted random selection of the most likely candidates. (See U.S. Patent. No. 8,232,973 (filed June 30, 2008).)
Suppose that someone asked an LLM to complete a sentence—“the sky is …”—and that sentence ends with the word “blue” 60% of the time and “cloudy” the other 40% of the time.
Because LLMs operate probabilistically, rather than deterministically, the LLM will not output “blue” simply because it is the most likely candidate. Instead, the LLM will make a probability-weighted random selection among the top few candidates. That selection will then influence the list of candidates for the next word selection.
Thus, LLMs generally solve problems probabilistically, rather than deterministically. Giving an LLM the same prompt 10 different times will produce 10 different responses.
The amount of randomness in the selection of the next word is a tunable model parameter called “temperature.” Many models allow the user to adjust this value to generate more creative outputs, at the risk of the LLM producing nonsense if the temperature is set too high.
This randomness is useful for driving creativity, and it is something that patent law will struggle to deal with conceptually when LLMs are used as prior art in a validity attack.
Once trained, an LLM’s parameters can be frozen and the model further used for inference responsive to new prompts. In other cases, the model may undergo active learning, in which its parameters continue to evolve based on performance during inference time.
This article focuses on the former case, where the model parameters remain unchanged after training, from before the critical date of the invention at issue until the model is used as a prior art reference.
The Law of Patentability and Prior Art
To understand whether and how a trained, archived LLM may itself demonstrate the obviousness of an invention, it is helpful to consider the current state of patent law.
The Patent Act describes five distinct classes of prior art. According to Title 35 of the U.S. Code, Section 102(a)(1), a purported inventor cannot obtain a patent if:
the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
But the Patent Act recognizes an exception for disclosures made within one year before the effective filing date of a claimed invention, where the disclosure was made by the inventor or the inventor previously and publicly discussed the subject matter recited in the disclosure.
Prior art-based patentability requirements come in two flavors: anticipation challenges and obviousness challenges. Anticipation challenges are covered by Section 102 of the Patent Act. An invention is anticipated if each and every limitation of the claim at issue is found in a single prior art reference. (See Verdegaal Bros. v. Union Oil Co. of Cal., 814 F.2d 628, 631 (Fed. Cir. 1987).)
Obviousness is in question when no single prior art reference anticipates the claim, but an obvious combination or modification of references reaches all claim elements. (See 35 U.S.C. § 103.) Obviousness challenges are covered by Section 103, which prohibits the granting of a patent for an invention that—though not expressly described in the prior art—would have been obvious to a person of ordinary skill in the art. Of relevance later in this discussion, obviousness is to be determined from the perspective of a person of ordinary skill at the time of invention.
Determining whether the prior art renders a claimed invention obvious requires looking to four factors:
- The scope and content of the prior art;
- The differences between the prior art and the claimed invention;
- The level of ordinary skill in the art; and
- Secondary considerations, such as long-felt and unsolved need, failure of others and skepticism by experts. (See Graham v. Deere, 383 U.S. 1, 17-18 (1966).)
But not all prior art references are relevant to whether a particular application or patent would have been obvious to one of ordinary skill in the art.
Only prior art references that qualify as analogous art may form the basis for an obviousness objection. A reference constitutes analogous art if it is from the same field of endeavor as the claimed invention or is reasonably pertinent to the problem the inventor faced. (See In re Bigio, 381 F.3d 1320, 1325 (Fed. Cir. 2004).)
Obviousness further requires a motivation or suggestion to modify or combine the prior art to reach the claimed combination. In KSR, the Supreme Court held, “Any need or problem known in the field of endeavor … and addressed by the patent can provide a reason for combining the elements in the manner claimed.”
A claimed invention is obvious when one of ordinary skill in the art, at the time of the invention, would have arrived at the invention via the operation of their creativity, background knowledge and common sense.2
LLMs as Prior Art
The Patent Act provides that no patent shall be issued for an invention that was previously in public use or publicly available. An archived, internet-accessible LLM that members of the public can prompt will qualify as prior art.
In a patent infringement case, a defendant raising an invalidity defense has the burden of demonstrating invalidity. To prove obviousness, the patent challenger will typically demonstrate that the references relied upon were publicly available on or before the effective filing date of the patent at issue.3
Consequently, defendants seeking to use an LLM as a prior art reference in an obviousness invalidity contention will prove that the LLM, with the model parameters since unchanged, was publicly available as of a specific date and that no one has altered or retrained the LLM since that date.
Parties—in expectation of a patent infringement suit—may archive offline LLMs with a custodian who can testify under oath that they have personal knowledge that no one altered or retrained the LLM as of a particular date. This is similar to how witnesses in patent infringement actions and inter partes review proceedings testify that archived webpages on the Wayback Machine were publicly available as of the relevant date and have not been changed since.
Indeed, industry practitioners may even choose to engineer their own prior art for use in future litigation. For example, frequent targets of patent suits could fine-tune an LLM on materials relevant to their technological field, make the fine-tuned LLM available for public use, archive the LLM to prevent subsequent training and designate a custodian to testify that no one altered the LLM after the archival date.
Archiving an LLM would have two consequences. First, an LLM archived in this manner would likely constitute prior art under Section 102(b) of the Patent Act for any patents or applications filed within a year of the LLM’s archival, as patent defendants could show both that the LLM was made publicly available and that it has not changed since.
Second, fine-tuning the LLM on materials relevant to the field of endeavor will help demonstrate that the LLM itself constitutes analogous art.4
Obviousness further requires the challenger identify a motivation to modify the prior art with a rational underpinning, per the KSR decision. Patent challengers will argue that a skilled artisan would prompt an LLM with a problem discussed in other prior art references, or with a problem that is self-evident.5
Once the LLM is qualified as analogous prior art, an expert witness attacking validity could apply a prompt that qualifies as a motivation to modify the LLM reference. The model’s output in response to the prompt could then be compared to the patent claims to argue every element is satisfied.
LLMs might prove attractive forms of prior art when a litigant is otherwise facing a high reference count in their obviousness combination. Juries often view invalidity contentions based on obviousness as weaker when the defense counsel must rely on many references to disclose each limitation of the claim at issue.
LLMs may provide better optics by serving as a single prior art reference that encodes knowledge from a staggering number of documents. By reducing the number of references that need to be cited from many to just two—the LLM and the prompt obtained from the prior art—basing obviousness attacks on LLMs may improve a defendant’s chances of invalidating an asserted patent.
New arguments will arise against using LLM outputs to invalidate patents under Section 103.
Patentees will argue hindsight bias drove the prompt provided to the LLM. Patentees will argue the only reason the accused infringer’s expert chose the prompt was that the patent at issue provided a roadmap.
Accused infringers will need to be ready to demonstrate that the prompt satisfies the constraints of KSR for motivations to modify the prior art. While certainly something to be mindful of, this concern is a conventional one, as courts are already equipped to guard against hindsight bias.
An ongoing problem with LLMs deals with their tendency to “hallucinate,” or generate plausible-sounding but untrue output. LLM hallucinations can take the form of citing statutes or cases that do not exist.
In light of LLMs’ hallucinations, parties will argue whether LLMs should be entitled to all the presumptions that U.S. patent law affords to traditional human-generated content, such as the presumption of operability.6 An LLM could hallucinate a system like a perpetual motion machine that is physically impossible to create. It could describe a process to make a device with the claim elements, but the process might be inoperative even if the device is operable.
As LLMs improve, and possibly start to exceed the reliability of individual traditional prior art references, this concern may attenuate.
Patentees may demand the opportunity to prompt the prior art LLM used to attack validity.
Patentees and applicants generally receive the opportunity to investigate prior art references to determine whether the references teach away from the invention at issue. A carefully prompted LLM output might teach away from the invention at issue.
The Argument from the Probabilistic Nature of LLM Outputs
The most interesting potential objection to the use of LLMs as prior art arises from their probabilistic nature.
LLMs generate responses via a probability-weighted random selection of candidate next tokens. Consequently, prompting an LLM the same way 10 times could yield 10 different answers.
This randomness is in tension with the requirement in patent law that the prior art must render all claim elements obvious at a particular point in time.
To invalidate a patent as obvious in view of the prior art, the defendant must demonstrate under Title 35 of the U.S. Code, Section 103, that
the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.
The defendant’s burden of persuasion at trial is clear and convincing evidence. (See Microsoft Corp. v. i4i Ltd. P’ship, 564 U.S. 91, 102 (2011).)
Thus, a patent defendant hoping to use an LLM as prior art in an obviousness invalidity contention must clearly and convincingly demonstrate that an ordinarily skilled artisan could have arrived at the invention by prompting the LLM on or before asserted patent’s effective filing date.
LLM outputs obtained when crafting invalidity contentions would likely be different from those that would be obtained at other times and, in particular, prior to the critical date of the invention at issue. Each time the same prompt is applied, you get a different output.
Patentees will argue that LLM output generated during a lawsuit is not clear and convincing evidence that the same output would have been produced at the relevant time under the Patent Act. They will claim the patent challenger just got lucky and assert that a statistical outlier is not clear and convincing evidence of obviousness.
At the same time, savvy patent challengers may create their own luck, such as by entering the same prompt into the prior art LLM until one instance of output has all the elements of the claim at issue. A simple programming script could apply the prompt millions of times and classify outputs that meet the claim limitations, potentially running continuously until one output renders the patent obvious.
A prompt and LLM pair should be thought of as having a distribution of outputs, and adjudicators may need to account for that distribution when assessing obviousness. If 9 out of 10 randomly selected outputs of the pair render a claim obvious, that is a very different scenario from taking 20,000 attempts to arrive at one potentially invalidating output. Similar arguments will apply when a patentee manages to elicit output that teaches away from a prior art LLM.
Traditional prior art references present none of these challenges: The reference reads the same way each time it is viewed. LLMs as prior art will force the patent system to grapple with new issues.
Conclusion and Takeaways
In short, LLMs generate outputs probabilistically, rather than deterministically; as a result, these models can perform sophisticated natural language processing tasks, including solving problems stated as prompts in a variety of fields unrelated to AI.
This will make publicly released and suitably archived LLMs attractive forms of prior art for patent challengers. Such attacks will raise novel issues, as courts and the U.S. Patent and Trademark Office grapple with the nondeterministic nature of LLMs and the requirement in patent law that obviousness be proven with reference to a particular time period.
Groundwork laid now could affect the balance of the patent system. Parties concerned with patent quality should consider periodically fine-tuning, publishing and archiving generative models to serve as prior art against later inventions.
Generative AI seems poised to amplify the rate at which invention occurs, and these prior art LLMs could serve as an important counterbalance to technological tools that aid in the process of inventing.
(This article first appeared in Law360.)
1 See KSR Int’l Co. v. Teleflex Inc., 550 U.S. 398, 420 (2007) (“Any need or problem known in the field of endeavor at the time of invention and addressed by the patent can provide a reason for combining elements in the manner claimed”).
2 See Perfect Web Techs. Inc. v. Infousa Inc., 587 F.3d 1324, 1329 (Fed. Cir. 2009) (noting that KSR expanded the source of information for a proper obviousness analysis to include “the background knowledge, creativity, and common sense of the person of ordinary skill”).
3 The patent challenger has the burden of demonstrating invalidity. Demonstrating that an invention would have been obvious to an ordinarily skilled artisan in view of one or more references requires demonstrating the priority of those references. See 35 U.S.C. § 103 (noting that to prove invalidity via an obviousness challenge, the defendant must show that “the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention”).
4 An LLM fine-tuned over materials relevant to a particular art may itself be directed to the same field of endeavor and thereby qualify as analogous art under the first Bigio prong. Alternatively, if a publicly available, fine-tuned LLM becomes a tool of the trade in the relevant art, then the LLM is likely pertinent to solving the problem the inventor faced.
5 See KSR Int’l Co. v. Teleflex Inc., 550 U.S. 398, 421 (2007) (“Rigid preventative rules that deny factfinders recourse to common sense, however, are neither necessary under our case law nor consistent with it.”)
6 MPEP 2121 (9th ed. rev. 7.2022 Feb. 2023), citing In re Sasse, 629 F.2d 675, 681 (CCPA 1980) (“When the reference relied on expressly anticipates or makes obvious all of the elements of the claimed invention, the reference is presumed to be operable. Once such a reference is found, the burden is on applicant to rebut the presumption of operability.”)
From Chatter to ChatGPT: A Once Simple Tool Has Become an Industry Mainstay