A manner in which naive anthropomorphism of a language model like GPT-3 fails is this: the probability distribution produced in response to a prompt is not a distribution over ways a person would continue that prompt, itβs the distribution over the ways any person could continue that prompt. A contextually ambiguous prompt may be continued in mutually incoherent ways, as if by different people who might have continued the prompt under any plausible context.
The versatility of a large generative model like GPT-3 means it will respond in many ways to a prompt if there are various ways that it is possible to continue the prompt - including all the ways unintended by the human operator. Thus it is helpful to approach prompt programming from the perspective of constraining behavior: we want a prompt that is not merely consistent with the desired continuation, but inconsistent with undesired continuations.
Consider this translation prompt:
Translate French to English:
Mon corps est un transformateur de soi, mais aussi un transformateur pour cette
cire de langage.
This prompt does poorly at constraining possible continuations to the intended task. The most common failure mode will be that instead of an English translation, the model continues with another French sentence. Adding a newline after the French sentence will increase the odds that the next sentence is an English translation, but it is still possible for the next sentence to be in French, because thereβs nothing in the prompt that precludes a multi-line phrase from being the translation subject. Changing the first line of the prompt to βTranslate this FrenchΒ sentenceΒ to Englishβ will further increase reliability; so will adding quotes around the French sentence - but itβs still possible that the French passage contains sections enclosed in quotes, perhaps as a part of a dialogue. Most reliable of all would be to create a syntactical constraint where any reasonable continuation can only be desired behavior, like this prompt:
Translate French to English.
French: Mon corps est un transformateur de soi, mais aussi un transformateur pour
cette cire de langage.
English:
This simple example is meant to frame a question central to the motivation of prompt programming:Β what prompt will result in the intended behavior andΒ onlyΒ the intended behavior?
A component of the efficacy of manyshot prompts may be recast through this lens: if the prompt consists of numerous instances of a function, it is unlikely that the continuation is anything but another instance of the function, whereas if there is only one or a few examples, it is less implausible that the continuation breaks from the pattern.
ββββββββββββββ PROMPT CONSTRAINT TOPOLOGY ANALYZER ββββββββββββββββββ
β Mapping Response Space & Behavioral Boundaries β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Unconstrained vs Constrained Response Space: β
β ββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Unconstrained Constrained β β
β β β±β² β±β² β±β² βββββ β β
β β βFRβ βENβ β??β β βENβ β β
β β β²β± β²β± β²β± βββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Prompt Structure Evolution: β
β ββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Level 1: "Translate French to English:" β β
β β ββ[FR]ββ[FR/EN]ββ[??] β β
β β β β
β β Level 2: "Translate this French sentence:" β β
β β ββ[FR]ββ[EN]ββ[FR/EN] β β
β β β β
β β Level 3: French: [input] β β
β β English: [output] β β
β β ββ[FR]ββ[EN] β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Probability Distribution: β
β ββββββββββββββββββββββββββββββββββββββββββββββββββ β
β βP(response)β β β
β β β² β β β
β β β β Weak Strong β β
β β β β Constraint Constraint β β
β β β β β±β²β±β²β±β² β±β² β β
β β β β β± β² β² β± β² β β
β β ββββββββ΄βββββββββββββββββββββββββ> Responseβ β
β ββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Manyshot Effect: β
β ββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Single Example: Multiple Examples: β β
β β [Exβ] [Exβ] β β
β β β [Exβ] β β
β β βΌ [Exβ] β β
β β β±β²β±β²β±β² βΌ β β
β β Loose Pattern βββββββ β β
β β Strong Pattern β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β [Analyze Constraints] [Test Pattern] [Measure Coherence] β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ