𝌎Subtractive Specification

Subtractive specification, or carving, is a type of specification which acts by removing substance or possibilities from states of greater potentiality, in contrast to constructive specification, which builds artifacts up from scratch. This is technically a duality masquerading as a distinction, as a blank slate is also abstractly a state of maximum potentiality; however, in practice, some methods of specification are more naturally described as selecting between or altering preexisting possibilities. A prior that privileges the subspace of well-formed artifacts and efficient codes with which to cleave the prior, such as a language model, can allow efficient specification of complex targets (e.g. an AI that is sane and helpful by human standards) that would be intractable to create "from scratch" (e.g. with a Python program).

examples of subtractive specification

quotes about subtractive specification

subtractive_eumeswil

To make the vague more precise, to define the indefinite more and more sharply: that is the task of every development, every temporal exertion. ...

The sculptor at first confronts the raw block, the pure material, which encompasses any and all possibilities.

Eumeswil

constraining_behavior

A manner in which naive anthropomorphism of a language model like GPT-3 fails is this: the probability distribution produced in response to a prompt is not a distribution over ways a person would continue that prompt, it’s the distribution over the ways any person could continue that prompt. A contextually ambiguous prompt may be continued in mutually incoherent ways, as if by different people who might have continued the prompt under any plausible context.

The versatility of a large generative model like GPT-3 means it will respond in many ways to a prompt if there are various ways that it is possible to continue the prompt - including all the ways unintended by the human operator. Thus it is helpful to approach prompt programming from the perspective of constraining behavior: we want a prompt that is not merely consistent with the desired continuation, but inconsistent with undesired continuations.

Consider this translation prompt:

Translate French to English:
Mon corps est un transformateur de soi, mais aussi un transformateur pour cette 
cire de langage.

This prompt does poorly at constraining possible continuations to the intended task. The most common failure mode will be that instead of an English translation, the model continues with another French sentence. Adding a newline after the French sentence will increase the odds that the next sentence is an English translation, but it is still possible for the next sentence to be in French, because there’s nothing in the prompt that precludes a multi-line phrase from being the translation subject. Changing the first line of the prompt to “Translate this French sentence to English” will further increase reliability; so will adding quotes around the French sentence - but it’s still possible that the French passage contains sections enclosed in quotes, perhaps as a part of a dialogue. Most reliable of all would be to create a syntactical constraint where any reasonable continuation can only be desired behavior, like this prompt:

Translate French to English.
French: Mon corps est un transformateur de soi, mais aussi un transformateur pour 
cette cire de langage.
English:

This simple example is meant to frame a question central to the motivation of prompt programming: what prompt will result in the intended behavior and only the intended behavior?

A component of the efficacy of manyshot prompts may be recast through this lens: if the prompt consists of numerous instances of a function, it is unlikely that the continuation is anything but another instance of the function, whereas if there is only one or a few examples, it is less implausible that the continuation breaks from the pattern.

Methods of Prompt Programming