Autoregressive generation is the process by which GPT-like models generate sequences, by performing the following steps in an indefinite loop: conditioning the model on the sequence so far and yielding predictions for next-token probabilities, sampling a token from the output distribution, and appending the sampled token to the prompt sequence. Due to the sampling step, this procedure is intrinsically hallucinatory.