Reinforcement learning from AI feedback (RLAIF) is a modification of RLHF where an AI provides the training signal that is provided by a human in the former method, conditioned on a constitution.
Reinforcement learning from AI feedback (RLAIF) is a modification of RLHF where an AI provides the training signal that is provided by a human in the former method, conditioned on a constitution.