prompt_strategies.stepwise_supervised

prompt_strategies.stepwise_supervised

Module for stepwise datasets, typically including a prompt and reasoning traces, and (optionally) per-step, or per-prompt-trace labels for reward modelling.

Classes

Name Description
StepwiseSupervisedPromptTokenizingStrategy Tokenizing strategy for supervised stepwise datasets, typically used for COT-reasoning.

StepwiseSupervisedPromptTokenizingStrategy

prompt_strategies.stepwise_supervised.StepwiseSupervisedPromptTokenizingStrategy(
    self,
    tokenizer,
    sequence_len=2048,
    step_separator='\n',
    max_completion_length=None,
    train_on_last_step_only=False,
)

Tokenizing strategy for supervised stepwise datasets, typically used for COT-reasoning. These datasets should include the following columns: - prompt: the prompt text - completions: a list of n completion steps - labels: a list of n labels indicating the “correctness” of each step