utils.models

utils.models

Module for models and model loading

Classes

Name	Description
ModelLoader	ModelLoader: managing all the config and monkey patches while loading model

ModelLoader

utils.models.ModelLoader(
    self,
    cfg,
    tokenizer,
    *,
    processor=None,
    inference=False,
    reference_model=False,
    **kwargs,
)

ModelLoader: managing all the config and monkey patches while loading model

Attributes

Name	Description
has_flash_attn	Check if flash attention is installed

Methods

Name	Description
patch_llama_derived_model	Modify all llama derived models in one block
patch_loss_llama	Patch loss functions and other optimizations
set_attention_config	sample packing uses custom FA2 patch
set_auto_model_loader	Set self.auto_model_loader. Defaults to `transformers.AutoModelForCausalLM`

patch_llama_derived_model

utils.models.ModelLoader.patch_llama_derived_model()

Modify all llama derived models in one block

patch_loss_llama

utils.models.ModelLoader.patch_loss_llama()

Patch loss functions and other optimizations

set_attention_config

utils.models.ModelLoader.set_attention_config()

sample packing uses custom FA2 patch

set_auto_model_loader

utils.models.ModelLoader.set_auto_model_loader()

Set self.auto_model_loader. Defaults to transformers.AutoModelForCausalLM (set at __init__). When using a multimodal model, self.auto_model_loader should be set according to the type of the model.

Functions

Name	Description
get_module_class_from_name	Gets a class from a module by its name.
load_model	Load a model for a given configuration and tokenizer.
load_tokenizer	Load and configure the tokenizer based on the provided config.
modify_tokenizer_files	Modify tokenizer files to replace added_tokens strings, save to output directory, and return the path to the modified tokenizer.
setup_quantized_meta_for_peft	Replaces `quant_state.to` with a dummy function to prevent PEFT from moving `quant_state` to meta device
setup_quantized_peft_meta_for_training	Replaces dummy `quant_state.to` method with the original function to allow training to continue

get_module_class_from_name

utils.models.get_module_class_from_name(module, name)

Gets a class from a module by its name.

Parameters

Name	Type	Description	Default
module	`torch.nn.Module`	The module to get the class from.	required
name	`str`	The name of the class.	required

load_model

utils.models.load_model(
    cfg,
    tokenizer,
    *,
    processor=None,
    inference=False,
    reference_model=False,
    **kwargs,
)

Load a model for a given configuration and tokenizer.

load_tokenizer

utils.models.load_tokenizer(cfg)

Load and configure the tokenizer based on the provided config.

modify_tokenizer_files

utils.models.modify_tokenizer_files(tokenizer_path, token_mappings, output_dir)

Modify tokenizer files to replace added_tokens strings, save to output directory, and return the path to the modified tokenizer.

This only works with reserved tokens that were added to the tokenizer, not tokens already part of the vocab.

Parameters

Name	Type	Description	Default
tokenizer_path	str	Path or name of the original tokenizer	required
token_mappings	Dict[int, str]	Dict mapping {token_id (int): new_token_string}	required
output_dir	str	Directory to save the modified tokenizer	required

Returns

Name	Type	Description
	str	Path to the modified tokenizer directory

Ref: https://github.com/huggingface/transformers/issues/27974#issuecomment-1854188941

setup_quantized_meta_for_peft

utils.models.setup_quantized_meta_for_peft(model)

Replaces quant_state.to with a dummy function to prevent PEFT from moving quant_state to meta device

setup_quantized_peft_meta_for_training

utils.models.setup_quantized_peft_meta_for_training(model)

Replaces dummy quant_state.to method with the original function to allow training to continue