Skip to main content

🦾 Supported models

Base versions

ModelModel KeyLoRAINT8LoRA + INT8LoRA + INT4
BLOOM 1.1Bbloom
Cerebras 1.3Bcerebras
DistilGPT-2distilgpt2
Falcon 7Bfalcon
Galactica 6.7Bgalactica
GPT-J 6Bgptj
GPT-2gpt2
LLaMA 7Bllama
LLaMA2llama2
OPT 1.3Bopt

Memory-efficient versions

The above mentioned are the base variants of the LLMs. Below are the templates to get their LoRA, INT8, INT8 + LoRA and INT4 + LoRA versions.

VersionTemplate
LoRA<model_key>_lora
INT8<model_key>_int8
INT8 + LoRA<model_key>_lora_int8

INT4 Precision model versions

In order to load any model's INT4+LoRA version, you will need to make use of GenericLoraKbitModel class from xturing.models. Below is how to use it:

from xturing.models import GenericLoraKbitModel
model = GenericLoraKbitModel('/path/to/model')

The /path/to/model can be replaced with you local directory or any HuggingFace library model like facebook/opt-1.3b.