ReLoRa: Pre-train a Large Language Model on Your GPU

In 2021, Hu et al. proposed low-rank adapters (LoRa) for LLMs. This method significantly reduces the cost of fine-tuning large language models (LLMs) by only training a few added parameters (low-rank networks) while keeping the LLM’s original parameters (high-rank networks) frozen.

With LoRa, we still need an existing pre-trained model to fine-tune, i.e., it can’t pre-train a good LLM from scratch due to the low-rank restrictions. It leaves pre-training unaffordable for most individuals and organizations.

To reduce this cost, Lialin et al. (2023) propose ReLoRa. This is a modification of LoRa that allows pre-training LLMs from scratch.

Read More