Fine-tuning Tiny LLM on Custom Dataset

Running a large language model (LLM) like Llama 2 7B on your custom dataset can be expensive - you need a lot of compute to train the model and even more to make fast inferences. This is where Tiny LLMs come in. These models are small (1-2B parameters) and can be trained on a single GPU. Here, you’ll learn how to fine-tune a Tiny LLM on a custom dataset of cryptocurrency news articles.

Our final model will be able to predict the subject and sentiment of a news article given its title and text. We’ll use the Tiny Llama model¹ which has 1.1B parameters and LoRA to reduce the number of parameters we use for training.

Why Fine-tuning Tiny LLM?

Get a Tiny LLM to perform well on your custom task requires much less resources than getting a large LLM to do the same. You might start with prompt engineering and other tricks but those don’t do well with Tiny LLMs. That said, you can get a Tiny LLM to perform well on your task with fine-tuning and a thousand good quality examples.

Important benefit of fine-tuning a Tiny LLM is that you can control the prompt and output format. In classical ML tasks, you have to convert your data into a format that the model can understand. This is not the case with LLMs. You can use your data in a plain text format and let the model learn the task. Also, you can get the model to generate the output in a format that you want - JSON, YAML, Markdown, etc. This will most probably save you a lot of extra tokens and computation.

Fine-tuning Tiny LLM on Custom Dataset

Why Fine-tuning Tiny LLM?

Choose Your Hardware (GPU)

MLExpert is loading...

References

Fine-tuning Tiny LLM on Custom Dataset

Why Fine-tuning Tiny LLM?

Choose Your Hardware (GPU)

MLExpert is loading...

References

Footnotes