Bootcamp
Fine-tuning Tiny LLM on Custom Dataset

Fine-tuning Tiny LLM on Custom Dataset

Running a large language model (LLM) like Llama 2 7B on your custom dataset can be expensive - you need a lot of compute to train the model and even more to make fast inferences. This is where Tiny LLMs come in. These models are small (1-2B parameters) and can be trained on a single GPU. Here, you'll learn how to fine-tune a Tiny LLM on a custom dataset of cryptocurrency news articles.

Our final model will be able to predict the subject and sentiment of a news article given its title and text. We'll use the Tiny Llama model1 which has 1.1B parameters and LoRA to reduce the number of parameters we use for training.

Why Fine-tuning Tiny LLM?

Get a Tiny LLM to perform well on your custom task requires much less resources than getting a large LLM to do the same. You might start with prompt engineering and other tricks but those don't do well with Tiny LLMs. That said, you can get a Tiny LLM to perform well on your task with fine-tuning and a thousand good quality examples.

Important benefit of fine-tuning a Tiny LLM is that you can control the prompt and output format. In classical ML tasks, you have to convert your data into a format that the model can understand. This is not the case with LLMs. You can use your data in a plain text format and let the model learn the task. Also, you can get the model to generate the output in a format that you want - JSON, YAML, Markdown, etc. This will most probably save you a lot of extra tokens and computation.

Choose Your Hardware (GPU)

MLExpert is loading...

References

Footnotes

  1. Tiny Llama (opens in a new tab) ↩ ↩2

  2. Crypto News + Dataset (opens in a new tab) ↩