How to fine-tune a GPT model
Using OpenAI's native fine-tuning to customize how your AI talks to you
Large-language models (LLMs) are fantastic at giving general responses to questions.
ChatGPT, how many emperor penguins would I need to stack on each other to get to the moon?
Emperor penguins are typically about 1.2 meters tall. The average distance from the Earth to the Moon is approximately 384,400 kilometers, or 384,400,000 meters. So, if you were to stack emperor penguins on top of each other to reach the moon, you would need approximately 384,400,000 / 1.2 = 320,333,333 penguins. However, this is purely hypothetical and not practically possible. Also, it's important to note that the distance to the moon can vary depending on its position in its orbit.
But what if I want it to respond in a certain way? What if instead of giving me a verbose explanation on my (vital) emperor penguin problem, I just wanted it to give me the direct numerical answer and skip the chitchat?
On Tuesday, OpenAI reopened their in-house fine-tuner for their GPT 3.5 Turbo model. This allows users to nudge their specific AI friend towards the responses or response format they desire or need. How’s it work?
Below is a simplified explanation. For full details on fine-tuning, see OpenAI’s documentation.
Crafting Your Training Dataset
The core of fine-tuning involves feeding a model (via OpenAI’s API) with numerous examples of how you want it to respond and act in conversation. This dataset should be a collection of examples that closely resemble the conversations and responses that you desire from your fine-tuned model. Each example should include a role (either 'system', 'user', or 'assistant'), content, and an optional name.
system: The information in this role tells the AI model the overall type of model you want them to be. Think of this as a definition of your desired fine-tuned personality.
content: Fill this with a hypothetical question that you feel will best represent the desired outcomes you want.
assistant: This role is your chance to define the exact response you want your model to have, if asked the question in your stated ‘content’ role. The goal with the dataset is to provide enough of these simulated responses to coax your fine-tuned model into starting to respond in the manner you are suggesting.
For instance, to get our AI to respond to our “how many penguins does it take to get to the moon?” inquiries with only the numerical answers and skip the verbose reasonings, our training data could look something like this:
Keep reading with a 7-day free trial
Subscribe to Handy AI to keep reading this post and get 7 days of free access to the full post archives.