Feed GPT-3.5 Turbo with your own data and thus customize the language model for your own purposes – this fine tuning of the AI model is now possible. The OpenAI developers announced this in a blog post. Even with the even more powerful successor GPT-4, retraining with your own data should be possible in the fall. The popular AI tool ChatGPT from the same manufacturer is also based on fine-tuned versions of GPT-3.5 (free version) or GPT-4 (paid Plus subscription).
If you feed GPT-3.5 Turbo with your own data, you can increase the performance of the language model “for specific, limited tasks” over that of the basic version GPT-4, OpenAI promises. And: Data that companies or individuals send via the fine-tuning API remains the property of the customers and would not be used by OpenAI or others themselves to train models.
Fixed response form, same format, consistent language style
The closed beta phase has shown that fine tuning is primarily suitable for three core goals. The provider understands “improved controllability” as the possibility of giving the model more precise specifications for its output from the outset. If desired, the language model answers more briefly or always in the same language – for example in German, even if the prompt is formulated in this way. Fixed output formats, such as those required when the language model is embedded in other applications, are also possible with finely tuned GPT 3.5 Turbo models. Finally, OpenAI customers can also customize the tonality of the language model’s responses, which is primarily of interest to companies that come out with a “recognizable brand language.”
Another advantage is that the individual prompts can be shorter because they no longer have to contain all the instructions. The testers could have reduced their prompts by up to 90 percent. Nevertheless, the model with 4000 tokens processes twice as many as was previously possible with finely tuned OpenAI models. Among other things, GPT-3, on which the original ChatGPT version was based, could already be customized.
OpenAI rules remain
However, OpenAI does not want to give its customers a completely free hand: In order to obtain the security features built into the standard model, the individual training data would first be channeled “through (a) moderation API and a GPT-4-supported moderation system”. The provider wants to prevent the use of “unsafe training data” that could possibly undermine the model barriers.
There are additional fees for model training: $0.008/1,000 tokens apply, model usage then $0.012 for inputs, $0.016 for outputs per 1,000 tokens. The blog post does not yet reveal when the provider will also open GPT-4 for fine tuning.
Go to home page
#Language #model #ChatGPT #Feed #GPT3.5 #Turbo #data