OpenAI has introduced a new method for fine-tuning its AI models: Reinforcement Fine-Tuning (RFT). This technique goes beyond previous, supervised fine-tuning and enables the development of highly specialized AI models for complex tasks in specific domains.
Previous fine-tuning methods focused on teaching models to imitate the style and tone of training data. RFT, on the other hand, allows models to develop new ways of problem-solving. The process begins by presenting a problem to the model. The model is given time to develop a solution. Then, an evaluation system assesses the answer and reinforces successful thought patterns while weakening faulty ones.
This approach is particularly suitable for specialized areas such as law, finance, engineering, and insurance, which require deep expertise. OpenAI emphasizes, for example, the collaboration with Thomson Reuters, in which the compact o1 Mini-model was trained to be a legal assistant.
An example of the application of RFT is provided by Justin Ree, a bioinformatician at Berkeley Lab. He used RFT to research rare genetic diseases. He trained the system with data from hundreds of scientific publications containing symptoms and their associated genes. Ree reports that the o1 Mini-model trained with RFT outperformed the standard o1 model in this task, despite being smaller and more cost-effective. Particularly noteworthy is the model's ability to explain its predictions.
Another example is the application of RFT in the legal field. In collaboration with Thomson Reuters, OpenAI trained an AI model that functions as a legal assistant. This model can analyze complex legal texts and extract relevant information, which can significantly increase the efficiency of legal professionals.
OpenAI offers organizations the opportunity to participate in the Reinforcement Fine-Tuning Research Program. This program is aimed at organizations working on complex tasks that could benefit from AI support. Participants gain access to the RFT API and can contribute feedback to improve the technology before it is publicly available. OpenAI plans to make RFT more broadly available in early 2025.
With RFT, OpenAI is taking an important step towards the specialization of AI models. The ability to train models on complex tasks with few examples opens up new possibilities for various industries. By combining reinforcement learning and fine-tuning, AI models can develop a deep understanding of specific domains, thus becoming valuable tools for experts. The future development and application of RFT will significantly influence how AI is used in specialized fields.
Mindverse, as a German provider of AI-powered content solutions, is following the developments in the field of AI model specialization with great interest. The possibilities opened up by technologies like RFT are promising and could further advance the development of customized AI solutions for companies and organizations. From chatbots and voicebots to AI search engines and knowledge systems to individual solutions – Mindverse supports companies in harnessing the full potential of AI.
Quellen: - https://the-decoder.com/openai-unveils-reinforcement-fine-tuning-to-build-specialized-ai-models-for-complex-domains/ - https://openai.com/form/rft-research-program/ - https://www.maginative.com/article/openai-introduces-reinforcement-fine-tuning-to-build-domain-specific-expert-ai-models/ - https://www.tomsguide.com/ai/chatgpt/openai-just-got-a-major-upgrade-with-world-changing-potential-heres-how-it-works - https://www.aiixx.ai/blog/openai-launches-program-to-create-hyper-specialized-ai-models-through-reinforcement-fine-tuning - https://www.geeky-gadgets.com/openai-reinforcement-fine-tuning-rft/ - https://www.youtube.com/watch?v=sL5eqm5d5F4 - https://www.reddit.com/r/TheDecoder/comments/1h8q9l2/openai_unveils_reinforcement_finetuning_to_build/ - https://www.linkedin.com/posts/the-decoder-en_openai-unveils-reinforcement-fine-tuning-activity-7271117114877296640-S11F - https://medium.com/@smartwork.ai.tools/openais-reinforcement-learning-fine-tuning-transforming-workflows-51a69365a74c