The development of ever more powerful AI models is progressing rapidly. OpenAI, a leading company in AI research, recently introduced its new language model "o1". This model pursues an innovative approach that focuses not only on scaling, but also on human-like thought processes and could potentially have far-reaching implications for the hardware market.
Previously, the development of large language models (LLMs) focused primarily on scaling: Larger models with more parameters, trained with huge amounts of data and enormous computing power, were expected to lead to continuous performance improvements. However, experts like Ilya Sutskever, co-founder of OpenAI and Safe Superintelligence (SSI), see this strategy reaching its limits. Instead, the focus is now shifting to optimizing resource utilization.
o1 relies on "Test-Time Compute," a technique that improves the model's performance during the inference phase, i.e., the generation of responses. Instead of simply increasing the computational effort and the amount of data, this method allows the model to think through problems step by step, similar to the human thought process. This makes the use of computing power more efficient, especially for complex tasks such as mathematical problems or programming tasks.
Noam Brown, a researcher at OpenAI, illustrated the effectiveness of this approach with an example from game development: A bot that had 20 seconds to "think" in a poker game achieved the same performance increase as a model scaled up by a factor of 100,000 and trained 100,000 times longer.
OpenAI's new approach could significantly impact the AI hardware market. So far, NVIDIA, with its powerful GPUs, dominates the market for AI chips. However, the focus on inference optimization could open the door for new competitors and increase the demand for specialized inference chips.
Other AI companies like Google DeepMind, Anthropic, and xAI are also working on similar techniques. This competition is driving innovation in the hardware sector. NVIDIA itself has recognized the changing dynamics in AI training and is focusing on optimizing inference with its Blackwell architecture.
The development of o1 and similar models may mark the beginning of a new era in AI development. More efficient training methods and the increasing importance of inference are changing the demands on hardware. This could lead to a more dynamic and innovative hardware market, with new opportunities for specialized providers and increased competition among the established players.
Bibliographie: - https://www.artificialintelligence-news.com/news/o1-model-llm-ai-openai-training-research-next-generation/ - https://opendatascience.com/ai-giants-shift-focus-as-scaling-limits-llms-embracing-human-like-reasoning-techniques/ - https://www.reddit.com/r/OpenAI/comments/1fsbhfk/why_is_o1_such_a_big_deal/ - https://www.linkedin.com/posts/emmanuelturlay_openais-new-o1-set-of-models-are-not-just-activity-7240407727468920833-9lUl - https://www.reuters.com/technology/artificial-intelligence/openai-rivals-seek-new-path-smarter-ai-current-methods-hit-limitations-2024-11-11/ - https://medium.com/codex/why-openai-o1s-training-method-will-open-a-golden-age-for-small-language-models-7d64cb58f2ba - https://www.deeplearning.ai/the-batch/issue-276/ - https://www.wired.com/story/openai-o1-strawberry-problem-reasoning/ - https://news.yahoo.com/tech/asked-openais-o1-model-thought-130504277.html - https://wccftech.com/nvidia-blackwell-power-openai-o1-llm-model-50x-uplift-inferencing-capabilities/