Ruliad Releases Open-Source DeepThought-8B Reasoning Model

Ruliad's DeepThought-8B: A Glimpse into the Thought Process of AI

The AI startup Ruliad has introduced a new language model called DeepThought-8B, which is characterized by its special transparency. Unlike many other language models, whose decision-making processes often remain hidden, DeepThought-8B reveals its thought process step by step. This is achieved through the output of structured data in JSON format. This transparency is intended to give users a better understanding of the model's decisions and strengthen trust in the results.

Functionality and Architecture

DeepThought-8B is based on the language model Llama-3.1 8B and can be run locally on graphics cards with at least 16 GB of memory. The model solves tasks through a chain of thought steps, which Ruliad calls a "Reasoning Chain." Each step of this chain is documented in JSON format, similar to OpenAI's "Structured Outputs." An example of the output:

{ "step": 1, "type": "problem_understanding", "thought": "The user is asking how many Rs there are in the word 'strawberry'" }

This structured representation makes it possible to understand and analyze the model's thought process. Another important feature of DeepThought-8B is the ability to influence the "Reasoning Chains" through so-called "Injections." There are three types of injections: "Scripted," "Max Routing," and "Thought Routing." "Scripted" allows predefining specific reasoning points. "Max Routing" gives users control over the maximum number of thought steps and the type of conclusion of the thought chain. "Thought Routing" allows the definition of if/then rules that are dynamically applied during the chat.

Test-time Compute Scaling

DeepThought-8B uses "Test-time Compute Scaling" to adjust the depth of analysis to the complexity of the task. This means that the model can use more computing power during inference to solve more complex problems. Similar approaches are also used in other models like OpenAI's o1. However, the exact implementation and training methods can vary.

Performance and Benchmarks

Ruliad emphasizes the competitive results of DeepThought-8B in benchmarks for reasoning, mathematics, and programming, despite its relatively small size of 8 billion parameters. In some benchmarks, it achieves similar performance to significantly larger models like Qwen-2-72B and Llama-3.1-70B. However, it lags behind models like Claude 3.5 Sonnet, GPT-4o, and o1-mini. Ruliad also acknowledges that the model still has potential for improvement in complex mathematical problems, long contexts, and edge cases.

Availability and Outlook

The model weights of DeepThought-8B are open source and available on Hugging Face. A developer API is planned and is currently in beta. In the meantime, users can test DeepThought-8B for free after logging in with a Google account at chat.ruliad.co. Ruliad plans regular updates to the model based on community feedback and the results of further research.

The release of DeepThought-8B joins a series of new reasoning models that have been introduced recently, including DeepSeek-R1 and Qwen QwQ. The focus on transparency and controllability of the thought process could become an important aspect in the development of future AI models.

Sources: - https://the-decoder.de/deepthought-8b-ruliad-veroeffentlicht-offenes-reasoning-modell/ - https://www.ruliad.co/news/introducing-deepthought8b - https://smartaibusinessnews.com/ruliad-ai-releases-deepthought-8b-a-new-small-language-model-built-on-llama-3-1-with-test-time-compute-scaling-and-deliverers-transparent-reasoning/ - https://news.ycombinator.com/item?id=42254359 - https://twitter.com/erik_nijkamp?lang=de - https://www.linkedin.com/pulse/daily-news-ml-agents-key-updates-127-jim-schwoebel-3r4ee - https://tldr.tech/ai/2024-12-06 - https://huggingface.co/osanseviero - https://twitter.com/yahorbarkouski - https://x.com/jeremyphoward?lang=de

Ruliad Releases Open-Source DeepThought-8B Reasoning Model

Ruliad's DeepThought-8B: A Glimpse into the Thought Process of AI

Functionality and Architecture

Test-time Compute Scaling

Performance and Benchmarks

Availability and Outlook

Start for free now and experience the power of AI-driven knowledge management.