A new AI system called DiffSensei generates manga based on text input and can maintain the style and appearance of the characters consistently. The system was developed by researchers from Peking University, the Shanghai AI Laboratory, and Nanyang Technological University. To demonstrate DiffSensei's capabilities, the researchers created a fictional manga about the AI pioneers Geoffrey Hinton, Yann LeCun, and Yoshua Bengio. The story is about their quest to develop an AI model that surpasses the transformer architecture and depicts their challenges, self-doubt, and ultimate triumph – culminating in the awarding of the Nobel Prize.
DiffSensei combines diffusion models with large language models to handle both the visual and narrative elements of manga creation. The manga generation process takes place in three steps: First, the page layout is created, then the characters are drawn, and finally, the dialogues are added. Multimodal models and LoRAs (Low-Rank Adaptation) ensure that the characters look consistent from panel to panel. For the training of DiffSensei, the researchers created their own dataset called MangaZero, which contains over 43,000 manga pages and 427,000 individual panels from 48 different manga series. Each panel was carefully annotated to mark the positions of the characters and the placement of the dialogues – details that, according to the team, are essential for the system to function properly.
The system is not yet perfect. Difficulties arise when the reference images of the characters are unclear. Similar-looking characters can merge in unexpected ways. Without specific character references, the artwork appears more generic than conforming to a specific manga style. Despite these limitations, the researchers believe that DiffSensei could optimize manga production in the future. The technology offers artists, publishers, and creatives a new tool to create personalized manga stories while maintaining control over characters and layouts.
The development of DiffSensei illustrates the rapid advancement of AI technology and its potential to revolutionize creative processes. Mindverse, a German company specializing in AI-powered content creation, also offers innovative solutions in this area. From text and image generation to the development of customized chatbots, voicebots, AI search engines, and knowledge systems – Mindverse supports companies in exploiting the full potential of AI. Developments in the field of AI, as exemplified by DiffSensei, underscore the importance of companies like Mindverse, which make these technologies accessible and usable.
Quellen: - https://the-decoder.com/diffsensei-ai-pioneers-hinton-lecun-and-bengio-star-in-fictional-manga-created-by-new-ai-system/ - https://jianzongwu.github.io/projects/diffsensei/ - https://the-decoder.com/ - https://www.ingendynamics.com/pioneers-of-the-ai-revolution/ - https://www.technologyreview.com/2023/05/02/1072528/geoffrey-hinton-google-why-scared-ai/ - https://www.youtube.com/watch?v=9RljVnT2zBY - https://www.weforum.org/stories/2024/03/ai-pioneers-breakthroughs-whats-next/ - https://www.researchgate.net/publication/386964040_DiffSensei_Bridging_Multi-Modal_LLMs_and_Diffusion_Models_for_Customized_Manga_Generation - https://mila.quebec/en/directory/yoshua-bengio - https://archive.blogs.harvard.edu/toshietakahashi/2019/06/17/interview-with-yoshua-bengio/