Thanks to the new LLM, GigaChat follows instructions better and can complete more complex tasks
At the international conference on artificial intelligence AI Journey, Sber developers announced a new version of the GigaChat service, which is based on one of the most advanced models for the Russian language with 29 billion parameters.
Sber's business clients will soon be among the first to have access to the new API – to implement their own solutions, as well as members of the academic community – to conduct research.
Thanks to the new LLM, GigaChat follows instructions better and can perform more complex tasks: the quality of summarizing, rewriting and editing texts, and answering various questions has significantly improved. The team compared the responses of the new and previous models and recorded an overall improvement in quality of 23%. At the same time, the announced model copes with facts 25% better than the previous version.
According to the results of internal evaluation in the MMLU (Massive Multitask Language Understanding) benchmark, the model of the new version of GigaChat with 29 billion parameters is superior to the most popular open analog LLaMA 2 34B.
Senior Vice President and Head of the Technologies block of Sberbank Andrey Belevtsev said:
Training the models that power GigaChat is a massive and complex computational undertaking, and we've never done anything like it before. The total number of computational operations was almost 6 times higher than the number of operations when training the ruGPT-3 model with 13 billion parameters in 2021. Also, especially for GigaChat, we have collected and are developing a unique dataset, on which hundreds of Sber employees are working, helping to develop and improve the quality of answers in a variety of domains.