


The world's most powerful open source MoE model is here, with Chinese capabilities comparable to GPT-4, and the price is only nearly one percent of GPT-4-Turbo
May 07, 2024 pm 04:13 PMImagine an artificial intelligence model that not only has the ability to surpass traditional computing, but also achieves more efficient performance at a lower cost. This is not science fiction, DeepSeek-V2[1], the world’s most powerful open source MoE model is here.
DeepSeek-V2 is a powerful mixture of experts (MoE) language model with the characteristics of economical training and efficient inference. It consists of 236B parameters, 21B of which are used to activate each tag. Compared with DeepSeek 67B, DeepSeek-V2 has stronger performance, while saving 42.5% of training costs, reducing KV cache by 93.3%, and increasing the maximum generation throughput to 5.76 times.
DeepSeek is a company exploring the nature of artificial general intelligence (AGI) and is committed to integrating research, engineering and business.
The comprehensive capabilities of DeepSeek-V2
In the current mainstream list of large models, DeepSeek-V2 performs well:
- The Chinese comprehensive ability (AlignBench) is the strongest among the open source models: it is in the same echelon with closed source models such as GPT-4-Turbo and Wenxin 4.0 in the evaluation
- The English comprehensive ability (MT-Bench) is in the third place First echelon: English comprehensive ability (MT-Bench) is in the same echelon as the strongest open source model LLaMA3-70B, surpassing the strongest MoE open source model Mixtral 8x22B
- Knowledge, mathematics, reasoning, programming and other ranking results At the forefront
- Supports 128K context windows
New model structure
The potential of AI Being constantly excavated, we can’t help but ask: What is the key to promoting intelligent progress? DeepSeek-V2 gives the answer - the perfect combination of innovative architecture and cost-effectiveness.
"DeepSeek-V2 is an improved version. With a total parameter of 236B and activation of 21B, it finally reaches the capability of 70B~110B Dense model. At the same time, the memory consumption is only 1/5 of the same level model~ 1/100. On the 8-card H800 machine, it can process the input of more than 100,000 tokens per second and the output of more than 50,000 tokens per second. This is not only a leap in technology, but also a revolution in cost control. "
Today, with the rapid development of AI technology, the emergence of DeepSeek-V2 not only represents a technological breakthrough, but also heralds the popularization of intelligent applications. It lowers the threshold for AI and allows more companies and individuals to enjoy the benefits of efficient intelligent services. At the same time, it also heralds the popularization of intelligent applications. It lowers the threshold for AI and allows more companies and individuals to enjoy the benefits of efficient intelligent services.
Chinese capability VS price
In terms of Chinese capability, DeepSeek-V2 leads the world in the AlignBench ranking while providing a very competitive API price.
Both open source models and papers
DeepSeek-V2 is not just a model, it is a gateway to more The key to the smart world. It opens a new chapter in AI applications with lower cost and higher performance. The open source of DeepSeek-V2 is the best proof of this belief. It will inspire more people's innovative spirit and jointly promote the future of human intelligence.
- Model weights: https://huggingface.co/deepseek-ai
- Open source address: https://github.com/deepseek-ai/DeepSeek-V2
As AI continues to evolve, how do you think DeepSeek-V2 will change our world? Let’s wait and see. If you are interested, you can visit chat.deepseek.com to personally experience the technological changes brought about by DeepSeek-V2.
References
[1]
DeepSeek-V2: https: //ipnx.cn/link/b2651c9921723afdfd04ed61ec302a6b
The above is the detailed content of The world's most powerful open source MoE model is here, with Chinese capabilities comparable to GPT-4, and the price is only nearly one percent of GPT-4-Turbo. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

To view Git commit history, use the gitlog command. 1. The basic usage is gitlog, which can display the submission hash, author, date and submission information; 2. Use gitlog--oneline to obtain a concise view; 3. Filter by author or submission information through --author and --grep; 4. Add -p to view code changes, --stat to view change statistics; 5. Use --graph and --all to view branch history, or use visualization tools such as GitKraken and VSCode.

To delete a Git branch, first make sure it has been merged or no retention is required. Use gitbranch-d to delete the local merged branch. If you need to force delete unmerged branches, use the -D parameter. Remote branch deletion uses the gitpushorigin-deletebranch-name command, and can synchronize other people's local repositories through gitfetch-prune. 1. To delete the local branch, you need to confirm whether it has been merged; 2. To delete the remote branch, you need to use the --delete parameter; 3. After deletion, you should verify whether the branch is successfully removed; 4. Communicate with the team to avoid accidentally deleting shared branches; 5. Clean useless branches regularly to keep the warehouse clean.

The "Dogcoin" in the currency circle usually refers to newly issued cryptocurrencies with extremely low market value, opaque project information, weak technical foundation or even no practical application scenarios. These tokens often appear with high-risk narratives.

To identify fake altcoins, you need to start from six aspects. 1. Check and verify the background of the materials and project, including white papers, official websites, code open source addresses and team transparency; 2. Observe the online platform and give priority to mainstream exchanges; 3. Beware of high returns and people-pulling modes to avoid fund traps; 4. Analyze the contract code and token mechanism to check whether there are malicious functions; 5. Review community and media operations to identify false popularity; 6. Follow practical anti-fraud suggestions, such as not believing in recommendations or using professional wallets. The above steps can effectively avoid scams and protect asset security.

This article has selected several top Python "finished" project websites and high-level "blockbuster" learning resource portals for you. Whether you are looking for development inspiration, observing and learning master-level source code, or systematically improving your practical capabilities, these platforms are not to be missed and can help you grow into a Python master quickly.

AMA in the currency circle is the abbreviation of Ask Me Anything, which is literally translated as "ask me any questions". This is a form of interaction between project parties and community members. Project teams usually broadcast live on specific platforms, such as Telegram groups, Discord servers, or via Twitter Spaces, to open questions to participants. Community members can take this opportunity to directly raise questions about any aspects such as technology, economic model, marketing promotion, roadmap, etc. to the core members of the project.

As a pioneer in the digital world, Bitcoin’s unique code name and underlying technology have always been the focus of people’s attention. Its standard code is BTC, also known as XBT on certain platforms that meet international standards. From a technical point of view, Bitcoin is not a single code style, but a huge and sophisticated open source software project. Its core code is mainly written in C and incorporates cryptography, distributed systems and economics principles, so that anyone can view, review and contribute its code.

To add a subtree to a Git repository, first add the remote repository and get its history, then merge it into a subdirectory using the gitmerge and gitread-tree commands. The steps are as follows: 1. Use the gitremoteadd-f command to add a remote repository; 2. Run gitmerge-srecursive-no-commit to get branch content; 3. Use gitread-tree--prefix= to specify the directory to merge the project as a subtree; 4. Submit changes to complete the addition; 5. When updating, gitfetch first and repeat the merging and steps to submit the update. This method keeps the external project history complete and easy to maintain.
