DeepSeek, an artificial intelligence startup based in Hangzhou, China, launched the large-scale language model DeepSeek-V3 at the end of December 2024, which has attracted global attention in the AI industry. The model has 671 billion parameters and was trained in just two months at a cost of $5.58 million, much lower than the investment cost of other tech giants. DeepSeek-V3 performs well among open-source models and is comparable to the most advanced models in the world. The company has optimized the training process to reduce costs, using approximately 2.78 million hours of Nvidia H800 GPUs, which are manufactured in China. This indicates significant progress by Chinese AI companies in acquiring advanced semiconductor materials necessary for training AI, despite restrictions imposed by the United States. DeepSeek's success has raised concerns in the US tech industry, causing a significant drop in the stocks of Nvidia and other tech companies. Experts believe that DeepSeek has achieved high performance at much lower costs than its American competitors by using open-source technology and effective training methods. In addition, DeepSeek has publicly released the source code and detailed technical explanations of the model, allowing researchers and developers worldwide to access and improve this technology. This level of transparency contrasts sharply with the more conservative approach of top American artificial intelligence companies and may change the way future technology companies develop models.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
#Deepseek Goes Viral
DeepSeek, an artificial intelligence startup based in Hangzhou, China, launched the large-scale language model DeepSeek-V3 at the end of December 2024, which has attracted global attention in the AI industry. The model has 671 billion parameters and was trained in just two months at a cost of $5.58 million, much lower than the investment cost of other tech giants.
DeepSeek-V3 performs well among open-source models and is comparable to the most advanced models in the world. The company has optimized the training process to reduce costs, using approximately 2.78 million hours of Nvidia H800 GPUs, which are manufactured in China. This indicates significant progress by Chinese AI companies in acquiring advanced semiconductor materials necessary for training AI, despite restrictions imposed by the United States.
DeepSeek's success has raised concerns in the US tech industry, causing a significant drop in the stocks of Nvidia and other tech companies. Experts believe that DeepSeek has achieved high performance at much lower costs than its American competitors by using open-source technology and effective training methods.
In addition, DeepSeek has publicly released the source code and detailed technical explanations of the model, allowing researchers and developers worldwide to access and improve this technology. This level of transparency contrasts sharply with the more conservative approach of top American artificial intelligence companies and may change the way future technology companies develop models.