Alibaba has released a new AI model as part of the Quen series. The tech giant says that this model is capable of processing video, audio, pictures, and text. It is also efficient enough to work directly on laptops and mobile phones.
The company has mentioned that the new model, available on GitHub and Hugging Face, can be utilized in AI agents. These AI agents will help the visually impaired navigate their surroundings with the help of audio description in real-time.Â
Alibaba has been quite swift with its new releases. The tech giant is apparently going all in on AI tech in 2025. Just days after the release of DeepSeek, the e-commerce giant released a new version of its Quen model. Earlier in March, it also released a newer version of Quart, its AI assistant app.
Alibaba is not the only company working on multimodal AI. Competitors such as OpenAI and Alphabet Inc.’s Google have also introduced generative AI tools that can handle different types of input like text and audio. On Tuesday, OpenAI added advanced image generation features to ChatGPT, further expanding its capabilities.
The company has stated that it plans to invest more in its AI and cloud computing network than it has in the past decade. Alibaba aims to become a key partner for companies that develop and apply AI in practical settings as models become more advanced and require more computing power.Â
Meanwhile, low-cost AI services from China are challenging the higher-priced offerings from major US companies, putting pressure on their business models. However, not everyone is yet convinced whether these AI releases surpass or match the cutting edge Western tech.
Cryptopolitan Academy: Want to grow your money in 2025? Learn how to do it with DeFi in our upcoming webclass. Save Your Spot