Alibaba’s new AI model can process video and audio on phones

1 mins read March 26, 2025

Alibaba's new AI model can process video and audio on phones

Alibaba has released a new AI model as part of the Quen series. The tech giant says that this model is capable of processing video, audio, pictures, and text. It is also efficient enough to work directly on laptops and mobile phones.

The company has mentioned that the new model, available on GitHub and Hugging Face, can be utilized in AI agents. These AI agents will help the visually impaired navigate their surroundings with the help of audio description in real-time.

Alibaba has been quite swift with its new releases. The tech giant is apparently going all in on AI tech in 2025. Just days after the release of DeepSeek, the e-commerce giant released a new version of its Quen model. Earlier in March, it also released a newer version of Quart, its AI assistant app.

Alibaba is not the only company working on multimodal AI. Competitors such as OpenAI and Alphabet Inc.’s Google have also introduced generative AI tools that can handle different types of input like text and audio. On Tuesday, OpenAI added advanced image generation features to ChatGPT, further expanding its capabilities.

The company has stated that it plans to invest more in its AI and cloud computing network than it has in the past decade. Alibaba aims to become a key partner for companies that develop and apply AI in practical settings as models become more advanced and require more computing power.

Meanwhile, low-cost AI services from China are challenging the higher-priced offerings from major US companies, putting pressure on their business models. However, not everyone is yet convinced whether these AI releases surpass or match the cutting edge Western tech.

The smartest crypto minds already read our newsletter. Want in? Join them.

IN BRIEF

Share this article

Disclaimer. The information provided is not trading advice. Cryptopolitan.com holds no liability for any investments made based on the information provided on this page. We strongly recommend independent research and/or consultation with a qualified professional before making any investment decisions.

Shummas Humayun

Shummas is a former technical content writer and a researcher.

TABLE OF CONTENT

Share this article

MORE … NEWS

SHOW ALL

What Is Base? The Ethereum Layer-2 Network Launched by Coinbase

October 21, 2025 Learn Crypto: Beginner Guides
Dogecoin vs. Bitcoin: Key Technical Differences

October 20, 2025 Learn Crypto: Beginner Guides
What Is TVL (Total Value Locked) in Crypto?

October 14, 2025 Learn Crypto: Beginner Guides
How to Read a Crypto Whitepaper?

October 13, 2025 Learn Crypto: Beginner Guides
Ripple vs. XRP vs. XRP Ledger: What’s the Difference?

October 13, 2025 Learn Crypto: Beginner Guides
What Is a Multisig Wallet in Crypto?

October 10, 2025 Learn Crypto: Beginner Guides

DEEP CRYPTO
CRASH COURSE

Which cryptocurrencies can make you money
How to boost your security with a wallet (and which ones are actually worth using)
Little-known investment strategies that the pros use
How to get started investing in crypto (which exchanges to use, the best crypto to buy etc)

Alibaba’s new AI model can process video and audio on phones

5 Ingenious Applications of ChatGPT And What You Should Do About Them

93% Business Leaders Favor AI-Powered Solutions for Brand Sustainability Management, Reuters

Here’s How Macron Supports France’s Vibrant and Productive AI Ecosystem

Bloomberg Estimates the Generative AI Market to Reach $1.3 Trillion by 2032

One sharp brief.
Every day.

Alibaba’s new AI model can process video and audio on phones

5 Ingenious Applications of ChatGPT And What You Should Do About Them

93% Business Leaders Favor AI-Powered Solutions for Brand Sustainability Management, Reuters

Here’s How Macron Supports France’s Vibrant and Productive AI Ecosystem

Bloomberg Estimates the Generative AI Market to Reach $1.3 Trillion by 2032

One sharp brief.Every day.

One sharp brief.
Every day.