ChatGPT Steps Up Its Game with Advanced AI Features of Voice, Vision, and More

3 mins read September 26, 2023

OpenAI introduces voice feature for ChatGPT, allowing audio responses in multiple personas.
The company targets on-the-go users and personal assistant competition.
Users can also submit images and questions, expanding ChatGPT’s capabilities.

In a significant update, artificial intelligence (AI) startup OpenAI has added groundbreaking AI features to its ChatGPT app, allowing the chatbot to both hear and speak, as well as analyze images. Over the next two weeks, users will have the option to select from five distinct personas for the chatbot, such as “Juniper,” “Breeze,” and “Ember,” each offering a unique voice for audio responses. This development marks OpenAI’s latest effort to make conversations with AI-powered chatbots even more lifelike and engaging, catering to subscribers of its ChatGPT Plus service and enterprise users.

AI chatbot conversations become more human

OpenAI’s ChatGPT, initially launched in May, already enabled voice-to-text interactions with the chatbot. Now, users will be able to select a voice persona, making conversations more dynamic and natural. This audio response feature aims to bridge the gap between human and AI interaction, fostering a sense of authenticity in conversations with the chatbot. OpenAI hopes this enhancement will attract users looking for on-the-go assistance, putting ChatGPT in direct competition with established personal assistant offerings like Google’s Assistant, Apple’s Siri, and Amazon.com’s Alexa.

Users can leverage this feature for a variety of tasks, such as requesting information about the history of Disneyland while driving to the theme park or asking for a cocktail recipe while multitasking in the kitchen. During tests, ChatGPT demonstrated its storytelling capabilities by narrating a whimsical tale involving a starfish and a swede. But, it’s worth noting that while ChatGPT can generate song lyrics, it won’t lend its voice to singing, highlighting the app’s limitations in the realm of vocal performance.

The voices provided by ChatGPT may sound relatively human-like, albeit with a subtle robotic undertone. OpenAI collaborated with voice actors to develop the text-to-speech AI model that powers this feature, aiming to deliver a more engaging and convincing conversational experience.

Expanding capabilities with image recognition

In addition to the voice update, OpenAI also announced upcoming features for GPT-4, one of the advanced AI models behind ChatGPT. In the coming weeks, paid and enterprise users will gain access to an image recognition feature within the ChatGPT app and website. This feature allows users to submit an image along with a related question or request, expanding ChatGPT’s capabilities beyond text-based interactions.

For example, users can upload an image of pink sunglasses and ask the chatbot for fashion advice or outfit suggestions to complement the accessory. Alternatively, they can submit a picture of a challenging math problem and request assistance in solving it. This image recognition feature adds a new dimension to ChatGPT’s utility, enhancing its versatility for users across various domains. This enhancement marks a significant step forward in bridging the gap between text-based AI and visual information, opening up exciting possibilities for a wide range of user interactions and applications.

Celebrating ChatGPT’s enhanced AI features

OpenAI’s most recent enhancements to ChatGPT mark a substantial stride in the ongoing evolution of AI-driven conversational interfaces. The incorporation of vocal proficiency, complete with distinctive personas, coupled with the broadening of its capabilities to encompass image analysis, is a testament to OpenAI’s commitment to providing users with a profoundly engaging and multifaceted conversational milieu. While it may not yet be poised to serenade users with melodious tunes, ChatGPT’s capacity to partake in dynamic and exceedingly lifelike dialogues, while also adeptly addressing image-related inquiries, firmly cements its position as an exceedingly potent and indispensable AI tool within the continually burgeoning sphere of virtual assistants and conversational artificial intelligence.

Don’t just read crypto news. Understand it. Subscribe to our newsletter. It's free.

Share this article

Disclaimer. The information provided is not trading advice. Cryptopolitan.com holds no liability for any investments made based on the information provided on this page. We strongly recommend independent research and/or consultation with a qualified professional before making any investment decisions.

Aamir Sheikh

Aamir is a tech journalist with nearly six years of experience in the crypto and tech industries. He graduated from MAJ University with an MBA in Finance and Marketing. He now works with Cryptopolitan, where he reports on the latest developments in the cryptocurrency markets and price prediictions.

TABLE OF CONTENT

1. AI chatbot conversations become more human

2. Expanding capabilities with image recognition

3. Celebrating ChatGPT’s enhanced AI features

Share this article