OpenAI has launched real-time video capabilities in ChatGPT, enhancing its Advanced Voice Mode with visual understanding. This feature, which began rolling out on December 12, allows users to interact with the AI via their camera to receive contextual assistance. The integration signifies a notable advancement in the AI’s conversational abilities, especially for those subscribing to ChatGPT Plus, Team, or Pro.
OpenAI enhances ChatGPT with real-time video capabilities
The Advanced Voice Mode now includes real-time video analysis and screen-sharing functionalities. Users can point their phones at objects for immediate responses or share their screens for detailed explanations of settings or problems. This addition builds on previous capabilities, allowing for a more interactive user experience. OpenAI demonstrated the feature during a livestream, showcasing its ability to engage in casual conversations and provide insights based on visual input.
As for availability, the rollout began on December 12 and will extend over the following week. However, European users along with ChatGPT Enterprise and Edu subscribers may face delays; these users will receive access early next year. This expansion indicates OpenAI’s focus on enhancing user interaction with AI, aligning with its overarching goal to provide advanced capabilities within its suite of products.
Integrations with iOS 18.2
In a parallel development, Apple recently introduced iOS 18.2, which incorporates several ChatGPT features across Siri, Writing Tools, and Visual Intelligence. The integration with Siri allows the voice assistant to recognize queries that fall outside its range and redirect them to ChatGPT. Users will be notified and must approve this action before it proceeds.
For users with iPhone 16 devices, Visual Intelligence empowers them to point their camera at objects or situations to retrieve information via ChatGPT or Google. Furthermore, the Writing Tools feature now includes a new “Compose” tool, enabling content creation from scratch using ChatGPT’s capabilities. These features, emphasizing utility and user control, have been structured to align with ChatGPT’s usage limits, ensuring an organized experience.
Updates from the ’12 days of OpenAI’
OpenAI has organized a campaign named “12 Days of OpenAI,” commencing on December 5, which features daily livestream sessions revealing new features or products. CEO Sam Altman described the campaign as a mix of significant updates and minor enhancements. Among the notable announcements was the introduction of a new Santa voice for the Advanced Voice Mode, which users can activate via a snowflake icon.
Additionally, the campaign unveiled Sora, OpenAI’s new video model, now available to ChatGPT Pro and Plus users. This model can generate text-to-video and video-to-video content, significantly broadening the creative avenues available to users. Other notable updates included the announcement of Canvas for all web users, previously a beta feature, enabling a more integrated project management experience.
Looking ahead, OpenAI has plans to expand its offerings, including a full version of its o1 language model, set to enhance reasoning capabilities further. The company aims to make Reinforcement Fine-Tuning more widely available, with applications currently sought from research institutes and universities for fine-tuning AI models for specific tasks.
Featured image credit: OpenAI