Artificial intelligence continues evolving toward more natural human interaction. Hidden code strings in ChatGPT's latest mobile update hint at an exciting development: video capture functionality that could let users record clips and ask questions verbally for quicker, more intuitive responses.
Feature Discovery
App researchers recently uncovered telling code strings containing the message: "Now you can capture videos and ask your question out loud for a faster answer." Though inactive, this discovery strongly indicates OpenAI's preparation to roll out multimodal video support within ChatGPT.
Video input capability would mark a significant advancement in AI accessibility. Users could record math problems on paper for instant solutions, show malfunctioning devices for troubleshooting guidance, capture physical symptoms for preliminary health insights, or share quick clips for AI-assisted content development. This represents a shift from traditional text-based interaction to rich, visual communication that mirrors how humans naturally share information.
Industry Context
OpenAI's move reflects the broader push toward multimodal AI systems across the tech landscape. While competitors like Google's Gemini and Anthropic's Claude have explored similar capabilities, implementing video input in a mainstream product like ChatGPT could establish OpenAI's competitive edge in the rapidly evolving AI assistant market.
These code discoveries don't guarantee immediate availability, but they clearly signal OpenAI's strategic direction. Should this feature launch, ChatGPT would transform from a text-and-voice assistant into a comprehensive multimodal platform capable of processing text, images, audio, and video seamlessly.