I recently had the opportunity to test out the Meta Ray-Ban AI glasses, and the experience was nothing short of fascinating. Conducted at Meta’s New York headquarters, the experimental feature I tried showcased the glasses’ ability to interpret the world around me using generative AI.
The feature, rolling out to Meta’s second-generation Ray-Ban glasses, allows users to leverage on-glasses cameras and generative AI to analyze images and provide insights. Originally slated for a 2024 launch, Meta has accelerated the introduction of these AI features, albeit in an early-access beta mode. The glasses, along with a new update incorporating Bing-powered search, are rapidly gaining additional capabilities.
During my demo, I engaged the glasses in a unique way. I placed four tea packets on a table, with their caffeine labels intentionally obscured. Using a voice command, “Hey, Meta, look at this and tell me which of these teas is caffeine-free,” I awaited the AI’s response. The glasses made a subtle clicking sound, and Meta’s AI voice promptly informed me that the chamomile tea was likely caffeine-free. The glasses were reading labels and making judgments through generative AI.
The process is reminiscent of Google Lens and other on-phone tools that combine cameras and AI. However, the Meta glasses offer an accessible way to invoke AI for real-time identification of objects in the environment. This impressed me, and I’m eager to explore its potential further.
Multimodal AI: Current Functionality and Limits
As of now, the feature has some limitations. It relies on taking a photo of the subject, which the AI then analyzes. After making a voice request, a shutter click is audible, followed by a few seconds of processing before the AI responds. The voice prompts have a specific format, starting with “Hey, Meta,” followed by “look and,” triggering the photo-taking process.
Each AI response and the corresponding photo are stored in the Meta View phone app paired with the glasses. This provides a visual and written record for future reference, resembling memory-jogging notes. The glasses could potentially serve as a head-worn Google search, offering information while shopping or exploring.
In terms of accuracy, the glasses demonstrated a mix of precision and occasional errors. While correctly identifying certain items, they occasionally provided inaccurate information or even hallucinated details. The glasses’ capabilities showed potential for assistive purposes, offering insights into the user’s surroundings.
It’s important to note that this early launch aims to uncover bugs and refine the on-glasses AI’s functionality. The voice prompts, with frequent use of “Hey, Meta, look,” may evolve in the future. The current process involves a brief pause for image analysis, but Meta envisions a seamless future where low-power sensors trigger AI awareness without voice prompts.
The Future of Wearable AI
Meta’s approach, termed “multimodal AI,” combines cameras and voice chat for a more immersive experience. The company plans to integrate various forms of inputs, including sensory data, into future AI iterations. The Qualcomm AI-focused chipset on Meta’s new Ray-Bans is poised to handle additional functionalities.
While this early-access beta raises privacy concerns due to anonymized query data, Meta aims to implement more discrete controls over data sharing in the final release. As wearable AI products continue to evolve, Meta’s glasses offer a glimpse into the future of assistive awareness and wearable technology.
As the landscape of wearable AI products takes shape, Meta’s glasses stand at the forefront, showcasing the potential of this emerging frontier.
Editors’ note: This story was updated on 12/13 to clarify how the voice prompts work and make a correction. We said “take a look at this” in our original story, but the actual working phrases are “look” or “look at this.”
Bình luận