Google showed Project Astra – an AI assistant with voice and visual recognition, similar to GPT-4o

by alex

The Technology section is published with the support of Favbet Tech

Google показал Project Astra — ИИ-ассистента с голосовым и визуальным распознаванием, похожего на GPT-4o

Google показал Project Astra — ИИ-ассистента с голосовым и визуальным распознаванием, похожего на GPT-4o

At the Google I/O 2024 presentation, the company showed the Project Astra virtual assistant with artificial intelligence and visual recognition based on Google Gemini, which is under development. Speaking about Astra, CEO of the DeepMind experimental laboratory, Demis Hassabis said that his team has always wanted to develop a universal AI agent that would be useful in everyday life.

Project Astra is a program whose main input interfaces are camera and voice. A man with a smartphone pointed his camera at different parts of the office and gave Astra a task: “Tell me when you see something that makes a sound.” When the virtual assistant saw a speaker next to the monitor, he responded, “I see a speaker that makes sound.” The demonstrator drew an arrow on the screen to the top circle on the speaker and asked, “What is the name of this part of the speaker?.” The program instantly responded: “This is a tweeter. It makes high-frequency sounds.”

Google показал Project Astra — ИИ-ассистента с голосовым и визуальным распознаванием, похожего на GPT-4o

Then, in the video, which Google says was recorded in one take, the tester walked up to a cup of colored pencils below the table and asked, “Give me some creative alliteration for that,” to which the response was, “Creative colored pencils are fun colored.” They usually create colorful pieces.” The video goes on to show Astra identifying and explaining pieces of code on the monitor and telling the user what area they are in based on the view from the window. Astra was able to answer the question: “Do you remember where you saw my glasses?” despite the fact that they were hidden. “Yes, I know. Your glasses were on the table next to the red apple.”

Google показал Project Astra — ИИ-ассистента с голосовым и визуальным распознаванием, похожего на GPT-4o

After this, the tester put on glasses, and the video received a first-person perspective. Using the built-in camera, the glasses scanned the environment, looking at the diagram on the board. The person in the video asked, “What can I add here to make this system faster?.” The program responded: “Adding a cache between the server and the database can improve speed.”

Practical intensive course on design – Design Booster from Powercode academy. Start designing from scratch in 3 months and earn the first $1000, if you don’t have a creative mind, enjoy painting. Learn the practical skills you need for a successful design career. Register

READ
Geely Coolray and Tugella 2024 crossovers have risen in price in Russia

The tester looked at the pair of cats depicted on the board and asked, “What does this remind you of?.” Astra said, “Schrödinger's Cat.” When a stuffed tiger was placed next to a golden retriever and asked to name the group, Astra responded, “Golden Stripes.”

The demonstration proves that Astra not only processed visual data in real time, but also remembered what it saw and worked with stored information. According to Hassabis, this was due to faster processing of information by continuously encoding video frames, combining video and speech input with a timeline of events, and caching this information for efficient use.

In the video, Astra responded to requests quite quickly. Hassabis noted in a blog post: “While we have made incredible progress in developing artificial intelligence systems that can understand multimodal information, reducing response times to conversational ones is a challenging engineering challenge.” Google is also working to give its AI a greater range of variety and emotional nuance.

While Astra remains an early feature with no concrete launch plans, Hassabis said similar assistants could be available on a phone or glasses in the future. There's no word yet on whether the glasses will be a successor to Google Glass, but a DeepMind executive noted that some of the capabilities demonstrated will be available in Google products later this year.

Practical intensive course on design – Design Booster from Powercode academy. Start designing from scratch in 3 months and earn the first $1000, if you don’t have a creative mind, enjoy painting. Learn the practical skills you need for a successful design career. Register

Google показал Project Astra — ИИ-ассистента с голосовым и визуальным распознаванием, похожего на GPT-4o

Google показал Project Astra — ИИ-ассистента с голосовым и визуальным распознаванием, похожего на GPT-4o

Favbet Tech is IT a company with 100% Ukrainian DNA, which creates perfect services for iGaming and Betting using advanced technologies and provides access to them. Favbet Tech develops innovative software through a complex multi-component platform that can withstand enormous loads and create a unique experience for players. The IT company is part of the FAVBET group of companies.

You may also like

Leave a Comment