Google Assistant to be patient with filler words, vocal pauses

Discovered by 9to5Google in a recent in-depth analysis of APK Insights, a feature of the Nest Hub Max called “Look and talk”. Dubbed “Blue Steel” in reference to Ben Stiller’s film “Zoolander”, the feature was just officially revealed at Google I/O 2022, and it’s a small part of a much larger initiative where Google hopes to make its much more natural and comfortable assistant to interact with.

Advertisement

Look and speak for Nest Hub Max

The goal is to easily initiate a conversation with Google Assistant on any device and just speak naturally and be understood. Our voices are becoming the fastest way to access computing and process queries, but to date, it’s not perfect. Like having to say “Hey Google” every time you want to ask your Nest Hub something or make a phone call – anything. “Look and Talk” seeks to solve this problem by simply letting you peek at your Nest Hub Max from up to five feet away and activate it before speaking.

You can stop worrying about the right way to ask for something, and just relax and speak naturally.

Sissie Hsiao, Vice President/General Manager, Google Assistant at Google

To do this, the assistant uses six machine learning models to process more than 100 signals from both the hub’s camera and microphone in real time. Things like the orientation of your head, your proximity to the device, the direction of your gaze, the way your lips move, and any other contextual awareness needed to accurately process your request.

Let me start by saying it is absolutely wild. We all knew that the need for the keyword would one day disappear into the background of smart home usage, but until now we weren’t sure how that would happen and had reservations about how which Google could do in a way that is both efficient and respectful of users’ privacy.

As for Look and Talk, it is considered safe and secure, and all data is processed on the device, never sent to Google or anyone else. It also uses both face matching and voice matching, so it only works if it recognizes that you’re really you and you’re the one making the request when you look at your device, which is super smart.

Assistant will handle pauses and filler words in voice prompts

One of the biggest gripes I have as I try to teach my family how to use Google Assistant properly is that they often stop halfway through their requests and the assistant thinks that they have finished talking. It will process the order, returning to results and not responding to their request, and this will frustrate them. In reality, they weren’t quite done talking, but instead they were just thinking about the right word to use. In this way, many millions of users are worried about using Assistant because they feel they need to perfect their thinking before speaking. It’s both uncomfortable and boring.

To address this issue, Google is creating a more comprehensive neural network that runs on its Tensor chip, as well as implementing a more polite and patient AI assistant. Instead of assuming the user is done speaking just because there is a bit of silence, they will process what was said and see if it makes sense or is a complete sentence before closing the microphone. If he’s not entirely sure if the user has conveyed a complete thought or request (read: if he uses awkward pauses or filler words like “ummm” while thinking out loud), this will gently encourage him to finish by saying “mm hm”.

Assistant can handle awkward pauses and filler words

Additionally, it looks like Google Assistant will also have a better idea of ​​how to complete the user’s thoughts on its behalf. So polite, right? When Sissie Hsiao demonstrated it on stage, she used the word “something”, and instead, the assistant figured out how to complete the song title and automatically decided to play it on Spotify!

He can also complete your thought for you if you don’t know the right word.

Basically, it means in the sense of “yeah? and what else? Please continue…” Admittedly, I was over the moon when I saw this live demo at Google I/O 2022, and this will likely be the one update that will solve most of my smart home frustrations.

Quick Phrases and Real Tone Apps

Quick Phrases – Google’s first implementation to eliminate the need for the “Okay Google” keyword for specific tasks such as setting timers, toggling smart lights, etc. – are extended to Nest Hub Max! Finally, Real Tone, an effort to improve Google’s camera and imaging products through skin tones to properly represent users from diverse backgrounds will work with Nest Hub Max for Look and Talk. More work is being done on this feature with the new Monk Skin Tone Scale that launched today (really, it just got made open source thanks to its creator, Harvard Professor Dr. Ellis Monk) Google AI skin tone research is intended to help improve skin tone assessment in machine learning and provide a set of best practices for use in ML Fairness.

If you want to watch the entire Google I/O 2022 Keynote, you can do so below. We also have plenty of coverage on everything announced and revealed yesterday, so be sure to check it out! Let me know in the comments which of these Google Assistant features interest you the most and if you think Google is writing history or playing with fire when it comes to AI and machine learning.

Leave a Comment