Search What You See: The Tech Behind The Magic | Made by Google Podcast S9E4

YouTube AI Channels Products

Summary

Google has enhanced its Circle to Search feature by leveraging Gemini 3 to enable holistic scene recognition of screen content, with a particular focus on breaking down fashion ensembles into individual items and supporting virtual try-ons. This update allows users to seamlessly find alternative products and preview how they look without needing to take screenshots, thereby improving the overall visual search experience.

No content available

Original Article

View Cached Full Text

Cached at: 05/08/26, 07:34 AM

TL;DR: Google has updated Circle to Search, leveraging Gemini 3 to enable holistic recognition of screen content. This is particularly notable for fashion, allowing for multi-item decomposition and virtual try-ons. Users can now find alternative products and see how they look on them without needing to take screenshots. # Major Upgrade for Circle to Search: From Single Object to Whole-Scene Recognition Welcome to the Made by Google podcast. In this episode, we explore the latest major upgrade to Circle to Search. This feature has been enhanced on devices like the Pixel 10 and the Samsung Galaxy S26, making it more powerful in recognizing what’s in front of you. Especially for fashion enthusiasts, new hidden features help users discover brand-new, beautiful looks. We sit down with Director of Product Management Harsh Kharbanda to dive into the technical evolution and use cases behind this feature. ## The Evolution from Google Lens to Circle to Search **Rachid Finge:** Harsh, welcome to the Made by Google podcast. Today we’re talking about Circle to Search. Do you remember your first encounter with Circle to Search? **Harsh Kharbanda:** I’ve been involved in its development since the inception of Google Lens in 2018. As Lens evolved, our primary focus was always on the camera. Users could take a photo of an object in front of them to get answers and resolve questions. But later, we realized that a lot of value lay in people’s ability to search for content they saw on their computer or phone screens. We noticed that users frequently took a large number of screenshots and then uploaded them to Google Lens. So, part of our team collaborated with the Android team to make the entire process more seamless. The "Circle Gesture" feature was born. When I used it for the first time, it felt very intuitive. I thought, "Wow, I’m going to use this more than ten times a day." Because every day, my phone screen presented various questions that needed answering. From that point on, I was completely hooked. As the Lens and Circle to Search products gradually merged, I took on more responsibility for the product work on Circle to Search. ## The Demand for Visual Search and Early Challenges **Rachid:** I think to understand today’s progress, it’s necessary to look back a bit. Search engines originally started with users typing queries into a search box. Later, we had voice search, and then, as you mentioned, we launched Google Lens. So, why did we launch Google Lens a few years ago? **Harsh:** The concept of visual search has existed for a long time. Google had developed Google Goggles around 2012 or 2014. But the technology wasn’t mature then. Early versions of Google Lens could only recognize QR codes, maybe some objects, and perform translations, etc. The key with Lens was that we knew there was a demand, because there are many questions that are difficult to express solely through speech or text. * **The Fashion Description Dilemma:** For example, how do you describe a very unique dress you saw on social media? You’d need a lot of different words, and even with those, you still wouldn’t capture the essence of that dress. * **The Plant Identification Predicament:** Many people use Lens to identify plants. If you don’t know the name of a plant someone gave you at home, and it’s wilting in a specific way with spots appearing, trying to describe it in text takes a long time and loses a lot of detail and precision. So, Google realized early on that users had many questions that were hard to describe, and this was the driving force behind our development. ## The Difference Between Circle to Search and Direct Camera Search **Rachid:** For those who have never used Circle to Search, how is it different from using the camera to search directly? **Harsh:** When we launched Lens, the focus was mainly on physical objects in front of you. But we quickly realized that in daily life, you don’t encounter many new things. You basically see the same plants, the same dogs, the same office locations and desks, etc. So, you don’t have many opportunities to ask new questions. Phones are becoming more powerful. People spend a lot of time on social media and various apps on their phones. As they browse and scroll, they encounter content that sparks curiosity. So, our goal was to bridge the gap: How can we bring visual search closer to the device? How can we make it appear where users have questions? That’s where Circle to Search came from. ## How to Use Circle to Search and Typical Scenarios **Rachid:** Can you explain how to use it from a user’s perspective? Do you have a favorite example of a scenario where it’s super useful? **Harsh:** It’s simple. On any new Pixel device, just long-press the navigation bar at the bottom of the screen in any app interface. This pauses (freezes) the screen, and then you can tap anywhere. Or better yet, you can precisely circle what you want, and we’ll display what is most likely an AI response explaining what you see. You can also ask very specific questions about what you’re looking at. One example I like is that my mom often sends me spam messages on WhatsApp. She sends a bunch of messages that, at least in my opinion, are definitely fake. Or videos where I’m thinking, "No way." I often use Circle to Search to verify this information. All I do is long-press the navigation bar, circle the content, and it tells me, "Hey, this is misinformation," and lists the reasons. It actually searches the web to find out what’s true, what’s false, and the nuances in between. I often screenshot the results and send them back to her. **Rachid:** If there’s something on the screen written in a language I can’t read, can I also circle the text to get a translation? **Harsh:** Absolutely. I run into this often too. For example, in family groups, someone might forward a message in an Indian language or dialect that I don’t speak, or even if I do, it’s hard to read. I just circle it, and it reads it out and translates it for me. It’s a very practical feature that many of our users frequently use. ## Breakthroughs in Fashion: Whole-Outfit Recognition and Virtual Try-On **Rachid:** I imagine at some point you may have realized that people are also using Circle to Search for fashion purposes, to some extent. **Harsh:** Yes. Early on with Circle to Search, we realized this would be a trend. One thing we learned from Google Lens, especially during the pandemic, was that users would see influencers’ videos or posts on social media and think, "Oh, I like that jacket," but obviously, they wouldn’t buy a $5,000 jacket. So, they would take screenshots and upload them to Google Lens to find similar alternatives within their price range. For Circle to Search, we knew early on that reducing friction—letting users ask questions directly on platforms like Instagram or TikTok without needing to screenshot, close the app, and open Google Lens to upload the screenshot—would drive growth. Young female users are the demographic using Circle to Search for visual shopping. What we started seeing was that often, they weren’t circling or looking for a single jacket or product, but rather the vibe of the entire outfit, the feel of the whole look. For example, "Oh, I see this cool Taylor Swift look trending... Now I want to find every part of this look, and of course, recreate it within my budget." Circle to Search wasn’t great at this initially. It was good at recognizing individual objects; you could circle the jacket, top, jeans, shoes, and bag one by one. Some influencers even wore sunglasses. So, there was a lot to circle. **In this update, we allow users to circle the whole thing and then find the entire outfit.** With **Gemini 3** and our latest model updates, we can view the image as a whole and break down which parts of the image are particularly interesting, decomposing all the different parts of the image for the user. Then, we perform the same visual search for each part of the image, so we find the sunglasses, jacket, jeans, etc. We find the exact products where possible, as well as many similar products for you to browse, helping you find items that fit your budget and truly recreate the entire look. The icing on the cake is that once you find a product you like, you can tap on it, and you’ll see an option to **try on that product**. When you tap the new "Try On" option, you’ll see what the found jacket looks like on you. This end-to-end journey was very popular in user testing. "Wow, that’s what Taylor Swift wore, I found something under $100 that looks great. I’m going to buy it." It really surprised users. ## Other Use Cases for Multi-Object Recognition **Rachid:** Circle to Search can now detect multiple objects. Besides fashion, what other scenarios are you considering? **Harsh:** There are many. 1. **Skincare Routines:** One of the product managers on the team who works on this loves skincare. She came across a skincare routine on Instagram featuring images of 14 different products. She basically just circled the whole thing and asked, "Hey, can you help me find all the products and provide reviews for all of them?" sorted by price. It was able to search for every item and do all of this for her. 2. **Plant Arrangements:** I saw a plant arrangement on social media and thought, "These plants look great, but I don’t know their names. I want to know if I can grow them at home." So I just circled it and asked, "Hey, can you tell me the names of all the plants, their growing conditions, what they need? Can they thrive under these conditions?" I was able to search for all of it. 3. **Award Ceremony Identification:** I saw a post with five actors holding their awards. I thought, "I recognize a few of them, but I don’t know the others, and I don’t know what they won for." So I just circled all of them and said, "Hey, can you tell me in a nice table who they are, what they won for, and which movies of theirs I should watch?" It then built a great result for me. So, there are many different use cases where users are actually asking about the whole picture, not just parts. Finding outfits is one of the most typical journeys, and it gives people an "aha" moment, leading them to use it for many different things. ## Demo: Recognizing and Trying On a Golf Outfit **Rachid:** Talking about the try-on tool is better than showing it to us directly, right? **Harsh:** Yes. Let me quickly screen-record to show you. This is an example of a golf outfit. It’s an outfit I would never have fully conceived of, but the parts within it really appealed to me. 1. **Invoking the Feature:** I invoked Circle to Search. 2. **Circling the Whole:** I circled the entire outfit. When I circled the whole outfit, I got an AI overview response allowing me to find this outfit. 3. **Decomposed Recognition:** When I clicked "Find Outfit," it actually managed to deconstruct the entire outfit. Look at this shirt, look at these shorts, look at this hat. 4. **Detailed View:** I can click on them, and it found the exact shirt, so I can open the shirt in the viewer to look at it closely. 5. **Virtual Try-On:** What I can do is also try it on. Let me give an example here. Now I see a "Try On" button, and when I click "Try On," it immediately puts this shirt on me. It’s probably not something I’d want to wear, but at least it tells me this isn’t something that suits me. So, that’s the entire journey: "Hey, I found this outfit online, it looks interesting, let me search for it." Then, you just click twice to search, we find something very similar for you, and now you can try it on. Source: Search What You See: The Tech Behind The Magic | Made by Google Podcast S9E4 (https://www.youtube.com/watch?v=x6Ix0RVd7yk)

Search What You See: The Tech Behind The Magic | Made by Google Podcast S9E4

Similar Articles

Search + Shopping | I/O 2026 Keynote

Google Search is getting its biggest changes ever

5 ways Google Search can level up your thrift and vintage shopping

What’s New in Google Accessibility | Episode 12

Submit Feedback

Similar Articles

@FinanceYF5: Ten years ago, search relied on keywords; five years ago, semantics; today, Google directly stuffs its strongest AI into the search box. Driven by Gemini 3.5, it supports cross-modal queries with images, videos, and files, merging AI Overviews and AI Mode into one. Search has truly changed this time!

Search + Shopping | I/O 2026 Keynote

Google Search is getting its biggest changes ever

5 ways Google Search can level up your thrift and vintage shopping

What’s New in Google Accessibility | Episode 12