Transforming visual accessibility

OpenAI Blog Products

Summary

OpenAI has partnered with Be My Eyes to use GPT-4 for visual accessibility, enabling blind and low-vision users to navigate websites, e-commerce platforms, and physical spaces through intelligent summarization and real-time guidance. The system leverages GPT-4's vision capabilities to identify important content and provide contextual assistance that mimics how sighted users naturally scan information.

Be My Eyes uses GPT-4 to transform visual accessibility.
Original Article
View Cached Full Text

Cached at: 04/20/26, 02:55 PM

# Transforming visual accessibility Source: [https://openai.com/index/be-my-eyes/](https://openai.com/index/be-my-eyes/) Already the company has a case where a user was able to navigate the railway system—arguably an impossible task for the sighted as well—not only getting details about where they were located on a map, but point\-by\-point instructions on how to safely reach where they wanted to go\. Yet traversing the complicated physical world is only half the story\. Understanding what’s on a screen can be twice as arduous for a person who isn’t sighted\. Screen readers, embedded in most modern operating systems, read through the pieces of a web page or desktop application line by line, section by section, speaking each word\. Images, the heart of communication on the web, can be even worse\. Yet, Henriksen says now they’re able to show GPT‑4 the webpage and the system knows—after countless training hours where deep learning algorithms build relationships to understand the “important” part of a webpage—which part to read or summarize\. This can not only simplify tasks like reading the news online, but grants people who need visual assistance access to some of the most cluttered pages on the web: shopping and e\-commerce sites\. GPT‑4 is able to summarize the search results the way the sighted naturally scan them—not reading every minuscule detail but bouncing between important data points—and help those needing sight support make the right purchase, in real\-time\. “This is a fantastic development for humanity”, Buckley says, “but it also represents an enormous commercial opportunity\.”

Similar Articles

Using AI coding tools to build a production braille 3D generator as a blind developer

Reddit r/ArtificialInteligence

A blind developer describes using AI coding tools to build a production web platform for generating 3D-printable braille objects. He emphasizes that AI compressed the implementation loop without replacing domain expertise, accessibility judgment, or lived experience, highlighting how AI can empower domain experts with limited engineering resources.

GPT-4V(ision) system card

OpenAI Blog

OpenAI releases a system card detailing the safety properties and evaluations of GPT-4V(ision), which adds image input capabilities to GPT-4, enabling multimodal instruction-following and vision analysis.

Empowering teams to unlock insights faster at OpenAI

OpenAI Blog

OpenAI has developed an internal research assistant that combines dashboards with a conversational GPT-5 interface to help teams analyze millions of support tickets and generate insights in minutes instead of weeks. The tool democratizes data analysis across teams, allowing non-technical users to ask questions in plain language and get actionable reports on product feedback, customer sentiment, and trends.

Introducing vision to the fine-tuning API

OpenAI Blog

OpenAI introduces vision fine-tuning capabilities for GPT-4o, allowing developers to customize the model with image data in addition to text for improved performance on vision tasks like visual search, object detection, and medical image analysis.

ChatGPT can now see, hear, and speak

OpenAI Blog

OpenAI is rolling out new voice and image capabilities to ChatGPT Plus and Enterprise users, enabling users to have voice conversations and share images for multimodal interactions powered by GPT-3.5/GPT-4 and custom text-to-speech models.