Harnessing AI for Accessibility: Opportunities and Challenges

Artificial intelligence holds immense potential to improve accessibility for people with disabilities, but it also comes with significant risks and limitations. This Q&A explores key aspects of AI in accessibility, drawing from expert perspectives to separate hype from genuine promise. We'll discuss alternative text generation, contextual image analysis, and the critical role of human oversight, while acknowledging the skepticism that rightly surrounds these technologies.

1. Why is there skepticism about using AI for accessibility?

Many accessibility experts, including Joe Dolson, maintain a healthy skepticism toward AI. This caution stems from AI's track record of producing unreliable or even harmful outputs. For instance, computer vision models often generate poor alternative text descriptions, especially for complex images. Additionally, AI systems can perpetuate biases present in their training data, leading to exclusionary results. The concern is that AI might be deployed as a quick fix without addressing underlying accessibility principles. As with any tool, AI can be used constructively or destructively. The mediocre middle ground—where AI offers incomplete or misleading assistance—is particularly problematic for accessibility, where accuracy and context are paramount. Thus, skepticism serves as a necessary guardrail against over-reliance on technology that is not yet mature.

Harnessing AI for Accessibility: Opportunities and Challenges

2. What genuine opportunities does AI offer for accessibility?

Despite valid concerns, AI can make meaningful differences when applied thoughtfully. One key opportunity is in augmenting human efforts, not replacing them. For example, AI can provide initial drafts of alternative text or flag images that likely need descriptions, speeding up the authoring process. Another area is analyzing image usage in context—training models to distinguish decorative from informative images—which could help prioritize accessibility efforts. AI also holds promise for complex images like graphs and charts, where it can generate structured descriptions or summaries. The key is to design AI tools as collaborators, not autonomous decision-makers. With proper oversight and continuous improvement, AI can reduce barriers for people with disabilities, especially in content creation and navigation tasks that are currently labor-intensive.

3. How does AI currently perform in generating alternative text?

Current AI systems for generating alternative text have notable limitations. They often examine images in isolation, ignoring the surrounding context—a consequence of separate foundation models for text and image analysis. This leads to descriptions that miss key contextual nuances, such as whether an image is decorative or essential to understanding the content. As Joe Dolson highlighted, results are particularly poor for certain image types, like abstract graphics or photos with subtle details. However, AI quality is gradually improving, offering richer descriptions than before. The challenge remains that AI cannot yet reliably distinguish between images that require descriptions and those that do not. While the technology is not ready for fully autonomous alt text generation, it can serve as a starting point—even if that starting point is a prompt that says, 'This description seems wrong, let me try again.' Human-in-the-loop refinement is essential.

4. What is the human-in-the-loop approach for alt text?

The human-in-the-loop approach integrates AI as an assistant rather than a replacement for human judgment. For alternative text, this means AI proposes an initial description, and a human editor refines or corrects it. This leverages AI's speed while maintaining human oversight for accuracy and context. Even if the AI's first attempt is flawed—offering something like 'What is this BS? That’s not right at all'—it can still save time by providing a rough draft. The human can then craft a proper description. This collaboration is especially valuable for authors who are not accessibility experts, as it educates them on what good alt text looks like. The approach also helps catch errors and biases that AI systems might introduce. By combining AI's scalability with human expertise, we can achieve more accessible content without sacrificing quality or trust.

5. Can AI be trained to understand image context for accessibility?

Yes, but it requires focused effort. Current AI models typically examine images in isolation, but training a system to analyze image usage in context—such as the surrounding text, page structure, and user interaction—is feasible. Such a model could help quickly identify which images are purely decorative (e.g., stock photos, dividers) and which require descriptive alt text (e.g., infographics, product photos). This contextual understanding would reinforce accessibility best practices and improve author efficiency. For example, a contextual AI could flag an image of a graph on a data page as likely needing a detailed description, while ignoring a decorative banner. While this technology is still emerging, early research shows promise. The key challenge is building diverse, inclusive training data that reflects real-world web content. With proper development, contextual AI could become a valuable tool for automating parts of the accessibility workflow.

6. How can AI help with complex images like graphs and charts?

Complex images—such as graphs, charts, and diagrams—are notoriously difficult to describe concisely, even for humans. AI offers potential by generating structured summaries or data tables that convey the key insights. For instance, an AI could extract data points from a line chart and produce a textual description like 'The line graph shows a sharp increase in sales from Q1 to Q2, followed by a plateau.' Advanced models can also create multi-layered descriptions that provide both a high-level overview and detailed data for users who need it. However, current AI often struggles with accuracy, especially for intricate visuals. Human verification remains crucial. Yet, as computer vision and natural language generation improve, AI could become a helpful starting point. Combining AI's ability to process visual information rapidly with human oversight for nuance will be the most effective strategy for making complex images accessible.

Xshell Lab