Counting Cats: Why "Good Enough" is my New Gold Standard for AI Accessibility
Is AI ready to describe complex images? To find out, I ran a "stress test" using a classic trick image designed to confuse viewers: a pile of cats where kittens are hidden in curls and shadows.
I tested three major models—GPT-5.2, Gemini 3 Pro, and Claude—against a "Gold Standard" human verifier.
The Results:
• GPT-5.2 correctly identified 15 cats, breaking them down by row and group.
• Gemini 3 Pro also hit 15, specifically noting the "hidden" kittens that often trip people up, such as the one "tucked directly under" a stretching cat.
• Claude initially struggled with "extended thinking," guessing 16 or 17, but settled on 15 without it.
• The Human Verifier: My trusted expert confirmed there are, in fact, fifteen cats.
The "Good Enough" Philosophy
This experiment highlights a critical shift in accessibility advocacy. While I wouldn't "bet my life" on an AI count, that isn't the right metric. The debate over "gold standards" often paralyzes us from using tools that are immediately helpful.
As the results show, AI is capable of handling nuance and detail. In the world of image descriptions, "the perfect really is the enemy of the good enough". If we wait for perfection, we miss the opportunity to make the web accessible right now.
#Accessibility #AI #AltText #Blind