No Fully Automated Compliance Tool Exists

There is currently no tool that can automatically determine whether a video requires an audio description track under WCAG 2.1 AA.

Third-party vendors offer at a cost, assisted software to generate audio description. UMB is exploring vendor options that are approved for use.

Automated tools can assist with analysis, but final determination must be made by a content creator or reviewer.

Tools That Can Assist (But Not Decide)

While no tool can make the compliance decision, the following tools can help staff compare spoken content to visual content:

  • Speech-to-text/transcript tools
    • YouTube auto-captions
    • Descript
    • Rev
  • Visual analysis tools
    • Microsoft Azure Video Indexer
    • Google Video AI
    • AWS Rekognition
    • Descript’s scene analysis features using Underlord AI chatbot

These tools can:

  • Identify spoken dialogue
  • Detect on-screen text
  • Flag visual scenes or actions

They cannot determine whether those visuals are essential — that decision remains with the reviewer.

Recommended Review Workflow

Use the following semi-automated workflow to determine whether audio description is required:

  1. Generate a transcript
    1. Use YouTube or Descript.
    2. Edit for accuracy (~99%).
  2. Review the visuals
    1. Watch the video without sound or
    2. Use a visual analysis tool to identify on-screen text, charts, or silent actions.
  3. Compare transcript to visuals
    1. Ask: Does the transcript fully communicate what the visuals show?
  4. Make a determination
    1. If yes: Integrated description or narration is sufficient.
    2. If no: Audio description, descriptive transcript, or alternate upload is required.

This process should be documented when applicable to demonstrate good-faith compliance.


Share Your Success and Help Your Peers

Request to join the virtual Social Media Committee by emailing social@umaryland.edu