Draft guidance for `aria-actions` behavior in screen readers

This document is intended to provide a set of rough guidelines for aria-actions implementation in screen reading software. There are no plans for this guidance to go into the ARIA spec or any other normative guidelines; it is only inteded to help clarify how the new aria-actions attribute is intended to be used as a reference for implementors creating new UI for actions.

Intro: browser and authoring requirements

The ARIA PR includes the actual proposed spec wording, but here's the gist of how authors and browsers should create actions, including examples:

Browser accName changes and API mappings

Elements referenced by aria-actions are allowed children of widget roles that would otherwise enforce presentational children. Browsers must both expose nested actions in the accessibility tree, and also prune them from any name from contents accName calculations.

For example, given the following HTML code for a single combobox option with two actions:

<div role="option" aria-actions="editID openID">
  your-file-name.pdf
  <button id="editID">Edit</button>
  <button id="openID">Open</button>
</div>

The browser should generate the following accessibility node structure:

option "your-file-name.pdf"
- button "Edit"
- button "Open"

The two relevant changes from what the ARIA spec calls for today are exposing the nested buttons as buttons (which all modern browsers already do even though nested buttons are technically disallowed), and excluding the actions from the computed accessible name.

The API mappings for aria-actions are summarized as follows; in the previous example these would be defined on the "option" node:

IA2: IAccessibleAction
UIA: Custom Property, AccessibleActions
ATK: AtkAction
AX API: Implement as custom action

Authoring patterns

The two most relevant authoring requirements are that referenced actions must exist in the DOM when the referencing element has keyboard focus, and authors must ensure referenced actions respond to the click event (i.e. they are programmatically invokable).

This means that when a user moves focus to a control that has secondary actions, each of those actions can be programmatically accessed and invoked.

Authors must also ensure that those secondary actions can be navigated to or directly invoked even without an actions-specific assistive tech interface. However, that method would be potentially unique to each authored acitons UI, while a built-in AT interface would provide a standard and consistent mode of interaction.

Authoring examples

A set of closeable file tabs could be authored using aria-actions:

<div role="tablist">
  <div role="tab" tabindex="0" aria-actions="remove1">
    index.html
    <button id="remove1" tabindex="-1" aria-label="close index.html">x</button>
  </div>
  <div role="tab" tabindex="-1" aria-actions="remove2">
    styles.css
    <button id="remove2" tabindex="-1" aria-label="close styles.css">x</button>
  </div>
  <div role="tab" tabindex="-1" aria-actions="remove3">
    script.js
    <button id="remove3" tabindex="-1" aria-label="close script.js">x</button>
  </div>
</div>

A tree of emails, where each email has multiple possible actions, could be marked up as follows:

<ul role="tree" aria-label="Primary Inbox">
  <li role="treeitem" aria-actions="flag1 read1 more1" tabindex="0">
    Re: meeting agenda for TPAC
    <div class="actions">
      <button id="flag1" tabindex="-1">Flag</button>
      <button id="read1" tabindex="-1">Mark as read</button>
      <button id="more1" tabindex="-1" aria-haspopup="menu">More actions</button>
    </div>
  </li>
  <li role="treeitem" aria-actions="flag2 read2 more2" tabindex="-1">
    Fwd: draft proposal for kitten acquisition
    <div class="actions">
      <button id="flag2" tabindex="-1">Flag</button>
      <button id="read2" tabindex="-1">Mark as read</button>
      <button id="more2" tabindex="-1" aria-haspopup="menu">More actions</button>
    </div>
  </li>
</ul>

Proposed screen reader interface

There are two primary user needs when it comes to surfacing and using actions associated with a control:

Inform the user that actions exist for the focused control
Provide a method to directly browse and activate associated actions

When landing on a control with actions, it should inform the user that actions are available. Two potential approaches are:

"control name, actions available"
"control name, 2 actions available"

Having an announcement of actions, even without a separate UI to invoke them, would be the minimum bar of supporting aria-actions. Authors are required to provide a scripted way to access or invoke them (e.g. pressing right arrow on an option to access edit/delete buttons or adding a delete key keyboard handler to remove tabs), so users should theoretically be able to use actions out of the box, if they are aware they exist and can find them. If the screen reader provides localized "has actions"-style help text, it would at least be an improvement on what exists today.

The more complex implementation challenge is providing a built-in UI for users to directly invoke associated actions without leaving the primary control. A screen-reader-provided actions UI would be the ideal that aria-actions is aiming for, since it would provide consistency and ease of use to end users. The mappings and authoring requirements of aria-actions are aimed at allowing screen readers to do the following:

query all actions associated with a given control
get the accessible names of all associated actions
present an on-demand menu of the actions to the user
programmatically invoke an associated action when chosen by the user

Given the following markup:

<li role="treeitem" aria-actions="flag1 read1 more1" tabindex="0">
  Re: meeting agenda for TPAC
  <div class="actions">
    <button id="flag1" tabindex="-1">Flag</button>
    <button id="read1" tabindex="-1">Mark as read</button>
    <button id="more1" tabindex="-1" aria-haspopup="menu">More actions</button>
  </div>
</li>

The final UX could approximate something like the following:

"Re: meeting agenda for TPAC, 3 actions available, press (keyboard command) to access actions"
(press keyboard command)
"Flag, 1 of 3"
(press enter)
"Re: meeting agenda for TPAC, flagged, 3 actions available, press (keyboard command) to access actions"
etc.

Open questions

Should screen readers sometimes expose actions defined on ancestor elements when focusing child elements?
Can nodes focused with aria-activedescendant support actions defined on them?

All screen reader UX proposals in this doc are intended only as a rough jumping off point, and not as a final requirement. Suggestions for improvement are very welcome!

smhigley/aria-actions-screen-readers.md

Draft guidance for aria-actions behavior in screen readers