Going deep with Accessibility in Soundproof

Development / iOS

When we first released Soundproof we had to make the difficult decision to delay polishing our VoiceOver experience for accessibility users.

We had done some work on this already, but due to the nature of our custom UI elements such as the time scrubber and Count-In screen, the work was too much to complete first time around.

In hindsight, I really would like to have done this for the first release. It seems to be common that developers don’t bother with this until later, if at all, but it is extremely useful as it tells you things about your visual UI that you wouldn’t have considered. In short, it is a great test of your visual design.

We wanted to have a VoiceOver experience that was as polished as our visual UI. I’ve found very little detailed information on making a first-class VoiceOver experience, hence the detail included here.

This is going to be quite long, but it will hopefully provide useful insights into the efforts required to make your apps work really well for VoiceOver users. Those users rock by the way, they are the most helpful beta testers ever.

We had some very kind members of the AppleVis forum test the app for us once we had done most of the work. This resulted in some very interesting feedback…

So, what were the problem areas?

“Your app is useless, I cannot get past the intro screen”

This was a fascinating problem, with wide implications. At startup Soundproof shows a single on-boarding screen with some static text and a single “Create a Setlist” button at the bottom.

This is what is displayed visually when users first run the app

What could possibly have gone wrong here? Well some testers could not find the “Create a Setlist” button. There were at least two causes. First, the element read by VoiceOver at the top of the screen is the big “Get Started” heading. That is all it said in VoiceOver, and it was not a tappable item.

Second, the low density of interactive elements — a single button at the bottom — gets worse the bigger display you have. The fingers of a blind user have to explore much more physical space to find that one thing they can touch.

This was a very surprising discovery, after all when designing a screen what could possibly be easier to use than a screen of static text with only one button?

For now we chose to add an accessibilityHint to the “Get Started” heading indicating that they can get out of this screen using the button at the bottom, and also by using the “Magic tap” gesture.

We should probably make the “Get Started” item a button as well, as when you can’t see how it is visually styled as a heading it seems like an obvious call to action, but this should do for now. The user could have spotted that VoiceOver did not say “Get Started. Button”, but this is easily missed.

Right there is another lesson: font stylings are obviously invisible to VoiceOver users. However we rely on these a lot in modern UI to indicate functionality. Forget worrying about visual cues disappearing on buttons in iOS 7 and up — nobody using VoiceOver can tell one bit of text’s importance relevant to another.

The Player

Our main player interface is relatively simple, but there are a number of elements that provide challenges with VoiceOver.

Soundproof Player screen

Statistics

Our statistics view drops down from behind the navigation bar to show practice stats when the player is not active.

Being a UIView that contains an animated UILabel that rotates between different stats, we had to make the container view return YES from isAccessibilityElement and return an aggregated value for accessibilityValue so that VoiceOver could speak the full set of statistics rather than just those that were currently visible in the animated display.

Track selector

Our track selector is actually a special UICollectionView derivative that supports infinite scrolling through the Setlist.

This caused a slew of problems with VoiceOver.

Problem 1: the built-in collection view VoiceOver handling would not work as page offsets would be incorrect.

To resolve the incorrect page number announcements such as “Page 2 of 7“ when they are tracks and there’s only 5, and have VoiceOver announce it as “Track 1 of 5”, we had to add a smart implementation of the hard-to-find accessibilityScrollStatusForScrollView: method of UIScrollViewAccessibilityDelegate.

Problem 2: the single visible cell was not being used for accessibility properties.

Problem 3: tapping the visible collection view cell did not announce track information.

To solve these two related problems we had to make our custom collection view delegate its accessibilityValue and accessibilityHint properties to the first (only) visible cell, and we had to make the collection view act as an accessibility element itself with isAccessibilityElement returning YES.

Problem 4: tapping a track would not tell you which number track you were on in the setlist.

While swiping, the track selector collection view would announce the position in the setlist with for example “Track 2 of 7”, but tapping on the view directly would only tell you the track information, not the position in the Setlist.

So we added the “Track 2 of 7” information to the end of the accessibilityValue for the current track cell. This results in some duplication of information when you swipe between tracks where it says:

“Track 3 of 7. Current track: Enviovore by Cephalic Carnage. Track 3 of 7.”

However by putting the information at the end of the value, we minimise the annoyance this could cause. It did not seem possible to change the announcement only when paging has just taken place. So if we want to have swiping announce the track number, and tapping also to convey this information, we have to live with it.

Time scrubber

The time scrubber is based on UISlider encapsulated in a custom view. We track the start and end of drag events on the slider and perform our zoomed time display animations in response to these, and then track the changes to update the player’s position when the user stops dragging. During the drag we update the “Add marker” button title to adapt to the presence of markers and the current play position so that the button doesn’t read “Set end marker” when you scrub the time to before the the existing (start) marker.

Problem 1: with VoiceOver on, changes to UISlider values don’t trigger drag events (well, duh), they only generate value changed events.

So the solution here was to detect if VoiceOver is running in the “value changed” action of the slider and adjust the state of the “Add marker” buttons there.

Problem 2: the VoiceOver description of the slider is not read as a time, it is just read as the float value of the slider.

To fix this a custom implementation of accessibilityValue is required on the custom view that acts as the accessibility element (parent) of the slider.

Problem 3: with the above issue solved, when the time position is at the start or end of the track, announcing “Time: 0 seconds” or “Time: 9 minutes 41 seconds” is not human enough.

So in this case the code that dynamically calculates the accessibilityValue for the slider needed to be adjusted to something more friendly i.e:

“Beginning of track” or “Beginning of marker range”
“End of track” or “End of marker range”

I feel detail like this is important because we developers tend to go to this kind of trouble with on-screen labels for relative times and quantities.

Problem 4: the custom view containing the time slider also presents the start and end marker positions visually, however the VoiceOver description does not tell you where the markers are. Thus VoiceOver users have no idea where their markers are when they return to the app or switch tracks.

This was solved by making the slider accessibilityValue also include a description of the marker positions if relevant. So tapping the slider with two markers set might say:

“Time position: 2 minutes 3 seconds. Start marker 32 seconds. End marker 1 minute 18 seconds”

The “Add marker” button

The marker interface is deliberately super-simple. Covered in a previous post, we have a one-button interface for all marker features which shows the only action you can take based on the current marker state. This means it alternates between “Add marker”, “Set start marker”, “Set end marker” and “Clear markers”.

Adding markers
Any of these actions will change the state of the time slider which has the markers on it, and possibly throw the player into “Repeat markers mode” automatically. It also changes the state of the button itself which can cause VoiceOver to re-announce that button.

The best solution we could come up with here was to post a message to indicate a layout change, pointing at the slider:

if (UIAccessibilityIsVoiceOverRunning()) {
  UIAccessibilityPostNotification(
    UIAccessibilityLayoutChangedNotification,
    self.timeLineScrubber);
}

This causes the slider to receive VoiceOver focus and have its current values announced, including the state of the markers. In fact we do this any time anything changes the state of the markers.

This does have the side effect that the accessibility focus switches from the button the user had selected, and sometimes results in a clipped announcement of the new button title, before the focus switches to the slider. This appears to be an unavoidable problem with VoiceOver (see later notes).

User does not know state at startup

When the application became active, VoiceOver would select the first accessible element, the “Settings” button. However you would not know what state the player was in, or what track was active.

For this we added the UIAccessibilityTraitSummaryElement trait to the track selector collection view. The result is that when the player becomes active, the accessibilityValue of the track selector is announced, which includes the current track information.

Count-In screen

Problem 1: VoiceOver does not have a chance to announce the track change when the user skips track using the next or previous track buttons, if the player is currently playing

This problem actually highlighted a non-trivial issue. For the UI to skip to a new track and then use the normal track selector VoiceOver announcements, we would have to wait for that announcement to finish before showing the Count-In screen. The Count-In screen itself can be set to announce tracks even for non-VoiceOver users, which adds more complication.

Rather than deal with the myriad problems of delaying the skip to a new track until an announcement has finished (which seems to be hard/impossible, see later notes), we decided that when VoiceOver is running, we would pause playback when the track skip buttons are pressed.

This pragmatic solution means the user is not bombarded with conflicting announcements and does not enter some strange state where they have to wait for something to complete, which is not compatible with the goals of VoiceOver which is designed to be constantly interruptible to avoid slowing users down.

Note: we tried to make the track selector get accessibility focus using UIAccessibilityLayoutChangedNotification when the track is changed, but it is apparently ignored or cancelled by the normal button announcement. Solutions on a postcard to…

Problem 2: Count-In screen announces the track info while our own announcer is happening because it is automatically speaking first accessible element

As we have our own voice announcer even for non-VoiceOver users, the solution was to set accessibilityElementsHidden = YES on the parent of the track info view, which hides all the subviews from VoiceOver. Using isAccessibilityElement = NO would not help, as the subviews have it set to YES. This makes sense, but can seem counter-intuitive at first.

Setlists screen

We have a screen that lists all of the Setlists the user has created, and it uses a table view.

We had this covered already with a custom accessibilityValue for each table view cell, but we found a fun bug. Highlighting a setlist would tell you the setlist name and then how many tracks are in it. This was interesting if the setlist name ended with a number. It would say something like:

“New setlist 1 point 6 tracks”

This was due to the accessibilityValue being “New setlist 1.6 tracks”. We were missing a space after the full stop, so the number of tracks was running together with the end of the setlist name, which by default has a number at the end.

This might not have been noticed if we quoted the setlist name in the string, e.g: ”New setlist 1”.6 tracks assigned to an NSString including the quotes. I can’t show the quotes escaped here due to a bug in WordPress Markdown handling.

Sadly we couldn’t quote the setlist name because of a VoiceOver bug we found (see later notes).

Adding nice things for VoiceOver users

Once we had the basics covered, we could add some nice-to-haves for VoiceOver users.

Magic tap

The so-called “Magic tap” is a two finger double tap anywhere on the screen. This is very handy as it is an easy gesture to perform that requires little accuracy.

So when you are on the player screen we made this gesture equivalent to the play/pause button. Just two finger double-tap anywhere with VoiceOver on and it will play or pause without talking to you.

On the Count-In screen we made this gesture perform the “Start” button action to skip the Count-In and instantly start the track.

We may extend this functionality in future releases to other screens such as to dismiss the track selector UI, or to play the selected track in Setlist view.

Scrub / escape

There is also a special two-finger “scrub” gesture for cancelling actions when VoiceOver is enabled.

We implemented this on the Count-In screen to perform the “Cancel” action which will cancel the Count-In and abandon playback.

Again, there is scope here to introduce this to other view controllers to avoid users having to hunt for the “Cancel” button.

Seemingly insurmountable problems

We did hit some issues with VoiceOver that we couldn’t find solutions for at this time.

Please shut up, VoiceOver

The biggest problem was that it seems impossible to prevent VoiceOver default announcements when you want to control what is said. For example if you want to say something after a button is tapped, it does not seem possible to reliably do this, because the button’s label is spoken as soon as the tap occurs.

If you try to use the UIAccessibilityAnnouncementNotification or UIAccessibilityLayoutChangeNotification in this situation you typically find your announcement or change does not occur, or there is a clipped announcement and then it switches back to the button label announcement, or vice versa.

This is very frustrating when a button action causes a change in UI state. We explored queuing our announcement internally and waiting for VoiceOver announcements to complete and then posting the notifications to make our announcements, but it does not seem possible to detect when VoiceOver finishes speaking its own announcements. You can hook into notifications for your own announcements, but not the default ones.

You could work around this by posting your announcement after a fixed time delay. However I find such “fixes” abhorrent as they will no doubt be unreliable or have other unforeseen side effects.

This issue causes problems all over the place. We hope Apple improves this in future. We need a way to know when VoiceOver has finished speaking, and a way to say “this layout change is all you should speak”.

Selecting a default item

We want play/pause or the track selector to be active at startup, but VoiceOver insists on selecting the first accessible element, the “Settings” bar button. Yes, we tried posting a layout change notification to focus our element when the view controller appears. It didn’t work.

View controller announcements

We wanted to announce the track selector view controller when it appears, as it has no title bar. It was too hard to suppress or delay the initial segmented control announcement. We have to assume the user knew what they were doing when they pressed the button to add tracks and enter this UI.

However if the user leaves our app and returns to it later it is not clear what they are doing.

It’s possible this could be worked around to some extent with a hidden UI element that is has the summary element trait.

Other observations

Here’s a quick dump of other little issues or tips we found:

UIStepper is bad for accessibility. Don’t use it. It is hard (or impossible?) to get it to announce the current value represented by the stepper, or bind it to the relevant view that contains its current state. We changed our settings UI to push a new view controller that uses a table view that contains rows for all the possible values. This works out nicer for everyone really.

UISwitch embedded in table view cells in Settings needed to be wrapped in a custom view that handles VoiceOver properties in aggregate, so that the label and value can be announced together. Also handling a single activation (double tap) can be used to toggle the state of the switch on that row by implementing accessibilityActivate… except that is seemingly never called so we frigged it with accessibilityActivationPoint set to return the center of the switch.

Tappable labels in Settings, which are static table view cells, benefitted from adding the Button trait so the user knows they can activate that row to make something happen.

Presenting a pile of static text with multiple headings on our on-boarding screen is painful for VoiceOver, as each is announced individually as your finger travels across the screen. There’s an argument to be had for presenting this as a single block of text that can be read out, using a custom container view. Referring back to the start of this post, a user doesn’t know a heading from a paragraph that follows it, so you lose the natural narrative flow of text when you present items as just a set of labels.

It is tempting to try to announce all app state all the time, but this would likely be very annoying given people typically do have a good short term memory and vision impairment does not normally affect this! So we asked ourselves “Did the user instigate this UI state change?”. If the answer was “Yes”, don’t try to announce what happened.

Custom UI elements are fairly painful for accessibility in terms of working out what the right behaviour is. Aggregating controls means aggregation VoiceOver properties.

The UIAccessibility category is not a great pattern because it forces the view to control its labelling, which stands in the way of class reuse. You need to add custom properties to be used as e.g. accessibility label prefixes, to make the views reusable, instead of having IB or the view controller populate these properties as required. Weirdly the accessibilityScrollStatusForScrollView method is in a delegate protocol which makes more sense, but others are not because of the UIAccessibility category added to NSObject. In my view UIKit should consider delegating all accessibility properties and methods to the delegates of controls. Obviously many controls don’t have delegates and rely on Target/Action, but perhaps the time has come…

Mispronunciations are rife in VoiceOver and there is apparently no way to correct it with specific phonetics.

Finally, there is a fun VoiceOver bug in quote handling. This “Track 01”, if you include the quotes in the value, is pronounced as “Track zero-one inches”. This is even with fancy unicode quotes. Yes, I radar’d it.

Soundproof is available in the App Store now. The build with these VoiceOver changes is pending review.

The Author

Marc Palmer (Twitter, Mastodon) is a consultant and software engineer specialising in Apple platforms. He currently works on the iOS team of Concepts sketching app, as well as his own apps like video subtitle app Captionista. He created the Flint open source framework. He can also do a pretty good job of designing app products. Don't ask him to draw anything, because that's really embarrassing. You can find out more here.