Voice Input

The Voice Input pattern supports the design of a voice or speech interface. This pattern relies on a solid understanding of the underlying user task and how to apply Automated Speech Recognition (ASR), Natural Language Processing (NLP) and speech output in order to support the user. Three levels of speech interaction are:

  1. ASR with text output
  2. ASR combined with NLP and text output
  3. ASR combined with NLP and spoken aloud text output 

All of the above interactions leverage constantly improving ASRs and solve one of the biggest obstacles of mobile interfaces: typing on a soft keyboard is not an efficient input method. Each Mobile OS has a different solutions, but all have similar approaches for voice input:

Appearance

Appearance characteristics for this pattern.

The voice interaction pattern has four basic states:

Behavior

Common behaviors for this pattern.

Usage

Usage guidelines for this pattern.

General voice and Natural Language Processing (NLP) guidelines:

Related

Standby

Fig 1. Voice Input - Standby

 

Active

Fig 2. Voice Input - Active

 

Processing

Fig 3. Voice Input - Processing

 

Output

Fig 4. Voice Input - Output