Adapting Speech Recognition in Augmented Reality for Mobile Devices in Outdoor Environments



This paper describes the process of integrating automatic speech recognition (ASR) into a mobile application and explores the benefits and challenges of integrating speech with augmented reality (AR) in outdoor environments. The augmented reality allows end-users to interact with the information displayed and perform tasks, while increasing the user’s perception about the real world by adding virtual information to it. Speech is the most natural way of communication: it allows hands-free interaction and may allow end-users to quickly and easily access a range of features available. Speech recognition technology is often available in most of the current mobile devices, but it often uses Internet to receive the corresponding transcript from remote servers, e.g., Google speech recognition. However, in some outdoor environments, Internet is not always available or may be offered at poor quality. We integrated an off-line automatic speech recognition module into an AR application for outdoor usage that does not require Internet. Currently, speech interaction is used within the application to access five different features, namely: to take a photo, shoot a film, communicate, messaging related tasks, and to request information, either geographic, biometric, or climatic. The application makes available solutions to manage and interact with the mobile device, offering good usability. We have compared the online and off-line speech recognition systems in order to assess their adequacy to the tasks. Both systems were tested under different conditions, commonly found in outdoor environments, such as: Internet access quality, presence of noise, and distractions.


Speech Recognition, Natural Language Processing, Sphinx for Mobile Devices, Augmented Reality, Outdoor Environments


Natural language processing


SLATE'17 Symposium on Languages Applications and Technologies, June 2017

Cited by

No citations found