Speech Engine Remote Control Protocols by treating Speech Engines and Audio Sub-systems as Web Services
This document proposes the use of the web service framework based on XML protocols to implement speech engine remote control protocols (SERCP). This document is informational. It illustrates how web services could be used. It is not a detailed specification. This is expected to be the output of the SPEECHSC activity, if it is decided to go in this direction. It also enumerates the requirements that have led to selecting a web service framework. Speech engines (speech recognition, speaker, recognition, speech synthesis, recorders and playback, NL parsers, and any other speech processing engines (e.g. speech detection, barge-in detection etc) etc...) as well as audio sub-systems (audio input and output sub-systems) can be considered as web services that can be described and asynchronously programmed via WSDL (on top of SOAP), combined in a flow described via WSFL, discovered via UDDI and asynchronously controlled and via SOAP that also enables asynchronous exchanges between the engines. This solution presents the advantage to provide flexibility, scalability and extensibility while reusing an existing framework that fits the evolution of the web: web services and XML protocols [15]. This document proposes using web services as a framework for SERCP. The proposed framework enables speech applications to control remote speech engines using the standardized mechanism of web services. The control messages may be tuned to the controlled speech engines.