Wheeler, J., "The Effect of Vehicle Noise on Automatic Speech Recognition Systems," SAE Technical Paper 2017-01-1864, 2017.
The performance of a vehicle’s Automatic Speech Recognition (ASR) system is dependent on the signal to noise ratio (SNR) in the cabin at the time a user voices their command. HVAC noise and environmental noise in particular (like road and wind noise), provide high amplitudes of broadband frequency content that lower the SNR within the vehicle cabin, and work to mask the user’s speech. Managing this noise is a vital key to building a vehicle that meets the customer’s expectations for ASR performance. However, a speech recognition engineer is not likely to be the same person responsible for designing the tires, suspension, air ducts and vents, sound package and exterior body shape that define the amount of noise present in the cabin. If objective relationships are drawn between the vehicle level performance of the ASR system, and the vehicle or system level performance of the individual noise, vibration and harshness (NVH) attributes, a partnership between the groups is brokered. Compatible targets are set and hardware selected that works to meet both groups’ goals. This paper examines the NVH attributes and performance metrics that relate to vehicle level ASR performance, and finds that strong relationships and statistical trends can be drawn between the Sentence Error Rate (SER%) and standard NVH metrics for that road surface or HVAC configuration. The paper also establishes that AI% should be the preferred metric to relate cabin noise to ASR performance in the presence of any other kind of steady state noise.