Apple’s online Machine Learning Journal has published a paper on the methodologies the HomePod uses to implement Siri functionality in far-field settings. As Apple’s Audio Software Engineering and Siri Speech Teams explain:
Siri on HomePod is designed to work in challenging usage scenarios such as:
- During loud music playback
- When the talker is far away from HomePod
- When other sound sources in a room, such as a TV or household appliances, are active
Each of those conditions requires a different approach to effectively separate a spoken Siri command from other household sounds and to do so efficiently. The report notes that the HomePod’s speech enhancement system uses less than 15% of one core of a 1.4 GHz A8 processor.
Apple engineers tested their speech enhancement system under a variety of conditions:
We evaluated the performance of the proposed speech processing system on a large speech test set recorded on HomePod in several acoustic conditions:
- Music and podcast playback at different levels
- Continuous background noise, including babble and rain noise
- Directional noises generated by household appliances such as a vacuum cleaner, hairdryer, and microwave
- Interference from external competing sources of speech
In these recordings, we varied the locations of HomePod and the test subjects to cover different use cases, for example, in living room or kitchen environments where HomePod was placed against the wall or in the middle of the room.
The paper concludes with examples of filtered and unfiltered audio from those HomePod tests. Regardless of whether you’re interested in the details of noise reduction technology, the sample audio clips are worth a listen. It’s impressive to hear barely audible commands emerge from background noises like a dishwasher and music playback.