A Little More About Windows Sound Architecture

In my previous post, I talked about Windows Vista/7 sound architecture and how it is not realtime which causes many problems. During my research I have come across much information and many articles online talking about this specific topic. Many people seem to be unhappy with this design.

My main goal is to plug in the guitar into the line in, monitor it thru the speakers and be able to play and record. This requires realtime monitoring of the line in signal. Also known as bit perfect data streaming. On Windows 7, monitoring has such high latency that it throws off my playing. This makes it impossible to play, monitor thru the speakers and record. If it was just a matter of the recording having latency, it could be adjusted during mixdown of the tracks. But for better recording and to not have to readjust tracks, it makes sense to try to reduce latency as much as possible. By the way, I use Reaper for multitracking, which means that Reaper plays all existing tracks while recording the new one.

There’s also a lot of people using Foobar and such applications to play their media realtime without latency. I’m not sure what the point is. If the entire playback has a latency of 200ms, you would never know. Your playback would simply start 200ms later but would be continuous after starting. To move away from media center and use another application for all of this seems rather pointless considering one would never know the difference. It would make sense if there was video also being played and the audio was out of sync. But that wouldn’t happend since the media player on the same computer would adjust for all of that.

Anyway, to each their own. Every user a right to use whatever appliation they prefer.

Back to the actual architecture. Microsoft has moved all the audio streaming out of the kernel and made it user-mode streaming. Microsoft claims that this provides low-latency, secure, reliable, glitch resilient sound framework. That is the Microsoft Core Audio API. For low latency Microsoft provides the Windows Audio Session API (WASAPI) which enables an application to control audio data flow from the application to audio endpoints. So if an application implement WASAPI, then it could run with minimum latency possible.

At this point it is not clear whether the Reaper team will implement WASAPI. If they do, it will really solve some recording latency items. But it would possible help in using other DSPs for audio processing. If the sound doesn’t have to go thru the Microsoft Windows Mixer and thru Microsoft DSPs, then it could be process with as many DSP as can be used by the user without increasing latency.