A Little More About Windows Sound Architecture

In my previous post, I talked about Windows Vista/7 sound architecture and how it is not realtime which causes many problems. During my research I have come across much information and many articles online talking about this specific topic. Many people seem to be unhappy with this design.

My main goal is to plug in the guitar into the line in, monitor it thru the speakers and be able to play and record. This requires realtime monitoring of the line in signal. Also known as bit perfect data streaming. On Windows 7, monitoring has such high latency that it throws off my playing. This makes it impossible to play, monitor thru the speakers and record. If it was just a matter of the recording having latency, it could be adjusted during mixdown of the tracks. But for better recording and to not have to readjust tracks, it makes sense to try to reduce latency as much as possible. By the way, I use Reaper for multitracking, which means that Reaper plays all existing tracks while recording the new one.

There’s also a lot of people using Foobar and such applications to play their media realtime without latency. I’m not sure what the point is. If the entire playback has a latency of 200ms, you would never know. Your playback would simply start 200ms later but would be continuous after starting. To move away from media center and use another application for all of this seems rather pointless considering one would never know the difference. It would make sense if there was video also being played and the audio was out of sync. But that wouldn’t happend since the media player on the same computer would adjust for all of that.

Anyway, to each their own. Every user a right to use whatever appliation they prefer.

Back to the actual architecture. Microsoft has moved all the audio streaming out of the kernel and made it user-mode streaming. Microsoft claims that this provides low-latency, secure, reliable, glitch resilient sound framework. That is the Microsoft Core Audio API. For low latency Microsoft provides the Windows Audio Session API (WASAPI) which enables an application to control audio data flow from the application to audio endpoints. So if an application implement WASAPI, then it could run with minimum latency possible.

At this point it is not clear whether the Reaper team will implement WASAPI. If they do, it will really solve some recording latency items. But it would possible help in using other DSPs for audio processing. If the sound doesn’t have to go thru the Microsoft Windows Mixer and thru Microsoft DSPs, then it could be process with as many DSP as can be used by the user without increasing latency.

Windows’ New Sound Architecture

So Microsoft Windows Vista/7 have a new architecture for playing sound. They have what they call glitch resilient sound layer. Not glitch free. Essentially, a lot of sounds go thru Windows Mixer and get DSPs to fix the sounds and then the final sound is sent to the sound card. For normal people, this is causing many glitches and playback problems. For home recorders, this causes another issue: monitor latency.

What is monitor latency? When plugging in something in the line in of the sound card, you can monitor the same signal thru the computer speakers. This is not so much of a problem for recorded playback thru line in. But if the signal is live such as a guitar signal, it will have problems. My guitar goes straight to the line in. Ok, first, I realize this is not the ideal solution and one must either buy a recording interface or mic the amps. But realistically, at 1 in the morning when a musical idea is being formed, I just want to plug the guitar in and record it. I can mic the amps later or even record clean then DSP it with a vamp. Anyway, line in playing has so much latency in monitoring that I can’t play. It throws me off rhythm.

I’m getting frustrated with it. In Win XP it was simple. All line in signal was sent to the soundcard. Apparently, in Windows XP the sound was proceseds in Kernel Streaming mode. There was also the ASIO route. But in Vista/7, the sound stuff was taken out of the kernel level to remove BSODs. There is a Windows Mixer processing all sounds.

Microsoft does have a way for an application to get control of the hardware and route signal directly to the hardware and bypass the Windows Mixer. It’s called WASAPI – Windows Audio Session API. But an application has to support it. In other words, one cannot plug in a guitar, start playing and monitoring it. Some application must first be written to support the API, and then must be run before the sound is will bypass the Windows Mixer. This mode will also disable all other sounds from any other apps. So in other words all the dings, bings, and other sounds will not be played.

Hopefully, I can use my Windows 7 machine for home recording. Otherwise, I’ll have to move to Windows XP, or buy a recording interface. Not sure, which is the more appealing option.