Some Notes On Latency

Latency seems to be one of these things everyone thinks of as very important. Questions like the following pop up on the support lists regularly:

I got FreeBob running stable now using a FireWire expresscard (Texas 
Instruments chipset) on my DELL Inspiron laptop. Seems to work fine with 
a frames/period 64, periods/buffer 3. I don't seem to be able to go any 
lower than the 64, because jack stops giving audio output when I try. 
Also QJackCtl refuses to lower periods/buffer to 2. Some hints would be 
greatly appreciated because I would love to lower the latency under the 
4.35 msec that I get now.

This page tries to put things into perspective and tries to explain that one should look past the marketing figure latency is. It will also explain what our definition of latency is.

The perspective, part 1: Latency in the real world

The very first thing to note is that the speed of sound is approximately 344m/s, meaning 34cm/ms. In other words, 1ms of latency corresponds with moving away 34cm from your sound source (speakers). So the difference between 4ms and 7ms is one meter. Please keep this in mind when you set your latency requirements. Apple considers everything below 10ms as good enough, and people don't seem to mind.

The perspective, part 2: The definition of latency

Define latency... when running jack with a period size of 2x64 frames, most PCI soundcards have approx 4ms of measured round-trip latency. However with these settings, qjackctl says the latency is 2.9ms (@44.1kHz). This is because qjackctl displays the audio buffer latency, since it cannot know what extra latency the device introduces. Hence it calculates the latency as: 2x64/44100 = 2.9ms.

For a correct comparison however one should perform a round-trip measurement of the effective latency between the systems you examine. After all, it's about the audio entering the device and how long it takes for the same audio to exit that device.

Such measurements can be done with programs like jdelay or qjacklam on Linux.

You can use LTU from CEntrance to do this on windows.

It's not what you set, it's what you get.

What is the lowest buffer setting possible with FFADO

To cut to the chase:

Buffer settings lower than 3x64 are not going to work with FFADO 2.x.

3x64 will have to do for the moment, and to be honest for me it's not a priority to push the code lower than 3x64. This latency level is considered it low enough and time is rather spent expanding functionality and device support. Experience learns us that there is an exponential increase in complexity when going to lower latencies.

To make 3x64 work reliably you need a well-configured system. The normal Linux distributions (e.g. Fedora, Ubuntu) cannot achieve this performance. You will have to use audio-specific addon packages (e.g. PlanetCCRMA for Fedora, UbuntuStudio? for Ubuntu), or specific audio distributions (e.g. 64studio) if you want to achieve this. And even with them it's not a trivial thing.

So why is the latency higher than on OS XYZ?

Short answer: it isn't.

The main difference between FFADO and vendor drivers is that we don't have to sell anything. This means that we can define our buffer numbers in the way that make most sense for us. The side effect is that it appears as if we have higher latency.

In a FireWire? streaming driver there are multiple layers of buffering involved: there is the audio buffering (ASIO in windows), and the 1394 ISO buffering. When you specify 3x64 buffering to FFADO, this means: "use a total of 3x64 frames for all buffering layers". The caveat in most windows drivers is that they only count the ASIO buffers in their latency computation, and "forget" that they are also using e.g. 4ms of ISO buffering that should be added.

Please note that:

  1. FFADO already has lower latency figures than can be met by Windows (bad design) nor are being met by MacOs? (by design).
  2. Apple considers everything < 10ms as 'good enough'. And the audio pro's seem to be very happy with Apple.

The difference between FreeBoB and FFADO

Some notes: First of all, the 'apparent' latency will increase when going to FFADO. I.e. where you previously were able to run jack + FreeBoB at 3x64, jack + FFADO will need more (e.g. 4x64). However this doesn't mean that the actual latency is higher.

Since the definition of buffer size numbers is an implementation issue, they don't directly compare. This is not only true for the host side of things (e.g. different meanings for different jack backends), but also at the audio card side. The same settings for the same backend can yield different results with different devices. The definition of what exactly the buffer size numbers mean has been changed between FreeBoB and FFADO.

The FreeBoB round-trip latency (jack -> device -> D/A -> cable -> A/D -> device -> jack) has 3 main components:

  1. the jackd buffers, i.e. the '3x64' in this case
  2. the firewire ISO packet buffers (undefined, usually around 0 to 24 frames)
  3. the device side packet/frame buffers (somewhere between 8 and 100 frames, depending on the device implementation. Is a constant for a certain implementation, i.e. does not change between startups)
  4. the a/d - d/a latency (relatively small and constant (1ms))

The following page gives the 2+3+4 latency for the ESI QuataFire?: http://freebob.sourceforge.net/index.php/ESI_Quatafire_Latency being approx 252 frames (@44100).

So if you'd run jack at 3x64 with the quatafire the roundtrip lateny would be approx:

252 + 3x64 = 444 frames = 10ms

One note that has to be made is that the ISO latency (2) in FreeBoB is not constant, nor well-defined. This results in a non-constant round-trip latency when re-starting jack, or when an xrun occurred. This is a major flaw in the FreeBoB implementation.

FFADO fixes this, and defines the latency as follows: the number of frames you specify at the command line (e.g. 3x64) is the latency of 1+2 in frames. Hence the roundtrip latency is well-defined and constant between sessions/xruns. It also means that the FFADO buffer sizes have to be higher compared to FreeBoB since FreeBoB used an extra buffer.

Another side-remark on the FFADO latency is that 'in theory' the mechanism used by FFADO will include 3+4 too. That is, if the device is 100% spec compliant (AMDTP), FFADO will generate the timing info for it such that the roundtrip latency of the entire loop is equal to the latency specified at the command line (e.g. 3x64).

But as far as we know there are no 100% spec compliant devices. BeBoB and ECHO devices will add a certain constant amount of frames for buffering that cannot be accounted for using the AMDTP spec'd timestamping mechanism. DICE based devices don't add this extra buffering. Neither DICE nor BeBoB nor ECHO devices seem to include (4) in their timestamping mechansim. As a result, the only device that really conforms to the '3x64 = total roundtrip latency' is a DICE based device that is looped back with ADAT (i.e. no A/D D/A).