Ticket #380 (closed bug: fixed)

Opened 6 years ago

Last modified 6 years ago

Segmentation fault with JACK in version 2.2.1

Reported by: dave.ze.rave Assigned to: jwoithe
Priority: minor Milestone: FFADO 2.x
Component: devices/rme Version: FFADO SVN (trunk)
Keywords: fireface800 Cc:
The device the bug applies to:

Description

I got some segmentation fault when trying to start JACK with libffado 2.2.1 on my RME Fireface 800. On version 2.1.0 everything runs smoothly so far, but of course the ffado-mixer is by far not as advanced! ;) So, as soon as I revert to the older version, everything runs smoothly again!

I also remember it to be running at least once with the same device type a couple of weeks back. I really wonder what could go wrong...

This is what happens: http://pastebin.com/PyziFcyz

The strange thing is: ffado-mixer (of version 2.2.1) can set stuff on the device when it is connected, save those settings, etc. and of course /dev/fw1 gets created.

So I really don't understand why the "ERROR: firewire ERR: FFADO: Error creating virtual device" is happening here :/

Attachments

ffado_segfault.txt (9.1 kB) - added by dave.ze.rave on 07/20/14 05:25:28.
ffado segfault
libffado-2.2.1-jack-segmentation-fault.txt (5.7 kB) - added by dave.ze.rave on 07/20/14 14:18:47.
ffado.log (22.1 kB) - added by dave.ze.rave on 07/21/14 04:51:32.
jackd -dfirewire -p1024 -n3 -v6 >& /tmp/ffado.log

Change History

07/20/14 05:25:28 changed by dave.ze.rave

  • attachment ffado_segfault.txt added.

ffado segfault

07/20/14 14:18:47 changed by dave.ze.rave

  • attachment libffado-2.2.1-jack-segmentation-fault.txt added.

07/20/14 16:38:17 changed by jwoithe

  • milestone changed from FFADO 2.2 to FFADO 2.x.

Unfortunately the output produced by your initial tests give no clue as to what is going on. There are a few things to run through to try to narrow the potential causes.

First up, could you tell us where your FFADO versions came from? Was 2.1 supplied by the distribution with 2.2.1 compiled by yourself? If you compiled 2.2.1 yourself, what scons line did you use (in particular, did you include "PREFIX=/usr")? If not, it's possible that the old and new FFADO library are both on the system, and this can create strange problems not unlike the one that you're seeing.

If you compiled FFADO yourself we will obtain additional information if you run it in "verbose" mode. While it seems you're using qjackctl, it will be easier for an initial test if you could simply run jackd at the command line. Please use the following command:

  jackd -dfirewire -p1024 -n3 -v6 >& /tmp/ffado.log

where "ffado.log" is an arbitary name for a log file (feel free to use whatever suits you). Once jackd exits, /tmp/ffado.log should contain a lot of debugging information which might point to the cause of the problem. Please make this file available to us so we can take a look.

Regarding the working ffado-mixer from FFADO 2.2.1, this operates independently of jackd. Jackd is concerned only with streaming audio to and from the device. It is entirely possible that one of these components can strike trouble while the other is perfectly fine.

07/21/14 04:50:47 changed by dave.ze.rave

Hey jwoithe,

sorry for the sloppy report. Must be the heat.

I'm on Arch Linux (Linux dvzrv 3.14.12-rt9-1-rt #1 SMP PREEMPT RT Wed Jul 16 12:30:26 CEST 2014 x86_64 GNU/Linux) rt patched (but this also happens on vanilla kernel).

Both libffado-2.1 and 2.2.1 (https://www.archlinux.org/packages/extra/x86_64/libffado/) are supplied by official Arch Linux repositories, as they are official version steps in ffado development (I presume?).

Here you can find how they build it: https://projects.archlinux.org/svntogit/packages.git/tree/trunk?h=packages/libffado (Check the PKGBUILD)

Concerning scons: Yes, "PREFIX=/usr" got included.

As pacman (Arch Linux package manager) handles up and downgrades pretty well, nothing is left of the old package in /usr/lib (so mixing up different libffado.so versions is out of the question). From checking a diff of the included files of the two packages, I can see, that only libffado.so.2.1.0 has changed to libffado.so.2.2.1, otherwise a lot of mixer stuff has been added (and that shouldn't be of interest here).

I'll attach the jackd debug log as suggested.

btw: Just mentioned ffado-mixer, because great improvements have been made in version 2.2.1 and I'd like to use it (and of course it luckily is independent of jackd!). ;)

07/21/14 04:51:32 changed by dave.ze.rave

  • attachment ffado.log added.

jackd -dfirewire -p1024 -n3 -v6 >& /tmp/ffado.log

07/21/14 06:11:06 changed by jwoithe

  • priority changed from blocker to major.

Thanks for posting the additional information and the debug log. The log in particular appears to show the cause of the trouble. First up we have this:

get_hardware_state: State reported by hardware:
:
get_hardware_state:   clock_mode: autosync/slave

This suggests that the device is set up with the sync source set to "autosync/slave". In other words, the FF800 is configured to look for an external clock. A little while later we see:

setSamplingFrequency: slave clock mode active but no valid external clock present
initStreaming: Could not set sampling frequency to 48000
ffado_streaming_init: Could not init the streaming system

While the FF800 is set to search for an external clock, it appears to be reporting that there is no valid external clock available (this is consistent with the rest of the get_hardware_state output, which also indicates the lack of any external clock signals). As a result the FF800 cannot verify the requested sample rate (48k) and therefore errors out. I will freely admit that the errors reported via jackd without FFADO's "-v6" option active could be slightly more informative, but fixing this is a relatively involved task for various reasons. That aside, the lack of a valid external clock signal is the apparent cause of your startup error.

The first thing to check would be to use ffado-mixer to see what the clock source is set to. Note that for the RME devices this is configured in the body of the "Control" tab rather than the generic dropdown control because the RME interfaces deal with clock sources in a different way to most other devices. Change the clock source to "master", optionally exit ffado-mixer, and try starting jackd again. See what happens and let me know the results.

As to why earlier FFADO versions work, I think this is due to the fact that FFADO 2.2.x now attempts to obtain as much of its initial operating configuration from the Fireface itself (or more accurately, from the configuration stored in the Fireface's flash). In contrast, FFADO 2.1.x forced its own preconfigured setup onto the device at startup which included the use of the internal clock. Therefore, if your Fireface's configuration stored in flash happens to be set up to select autosync/slave mode as the clock source, it is entirely possible that FFADO 2.1.x will behave differently to 2.2.x.

Finally, the segmentation fault you saw in your initial tests appears to originate in jackd code. As far as I can tell FFADO is returning an error code back to jackd; it seems that jackd is not quite handling this right and the jackd code segfaults. Interestingly enough, as far as I can tell your manual running of jackd did not segfault so the problem might be due to an interaction between jack2 and qjackctl when a backend's startup returns an error.

07/21/14 06:27:07 changed by dave.ze.rave

  • status changed from new to closed.
  • resolution set to fixed.

Alrite. This seems to have resolved the issue! It was a bit strange, because although the clock source was set to "Master" in ffado-mixer, it seems it was actually set to "Autosync". So I rechecked "Autosync" and afterwards "Master" again, saved the controls et voilá, JACK starts! As soon as I use "Autosync" again, JACK fails (because there is no external clock). I should've paid more attention to the red "HOST" LED on the device... -_-

So all in all: Make sure to set your clock source to "Master" and make sure it really IS set to "Master" before starting JACK!

Thanks for the help and clarification, jwoithe! I think the way ffado-mixer works now, is much more coherent with the way the RME software works (which is a good thing), but you have to keep an eye on these small culprits, if something changes in the way it behaves.

07/21/14 13:24:13 changed by stefanr

My 2 cents: FFADO should log failure messages as regular log messages, not as debug messages. In other words, *if* it is possible to determine in FFADO's RME driver that the "slave clock mode active but no valid external clock present" is not recoverable, this message should always be logged regardless of verbosity setting.

07/21/14 16:05:13 changed by jwoithe

  • priority changed from major to blocker.

dave.ze.rave: thanks for reporting your success with this. I am somewhat puzzled that the clock source was initially reported by ffado-mixer to be "master" when in fact it was not. I will look into that as a separate issue. However, it does seem that the setting to autosync without an external clock was the root cause of your troubles.

Regarding the "HOST" LED, this should extinguish once a any component of FFADO (be it the streaming system via jackd, or the ffado-mixer interface via the automatically started ffado-dbus-server) start and communicate with the FF800 successfully. It is interesting that the LED remained lit at least for some parts of your testing. The important thing is that things seem to be working now.

The Fireface mixer is now much closer to feature-parity with the equivalents provided on other systems. This was the most significant RME driver change between FFADO 2.1 and 2.2.

Stefan: I completely agree that the way error messages are handled is very sub-optimal. I recall that I looked into this issue some time ago (more than a year ago) and discovered that there were some complications which meant that any change would not be trivial. I can't remember the details off-hand though but I think it was associated with the way these were managed internally. In any case, it is my intention to revisit this during the next development cycle.

07/22/14 16:09:11 changed by jwoithe

  • priority changed from blocker to minor.