Ticket #378 (new bug)

Opened 7 years ago

Last modified 7 years ago

Focusrite Saffire PRO 14 jackd dies with Assertion `m_handle == __null' failed.

Reported by: teemu kurki Assigned to:
Priority: major Milestone: FFADO 2.1
Component: devices/dice Version: FFADO SVN (trunk)
Keywords: Cc:
The device the bug applies to: Focusrite Saffire PRO 14

Description

OS: newest AVLinux, ffado version 2.1.0+SVN2462 Output of ffado-diag attached.

ffado-mixer shows nicely all the matrixes and operated as the red inst-leds shine when corresponding button in ffado-mixer is pressed. Also the third led (FW) shines green after starting the jack with command: /usr/bin/jackd -v -t2000 -dfirewire -v 7 -dhw:0 -r48000 -p512 -n3 > jackd_ou.txt 2>&1

The resulted file is attached, too.

And yes, the jack dies eventually. aborted

Attachments

ffado-diag.out (6.4 kB) - added by teemu kurki on 02/20/14 03:36:31.
ffado-diag output
jackdout.zip (191.5 kB) - added by teemu kurki on 02/20/14 03:42:02.
zipped 6.1MB of jackd output

Change History

02/20/14 03:36:31 changed by teemu kurki

  • attachment ffado-diag.out added.

ffado-diag output

02/20/14 03:42:02 changed by teemu kurki

  • attachment jackdout.zip added.

zipped 6.1MB of jackd output

02/20/14 04:20:14 changed by teemu kurki

  • owner changed.
  • component changed from generic to devices/dice.

02/20/14 14:54:53 changed by jwoithe

I note that you are running at debug level 7 ("-v 7"). This debug level produces a lot of output: often one line per packet sent and received. This sustained level of output can cause performance problems. Do you have problems if you run with no "-v" options at all? Assuming yes, could you please try running jackd with the following simplified command line:

/usr/bin/jackd -dfirewire -v6 -r48000 -p512 -n3 > jackd_out.txt 2>&1 

This should remove any uncertainty arising frmo timing issues caused by the very verbose logs obtained when "-v 7" is used. We also don't need the verbose messages from the upper jackd levels, so the "-v" before "-dfirewire" can be omitted. This will give us a better baseline for evaluation.

From your ffado-diag:

03:00.0 FireWire (IEEE 1394) [0c00]: JMicron Technology Corp. IEEE 1394 Host Controller [197b:2380] (prog-if 10 [OHCI])

JMicron firewire host controllers have caused significant problems in the past when used for audio work. It is suspected that there is a bug on the silicon which the writers of firewire OHCI drivers under other OSes have been made aware of and are working around. If this is true then the information hasn't made it to the Linux kernel driver authors. In any case, we have had several people on the ffado-devel and ffado-user mailing lists over the last 6 months or so who had issues with JMicron firewire cards which were all completely resolved when they used a more reliable card (usually something using a chipset from TI or LSI). A little more information about JMicron cards can be found on our Host Controller Notes page.

02/21/14 08:48:28 changed by teemu kurki

Thanks for the proposal.

I have tried first without the verbose, options with the same result.

Now I found a way how to work around the problem reported by me. I booted to windows and used the Saffire Mixcontrol application (provided with the device) and selected "Reset to factory defaults" and then "Save to hardware".

After that the problem disappeared also at Linux side. :)

I apparently have changed some setups with saffire mixcontrol before my trials...

02/23/14 15:44:25 changed by jwoithe

Thanks for your reply. Running without the "-v6" option after "-dfirewire" conclusively demonstrates that the issues observered are not a result of the additional loading placed on the system by the extra logging. Thank you for testing this.

Your workaround is interesting. Although it's clearly difficult to do, it would be good to know what part of the devices setup was causing FFADO these difficulties. To identify the setting which causes the problem we would need a Pro14 owner to test FFADO repeatedly, having changed only one thing from the default setup at a time. This would be very tedious, and it's not clear to me whether it is worth doing.

I am not sure what to mark this ticket as. If it's felt this is something which ought to be pursued we could mark it as "waiting for volunteer". Otherwise we could go for "wontfix", since without knowing what setting caused the problem it's difficult to do anything about it. However, this is complicated by further observations descrived below.

Referring back to the original log I note that the dbs field of each packet was out by a factor of 4 compared to what theory says it should have been. At first glance I don't expect this to have anything to do with the problem being reported as it seems the device firmware is simply failing to report dbs in the way the standard expects. FFADO should still to the right thing here I think, and in any case this is such a low-level issue that I would expect it to occur no matter what the device settings were.

If you wanted to confirm this you could retry your original test (with "-v7") and simply note whether the

(AmdtpReceiveStreamProcessor.cpp)[ 202] processPacketData: packet->dbs 9 calculated dbs 36 packet->fdf 02 nevents 8

message appeared in the resulting log.

Working back from the end of the original log, the assertion occurred as a result of an attempt to recover from a timeout while starting the transmit (playback) stream:

(StreamProcessorManager.cpp)[ 547] startDryRunning:  Timeout waiting for the SP's to start dry-running
(StreamProcessorManager.cpp)[ 552] startDryRunning:  Receive SP 0x874ccb8 has state ePS_DryRunning
(StreamProcessorManager.cpp)[ 558] startDryRunning:  Transmit SP 0x874cfb0 has state ePS_Stopped

The transmit stream failed to start because it was disabled:

(IsoHandlerManager.cpp)[1818] enable: Could not do xmit initialisation!
(IsoHandlerManager.cpp)[ 188] updateShadowMapHelper: (0x87247b0) Transmit handler 0x874cb30 skipped (disabled)

The transmit initialisation failed because the Pro14 appears to have rejected the iso channel that FFADO wanted to use (channel 2)"

(ieee1394service.cpp)[1326] allocateIsoChannelGeneric: found free iso channel 2
(dice_avdevice.cpp)[1456] allocateIsoChannel: allocated channel 2, bandwidth 304
(dice_avdevice.cpp)[1358] startstopStreamByIndex: (0x8725180) Allocated channel 2 for TX
:
(dice_avdevice.cpp)[1370] startstopStreamByIndex: ISO_CHANNEL register != 0xFFFFFFFF (=0x00000000) for ATX 0
(dice_avdevice.cpp)[1462] deallocateIsoChannel: freeing channel 2

In other words, the ISO_CHANNEL register on the Pro14 was not the expected value after attempting to set the iso channel (the code suggests it should be 0xffffffff). This causes that part of the setup to fail, and thus transmit initialisation fails.

There are a few other interesting observations too. Firstly, the setup of the receive iso channel fails in the same way but FFADO does not appear to notice. This is possibly a bug. Second, since transmit initialisation failed it's arguable that FFADO should have simply failed at that point and not tried to get streaming going at all. The continued attempts to start might be due to FFADO not registering that receive setup also failed - as far as it was concerned at least one stream was good to go, so it tried for that. The root cause of the fault might therefore be due to a failure to properly propagate the receive initialisation error back to the higher levels of the FFADO system. I will look into this.

Another item worthy of note was the fact that FFADO was attempting to use iso channel 2 for transmit, rather than channel 0 which would be the usual choice. The reason 2 was chosen is that channels 0 and 1 were already marked by the bus as being in use. The most likely reason for this is that a previous FFADO session had crashed out - when this happens channels are not freed, and a bus reset is required to ensure they are freed. What I can't tell from the log is what caused the earlier crash: was it a similar failure to set iso channels on the Pro14 (0 for tx in this case), or was it some other as yet unidentified issue? The former seems to be the most likely, in which case we have to conclude that some setting of the Pro14's configuration (one which was cleared by the factory reset) gave rise to this iso initialisation rejection.