Ticket #306 (closed bug: fixed)

Opened 13 years ago

Last modified 12 years ago

jackd segfaults while terminating when using the Juju stack

Reported by: koniu Assigned to:
Priority: major Milestone:
Component: generic Version: FFADO 2.0.1
Keywords: Cc:
The device the bug applies to:

Description

From: http://trac.jackaudio.org/ticket/196 (more details)

1. run jackdbus auto
2. jack_control ds firewire
3. jack_control start
4. jack_control stop
5. #@Q$*&*&%! (crash)

Debian testing, jackd2 1.9.5~dfsg-19, libffado2 2.0.1+svn1856-1

Jack output with ffado v=3

koniu@krak ~ % jackdbus auto                        
no message buffer overruns
no message buffer overruns
04468875342:  (ffado.cpp)[  92] ffado_streaming_init: libffado 2.999.0- built Sep 29 2010 15:02:40
04468900222: Warning (ieee1394service.cpp)[ 375] initialize: Could not set SPLIT_TIMEOUT to min requested (1000000)
04468900350: Warning (ieee1394service.cpp)[ 379] initialize: Set SPLIT_TIMEOUT to min requested (1000000) did not succeed
04468902092: Warning (cycletimer.h)[ 120] wrapAtMinTicks: insufficient wrapping: -6888979786615724700
04473150667: Warning (StreamProcessor.cpp)[ 707] getPacket: Instantanous samplerate more than 1% off nominal. [Nom fs: 44100.000000, Instantanous fs: 22033.845119, diff: 22066.154881 (    0.500366)]
04476112106: Warning (StreamProcessor.cpp)[ 389] putPacket: Instantanous samplerate more than 1% off nominal. [Nom fs: 44100.000000, Instantanous fs:    31.635583, diff: 44068.364417 (    0.999283)]
04476112187: Warning (TimestampedBuffer.cpp)[1053] incrementFrameCounter: (0x98a9e28) difference rather large (+): diff=   6210296.575, max=      1536.000, 3128331731.000, 3122121434.425
04476112218: Warning (TimestampedBuffer.cpp)[ 249] calculateRate: (0x98a9e28) rate ( 682.41101) more that 10% off nominal (rate= 557.27893, diff=      5459.288, update_period=8)
04476120040: Warning (ieee1394service.cpp)[1487] freeIsoChannel: Could not do CMP disconnect for channel 0!
04476123604: Warning (ieee1394service.cpp)[1487] freeIsoChannel: Could not do CMP disconnect for channel 1!
04476124438: Error (IsoHandlerManager.cpp)[1289] ~IsoHandler: BUG: Handler still running!
*** glibc detected *** jackdbus: free(): invalid pointer: 0x098d3b70 ***
*** glibc detected *** jackdbus: free(): invalid pointer: 0x098a9230 ***
terminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check
no message buffer overruns
koniu@krak ~ % 


ffado-diag

koniu@krak ~ % ffado-diag 


FFADO diagnostic utility 2.999.0-
============================
(C) 2008 Pieter Palmers


=== CHECK ===
 Base system...
  kernel version............ 2.6.33.7-rt29
FIXME: implement test for RT kernel
   RT patched............... False
  old 1394 stack present.... False
  old 1394 stack loaded..... False
  old 1394 stack active..... False
  new 1394 stack present.... True
  new 1394 stack loaded..... True
  new 1394 stack active..... True
  /dev/raw1394 node present. False
 Prerequisites (dynamic at run-time)...
   gcc ............... gcc (Debian 4.4.4-8) 4.4.5 20100728 (prerelease)
   g++ ............... g++ (Debian 4.4.4-8) 4.4.5 20100728 (prerelease)
   PyQt4 (by pyuic4) . sh: pyuic4: not found
   jackd ............. no message buffer overruns
     path ............ /usr/bin/jackd
     flags ...........  -ljack  
   libraw1394 ........ 2.0.5
     flags ...........  -lraw1394  
   libavc1394 ........ Package libavc1394 was not found in the pkg-config search path.
Perhaps you should add the directory containing `libavc1394.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libavc1394' found
     flags ........... Package libavc1394 was not found in the pkg-config search path.
Perhaps you should add the directory containing `libavc1394.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libavc1394' found
   libiec61883 ....... Package libiec61883 was not found in the pkg-config search path.
Perhaps you should add the directory containing `libiec61883.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libiec61883' found
     flags ........... Package libiec61883 was not found in the pkg-config search path.
Perhaps you should add the directory containing `libiec61883.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libiec61883' found
   libxml++-2.6 ...... Package libxml++-2.6 was not found in the pkg-config search path.
Perhaps you should add the directory containing `libxml++-2.6.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libxml++-2.6' found
     flags ........... Package libxml++-2.6 was not found in the pkg-config search path.
Perhaps you should add the directory containing `libxml++-2.6.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libxml++-2.6' found
   dbus-1 ............ 1.2.24
     flags ........... -I/usr/include/dbus-1.0 -I/usr/lib/dbus-1.0/include  -ldbus-1 -lpthread -lrt  
 Prerequisites (static at compile-time)...
   gcc ............... gcc (Debian 4.4.4-17) 4.4.5
   g++ ............... g++ (Debian 4.4.4-17) 4.4.5
   PyQt4 (by pyuic4) . Python User Interface Compiler 4.7.3 for Qt version 4.6.3
   jackd ............. sh: jackd: not found
     path ............ 
     flags ........... Package jack was not found in the pkg-config search path.
   libraw1394 ........ 2.0.5
     flags ...........  -lraw1394  
   libavc1394 ........ Package libavc1394 was not found in the pkg-config search path.
     flags ........... Package libavc1394 was not found in the pkg-config search path.
   libiec61883 ....... 1.2.0
     flags ...........  -liec61883 -lraw1394  
   libxml++-2.6 ...... 2.30.0
     flags ........... -pthread -I/usr/include/libxml++-2.6 -I/usr/lib/libxml++-2.6/include -I/usr/include/libxml2 -I/usr/include/glibmm-2.4 -I/usr/lib/glibmm-2.4/include -I/usr/include/sigc++-2.0 -I/usr/lib/sigc++-2.0/include -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include  -pthread -lxml++-2.6 -lxml2 -lglibmm-2.4 -lgobject-2.0 -lsigc-2.0 -lgthread-2.0 -lrt -lglib-2.0  
   dbus-1 ............ 1.2.24
     flags ........... -I/usr/include/dbus-1.0 -I/usr/lib/dbus-1.0/include  -ldbus-1 -lpthread -lrt  
 Hardware...
   Host controllers:
15:00.1 FireWire (IEEE 1394) [0c00]: Ricoh Co Ltd R5C552 IEEE 1394 Controller [1180:0552] (rev 09) (prog-if 10 [OHCI])
	Subsystem: Lenovo ThinkPad X60s [17aa:201e]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 32 (500ns min, 1000ns max)
	Interrupt: pin B routed to IRQ 17
	Region 0: Memory at e4301000 (32-bit, non-prefetchable) [size=2K]
	Capabilities: <access denied>
	Kernel driver in use: firewire_ohci

16:00.0 FireWire (IEEE 1394) [0c00]: VIA Technologies, Inc. VT6306/7/8 [Fire II(M)] IEEE 1394 OHCI Controller [1106:3044] (rev 46) (prog-if 10 [OHCI])
	Subsystem: VIA Technologies, Inc. VT6306/7/8 [Fire II(M)] IEEE 1394 OHCI Controller [1106:3044]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 64 (8000ns max), Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at 84000000 (32-bit, non-prefetchable) [size=2K]
	Region 1: I/O ports at 9000 [size=128]
	Region 2: Memory at 84000800 (32-bit, non-prefetchable) [size=256]
	Capabilities: <access denied>
	Kernel driver in use: firewire_ohci

   CPU info:
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 14
model name	: Genuine Intel(R) CPU           L2400  @ 1.66GHz
stepping	: 8
cpu MHz		: 1000.000
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc arch_perfmon bts aperfmperf pni monitor vmx est tm2 xtpr pdcm
bogomips	: 3324.76
clflush size	: 64
cache_alignment	: 64
address sizes	: 32 bits physical, 32 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 14
model name	: Genuine Intel(R) CPU           L2400  @ 1.66GHz
stepping	: 8
cpu MHz		: 1667.000
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc arch_perfmon bts aperfmperf pni monitor vmx est tm2 xtpr pdcm
bogomips	: 3324.87
clflush size	: 64
cache_alignment	: 64
address sizes	: 32 bits physical, 32 bits virtual
power management:

 Configuration...
  IRQ information
Hardware Interrupts:
--------------------
 IRQ    0: PID:  None, count: [2206141, 2206141], Sched None (priority None), drivers: ['timer']
 IRQ    1: PID:  None, count:     [22361, 22361], Sched None (priority None), drivers: ['i8042']
 IRQ    8: PID:  None, count:             [1, 1], Sched None (priority None), drivers: ['rtc0']
 IRQ    9: PID:  None, count:     [51595, 51595], Sched None (priority None), drivers: ['acpi']
 IRQ   12: PID:  None, count:   [189091, 189091], Sched None (priority None), drivers: ['i8042']
 IRQ   14: PID:  None, count:             [0, 0], Sched None (priority None), drivers: ['ide0']
 IRQ   16: PID:  None, count:   [199137, 199137], Sched None (priority None), drivers: ['uhci_hcd:usb2', 'yenta', 'i915', 'firewire_ohci']
 IRQ   17: PID:  None, count:           [71, 71], Sched None (priority None), drivers: ['firewire_ohci', 'uhci_hcd:usb3']
 IRQ   18: PID:  None, count:             [0, 0], Sched None (priority None), drivers: ['mmc0', 'uhci_hcd:usb4']
 IRQ   19: PID:  None, count:       [7073, 7073], Sched None (priority None), drivers: ['ehci_hcd:usb1', 'uhci_hcd:usb5']
 IRQ   28: PID:  None, count:       [2237, 2237], Sched None (priority None), drivers: ['eth0']
 IRQ   29: PID:  None, count:     [18755, 18755], Sched None (priority None), drivers: ['ahci']
 IRQ   30: PID:  None, count:   [152431, 152431], Sched None (priority None), drivers: ['hda_intel']
 IRQ   31: PID:  None, count:     [97179, 97179], Sched None (priority None), drivers: ['iwl3945']


Do you guys have some insight?

Cheers,

Attachments

adi-debug.patch (15.8 kB) - added by adi on 10/04/10 16:33:21.
patch that adds more debugging output. also has some locks in place, though not every lock is used.
ffado_log.txt (170.9 kB) - added by roos on 06/18/11 02:44:06.
entire session log from start til crash
qjackctl_messages (29.4 kB) - added by roos on 06/18/11 02:45:42.
qjackctl messages
ffado_log2 (26.3 kB) - added by roos on 06/18/11 03:14:09.
it shows -v 6 messages right before the crash.
ffado_qjackctl_jackd1 (35.5 kB) - added by roos on 06/19/11 04:23:31.
qjackctl messages ffado-svn jackd1
ffado_terminal_jackd1 (82.4 kB) - added by roos on 06/19/11 04:26:04.
-v 6 log when running jack -d firewire in terminal / jackd1
ffado-diag_jackd1_Debian (6.3 kB) - added by roos on 06/19/11 04:27:50.
ffado-diag / debian / jackd1
ffado_qjackctl_debian_jackd2 (43.5 kB) - added by roos on 06/19/11 04:40:56.
ffado / qjackctl messages / debian / now jackd2
ffado_terminal_jackd2_debian (87.3 kB) - added by roos on 06/19/11 04:42:22.
terminal / ffado-svn / debian / jackd
ffado-diag_jackd2_Debian (6.5 kB) - added by roos on 06/19/11 04:43:41.
ffado-diag / debian / jackd2
ffado-diag_ubuntu_jackd2 (6.4 kB) - added by roos on 06/19/11 04:52:32.
ffado-diag ubuntu jackd2
log (12.4 kB) - added by jof on 06/20/11 07:17:43.
log file for "jackd 1" and "ffado svn 1985"
log-ffado-diag (7.2 kB) - added by jof on 06/20/11 07:25:34.
log file ffado-diag "ffado svn 1985"

Change History

10/02/10 03:55:04 changed by adi

  • summary changed from jackdbus dies when stopping firewire backend to jackd segfaults while terminating when using the Juju stack.

Hi!

This is a known regression when running FFADO on the new Juju stack.

10/03/10 09:27:38 changed by stefanr

Indeed. First I thought it was a double-free bug in libraw1394's juju specific code, but I was wrong. This is more likely an issue in libffado. Superficial browsing through the source did not yet point out anything obvious to me.

10/04/10 16:32:24 changed by adi

I spent the last 15hrs or so hunting this bug. Before I'll fall asleep, let me post some of things I've found. Don't get confused by additional printfs, I took the liberty to add a few.

Basic trouble is that IsoHandlerManager::IsoHandler::disable() is called multiple times for the same fw_handle. This seems to cause a race condition, though I still don't know the culprit.

Let's add some logs:

106902828072: Debug (IsoHandlerManager.cpp)[1898] updateState: (0x9e04f28) handler needs state update from 1 => 0
106902828079: Debug (IsoHandlerManager.cpp)[1903] updateState: handler has to be disabled
adi: Now waiting for handle 9e04a30 on lock 9e04f64 in thread 9e04f28
106902828085: Debug (IsoHandlerManager.cpp)[1796] disable: (0x9e04f28, Receive) enter...
adi: got it, lock 9e04f64 is held by thread 9e04f28
106902828108: Debug (IsoHandlerManager.cpp)[1813] disable: (0x9e04f28, Receive) wake up handle...
106902828122: Debug (IsoHandlerManager.cpp)[1827] disable: (0x9e04f28, Receive) stop...
106902829130: Debug (IsoHandlerManager.cpp)[1839] disable: (0x9e04f28, Receive) shutdown...
adi@libraw: now destroying handle 9e04a30
106902829165: Debug (IsoHandlerManager.cpp)[1844] disable: (0x9e04f28, Receive) destroy...

Looks like we destroyed fw-handle 0x9e04a30 from thread 0x9e04f28. But wait, only a few lines later:

106902835835: Debug (IsoHandlerManager.cpp)[ 950] pruneHandlers:  handler (0x9e04f28) not in use
106902835850: Debug (IsoHandlerManager.cpp)[ 683] unregisterHandler: enter...
106902835865: Debug (IsoHandlerManager.cpp)[  90] requestShadowMapUpdate: (0x9e02ce8) enter
106902835878: Debug (IsoHandlerManager.cpp)[  95] requestShadowMapUpdate: (0x9e02ce8) exit
106902835892: Debug (IsoHandlerManager.cpp)[  90] requestShadowMapUpdate: (0x9e02e28) enter
106902835905: Debug (IsoHandlerManager.cpp)[  95] requestShadowMapUpdate: (0x9e02e28) exit
106902835918: Debug (IsoHandlerManager.cpp)[ 961] pruneHandlers:  deleting handler (0x9e04f28)
106902835931: Error (IsoHandlerManager.cpp)[1296] ~IsoHandler: BUG: Handler still running!
adi: handle 9e04a30 still running in 9e04f28
106902835951: Debug (IsoHandlerManager.cpp)[1796] disable: (0x9e04f28, Receive) enter...
adi: Now waiting for handle 9e04a30 on lock 9e04f64 in thread 9e04f28
adi: got it, lock 9e04f64 is held by thread 9e04f28
106902835972: Debug (IsoHandlerManager.cpp)[1813] disable: (0x9e04f28, Receive) wake up handle...
106902835990: Debug (IsoHandlerManager.cpp)[1827] disable: (0x9e04f28, Receive) stop...
106902836005: Debug (IsoHandlerManager.cpp)[1839] disable: (0x9e04f28, Receive) shutdown...
106902836019: Debug (IsoHandlerManager.cpp)[1844] disable: (0x9e04f28, Receive) destroy...
adi@libraw: now destroying handle 9e04a30

Once again, the very same fw-handle from the very same thread seems to get destroyed a second time.

Given that pruneHandlers()'s call to the IsoHandler? destructor comes first (though very close, that's the "BUG: Handler still running message") which results in an additional call to disable(), I wonder what happened to the "normal" call of disable() in updateState().

As mentioned above, I still haven't found the root cause, but perhaps somebody has an idea or compares the shutdown behaviour against the old stack. For the sake of simplicity, I attached my current modifications. They do contain a lot of noise, especially lines that accidentially got indented, but I'm too tired now to clean them up.

HTH

10/04/10 16:33:21 changed by adi

  • attachment adi-debug.patch added.

patch that adds more debugging output. also has some locks in place, though not every lock is used.

05/22/11 06:13:08 changed by jwoithe

  • status changed from new to closed.
  • resolution set to fixed.

I finally moved to the new stack this weekend and very quickly got tired of the double-free and decided to track it down. SVN r1985 should fix it.

I should have read this ticket before doing the work - it would have saved me a little time. :-)

On my system at least, the reason why the IsoHandler? destructor calls disable() is that it thinks the handler is still running (obviously). This means that the previous call to disable(), instigated by the IsoTask? thread, could not have completed at this point, because it's final task is to set the handle to NULL and the state to "stopped". Further testing showed that raw1394_destroy_handle() appeared to take upwards of 20 milliseconds (!) (around 19000 units of the ffado debug output timestamp which is in microseconds AFAIK) to return. This is what happened to the "normal" call of disable() in updateState() referred to in the previous comment. In this time the jackd thread continues its closedown and hits the IsoHandler? destructor before the IsoTask?'s disable() has completed. We therefore get a second call to raw1394_destroy_handle() and the double-free follows.

I'm not for one moment going to pretend that the solution in r1985 is elegant or even the "correct" approach. However, in practice it seems to work and should tie us over until the kernel streaming stuff matures and gains support for all devices. I also cannot be sure that there aren't further races in the closedown code which we might hit under different systems with different devices, but we can deal with those as they become apparent. I suspect that fixing the underlying ffado thread issues which give rise to the above control flow will take considerable effort which will be hard to justify given the future in-kernel plans.

Of course one could argue that in any case we ought to try to work out why raw1394_destroy_handle() takes so long under the new stack, but that's an exercise for another evening.

I will tentatively close this ticket as "fixed" since it seems that r1985 fixes it on my system. However, it may be that I'm just lucky, so if someone can reproduce it with r1985 or later on their own systems feel free to re-open it.

06/17/11 14:24:48 changed by roos

  • status changed from closed to reopened.
  • resolution deleted.

It looks like this problem isn't solved yet. Using 2.0.99+svn1985-2 on Debian and Ubuntu. Jack1 and Jack2.

22:12:55.577 JACK is stopping...
Jack: JackClient::Deactivate
Jack: fPollTable i = 1 fd = 46
Jack: fPollTable i = 2 fd = 47
Jack: fPollTable i = 3 fd = 49
Jack: JackRequest::DeactivateClient
Jack: JackEngine::ClientDeactivate ref = 3 name = qjackctl
Jack: JackServer::Deactivate client = 3 was not activated
Jack: JackServer::Deactivate client = 3 was not activated
Jack: JackProcessSync::TimedWait time out = 10000000
Jack: JackProcessSync::TimedWait finished delta = 1873.0
Jack: fPollTable i = 1 fd = 46
Jack: JackRequest::Notification
Jack: JackEngine::NotifyClient: no callback for event = 4
Jack: JackEngine::NotifyClient: no callback for event = 4
Jack: JackExternalClient::ClientNotify ref = 2 name = jack_cpu_load notify = 4
Jack: JackClient::Deactivate res = 0
Jack: jack_client_close
Jack: JackClient::Close ref = 3
Jack: JackClient::Deactivate
Jack: JackSocketClientChannel::Stop
Jack: JackPosixThread::Kill
Jack: JackExternalClient::ClientNotify ref = 3 name = qjackctl notify = 4
Jack: JackGraphManager::RecalculateLatency port_index = 6
Jack: JackGraphManager::RecalculateLatency port_index = 7
Jack: JackGraphManager::RecalculateLatency port_index = 8
Jack: JackGraphManager::RecalculateLatency port_index = 9
Jack: JackGraphManager::RecalculateLatency port_index = 10
Jack: JackGraphManager::RecalculateLatency port_index = 11
Jack: JackGraphManager::RecalculateLatency port_index = 12
Jack: JackGraphManager::RecalculateLatency port_index = 6
Jack: JackGraphManager::RecalculateLatency port_index = 7
Jack: JackGraphManager::RecalculateLatency port_index = 8
Jack: JackGraphManager::RecalculateLatency port_index = 9
Jack: JackGraphManager::RecalculateLatency port_index = 10
Jack: JackGraphManager::RecalculateLatency port_index = 11
Jack: JackGraphManager::RecalculateLatency port_index = 12
Jack: JackGraphManager::RecalculateLatency port_index = 1
Jack: JackGraphManager::RecalculateLatency port_index = 2
Jack: JackGraphManager::RecalculateLatency port_index = 3
Jack: JackGraphManager::RecalculateLatency port_index = 4
Jack: JackGraphManager::RecalculateLatency port_index = 5
Jack: JackGraphManager::RecalculateLatency port_index = 1
Jack: JackGraphManager::RecalculateLatency port_index = 2
Jack: JackGraphManager::RecalculateLatency port_index = 3
Jack: JackGraphManager::RecalculateLatency port_index = 4
Jack: JackGraphManager::RecalculateLatency port_index = 5
Jack: fPollTable i = 2 fd = 47
Jack: fPollTable i = 3 fd = 49
Jack: fPollTable i = 1 fd = 46
Jack: fPollTable i = 2 fd = 47
Jack: fPollTable i = 3 fd = 49
Jack: JackRequest::ClientClose
Jack: JackEngine::ClientCloseAux ref = 3
Jack: JackGraphManager::RemoveAllPorts ref = 3
Jack: JackProcessSync::TimedWait time out = 2000000
Jack: JackProcessSync::TimedWait finished delta = 450.0
Jack: JackExternalClient::ClientNotify ref = 3 name = qjackctl notify = 1
Jack: JackPosixSemaphore::Destroy
Jack: JackSocketNotifyChannel::Close
Jack: JackClientSocket::Close
Jack: JackShmMem::delete size = 0 index = 3
Jack: JackSocketServerChannel::ClientRemove ref = 3
Jack: JackClientSocket::Close
Jack: JackClientSocket::Close
Jack: JackSocketServerChannel::BuildPoolTable size = 3
Jack: fSocketTable i = 1 fd = 46
Jack: fSocketTable i = 2 fd = 47
Jack: JackClientSocket::Close
Jack: fPollTable i = 1 fd = 46
Jack: JackRequest::Notification
Jack: JackEngine::NotifyClient: no callback for event = 4
Jack: JackEngine::NotifyClient: no callback for event = 4
Jack: JackPosixSemaphore::Disconnect jack_sem.1000_default_qjackctl
Jack: JackExternalClient::ClientNotify ref = 2 name = jack_cpu_load notify = 4
Jack: JackLibClient::~JackLibClient
Jack: JackShmReadWritePtr1::~JackShmReadWritePtr1 3
Jack: Succeeded in unlocking 384 byte memory area
Jack: JackLibGlobals Destroy 87e4610
Jack: ~JackLibGlobals
Jack: JackPosixSemaphore::Disconnect jack_sem.1000_default_system
Jack: JackPosixSemaphore::Disconnect jack_sem.1000_default_freewheel
Jack: JackPosixSemaphore::Disconnect jack_sem.1000_default_jack_cpu_load
Jack: no message buffer overruns
Jack: ThreadHandler: exit
Jack: JackPosixThread::Stop
Jack: JackShmReadWritePtr::~JackShmReadWritePtr 1
Jack: Succeeded in unlocking 1012 byte memory area
Jack: JackShmReadWritePtr::~JackShmReadWritePtr 0
Jack: JackGraphManager::RecalculateLatency port_index = 6
Jack: JackGraphManager::RecalculateLatency port_index = 7
Jack: JackGraphManager::RecalculateLatency port_index = 8
Jack: JackGraphManager::RecalculateLatency port_index = 9
Jack: JackGraphManager::RecalculateLatency port_index = 10
Jack: JackGraphManager::RecalculateLatency port_index = 11
Jack: JackGraphManager::RecalculateLatency port_index = 12
Jack: JackGraphManager::RecalculateLatency port_index = 6
Jack: JackGraphManager::RecalculateLatency port_index = 7
Jack: JackGraphManager::RecalculateLatency port_index = 8
Jack: JackGraphManager::RecalculateLatency port_index = 9
Jack: JackGraphManager::RecalculateLatency port_index = 10
Jack: JackGraphManager::RecalculateLatency port_index = 11
Jack: JackGraphManager::RecalculateLatency port_index = 12
Jack: JackGraphManager::RecalculateLatency port_index = 1
Jack: JackGraphManager::RecalculateLatency port_index = 2
Jack: JackGraphManager::RecalculateLatency port_index = 3
Jack: JackGraphManager::RecalculateLatency port_index = 4
Jack: JackGraphManager::RecalculateLatency port_index = 5
Jack: JackGraphManager::RecalculateLatency port_index = 1
Jack: JackGraphManager::RecalculateLatency port_index = 2
Jack: JackGraphManager::RecalculateLatency port_index = 3
Jack: JackGraphManager::RecalculateLatency port_index = 4
Jack: JackGraphManager::RecalculateLatency port_index = 5
Jack: fPollTable i = 2 fd = 47
Jack: Succeeded in unlocking 82245916 byte memory area
Jack: jack_client_close res = 0
jack main caught signal 15
Jack: JackServer::Stop
Jack: JackThreadedDriver::Stop
Jack: JackPosixThread::Stop
Jack: ThreadHandler: exit
Jack: JackServer::Close
Jack: fPollTable i = 1 fd = 46
Jack: JackRequest::Notification
Jack: JackRequest::Notification kQUIT
Jack: JackMachServerChannel::Execute JackQuitException
Jack: ThreadHandler: exit
Jack: JackPosixThread::Stop
Jack: JackServerSocket::Close /dev/shm/jack_default_1000_0
Jack: JackClientSocket::Close
Jack: JackClientSocket::Close
Jack: JackFFADODriver::Detach
*** glibc detected *** /usr/bin/jackd: double free or corruption (!prev): 0x095f4a00 ***
======= Backtrace: =========
/lib/tls/i686/cmov/libc.so.6(+0x6b591)[0xb742a591]
/lib/tls/i686/cmov/libc.so.6(+0x6cde8)[0xb742bde8]
/lib/tls/i686/cmov/libc.so.6(cfree+0x6d)[0xb742eecd]
/usr/lib/libraw1394.so.11(fw_destroy_handle+0x8a)[0xb6e1679a]
/usr/lib/libraw1394.so.11(raw1394_destroy_handle+0x53)[0xb6e152b3]
/usr/lib/libffado.so.2(_ZN17IsoHandlerManager10IsoHandler7disableEv+0x6e)[0xb0ead0ce]
/usr/lib/libffado.so.2(_ZN17IsoHandlerManager10IsoHandler11updateStateEv+0x90)[0xb0ead620]
/usr/lib/libffado.so.2(_ZN17IsoHandlerManager7IsoTask21updateShadowMapHelperEv+0x83)[0xb0eaeed3]
/usr/lib/libffado.so.2(_ZN17IsoHandlerManager7IsoTask7ExecuteEv+0x42a)[0xb0eaf3fa]
/usr/lib/libffado.so.2(_ZN4Util11PosixThread13ThreadHandlerEPv+0xe6)[0xb0ec95c6]
/lib/tls/i686/cmov/libpthread.so.0(+0x596e)[0xb75aa96e]
/lib/tls/i686/cmov/libc.so.6(clone+0x5e)[0xb748ca4e]
======= Memory map: ========
08048000-0804d000 r-xp 00000000 08:08 1326117    /usr/bin/jackd
0804d000-0804e000 r--p 00004000 08:08 1326117    /usr/bin/jackd
0804e000-0804f000 rw-p 00005000 08:08 1326117    /usr/bin/jackd
09579000-0961a000 rw-p 00000000 00:00 0          [heap]
ac7fe000-ac7ff000 ---p 00000000 00:00 0 
ac7ff000-ac87f000 rw-p 00000000 00:00 0 
ac87f000-ac880000 ---p 00000000 00:00 0 
ac880000-ac900000 rw-p 00000000 00:00 0 
ac900000-ac923000 rw-p 00000000 00:00 0 
ac923000-aca00000 ---p 00000000 00:00 0 
aca64000-aca65000 ---p 00000000 00:00 0 
aca65000-ad265000 rw-p 00000000 00:00 0 
ad265000-ad266000 ---p 00000000 00:00 0 
ad266000-ada66000 rw-p 00000000 00:00 0 
ada66000-ada67000 ---p 00000000 00:00 0 
ada67000-ae267000 rw-p 00000000 00:00 0 
ae267000-ae268000 ---p 00000000 00:00 0 
ae268000-aea68000 rw-p 00000000 00:00 0 
aea68000-aea69000 ---p 00000000 00:00 0 
aea69000-af269000 rw-p 00000000 00:00 0 
af269000-af26a000 ---p 00000000 00:00 0 
af26a000-afa6a000 rw-p 00000000 00:00 0 
afa6a000-afa6b000 ---p 00000000 00:00 0 
afa6b000-b026b000 rw-p 00000000 00:00 0 
b026b000-b026c000 ---p 00000000 00:00 0 
b026c000-b0c6d000 rw-p 00000000 00:00 0 
b0c6d000-b0c9c000 r-xp 00000000 08:08 523356     /lib/libpcre.so.3.12.1
b0c9c000-b0c9d000 r--p 0002e000 08:08 523356     /lib/libpcre.so.3.12.1
b0c9d000-b0c9e000 rw-p 0002f000 08:08 523356     /lib/libpcre.so.3.12.1
b0c9e000-b0cb1000 r-xp 00000000 08:08 523512     /lib/libz.so.1.2.3.3
b0cb1000-b0cb2000 r--p 00012000 08:08 523512     /lib/libz.so.1.2.3.3
b0cb2000-b0cb3000 rw-p 00013000 08:08 523512     /lib/libz.so.1.2.3.3
b0cb3000-b0d7b000 r-xp 00000000 08:08 523375     /lib/libglib-2.0.so.0.2400.1
b0d7b000-b0d7c000 r--p 000c7000 08:08 523375     /lib/libglib-2.0.so.0.2400.1
b0d7c000-b0d7d000 rw-p 000c8000 08:08 523375     /lib/libglib-2.0.so.0.2400.1
b0d7d000-b0d81000 r-xp 00000000 08:08 1314066    /usr/lib/libgthread-2.0.so.0.2400.1
b0d81000-b0d82000 r--p 00003000 08:08 1314066    /usr/lib/libgthread-2.0.so.0.2400.1
b0d82000-b0d83000 rw-p 00004000 08:08 1314066    /usr/lib/libgthread-2.0.so.0.2400.1
b0d83000-b0dc0000 r-xp 00000000 08:08 1314063    /usr/lib/libgobject-2.0.so.0.2400.1
b0dc0000-b0dc1000 r--p 0003c000 08:08 1314063    /usr/lib/libgobject-2.0.so.0.2400.1
b0dc1000-b0dc2000 rw-p 0003d000 08:08 1314063    /usr/lib/libgobject-2.0.so.0.2400.1
b0dc2000-b0e17000 r-xp 00000000 08:08 1361016    /usr/lib/libglibmm-2.4.so.1.3.0
b0e17000-b0e18000 ---p 00055000 08:08 1361016    /usr/lib/libglibmm-2.4.so.1.3.0
b0e18000-b0e19000 r--p 00055000 08:08 1361016    /usr/lib/libglibmm-2.4.so.1.3.0
b0e19000-b0e1a000 rw-p 00056000 08:08 1361016    /usr/lib/libglibmm-2.4.so.1.3.0
b0e1a000-b0fe1000 r-xp 00000000 08:08 1312187    /usr/lib/libffado.so.2.999.0
b0fe1000-b0feb000 r--p 001c6000 08:08 1312187    /usr/lib/libffado.so.2.999.0
b0feb000-b0fee000 rw-p 001d0000 08:08 1312187    /usr/lib/libffado.so.2.999.0
b0fee000-b0fef000 rw-p 00000000 00:00 0 
b0fef000-b5e5f000 rw-s 00000000 00:10 67499      /dev/shm/jack-1000-0
b5e5f000-b649c000 rw-p 00000000 00:00 0 
b649c000-b649d000 ---p 00000000 00:00 0 
b649d000-b6c9d000 rw-p 00000000 00:00 0 
b6ca0000-b6ca5000 r-xp 00000000 08:08 1315393    /usr/lib/libsigc-2.0.so.0.0.0
b6ca5000-b6ca6000 r--p 00004000 08:08 1315393    /usr/lib/libsigc-2.0.so.0.0.0
b6ca6000-b6ca7000 rw-p 00005000 08:08 1315393    /usr/lib/libsigc-2.0.so.0.0.0
b6ca7000-b6dcb000 r-xp 00000000 08:08 1370421    /usr/lib/libxml2.so.2.7.6
b6dcb000-b6dcc000 ---p 00124000 08:08 1370421    /usr/lib/libxml2.so.2.7.6
b6dcc000-b6dd0000 r--p 00124000 08:08 1370421    /usr/lib/libxml2.so.2.7.6
b6dd0000-b6dd1000 rw-p 00128000 08:08 1370421    /usr/lib/libxml2.so.2.7.6
b6dd1000-b6dd2000 rw-p 00000000 00:00 0 
b6dd2000-b6df5000 r-xp 00000000 08:08 1365250    /usr/lib/libxml++-2.6.so.2.0.7
b6df5000-b6df6000 r--p 00023000 08:08 1365250    /usr/lib/libxml++-2.6.so.2.0.7
b6df6000-b6df7000 rw-p 00024000 08:08 1365250    /usr/lib/libxml++-2.6.so.2.0.7
b6df7000-b6e0b000 r-xp 00000000 08:08 1375032    /usr/lib/libconfig++.so.8.0.0
b6e0b000-b6e0c000 r--p 00013000 08:08 1375032    /usr/lib/libconfig++.so.8.0.0
b6e0c000-b6e0d000 rw-p 00014000 08:08 1375032    /usr/lib/libconfig++.so.8.0.0
b6e0d000-b6e19000 r-xp 00000000 08:08 1432895    /usr/lib/libraw1394.so.11.0.1
b6e19000-b6e1a000 r--p 0000b000 08:08 1432895    /usr/lib/libraw1394.so.11.0.1
b6e1a000-b6e1b000 rw-p 0000c000 08:08 1432895    /usr/lib/libraw1394.so.11.0.1
b6e1b000-b6e26000 r-xp 00000000 08:08 1360028    /usr/lib/libiec61883.so.0.1.1
b6e26000-b6e27000 r--p 0000a000 08:08 1360028    /usr/lib/libiec61883.so.0.1.1
b6e27000-b6e28000 rw-p 0000b000 08:08 1360028    /usr/lib/libiec61883.so.0.1.1
b6e3f000-b6e40000 ---p 00000000 00:00 0 
b6e40000-b6ec0000 rw-p 00000000 00:00 0 
b6ec0000-b6ec5000 r-xp 00000000 08:08 1349171    /usr/lib/libogg.so.0.6.0
b6ec5000-b6ec6000 r--p 00004000 08:08 1349171    /usr/lib/libogg.so.0.6.0
b6ec6000-b6ec7000 rw-p 00005000 08:08 1349171    /usr/lib/libogg.so.0.6.0
b6ec7000-b6eee000 r-xp 00000000 08:08 1349182    /usr/lib/libvorbis.so.0.4.3
b6eee000-b6eef000 r--p 00026000 08:08 1349182    /usr/lib/libvorbis.so.0.4.3
b6eef000-b6ef0000 rw-p 00027000 08:08 1349182    /usr/lib/libvorbis.so.0.4.3
b6ef0000-b6fdc000 r-xp 00000000 08:08 1350897    /usr/lib/libvorbisenc.so.2.0.6
b6fdc000-b6fdd000 ---p 000ec000 08:08 1350897    /usr/lib/libvorbisenc.so.2.0.6
b6fdd000-b6feb000 r--p 000ec000 08:08 1350897    /usr/lib/libvorbisenc.so.2.0.6
b6feb000-b6fec000 rw-p 000fa000 08:08 1350897    /usr/lib/libvorbisenc.so.2.0.6
b6fec000-b7037000 r-xp 00000000 08:08 1350892    /usr/lib/libFLAC.so.8.2.0
b7037000-b7038000 r--p 0004a000 08:08 1350892    /usr/lib/libFLAC.so.8.2.0
b7038000-b7039000 rw-p 0004b000 08:08 1350892    /usr/lib/libFLAC.so.8.2.0
b7039000-b704c000 r-xp 00000000 08:08 523294     /lib/tls/i686/cmov/libnsl-2.11.1.so
b704c000-b704d000 r--p 00012000 08:08 523294     /lib/tls/i686/cmov/libnsl-2.11.1.so
b704d000-b704e000 rw-p 00013000 08:08 523294     /lib/tls/i686/cmov/libnsl-2.11.1.so
b704e000-b7050000 rw-p 00000000 00:00 0 
b7050000-b7054000 r-xp 00000000 08:08 1314791    /usr/lib/libXdmcp.so.6.0.0
b7054000-b7055000 r--p 00003000 08:08 1314791    /usr/lib/libXdmcp.so.6.0.0
b7055000-b7056000 rw-p 00004000 08:08 1314791    /usr/lib/libXdmcp.so.6.0.0
b7056000-b70b7000 r-xp 00000000 08:08 1326082    /usr/lib/libsndfile.so.1.0.21
b70b7000-b70b8000 ---p 00061000 08:08 1326082    /usr/lib/libsndfile.so.1.0.21
b70b8000-b70b9000 r--p 00061000 08:08 1326082    /usr/lib/libsndfile.so.1.0.21
b70b9000-b70ba000 rw-p 00062000 08:08 1326082    /usr/lib/libsndfile.so.1.0.21
b70ba000-b70be000 rw-p 00000000 00:00 0 
b70be000-b70c5000 r-xp 00000000 08:08 523660     /lib/libwrap.so.0.7.6
b70c5000-b70c6000 r--p 00006000 08:08 523660     /lib/libwrap.so.0.7.6
b70c6000-b70c7000 rw-p 00007000 08:08 523660     /lib/libwrap.so.0.7.6
b70c7000-b70d3000 r-xp 00000000 08:08 1317694    /usr/lib/libXi.so.6.1.0
b70d3000-b70d4000 r--p 0000c000 08:08 1317694    /usr/lib/libXi.so.6.1.0
b70d4000-b70d5000 rw-p 0000d000 08:08 1317694    /usr/lib/libXi.so.6.1.0
b70d5000-b70e3000 r-xp 00000000 08:08 1315060    /usr/lib/libXext.so.6.4.0
b70e3000-b70e4000 r--p 0000d000 08:08 1315060    /usr/lib/libXext.so.6.4.0
b70e4000-b70e5000 rw-p 0000e000 08:08 1315060    /usr/lib/libXext.so.6.4.0
b70e5000-b70e8000 r-xp 00000000 08:08 523376     /lib/libuuid.so.1.3.0
b70e8000-b70e9000 r--p 00002000 08:08 523376     /lib/libuuid.so.1.3.0
b70e9000-b70ea000 rw-p 00003000 08:08 523376     /lib/libuuid.so.1.3.0
b70ea000-b7102000 r-xp 00000000 08:08 1314793    /usr/lib/libxcb.so.1.1.0
b7102000-b7103000 r--p 00017000 08:08 1314793    /usr/lib/libxcb.so.1.1.0
b7103000-b7104000 rw-p 00018000 08:08 1314793    /usr/lib/libxcb.so.1.1.0
b7104000-b714d000 r-xp 00000000 08:08 1350925    /usr/lib/libpulsecommon-0.9.21.so
b714d000-b714e000 r--p 00048000 08:08 1350925    /usr/lib/libpulsecommon-0.9.21.so
b714e000-b714f000 rw-p 00049000 08:08 1350925    /usr/lib/libpulsecommon-0.9.21.so
b714f000-b7164000 r-xp 00000000 08:08 1315066    /usr/lib/libICE.so.6.3.0
b7164000-b7165000 r--p 00014000 08:08 1315066    /usr/lib/libICE.so.6.3.0
b7165000-b7166000 rw-p 00015000 08:08 1315066    /usr/lib/libICE.so.6.3.0
b7166000-b7168000 rw-p 00000000 00:00 0 
b7168000-b7281000 r-xp 00000000 08:08 1315054    /usr/lib/libX11.so.6.3.0
b7281000-b7282000 r--p 00118000 08:08 1315054    /usr/lib/libX11.so.6.3.0
b7282000-b7284000 rw-p 00119000 08:08 1315054    /usr/lib/libX11.so.6.3.0
b7284000-b7285000 rw-p 00000000 00:00 0 
b7285000-b72c5000 r-xp 00000000 08:08 1350923    /usr/lib/libpulse.so.0.12.2
b72c5000-b72c6000 r--p 00040000 08:08 1350923    /usr/lib/libpulse.so.0.12.2
b72c6000-b72c7000 rw-p 00041000 08:08 1350923    /usr/lib/libpulse.so.0.12.2
b72c7000-b738a000 r-xp 00000000 08:08 1345951    /usr/lib/libasound.so.2.0.0
b738a000-b738e000 r--p 000c2000 08:08 1345951    /usr/lib/libasound.so.2.0.0
b738e000-b738f000 rw-p 000c6000 08:08 1345951    /usr/lib/libasound.so.2.0.0
b7392000-b7393000 rw-s 00000000 00:10 68581      /dev/shm/jack-1000-2 (deleted)
b7393000-b7394000 rw-s 00000000 00:10 66686      /dev/shm/sem.jack_sem.1000_default_jack_cpu_load
b7394000-b739d000 r-xp 00000000 08:08 1443905    /usr/lib/jack/jack_firewire.so
b739d000-b739e000 r--p 00008000 08:08 1443905    /usr/lib/jack/jack_firewire.so
b739e000-b739f000 rw-p 00009000 08:08 1443905    /usr/lib/jack/jack_firewire.so
b739f000-b73a6000 r--s 00000000 08:08 1337173    /usr/lib/gconv/gconv-modules.cache
b73a6000-b73a9000 r-xp 00000000 08:08 1314065    /usr/lib/libgmodule-2.0.so.0.2400.1
b73a9000-b73aa000 r--p 00002000 08:08 1314065    /usr/lib/libgmodule-2.0.so.0.2400.1
b73aa000-b73ab000 rw-p 00003000 08:08 1314065    /usr/lib/libgmodule-2.0.so.0.2400.1
b73ab000-b73bd000 rw-s 00000000 00:10 17508      /dev/shm/jack-shm-registry
b73bd000-b73bf000 rw-p 00000000 00:00 0 
b73bf000-b7512000 r-xp 00000000 08:08 523288     /lib/tls/i686/cmov/libc-2.11.1.so
b7512000-b7513000 ---p 00153000 08:08 523288     /lib/tls/i686/cmov/libc-2.11.1.so
b7513000-b7515000 r--p 00153000 08:08 523288     /lib/tls/i686/cmov/libc-2.11.1.so
b7515000-b7516000 rw-p 00155000 08:08 523288     /lib/tls/i686/cmov/libc-2.11.1.so
b7516000-b7519000 rw-p 00000000 00:00 0 
b7519000-b7536000 r-xp 00000000 08:08 523337     /lib/libgcc_s.so.1
b7536000-b7537000 r--p 0001c000 08:08 523337     /lib/libgcc_s.so.1
b7537000-b7538000 rw-p 0001d000 08:08 523337     /lib/libgcc_s.so.1
b7538000-b7539000 rw-p 00000000 00:00 0 
b7539000-b755d000 r-xp 00000000 08:08 523292     /lib/tls/i686/cmov/libm-2.11.1.so
b755d000-b755e000 r--p 00023000 08:08 523292     /lib/tls/i686/cmov/libm-2.11.1.so
b755e000-b755f000 rw-p 00024000 08:08 523292     /lib/tls/i686/cmov/libm-2.11.1.so
b755f000-b7596000 r-xp 00000000 08:08 523268     /lib/libdbus-1.so.3.4.0
b7596000-b7597000 r--p 00036000 08:08 523268     /lib/libdbus-1.so.3.4.0
b7597000-b7598000 rw-p 00037000 08:08 523268     /lib/libdbus-1.so.3.4.0
b7598000-b759f000 r-xp 00000000 08:08 523304     /lib/tls/i686/cmov/librt-2.11.1.so
b759f000-b75a0000 r--p 00006000 08:08 523304     /lib/tls/i686/cmov/librt-2.11.1.so
b75a0000-b75a1000 rw-p 00007000 08:08 523304     /lib/tls/i686/cmov/librt-2.11.1.so
b75a1000-b75a3000 r-xp 00000000 08:08 523291     /lib/tls/i686/cmov/libdl-2.11.1.so
b75a3000-b75a4000 r--p 00001000 08:08 523291     /lib/tls/i686/cmov/libdl-2.11.1.so
b75a4000-b75a5000 rw-p 00002000 08:08 523291     /lib/tls/i686/cmov/libdl-2.11.1.so
b75a5000-b75ba000 r-xp 00000000 08:08 523302     /lib/tls/i686/cmov/libpthread-2.11.1.so
b75ba000-b75bb000 r--p 00014000 08:08 523302     /lib/tls/i686/cmov/libpthread-2.11.1.so
b75bb000-b75bc000 rw-p 00015000 08:08 523302     /lib/tls/i686/cmov/libpthread-2.11.1.so
b75bc000-b75bf000 rw-p 00000000 00:00 0 
b75bf000-b763f000 r-xp 00000000 08:08 1318951    /usr/lib/libjackserver.so.0.1.0
b763f000-b7641000 r--p 00080000 08:08 1318951    /usr/lib/libjackserver.so.0.1.0
b7641000-b7642000 rw-p 00082000 08:08 1318951    /usr/lib/libjackserver.so.0.1.0
b7642000-b7643000 rw-p 00000000 00:00 0 
b7643000-b772c000 r-xp 00000000 08:08 1309752    /usr/lib/libstdc++.so.6.0.13
b772c000-b772d000 ---p 000e9000 08:08 1309752    /usr/lib/libstdc++.so.6.0.13
b772d000-b7731000 r--p 000e9000 08:08 1309752    /usr/lib/libstdc++.so.6.0.13
b7731000-b7732000 rw-p 000ed000 08:08 1309752    /usr/lib/libstdc++.so.6.0.13
b7732000-b7739000 rw-p 00000000 00:00 0 
b7739000-b773b000 r-xp 00000000 08:08 1314785    /usr/lib/libXau.so.6.0.0
b773b000-b773c000 r--p 00001000 08:08 1314785    /usr/lib/libXau.so.6.0.0
b773c000-b773d000 rw-p 00002000 08:08 1314785    /usr/lib/libXau.so.6.0.0
b773d000-b7741000 r-xp 00000000 08:08 1350917    /usr/lib/libXtst.so.6.1.0
b7741000-b7742000 r--p 00003000 08:08 1350917    /usr/lib/libXtst.so.6.1.0
b7742000-b7743000 rw-p 00004000 08:08 1350917    /usr/lib/libXtst.so.6.1.0
b7743000-b774a000 r-xp 00000000 08:08 1315072    /usr/lib/libSM.so.6.0.1
b774a000-b774b000 r--p 00006000 08:08 1315072    /usr/lib/libSM.so.6.0.1
b774b000-b774c000 rw-p 00007000 08:08 1315072    /usr/lib/libSM.so.6.0.1
b774c000-b774d000 rw-s 00000000 00:10 66674      /dev/shm/sem.jack_sem.1000_default_freewheel
b774d000-b774e000 rw-s 00000000 00:10 66670      /dev/shm/sem.jack_sem.1000_default_system
b774e000-b774f000 rw-s 00000000 00:10 67500      /dev/shm/jack-1000-1
b774f000-b7752000 rw-p 00000000 00:00 0 
b7752000-b7753000 r-xp 00000000 00:00 0          [vdso]
b7753000-b776e000 r-xp 00000000 08:08 523465     /lib/ld-2.11.1.so
b776e000-b776f000 r--p 0001a000 08:08 523465     /lib/ld-2.11.1.so
b776f000-b7770000 rw-p 0001b000 08:08 523465     /lib/ld-2.11.1.so
bfbd7000-bfbf8000 rw-p 00000000 00:00 0          [stack]
22:12:55.994 JACK was stopped successfully.
22:12:55.995 Post-shutdown script...
22:12:55.995 killall jackd
22:12:55.996 JACK has crashed.
jackd: no process found
22:12:56.405 Post-shutdown script terminated with exit status=256.

06/17/11 14:26:41 changed by roos

(above message is with kernel *.39)

06/17/11 19:57:44 changed by jwoithe

I don't currently have jack2 (aka jackmp) on my system. Since you mentioned that this happens under jack1, could you please re-run that and attach the crash report produced with that? However, when doing so could you please add "-v 6" to the end of your jackd command to produce additional debugging information which may help me track this last gremlin down. I don't see the above crash on my system at present, so I have to rely on your dump to work out what is happening.

Can you also double-check (using the banners printed by jackd when it starts) that the system is using the version of ffado you think it is. Assuming you've compiled a debug build this should be reported in a line looking something like this

ffado_streaming_init: libffado 2.999.0-1984M built May 20 2011 23:07:33

if you start jackd without your device connected using "jackd -d firewire".

I've just made a *very* quick check of the source in trunk and I can't immediately see a way that the double-free can still be triggered as it's protected by a lock now. That's not to say that there isn't a way though. I'd just like to make doubly sure that we're talking about the same version of ffado before digging deeply into this, since it's easy for libffado versions to get confused.

06/18/11 01:05:46 changed by jwoithe

I've now had a chance to look into this in a little more detail and I must say that I'm puzzled as to how this can still occur - assuming the double-free report given above is due to a double call to raw1394_destroy_handle() which is what it looks to be. The reasoning is as follows.

The only place that calls raw1394_destroy_handle() is IsoHandlerManager::IsoHandler::disable(). In this function, the call to raw1394_destroy_handle() is protected by the m_disable_lock mutex. Even if there were two different simultaneous callers to IsoHandlerManager::IsoHandler::disable(), only the first would get to call raw1394_destroy_handle(). The second would block on the mutex lock request until the first released the lock. This happens right at the end of IsoHandlerManager::IsoHandler::disable(), critically after the state has been set to "stopped" and the handle is explicitly set to NULL. When the second call acquires the lock, the first thing it does is test the state and only proceed if it's "running". Since the first disable() call set the state to "stopped" this conditional (in IsoHandlerManager?.cpp line 1850) will fail and the second disable() call will exit without calling raw1394_destroy_handle().

Similarly, if some other mechanism calls disable() sometime later, the lock will be acquired immediately but again the state is tested before proceeding. Since the state cannot be anything other than "stopped" following a call to raw1394_destroy_handle(), this ensures that raw1394_destroy_handle() will only be called once per raw1394 handle.

All this is great in theory, but obviously your system produced the above crash dump and we need to understand why. As mentioned, after a lot of testing back in May I am no longer able to reproduce this problem on my system with the new stack so I will be relying on the outcome of the tests outlined before on "Roos"s system to take this further. To summarise what I need:

1) Confirmation of the version of libffado actually being used by jack

2) Build a debug build of libffado (the default if you've checked out svn trunk) and run with the "-v 6" option at the end of the jackd command. This will turn on additional debugging messages which will hopefully shed more light as to what's going on inside disable() in particular.

Finally for the sake of completeness I will note that it's not sufficient to check out rev 1985 in the 2.0 ffado branch - you need to get trunk since this fix has not yet been backported into the 2.0 branch.

06/18/11 02:44:06 changed by roos

  • attachment ffado_log.txt added.

entire session log from start til crash

06/18/11 02:45:03 changed by stefanr

Jonathan, I am wondering about this section:

IsoHandlerManager::IsoHandler::disable()
{
    signed int i, have_lock = 0;
[...]
    i = pthread_mutex_trylock(&m_disable_lock);
    if (i == 0)
        have_lock = 1;
    else
    if (i == EBUSY) {
[...]
        debugOutput( DEBUG_LEVEL_VERBOSE, "waiting for disable lock\n");
        pthread_mutex_lock(&m_disable_lock);
        debugOutput( DEBUG_LEVEL_VERBOSE, "now have disable lock\n");
[...]
        have_lock = 1;
    }
[...]
[kill the raw1394 handle here]
[...]
    if (have_lock)
        pthread_mutex_unlock(&m_disable_lock);
    return true;
}

The fact that you initially only test for i == 0 i == EBUSY and that you later, in the supposedly m_disable_lock protected section, test for have_lock, begs the question whether there are circumstances that this section could be executed without m_disable_lock held.

06/18/11 02:45:42 changed by roos

  • attachment qjackctl_messages added.

qjackctl messages

06/18/11 03:14:09 changed by roos

  • attachment ffado_log2 added.

it shows -v 6 messages right before the crash.

(follow-up: ↓ 11 ) 06/18/11 03:56:16 changed by jwoithe

I'll take a look at the logs in a moment. To answer Stefan's question about the structure in disable(), it looks a little odd but it should be fine. It's written like that to permit the printing of the more detailed log messages. One could for instance simply call pthread_mutex_lock(), but then there's no way to determine whether there was lock contention or not. This way we find out.

To answer your question, I don't believe it's possible - in these circumstances - to have the lower section execute without the lock. According to the documentation for pthread_mutex_trylock(), the usual return values are 0 (no error, the lock was taken) and EBUSY (the lock is already held and thus cannot be taken). Other errors will not apply: EINVAL requires that the lock is not initialised (the mutex is always initialised by the time disable() is called), and EAGAIN requires that too many recursive locks are held (recursive locking is not configured for this mutex). So unless there's undocumented behaviour occuring in pthread_mutex_trylock() I can't see how the function could continue without the lock held. However, I don't profess to be an expert in the area of posix locks; are you (Stefan) aware of other returns from pthread_mutex_trylock() which aren't in the documentation?

"have_lock" is used just because I'm paranoid and don't want to complicate things by calling pthread_mutex_unlock() unless we definitely have the lock. As outlined above though the expectation is that we must always have it by then.

I guess to prove this you could comment out the "if (i == EBUSY)" conditional and replace it with a simple "{". That would force the call to pthread_mutex_lock() on ANY error from pthread_mutex_trylock() and answer the question pretty much immediately. I'd have to wait for "roos" to try that though, since I no longer get the double-free on my system (which is running 2.6.39 for reference).

(in reply to: ↑ 10 ; follow-up: ↓ 13 ) 06/18/11 04:24:04 changed by stefanr

Replying to jwoithe:

are you (Stefan) aware of other returns from pthread_mutex_trylock() which aren't in the documentation?

No. Also, it very much looks like the trylock handling is indeed not the issue here; see below.

- 20 minutes earlier -

Interesting. I now had one occasion in which the bug returned:

jackd 0.120.1
[...]
253008147112:  (ffado.cpp)[  92] ffado_streaming_init: libffado 2.999.0-1985 built Jun 18 2011 12:40:33
^Cjack main caught signal 2
*** glibc detected *** /usr/bin/jackd: double free or corruption (!prev): 0x00000000017a20d0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x731fd)[0x7fbce09f61fd]
/lib64/libc.so.6(cfree+0x6c)[0x7fbce09faeec]
/usr/local/lib/libraw1394.so.11(raw1394_destroy_handle+0x31)[0x7fbcdeec0001]
/usr/local/lib/libffado.so.2(_ZN17IsoHandlerManager10IsoHandler7disableEv+0x304)[0x7fbcdf71a128]
/usr/local/lib/libffado.so.2(_ZN17IsoHandlerManager10IsoHandler11updateStateEv+0x132)[0x7fbcdf71a44a]
/usr/local/lib/libffado.so.2(_ZN17IsoHandlerManager7IsoTask21updateShadowMapHelperEv+0x13f)[0x7fbcdf713175]
/usr/local/lib/libffado.so.2(_ZN17IsoHandlerManager7IsoTask7ExecuteEv+0x144)[0x7fbcdf7134f4]
/usr/local/lib/libffado.so.2(_ZN4Util11PosixThread13ThreadHandlerEPv+0x25a)[0x7fbcdf7439d2]
/lib64/libpthread.so.0(+0x6dba)[0x7fbce0ef5dba]
/lib64/libc.so.6(clone+0x6d)[0x7fbce0a5502d]
======= Memory map: ========
[...]

Seen with a BeBoB device.

Seems that the bug is still there - but unlikely to happen on my system and likely to happen on roos' system.

(Another observation here is that jackd does often not exit immediately after receiving ^C i.e. SIGINT. Instead, it gets stuck until "jackd watchdog: timeout - killing jackd". Seen with a BeBoB device, or with a BeBoB and a DICE device aggregated. But I am currently on kernel 3.0-rc3, hence there may be added weirdnesses from the usual -rc kernel regressions.)

- tens minutes later -

Weird. I got the bug again after the following modification:

Index: src/libieee1394/IsoHandlerManager.cpp
===================================================================
--- src/libieee1394/IsoHandlerManager.cpp	(revision 1985)
+++ src/libieee1394/IsoHandlerManager.cpp	(working copy)
@@ -1820,38 +1820,24 @@
 bool
 IsoHandlerManager::IsoHandler::disable()
 {
-    signed int i, have_lock = 0;
+    bool ret;
 
     debugOutput( DEBUG_LEVEL_VERBOSE, "(%p, %s) enter...\n", 
                  this, (m_type==eHT_Receive?"Receive":"Transmit"));
 
-    i = pthread_mutex_trylock(&m_disable_lock);
-    if (i == 0)
-        have_lock = 1;
-    else
-    if (i == EBUSY) {
-        // Some other thread is disabling this handler, a process which can
-        // take considerable time when using the new kernel firewire stack. 
-        // Wait until it is finished before returning so the present caller
-        // can act knowing that the disable has occurred and is complete
-        // (which is what normally would be expected).
-        debugOutput( DEBUG_LEVEL_VERBOSE, "waiting for disable lock\n");
-        pthread_mutex_lock(&m_disable_lock);
-        debugOutput( DEBUG_LEVEL_VERBOSE, "now have disable lock\n");
-        if (m_State == eHS_Stopped) {
-            debugOutput( DEBUG_LEVEL_VERBOSE, "another disable() has completed\n");
-            pthread_mutex_unlock(&m_disable_lock);
-            return true;
-        }
-        have_lock = 1;
+    pthread_mutex_lock(&m_disable_lock);
+
+    if (m_State == eHS_Stopped) {
+        debugOutput( DEBUG_LEVEL_VERBOSE, "another disable() has completed\n");
+        ret = true;
+        goto out;
     }
 
     // check state
     if(m_State != eHS_Running) {
         debugError("Incorrect state, expected eHS_Running, got %d\n",(int)m_State);
-        if (have_lock)
-            pthread_mutex_unlock(&m_disable_lock);
-        return false;
+        ret = false;
+        goto out;
     }
 
     assert(m_handle != NULL);
@@ -1889,10 +1875,10 @@
 
     m_State = eHS_Stopped;
     m_NextState = eHS_Stopped;
-
-    if (have_lock)
-        pthread_mutex_unlock(&m_disable_lock);
-    return true;
+    ret = true;
+out:
+    pthread_mutex_unlock(&m_disable_lock);
+    return ret;
 }
 
 // functions to request enable or disable at the next opportunity

I.e. this modification makes it 100% sure that IsoHandlerManager.cpp's raw1394_destroy_handle is executed with the mutex held since this no longer depends on exact pthread_mutex_trylock return codes.

This time the crash looked like this:

jackd 0.120.1
[...]
254221450451:  (ffado.cpp)[  92] ffado_streaming_init: libffado 2.999.0-1985M built Jun 18 2011 13:01:24
[...]
^Cjack main caught signal 2                                                                           
254461308291: Warning (TimestampedBuffer.cpp)[1062] incrementFrameCounter: (0x1675ae0) difference rather large (+): diff=   6165733.572, max=      1536.000,   62169494.000,   56003760.428                 
254461308352: Warning (TimestampedBuffer.cpp)[ 251] calculateRate: (0x1675ae0) rate ( 681.51312) more that 10% off nominal (rate= 557.27893, diff=      5452.105, update_period=8)                          
254461313877: Warning (ieee1394service.cpp)[1489] freeIsoChannel: Could not do CMP disconnect for channel 0!                                                                                                
254461317668: Warning (ieee1394service.cpp)[1489] freeIsoChannel: Could not do CMP disconnect for channel 1!                                                                                                
254461317881: Error (IsoHandlerManager.cpp)[1329] ~IsoHandler: BUG: Handler still running!            
*** glibc detected *** /usr/bin/jackd: double free or corruption (!prev): 0x000000000168d820 ***      
======= Backtrace: =========
/lib64/libc.so.6(+0x731fd)[0x7fea7fa361fd]
/lib64/libc.so.6(cfree+0x6c)[0x7fea7fa3aeec]
/usr/local/lib/libraw1394.so.11(raw1394_destroy_handle+0x31)[0x7fea7df00001]
/usr/local/lib/libffado.so.2(_ZN17IsoHandlerManager10IsoHandler7disableEv+0x237)[0x7fea7e75a05b]
/usr/local/lib/libffado.so.2(_ZN17IsoHandlerManager10IsoHandler11updateStateEv+0x132)[0x7fea7e75a37a]
/usr/local/lib/libffado.so.2(_ZN17IsoHandlerManager7IsoTask21updateShadowMapHelperEv+0x13f)[0x7fea7e753175]
/usr/local/lib/libffado.so.2(_ZN17IsoHandlerManager7IsoTask7ExecuteEv+0x144)[0x7fea7e7534f4]
/usr/local/lib/libffado.so.2(_ZN4Util11PosixThread13ThreadHandlerEPv+0x25a)[0x7fea7e783902]
/lib64/libpthread.so.0(+0x6dba)[0x7fea7ff35dba]
/lib64/libc.so.6(clone+0x6d)[0x7fea7fa9502d]
======= Memory map: ========
[...]

Jonathan,

are we sure that this is really about the raw1394_destroy_handle in IsoHandlerManager.cpp, and not about one of the five raw1394_destroy_handle in ieee1394service.cpp?

06/18/11 04:57:24 changed by stefanr

As a possibly unrelated side note, on rare occasions I also see this mode of exiting:

^Cjack main caught signal 2                                                                           
jackd: pthread_mutex_trylock.c:230: __pthread_mutex_trylock: Assertion `robust' failed.
Aborted                                                                                               

Same with -v6 (only the tail of the session log; can't find the ^C that initiated the shutdown):

257340068716: Debug (IsoHandlerManager.cpp)[ 134] updateShadowMapHelper: (0x2230700) updating shadow vars...
257340068721: Debug (IsoHandlerManager.cpp)[1951] updateState: (0x7f57540083f0) handler needs state update from 1 => 0
257340068725: Debug (IsoHandlerManager.cpp)[1956] updateState: handler has to be disabled
257340068728: Debug (IsoHandlerManager.cpp)[1826] disable: (0x7f57540083f0, Transmit) enter...
257340068731: Debug (IsoHandlerManager.cpp)[1838] disable: waiting for disable lock
257340068870: Debug (StreamProcessor.cpp)[ 480] putPacket: Should update state from ePS_DryRunning to ePS_Stopped
257340068883: Debug (StreamProcessor.cpp)[1657] updateState: Do state transition: ePS_DryRunning => ePS_Stopped
257340068895: Debug (StreamProcessor.cpp)[1361] doStop: Enter from state: ePS_DryRunning
257340068906: Debug (IsoHandlerManager.cpp)[1087] stopHandlerForStream:  stopping handler 0x7f5754359130 for stream 0x7f5754353fc0
257340068918: Debug (IsoHandlerManager.cpp)[  88] requestShadowMapUpdate: (0x22308b0) enter
257340068929: Debug (IsoHandlerManager.cpp)[  93] requestShadowMapUpdate: (0x22308b0) exit
257340068938: Debug (IsoHandlerManager.cpp)[1099] stopHandlerForStream:  requested disable for handler 0x7f5754359130
257340068948: Debug (TimestampedBuffer.cpp)[ 345] clearBuffer: Clearing buffer
257340068959: Debug (PortManager.cpp)[ 219] preparePorts: preparing ports
257340068972: Debug (StreamProcessor.cpp)[1414] doStop: State switch complete, dumping SP info...
 StreamProcessor 0x7f5754353fc0, Receive:
  Port, Channel    : 0, 0
  Packets, Dropped, Skipped : 26104, 0, 0
  Now                   : 01751394899 (071s 2115c 1619t)
  Xrun?                 : False
  State                 : ePS_Stopped
  Buffer                : 0x7f575400b830
  Framerate             : Nominal: 44100, Sync: 44099.998489, Buffer 44099.998489
  TimestampedBuffer (0x7f575400b830): 0000 frames, 0000 events
   Timestamps           : head: 1751360654.000, Tail: 1751360654.000, Next tail: 1751365112.231
    Head - Tail         :          0.000 (0.000000 frames)
   DLL Rate             : 4458.231445 (557.278931)
   DLL Bandwidth        : 4.069010e-09 1/ticks (0.100000 Hz)
257340069246: Debug (StreamProcessorManager.cpp)[1051] stop:  Stopped...
257340069275: Debug (devicemanager.cpp)[ 955] stopStreamingOnDevice: Stopping stream 0 of device 0x7f5754001d50
257340069314: Debug (ieee1394service.cpp)[1451] freeIsoChannel: Freeing ISO channel 0...
257340069326: Debug (ieee1394service.cpp)[1483] freeIsoChannel:  allocated using IEC61883 CMP...
257340069335: Debug (ieee1394service.cpp)[1484] freeIsoChannel:  performing IEC61883 CMP disconnect...
257340069261: Debug (IsoHandlerManager.cpp)[ 134] updateShadowMapHelper: (0x22308b0) updating shadow vars...
257340069354: Debug (IsoHandlerManager.cpp)[1951] updateState: (0x7f5754359130) handler needs state update from 1 => 0
257340069366: Debug (IsoHandlerManager.cpp)[1956] updateState: handler has to be disabled
257340069375: Debug (IsoHandlerManager.cpp)[1826] disable: (0x7f5754359130, Receive) enter...
jackd: pthread_mutex_trylock.c:230: __pthread_mutex_trylock: Assertion `robust' failed.
Aborted

(in reply to: ↑ 11 ) 06/18/11 05:02:45 changed by stefanr

Replying to stefanr:

are we sure that this is really about the raw1394_destroy_handle in IsoHandlerManager.cpp, and not about one of the five raw1394_destroy_handle in ieee1394service.cpp?

I now added debugOutput calls at each of the five raw1394_destroy_handle calls in ieee1394service.cpp.

Index: src/libieee1394/ieee1394service.cpp
===================================================================
--- src/libieee1394/ieee1394service.cpp (revision 1985)
+++ src/libieee1394/ieee1394service.cpp (working copy)
@@ -135,6 +135,7 @@
 
     delete m_pWatchdog;
     if ( m_handle ) {
+        debugOutput(DEBUG_LEVEL_MESSAGE, "**** Ieee1394Service::~Ieee1394Service() 1 ****\n");
         raw1394_destroy_handle( m_handle );
     }
     delete m_handle_lock;
@@ -144,6 +145,7 @@
     if(m_armHelperRealtime) delete m_armHelperRealtime;
 
     if ( m_util_handle ) {
+        debugOutput(DEBUG_LEVEL_MESSAGE, "**** Ieee1394Service::~Ieee1394Service() 2 ****\n");
         raw1394_destroy_handle( m_util_handle );
     }
 }
@@ -186,6 +188,7 @@
     }
     struct raw1394_portinfo pinf[IEEE1394SERVICE_MAX_FIREWIRE_PORTS];
     int nb_detected_ports = raw1394_get_port_info(tmp_handle, pinf, IEEE1394SERVICE_MAX_FIREWIRE_PORTS);
+    debugOutput(DEBUG_LEVEL_MESSAGE, "**** Ieee1394Service::detectNbPorts() ****\n");
     raw1394_destroy_handle(tmp_handle);
 
     if (nb_detected_ports < 0) {
@@ -340,6 +343,7 @@
     }
     struct raw1394_portinfo pinf[IEEE1394SERVICE_MAX_FIREWIRE_PORTS];
     int nb_detected_ports = raw1394_get_port_info(tmp_handle, pinf, IEEE1394SERVICE_MAX_FIREWIRE_PORTS);
+    debugOutput(DEBUG_LEVEL_MESSAGE, "**** Ieee1394Service::initialize( int port ) ****\n");
     raw1394_destroy_handle(tmp_handle);
 
     if (nb_detected_ports < 0) {
@@ -1633,6 +1637,7 @@
     m_thread.Stop();
     delete &m_thread;
     if(m_handle) {
+        debugOutput(DEBUG_LEVEL_MESSAGE, "**** Ieee1394Service::HelperThread::~HelperThread() ****\n");
         raw1394_destroy_handle(m_handle);
     }
 }

Unfortunately, Heisenberg is right once more and I was unable to reproduce the double-free crash so far, even after many tries.

06/18/11 05:51:53 changed by jwoithe

Wow, lots of additional info from stefanr - thanks.

are we sure that this is really about the raw1394_destroy_handle in IsoHandlerManager?.cpp, and not about one of the five raw1394_destroy_handle in ieee1394service.cpp?

No, I'm not sure.

I've had a chance to look at roos's logs (ffado_log2 in particular). We see the initial disable working just fine (at 06438135810), but then curiously enough when ~IsoHandler?() is called it seems to think that the handler's state is not "running" ("Handler still running" at 06438143891). Consequently when disable() is called from this site it does a full disable, presumedly also including a call to raw1394_destroy_handle(). Note that both this disable call and the first one are both acting on the receive handler (0x86b3b68).

This behaviour is very close to what I saw before putting in the mutex to protect the raw1394_destroy_handle() call in disable(). It looks like the lock isn't working in roos's case, but I can't see how it can't. The version of ffado reported in his logs clearly indicates that it's r1985 from trunk, so it's not that it's picking up the wrong library or anything.

So I'm quite puzzled.

I note that your crash dumps (or at least the backtraces) are very similar to those in roos's case. They all stem from the ~IsoHandler?() call as a result of the disable() call - even the one you got after modifying disable() to ensure the mutex was held when raw1394_destroy_handle() was called. On my system I noted that the reason why disable() was re-entered was that the raw1394_destroy_handle() from the previous invocation (in a different thread) hadn't finished yet, and thus the handle and state hadn't been reset. With the mutex the second thread (the ~IsoHandler?() one) should wait until that first one is complete before continuing because that's the only way it'll get the mutex. But for some reason it seems that this isn't happening. Hmm.

In relation to the raw1394_destroy_handle() calls in ieee1394service, my gut feeling is that they aren't implicated. I haven't looked into that object in all that much detail, but my understanding is that the raw1394 handle that it uses is a completely different handle to that used by the IsoHandler? objects. Of course a corruption triggered by incorrect usage in ieee1394service might well only show up in unrelated places, so it will be interesting to see what happens if you manage to trigger the bug with that additional debug info in place.

Finally, I don't have a clue about the "assertion robust" thing. Do you know what it actually means?

06/18/11 05:57:21 changed by jwoithe

One more thing. Going back to roos's log, if the first disable() call at 06438135810 is in fact still waiting for raw1394_destroy_handle() to return at the time ~IsoHandler?() attempts to call disable() I would expect to see some debug messages relating to a wait for the disable lock, but I don't see these even though roos's log was at "-v 6". It's all rather strange.

06/18/11 06:14:40 changed by jwoithe

And another thing. The in roos's log, the double free crash that's triggered actually stems from the first disable() call at 06438135810, not the second at 06438143903 (you can tell from the backtrace). That's easily explained though - for one reason or another the second disable() call's raw1394_destroy_handle() returned quickly and permitted that thread to continue. Eventually the first call's raw1394_destroy_handle() woke up and triggered the double-free since the handle it was working on has since been disposed of by that second call. But why did the second thread get that far into disable() - the disable_lock mutex should have prevented that, especially with your mod which took out the EBUSY conditional.

In all this though I keep coming back to the question as to why the mutex locking seems to be completely ineffective here. The crashes you and roos are seeing appear to suggest that all the disable_lock mutex operations are no-ops. About the only thing I can think of which could cause that would be heap corruption in some other section of code which took out disable_lock, most likely completely unrelated to what we're presently looking at.

06/18/11 06:30:53 changed by jwoithe

A final thought before I go to bed for the day: if there was some form of memory corruption occurring it might explain those intermittant pthread_mutex_trylock "not robust" messages you see. Perhaps it does stem from the 1394_destroy_handle() calls in ieee1394service, and we just hope that Heisenberg goes to sleep soon so you can trigger the bug with those extra debug printfs in place.

But in all this, if we entertain this corruption idea we have to wonder why it doesn't happen with the old stack.

(follow-up: ↓ 20 ) 06/18/11 06:36:06 changed by adi

Just a random thought on this, maybe I find time today to give it a try. To triage the bug, how about adding the lock to libraw1394's handle structure. create_handle() will init the lock, destroy_handle() will only be effective if the lock isn't busy.

This way, we make sure the lock is local to the handle, not to a given thread or a group of respectively.

No idea if this helps at all if we really have a severe memory corruption elsewhere.

06/18/11 07:43:08 changed by stefanr

Replying to jwoithe:

Finally, I don't have a clue about the "assertion robust" thing. Do you know what it actually means?

No, I don't know. It is at line 230 of this file: http://repo.or.cz/w/glibc.git/blob_plain/HEAD:/nptl/pthread_mutex_trylock.c

I suppose the assertion tests that a branch of execution in __pthread_mutex_trylock is only entered for a certain subtype of mutex.

Replying to jwoithe:

if there was some form of memory corruption occurring it might explain those intermittant pthread_mutex_trylock "not robust" messages you see.

...and also the seemingly ineffectiveness of the locking in disable().

I wonder, could there be a use-after-free of the entire IsoHandler instance happening?

Perhaps it does stem from the 1394_destroy_handle() calls in ieee1394service, and we just hope that Heisenberg goes to sleep soon so you can trigger the bug with those extra debug printfs in place.

I got this assertion repeatedly in the meantime, but it was never preceded immediately by one of those printfs. Example:

$ /usr/bin/jackd -P66 -dfirewire -r44100 -p128 -n3 
jackd 0.120.1
Copyright 2001-2009 Paul Davis, Stephane Letz, Jack O'Quinn, Torben Hohn and others.
jackd comes with ABSOLUTELY NO WARRANTY
This is free software, and you are welcome to redistribute it
under certain conditions; see the file COPYING for details

no message buffer overruns
JACK compiled with System V SHM support.
loading driver ..
258335061595:  (ffado.cpp)[  92] ffado_streaming_init: libffado 2.999.0-1985M built Jun 18 2011 13:01:24
258335064218:  (ieee1394service.cpp)[ 191] detectNbPorts: **** Ieee1394Service::detectNbPorts() ****
258335107072:  (ieee1394service.cpp)[ 191] detectNbPorts: **** Ieee1394Service::detectNbPorts() ****
258335244843:  (ieee1394service.cpp)[ 346] initialize: **** Ieee1394Service::initialize( int port ) ****
258335305920:  (ieee1394service.cpp)[ 191] detectNbPorts: **** Ieee1394Service::detectNbPorts() ****
258335394489:  (ieee1394service.cpp)[ 346] initialize: **** Ieee1394Service::initialize( int port ) ****

[let it all run for a minute or so]

^Cjack main caught signal 2
jackd: pthread_mutex_trylock.c:230: __pthread_mutex_trylock: Assertion `robust' failed.
Aborted

But in all this, if we entertain this corruption idea we have to wonder why it doesn't happen with the old stack.

As you found prior to r1985, different duration of raw1394_destroy_handle() is one possible influence.

Furthermore, always since this bug was reported, libraw1394's juju backend was under suspicion. But (a) this code has been looked at already and there is not a lot to review, (b) the other libraw1394 client application that comes closest to FFADO, which would be Kino, does not exhibit issues that point towards memory corruption by libraw1394 usage. Like FFADO, Kino creates and shuts down raw1394 handle instances during its lifetime multiple times. (Whenever Kino's capture pane is left, all libraw1394 usage is shut down. It starts with all new raw1394 handles whenever the capture pane is being entered again.) Also like FFADO, Kino exercises asynchronous transmission, asynchronous reception, and isochronous reception; and it spawns threads for 1394 I/O. Unlike FFADO, Kino does not exercise isochronous transmission. (It presumably could transmit isochronously though if I set up a test with a camcorder that accepts DV upload.)

PS: roos used jack2 so far and almost always gets the crash (sometimes he doesn't), whereas I used jack1 so far and almost never get the crash. (Ever since you committed r1985, I got it only at the two occasions from today as noted above. Though only today I deliberately started and stopped jack many times in repetition.)

(in reply to: ↑ 18 ) 06/18/11 07:49:24 changed by stefanr

Replying to adi:

Just a random thought on this, maybe I find time today to give it a try. To triage the bug, how about adding the lock to libraw1394's handle structure. create_handle() will init the lock, destroy_handle() will only be effective if the lock isn't busy.

Or while we are at it, create a modified libraw1394 which (a) serializes all its API calls per handle by a pthread_mutex, and (b) does so with mutex_trylock and errors out with a debug message if the lock can't be taken? And (c), poisons raw1394_handle memory before freeing it.

06/19/11 01:24:17 changed by jwoithe

Back online (for a short time anyway).

I should clarify that I wasn't suggesting a memory corruption within libraw1394. I'm quite prepared to accept that there's something amiss inside libffado causing this. My suggestion was that memory being corrupted elsewhere in ffado may only be having an adverse effect at this point.

On the jack1 vs jack2 issue, roos mentioned that he was reliably hitting the bug under both jack1 and jack2. So far the only dumps provided have been for jack2. I'm hoping a jack1 dump will be provided soon since that will be the easiest for me to debug and follow up on.

Adi's suggestion is interesting although I suspect it may paper over the problem. It's still worth a try though. Looking at the code as it exists now, the reentrancy of raw1394_destroy_handle() via the IsoHandler? is theoretically not possible due to the disable_lock mutex. If that mutex was corrupted however, the locking calls will work with "undefined" results (according to the posix standard) and we could very well end up with symptoms like we're seeing. The trouble is that I don't think the corruption is happening in libraw1394 - I suspect very strongly that it's in libffado. The issue is tracking it down; unfortunately valgrind doesn't work too well with ffado/jackd due to the RT issues and so forth.

As stefanr pointed out earlier, there are other calls to raw1394_destroy_handle() made in the ieee1394service object. However, none of those raw1394 handles match the handles used by the IsoHandler? objects. Having said that, I guess if there was improper use of raw1394_destroy_handle() in ieee1394service which caused corrupted memory then maybe it would only show up later when the IsoHandler? objects were disposed. The problem with this theory is that the type of problem we're talking about would be a reentrant call to raw1394_destroy_handle(), and in all previous cases we've seen an immediate double-free trap in the reentrant call, not one delayed until some time later.

Later tonight I might try and take a close look at ieee1394service and see if the sort of thing which went wrong in IsoHandler? (which I fixed with the addition of the disable_lock mutex) could possibly happen there too. Of course the big issue for me is that at present I'm not getting the double-free on my system anymore, making debugging a purely theoretical exercise for me.

06/19/11 04:23:31 changed by roos

  • attachment ffado_qjackctl_jackd1 added.

qjackctl messages ffado-svn jackd1

06/19/11 04:26:04 changed by roos

  • attachment ffado_terminal_jackd1 added.

-v 6 log when running jack -d firewire in terminal / jackd1

06/19/11 04:27:50 changed by roos

  • attachment ffado-diag_jackd1_Debian added.

ffado-diag / debian / jackd1

06/19/11 04:40:56 changed by roos

  • attachment ffado_qjackctl_debian_jackd2 added.

ffado / qjackctl messages / debian / now jackd2

06/19/11 04:42:22 changed by roos

  • attachment ffado_terminal_jackd2_debian added.

terminal / ffado-svn / debian / jackd

06/19/11 04:43:41 changed by roos

  • attachment ffado-diag_jackd2_Debian added.

ffado-diag / debian / jackd2

06/19/11 04:47:19 changed by roos

JAckd crashes always when using qjackctl (ubuntu jackd2 / debian jackd1 & jackd2)

The glibc crash didn't came up in the terminal using debian.

To stop jackd2 seems to be harder then jackd1. In terminal I had to press 3 x ctrl-c till it really came back to my prompt e.g. @debian:~$

With jackd2 I had to press stop in qjackctl at least 2 times, till it actually ended the stop procedure.

06/19/11 04:52:32 changed by roos

  • attachment ffado-diag_ubuntu_jackd2 added.

ffado-diag ubuntu jackd2

06/19/11 05:08:48 changed by jwoithe

Roos: thanks for the jack1 debug logs and the ffado-diag output.

[Written before comment from roos at 06/19/11 04:47:19]

Stefan: a few observations from the most recent set of logs provided by roos. Firstly I note that the kernel in use is a derivative of 2.6.33. Is there any reason at all to suspect that the "new" stack in a kernel of this age may create problems which have since been dealt with? The nature of the crash dumps doesn't suggest that it's a kernel issue, but i thought it was worth asking the question.

I note the crash backtrace in the jack1 case is more or less identical to that shown under jack2. To me this more or less rules out jackd as a suspect (not that it ever was a serious suspect).

Interestingly enough, the jack1 debug messages do not display the "~IsoHandler??: BUG: Handler still running!" messages, which tends to rule out the whole ~IsoHandler?? / disable() thing that we've been looking at. What ultimately confuses me a little is that jackd (according to the "-v 6" messages) exited due to the watchdog firing while the qjackctl messages clearly show it crashing due to double-free. I'll have to think a bit more about this.

On the subject of the raw1394_destroy_handle() calls in ieee1394service, I don't *think* there's a problem. The calls in ~Ieee1394Service() refers to the handles created in Ieee1394Service::initialize(). I see no where else where those handles are referenced in a raw1394_destroy_handle(), and one would not expect the destructor to be reentered, so I think that lot is safe. There are also raw1394_destroy_handle() calls in Ieee1394Service::detectNbPorts() and Ieee1394Service::initialize(), but in both cases the handles destroyed were created in the very same methods. Again, there doesn't seem to be a problem here.

Finally we have a call in Ieee1394Service::HelperThread::~HelperThread??(), which references the handle created in Ieee1394Service::HelperThread::HelperThread??(). Since the raw1394_destroy_handle() is in the destructor here, it really can only be called once unless the destructor was called a second time on a previously disposed of object (which is clearly a big bug). I would have thought that if this were going on we'd have a far more reproducable bug than what we appear to have.

So, according to my very brief analysis I can't immediately see a mechanism whereby raw1394_destroy_handle() could be called twice on the same handle. This may mean that we're back to considering some form of memory corruption by another area of ffado which happens to render the IsoHandler??'s disable_lock mutex ineffective, which in turn triggers the bug which r1985 was expected to fix.

Thoughts?

06/19/11 05:12:35 changed by jwoithe

Roos wrote:

The glibc crash didn't came up in the terminal using debian.

That's interesting, since according to ffado-diag it's debian running 2.6.33 while Ubuntu is running 2.6.39. That may be a red herring though - if we are chasing a memory corruption it could quite easily have different signatures on different distributions. Thanks for these additional attachments roos. I'm going to wait to see what Stefan has to say about the latest information.

06/19/11 07:24:46 changed by stefanr

Replying to stefanr:

Or while we are at it, create a modified libraw1394 which (a) serializes all its API calls per handle by a pthread_mutex, and (b) does so with mutex_trylock and errors out with a debug message if the lock can't be taken? And (c), poisons raw1394_handle memory before freeing it.

Notes to self:

  • Implement the library-internal locking with extra static allocations, so that heap or stack corruptions don't affect the lock. Requires a global lock and a suitably sized array to keep track of the state of all raw1394_handle instances that an application uses at a time.
  • Use backtrace() in the trylock failure message.

Replying to jwoithe:

But in all this, if we entertain this corruption idea we have to wonder why it doesn't happen with the old stack.

At the danger of stating something obvious, I could think of these classes of differences between the stacks:

  • Bugs in the libraw1394 juju backend or even the kernel. (No way. ;-)
  • A more theoretical class: Bugs in the libraw1394 linux1394 backend or kernel that are not present in the juju counterparts but hid FFADO bugs of whatever nature.
  • Juju (library backend or kernel) not initializing some presumed unused data that the linux1394 counterparts do initialize.
  • Different usage of heap and stack, i.e. different memory layout. E.g. a juju libraw1394 handle instance is 20 kB large, the linux1394 variant 8 kB.
  • Timing differences between the two backends that uncover race conditions in FFADO. This was apparently the cause for the bug that you fixed in r1985.
  • Different delivery of DMA interrupts (e.g. how many packets are to be handled at a time), such that previously not exercised code paths in FFADO's streaming code are now ventured into.

Replying to jwoithe:

Firstly I note that the kernel in use is a derivative of 2.6.33. Is there any reason at all to suspect that the "new" stack in a kernel of this age may create problems which have since been dealt with? The nature of the crash dumps doesn't suggest that it's a kernel issue, but i thought it was worth asking the question.

There have been lots of kernel fixes in the meantime, many of which have not found their way into the 2.6.33-longterm kernel series yet (and won't ever). But it seems unlikely to me that this plays a role here. After all, the three variations of unclean shutdown that have been observed so far,

  • glibc abort after double-free, which may be caused by memory corruption of the iso handler m_disable_lock mutex,
  • glibc abort after "__pthread_mutex_trylock: Assertion `robust' failed.", which also points to a memory corruption of m_disable_lock,
  • jackd seemingly cleanly shutting down but taking a long time to do so (often so long that jackd's watchdog time-out hits), which too could be caused by a corrupted mutex or other corrupted resource which is waited for to change state,

have all been observed on my 3.0-rc3 from kernel.org (Gentoo userland), though with varying degree of repeatability. The latter two modes were very highly reproducible in my tests yesterday.

On the subject of the raw1394_destroy_handle() calls in ieee1394service, I don't *think* there's a problem.

(...)

This may mean that we're back to considering some form of memory corruption by another area of ffado which happens to render the IsoHandler?'s disable_lock mutex ineffective,

Right.

06/20/11 06:34:47 changed by jwoithe

Just checking in to say I couldn't work on this tonight, mostly because my route to www.ffado.org was broken, apparently due to a problem with NTT in the US. A couple of points coming from stefanr's latest post:

  • I agree that there are plenty of ways the new stack could be changing ffado's behaviour. I guess I was eluding to the lower level effects, which we are still to determine.
  • From what you've said and my understanding of the situation I think we can rule out the kernel version difference as significant here.
  • It's looking more likely that memory corruption is killing the disable_lock mutex, effectively triggering the bug which r1985 fixed. The trick will be working out a way of tracking it down.

Finally, it's worth noting that r1985 did fix a real bug caused by the much longer potential execution time of raw1394_destroy_handle() under the new kernel stack. Given that the remaining bug may be a memory corruption bug, I idly wonder whether it's possible that the same bug may be responsible for a number of the "random crash" reports we've had from time to time. Time will tell I guess.

06/20/11 07:17:43 changed by jof

  • attachment log added.

log file for "jackd 1" and "ffado svn 1985"

06/20/11 07:25:34 changed by jof

  • attachment log-ffado-diag added.

log file ffado-diag "ffado svn 1985"

06/22/11 03:37:37 changed by jwoithe

jof: thanks for these additional logs.

It will be a few days before I can get back to chasing this issue. It's starting to look very much like a memory corruption bug, but it may be difficult to track down since it clearly has a greater probability of happening on some systems and not others. Since my development system is one of those which does not trigger the bug it's going to make it a little interesting to track it down. But anyway, I'll see how things go once I get some more time on it.

09/12/11 17:45:16 changed by jwoithe

Could those who are experiencing these crashes on exit please test r1995? Holger Dehnhardt spotted a rather embarassing bug that I introduced in r1985 which almost certainly explains why the problem persisted for many people even after the "fix" in r1985.

I am cautiously hopeful that r1995 will resolve this issue. Only wider testing can prove this though.

09/15/11 16:16:19 changed by jwoithe

  • status changed from reopened to closed.
  • resolution set to fixed.

At this stage, reports on the mailing list appear to indicate that r1995 has resolved the behaviours detailed in this ticket. As a result I will close this ticket at this point. As for the previous closure though (following r1985), please feel free to re-open it if it seems the problem persists.

To summarise: the crashes were caused by a race condition precipitated by the raw1394 handle close function taking much longer under the new stack compared to the old. The idea I implemented in r1985 would have fixed this except for a silly oversight on my part. This omission was fixed in r1995, so as of r1995 the problem should be solved.

03/28/12 04:48:16 changed by jwoithe

Note: the problems discussed here may have been related to those raised in ticket #283.