... darkrealms...

Msg # 1156 of 1332 on ZZLI4424, Sunday 9-06-25, 8:04
From: SALVATORE BONACCORSO
To: ALL
Subj: Bug#1112627: linux-image-6.16.3+deb14-am
 XPost: linux.debian.bugs.dist 
 From: carnil@debian.org 
  
 Control: tags -1 + moreinfo upstream 
  
 On Sun, Aug 31, 2025 at 02:43:30PM +0200, Francesco Poli (wintermute) wrote: 
 > Package: src:linux 
 > Version: 6.16.3-1 
 > Severity: important 
 > X-Debbugs-Cc: debian-amd64@lists.debian.org, invernomuto@paranoici.org 
 > User: debian-amd64@lists.debian.org 
 > Usertags: amd64 
 > 
 > Hello! 
 > 
 > I've just upgraded linux-image-amd64 and rebooted to 
 > linux-image-6.16.3+deb14-amd64, to find a very bad surprise: 
 > Intel integrated audio (device 00:1b.0, see the PCI list) no longer works. 
 > 
 > Everything looks normal: alsamixer shows the usual controls (nothing 
 relevant 
 > was found to be accidentally muted) for the usual sound card (HDA Intel 
 PCH); 
 > jackd starts as usual; audacious starts as usual and plays music as usual. 
 > 
 > But... 
 > 
 > But nothing can be heard from the speakers and /var/log/kern.log is 
 > filled with the following error messages (that repeat once every 5 s): 
 > 
 > kernel: dmar_fault: 19382 callbacks suppressed 
 > kernel: DMAR: DRHD: handling fault status reg 3 
 > kernel: DMAR: [DMA Write NO_PASID] Request device [00:1b.0] fault addr 
 0xffa02000 [fault reason 0x0c] non-zero reserved fields in PTE 
 > kernel: DMAR: DRHD: handling fault status reg 3 
 > kernel: DMAR: [DMA Write NO_PASID] Request device [00:1b.0] fault addr 
 0xffa02000 [fault reason 0x0c] non-zero reserved fields in PTE 
 > kernel: DMAR: DRHD: handling fault status reg 3 
 > kernel: DMAR: [DMA Write NO_PASID] Request device [00:1b.0] fault addr 
 0xffa02000 [fault reason 0x0c] non-zero reserved fields in PTE 
 > kernel: DMAR: DRHD: handling fault status reg 3 
 > 
 > 
 > Rebooting with linux-image-6.12.38+deb13-amd64 makes everything work 
 > again. 
 > 
 > The error messages look similar to the ones quoted in a comment to 
 > one [upstream bug] report, however that comment refers to Linux kernel 
 > version 6.12.23, while I only see this issue after upgrading from 
 > version 6.12.38 to version 6.16.3 ... 
 > 
 > [upstream bug]:  
 > 
 > 
 > What am I doing wrong? 
 > Please forward my bug report upstream and incorporate a fix as soon as 
 > possible. 
 > 
 > Thanks for any help you may provide! 
  
 So while this is IOMMU related, it *still* might be broken firmware 
 and you can try if disabling IOMMU "resolves" the issue. Still there 
 is indication that this might be a real regression from 6.12 to 6.16. 
 So additionally to the above tests I would like to ask you  to do the 
 following: 
  
 Between your last "good" Debian revision (6.12.38-1) and your first 
 noticed "bad' revison (6.16.3-1) there were a couple of experimental 
 uploads: 
  
  6.16.3-1 
  6.16.1-1~exp1 
  6.16-1~exp1 
  6.16~rc7-1~exp1 
  6.15.6-1~exp1 
  6.15.5-1~exp1 
  6.15.4-1~exp1 
  6.15.3-1~exp1 
  6.15.2-1~exp1 
  6.15.1-1~exp1 
  6.15-1~exp1 
  6.15~rc7-1~exp1 
  6.14.6-1~exp1 
  6.14.5-1~exp1 
  6.14.3-1~exp1 
  6.13.11-1~exp1 
  6.13.10-1~exp1 
  6.13.9-1~exp1 
  6.13.8-1~exp1 
  6.13.7-1~exp1 
  6.13.6-1~exp1 
  6.13.5-1~exp1 
  6.13.4-1~exp1 
  6.13.3-1~exp1 
  6.13.2-1~exp1 
  6.13~rc7-1~exp1 
  6.13~rc6-1~exp1 
  6.12.43-1 
  6.12.41-1 
  6.12.38-1 
  
 Via snapshot.debian.org service, do a manual search of 
 linux-image-amd64 package an determine a close a possible range of 
 still "good" and first "bad" kernel revision. I would start here to 
 test first the major upstream bumps as we do not know if the potential 
 rgression was as well introduced backported in the back then stable 
 series. So you might wnant to test first 6.13~rc6-1~exp1, test, 
 depending on the result go up to the 6.14.y versions, test, if 
 behavoour changes, then search between the 6.13 and 6.14.y versions 
 oteherwise move the 6.15.y versions and search similarly. 
  
 Let's say you find hipotetically now you found that all up to 
 6.14.6-1~exp1 were good and breaks with testing 6.15~rc7-1~exp1. 
  
 (Note if the situation is not very clear and you jump between results 
 while moving, then it's better to directly bisect v6.12 to 6.16.3) 
  
 But let's assume we have all up to 6.14.6-1~exp1 are good and 
 6.15~rc7-1~exp1 is bad. 
  
 It would be great if you could bisect the problem. That would involve 
 compiling and testing a few kernels: 
  
         git clone ttps://git.kernel.org/pub/scm/linux/kernel/git 
 stable/linux-stable.git 
         cd linux-stable 
         git checkout v6.14 
         cp /boot/config-$(uname -r) .config 
         yes '' | make localmodconfig 
         make savedefconfig 
         mv defconfig arch/x86/configs/my_defconfig 
  
         # test 6.14 to ensure this is "good" 
         make my_defconfig 
         make -j $(nproc) bindeb-pkg 
         ... install the resulting .deb package and confirm it successfully 
 boots / problem does not exist 
  
         # test 6.15-rc7 to ensure this is "bad" 
         git checkout v6.15-rc7 
         make my_defconfig 
         make -j $(nproc) bindeb-pkg 
         ... install the resulting .deb package and confirm it fails to boot 
 / 
 problem exists 
  
 With that confirmed, the bisection can start: 
  
         git bisect start 
  git bisect good v6.14 
  git bisect bad v6.15-rc7 
  
 In each bisection step git checks out a state between the oldest 
 known-bad and the newest known-good commit. In each step test using: 
  
     make my_defconfig 
     make -j $(nproc) bindeb-pkg 
     ... install, try to boot / verify if problem exists 
  
 and if the problem is hit run: 
  
     git bisect bad 
  
 and if the problem doesn't trigger run: 
  
     git bisect good 
  
 . Please pay attention to always select the just built kernel for 
 booting, it won't always be the default kernel picked up by grub. 
  
 Iterate until git announces to have identified the first bad commit. 
  
 Then provide the output of 
  
     git bisect log 
  
 In the course of the bisection you might have to uninstall previous 
 kernels again to not exhaust the disk space in /boot. Also in the end 
 uninstall all self-built kernels again. 
  
 Once we have a clear bad commit we might we can douple check. If the 
  
 [continued in next message] 
  
 --- SoupGate-Win32 v1.05 
  * Origin: you cannot sedate... all the things you hate (1:229/2)
[ list messages | list forums | previous | next | reply ]
328,084 visits
(c) 1994, bbs@darkrealms.ca