Quantcast
Channel: Debian User Forums
Viewing all articles
Browse latest Browse all 3557

System and Network configuration • [Software] Serious deadlock issue jbd2

$
0
0
Hello,
Well I just had this issue while transferring data from an xfs drive to a btrfs drive on my archlinux (linux 6.7.1) system, so this has to be a kernel issue I guess.
Thanks for your feedback.

The jbd2 (quoted in the topic) is a kernel module for ext4 filesystem, but you are now reporting about xfs and btrfs filesystems.

Does the solution reported in the previous post still apply to the problem detected for xfs and btrfs filesystems?
No it didn't, my issue has to do with several tasks locking up randomly when heavy i/o is happening, examples being:
INFO: task xfsaild/dm-1:1346 blocked for more than 120 seconds.
INFO: task kworker/u32:17:6058 blocked for more than 120 seconds.
INFO: task cp:5940 blocked for more than 120 seconds.
INFO: task wireplumber:2406 blocked for more than 120 seconds.
INFO: task panel-1-weather:2927 blocked for more than 120 seconds.
INFO: task pool-Thunar:3509 blocked for more than 120 seconds.

except for wireplumber and a weather applet (which probably was blocked because I tried click it while a heavy i/o activity was running), what these tasks have in common is that cp and thunar are often used by me to write data, so the actual task doing the i/o activity often gets blocked, it could be any command/program mv, cp, rsync, borgbackup, thunar, etc, which lends me to think that first, all of them are God-tier software, and I doubt they have a bug in common that nukes my system everytime, and two, that an underlying cause with either 'fsync' or something like that is nuking my system, because a very frequent task blocked is kworker, and xfsaild (and jbd in the case of ext4).
I will have to try my best and write a detailed bugreport to kernel.org since it happened with archlinux as well (and with very different filesystems). BUT, then again, this was somehow, already reported and ignored? https://bugzilla.kernel.org/show_bug.cgi?id=204253 This bug report details the same behavior I see with different software and versions, but it is the same behavior. There are several bug reports with "X has been blocked for more than 120 seconds" that could be related.

I am not sure if it has something to do with delayed allocation features triggering something, because now that I think about it my BTRFS+arch install has autofrag enabled, reading the docs it says "When enabled, small random writes into files (in a range of tens of kilobytes, currently it’s 64KiB) are detected and queued up for the defragmentation process." so they might be similar? I might be comparing apples to oranges. Anyway I will have to read bugreports and do some tests and then report this on my own.

Statistics: Posted by Penaut Butter — 2024-02-10 07:03 — Replies 37 — Views 4893



Viewing all articles
Browse latest Browse all 3557

Trending Articles