I’ve got an ancient Olivetti Netstrada, a deskside server system with quad Pentium Pro 200MHz processors, 256MB RAM, dual power supplies and five 4GB SCSI drives.
It’s been running Ubuntu 8.04 for ages and I found that with my partitioning layout (set up for testing ZFS-fuse ages ago) I couldn’t upgrade it without major surgery so I decided I’d just put Debian on it instead. That’s where I hit problems..
- Debian/kfreebsd (Squeeze & daily) – kernel panics very early with
panic: vm_fault: fault on nofault entry, addr: c3925000
. - Debian/Linux Squeeze – CD boot loader hangs before getting to menu
- Debian Lenny – install kernel panics when uncompressing the initramfs, claims it’s out of memory.
Fortunately the Debian Etch install CD boots and installs correctly, only problem is that Etch is now archived and there are no updates for it..
I dist-upgraded to Lenny and found that the latest kernel there still panics on boot, but the user space is OK. Then I went to Squeeze and found that yes, the Squeeze kernel hangs very early, just after saying it was booting the kernel after uncompressing. Unfortunately the udev in Squeeze won’t work with the Etch kernel, but all that’s broken so far is bringing up the network interface and I can do that by hand with dhclient eth0. Oh, and grub2 hangs (which I suspect is the same issue as the install CD).
I’ve tried building my own kernel using 2.6.38.3 starting with an “allnoconfig” to disable everything and only turn on the minimum necessary, but that has the same behaviour as the 2.6.32 kernel that is in Squeeze, the last thing printed to the console is:
Booting the kernel.
which is at the end of the decompress_kernel()
function in arch/x86/boot/compressed/misc.c
.
Does anyone have any ideas before I go and throw myself on the tender mercies of the LKML ?
Update: Both Alan Cox and Ingo Molnar suggested using the earlyprintk=vga option which I’d not stumbled across before, that revealed that the 2.6.39-rc4 kernel is misdetecting LOWMEM as 16MB not 256MB which could explain a lot. It also reminded me that I’d seen this before and had an offlist conversation with H. Peter Anvin about it in 2008 which tailed off due to work pressures on his part.
Update 2: Thanks to Thomas Meyer and H. Peter Anvin it’s now known what happened – the commit message from hpa for Thomas’s patch describes it best:
When we use BIOS function e801 to probe memory, we should use ax/bx (or cx/dx) as a pair, not mix and match. This was a typo during the translation from assembly code, and breaks at least one set of machines in the field (which return cx = dx = 0).
The patch has been accepted by Linus and will be in 2.6.39!
Update 3: The patch is in 2.6.39-rc6 and that now successfully boots all the way to userspace with the kernel parameters “noapic scsi_mod.scan=sync”! Hooray!
been following along on twitter, can you post the .config ‘allnoconfig’ generates for you?
@rektide kernel config along with /proc/cpuinfo and dmesg from 2.6.18 etc are all in my post to the LKML here: http://permalink.gmane.org/gmane.linux.kernel/1129949
Using the earlyprintk=vga option suggested by Alan Cox and Ingo Molnar I found that the system was misdetecting memory, mentioned in my reply to Alan Cox here: http://thread.gmane.org/gmane.linux.kernel/1129949/focus=1130180
Thomas Meyer might have found the problem, he writes:
The fix is now in the mainline kernel git tree, many thanks to Thomas for spotting it and posting the fix and H. Peter Anvin for committing it and pushing it to Linus!
System now boots successfully all the way to userspace with an unpatched 2.6.39-rc6 kernel and the “
noapic scsi_mod.scan=sync
” boot parameters!Kernel developers rock.. 😉