Apr 14, 2015 20:47
It takes a bit these days to get me worked up to post a work rant, but it did it today.
Ubuntu. I hate it. Loath it even. This will probably generate some flak because I know people who think it is amazing. Ubuntu, in my experience, is full of hacks that make perfect sense to the people who put them in there, but no sense to anyone else. Admittedly, it probably does not help I'm only ever called in after things have broken so badly only I can fix it, but I have never liked the way Ubuntu works.
Today's problem - Ubuntu sitting at a initramfs prompt, which is a small busybox installation. I'd call it the pre-boot environment. It is what loads first to then load the rest of the OS. When I've seen similiar things in the past it means the root partition is ususally toast.
In this case though, the root partition was fine. I could mount it, see the files, but it still wouldn't boot all the way. It did not take me long to discover the machine has software raid installed. It has 17 hard disks, so /dev/sda through to /dev/sdq, arranged into three raid arrays /dev/md0 to /dev/md2.
/dev/md2 was a raid 5 array with a dead disk (/dev/sdf). Not a problem, edit out the mount line for /dev/md2 in /etc/fstab, let it boot up, then fix the problem.
Ok, first problem is the tiny busybox installation doesn't have an editor (No, I refuse to use ed). So I could mount /dev/sda1 (the root partition) and see /etc/fstab, but nope, couldn't edit it. Right then, lets load vi from the root partition. Damn, compiled with dynamic libraries which don't exist in busybox. How about vim? Same thing. nano? same thing.
Plan B - "grep -v md /etc/fstab > /etc/fstab.new"
Translation - take all the lines in fstab that don't contain the string "md" and redirect them to a file called fstab.new. Swap the new for the old and now we only have a fstab that will try to mount the root partition. Cool.
Reboot.
Same thing.
Ok, it is trying to check the raid arrays dispite us not mounting them. Ok, there is a file called /etc/mdadm/mdadm.conf which contains information about the raid arrays. Ok, remove that file, now there is no information about the raid arrays. Reboot.
Still tries to check the raid arrays.
More hunting around. Ahha, there is a service called /etc/init.d/mdadm that is starting and is responsible for managing the raid arrays. Disable that service. Reboot.
Still tries to check the raid arrays.
So now we've told it that a) we don't want it to mount the raid arrays; b) it doesn't have any raid arrays; c) Don't run the daemon responsible for managing raid arrays and yet it still tries to access them.
This is almost textbook about how not to do things. Much Googling later and I discover that maybe there is a GUI option to manage the raid arrays. Great except a)It is normally accessed via a ssh terminal and b)it won't boot.
I did get over the booting problem. If you type "exit" at the initramfs prompt, it will quite happily run some check, complain about the raid not being accessiable and boot. I am happy with this, this is what I wanted. I am not happy that if the machine gets rebooted, someone needs to plug a monitor and keyboard into the console and type "exit"
It is just pathetic behaviour.
I know people are going to respond to this and go "Ahh, just do this magic fu." The point is I shouldn't have to do magic fu. If I'm telling a system not to worry about a dead disk that isn't needed for it to work, why does it insist on stopping until I acknowledge the disk at the lowest and most basic level possible?
It just confirms my love of Ubuntu on boat anchors.
work rant