Archive for October, 2011

Xen 4.1.2 on Fedora 16 Beta – My experience

Duplicate of my mail to the Xen-users list: Read the rest of this entry »

,

No Comments

Stupid Trick: Changing the UUID in /etc/fstab from an emergency shell with screen

So. I installed a new VM with /boot formatted as btrfs. Except pygrub apparently doesn’t like btrfs, so it refused to boot.

And I refused to reinstall. So it looked like an impasse – I didn’t want to mess around with chroots and installing grub2, but it couldn’t boot without said messing around.

And then I had a thought – pygrub doesn’t care that grub2 isn’t installed, it just cares about the location of the grub.cfg file. So it should be possible to move stuff off the btrfs partition onto an ext4 partition!

One lvcreate, mkfs.ext4, xm blk-attach 0 /dev/domU/rawhide-boot /dev/xvda, mount /dev/xvda1 /mnt/oldroot, mount /dev/domU/rawhide-boot2 /mnt/newroot, cp -r /mnt/oldroot/* /mnt/newroot, umount /mnt/*root, xm blk-detach 0 xvda later, I’ve got the VM at least showing the plymouth boot screen.

YAY! Except it dies partway through with a systemd error – a fsck failed, meaning it can’t continue. Uh oh.

First thought: Go poke at systemd, try to disable fsck. Nope, systemctl disable fsck…service refuses to work without the exact name. And the terminal actually clips the fsck service name because it’s using /dev/disk/by-uuid/36longcharacterstring. I’m not a happy guy, I can’t even see what I’ll need to type out!

So, second thought: Ok, why does it think the disk is still present then? Check /etc/fstab – /boot is listed there with UUID=olduuid. Ok… so, I should be able to replace it with the new uuid. But what is the new UUID? Looking at the header of /etc/fstab, it lists a few possibilities. findfs looks like the most likely. But… it wants a label or uuid. Trying label=/boot doesn’t work. Ok, how about blkid? Oh, hey, that’s looking good. It’s giving me a bunch of UUDs. But how to get it into /etc/fstab?

And this is the stupid trick: screen has a text buffer that you can copy stuff into. I recently used it to capture VM logs and save them to a file using the copy into buffer function (Ctrl+a, [), then printing them out into a open vim window that was in insert mode (with Ctrl+a, ]). So, it was a simple matter to jump into copy mode, select the new UUID, jump over to the vim window open to /etc/fstab, change to overwrite mode (because this was in place, and I didn’t want to have to delete the old UUID when I could just overwrite it), and empty the buffer.

A :wq and reboot later, the VM comes up fine.

(Of course, I had to do a ifup eth0 and  yum install screen vim to actually get screen and vim. Why those aren’t included in the default F16 is beyond me…)

1 Comment

Revised Fedora virt-install command

I got tired of having to muck around with kpartx and such whenever a domU went down.

So I decided to just go with 4 LVs per domU – root, home, swap and boot.

The command to create the LVs is just one really long one, the original command replicated 4 times:

lvcreate -L10G -n rawhide-root vg_caesium_domUs && lvcreate -L1G -n rawhide-swap vg_caesium_domUs && lvcreate -L10G -n rawhide-home vg_caesium_domUs && lvcreate -L512M -n rawhide-boot vg_caesium_domUs

The virt-install command is somewhat different too – It now specifies the 4 block drives:

virt-install -p -r 512 -n rawhide -f /dev/vg_caesium_domUs/rawhide-boot -f /dev/vg_caesium_domUs/rawhide-root -f /dev/vg_caesium_domUs/rawhide-home -f /dev/vg_caesium_domUs/rawhide-swap -l ftp://ftp.jaist.ac.jp/pub/Linux/Fedora/development/16/i386/os/ -x ks="http://hydrogen/~kyl191/ks/ks.php?hostname=rawhide"

 

,

No Comments

Why I’m becoming increasingly disillusioned with SELinux

My history with SELinux is a… varied one.

I first remember using it back in Fedora Core 6. I soon gave up on it, the labeling wasn’t consistent and I didn’t have the time nor inclination to relabel everything, especially when a quick one work change in a config file fixed all my problems.

The next time I used it was probably in F10. I ended up disabling it then, for much the same reasons.

Then I got onto F12/13 for my home server, and disabled SELinux again pretty darn quickly, after neither samba nor Apache liked it.

I can’t remember if I ever disabled it on my testing server which was running F15. But when I upgraded the server to F16Beta, I quickly rediscovered SELinux. Mainly because I’d get errors like this one:

avc:  denied  { read } for  pid=1868 comm="dhclient-script" name="ifcfg-br0" dev=dm-4 ino=10894 scontext=unconfined_u:system_r:dhcpc_t:s0-s0:c0.c1023 tcontext=unconfined_u:object_r:admin_home_t:s0 tclass=file

As far as I can tell, that’s SELinux denying the default network setup script from running. br0 is my custom xen bridge, I’ll admit, but it was working fine in F15, and even F16B until a new update came out sometime on or about Oct 20, breaking the network script.SELinux probably lasted the longest on F16B so far, because, for the most part, it worked perfectlyfine until, well, it just didn’t work.

  1. Sudo -i just  didn’t work. I spent about 30 minutes trying to hunt down what was causing the problems – checking /etc/groups, running visudo, the whole shebang. Everything said “Yes, I should be allowed to use sudo!” But sudo would perpetually tell me “User kyl191 is not in the sudoers file. This incident will be reported.”, and it was. My root terminal would get an extra line reading “You hve new mail” after any command was run.
  2. Starting up VMs just outright didn’t work, SELinux blocked access to the disk, but it was never relayed back to xen – Xen would just die with the exception “Cannot find bootloader”. That was my first inkling that SELinux is still in force.
  3. Last straw, SELinux killed my networking setup when it did…. something after an update that was published on/around Oct 21.

Some will argue that my ifcfg-br0 script is in the wrong context, so of course it would fail. To that my only response is “So…what context should I use then?” I’ve got no clue, and I’m sorry, but I’m not about to spend time looking for a list of contexts, and trying to apply them to each and every file I encounter in such a situation.

I’ve trying to write documentation for the Xen project. On my testing server, I’m not going to bother enabling SELinux because it gives me strange errors. For prod servers, I might use it if there was a VM for each service, otherwise the resulting mish-mash of programs tend to lead to unexpected results.

, ,

No Comments

Win2K8R2 as HVM VM on Xen 4.1

Config file: pastebin.com/ULE1Y49R

Also, it’s working fine without the Xen extensions, though there’s (at least) one unknown device in the device manager)

No Comments

Sudo broken on F16?

I haven’t been able to use sudo on F16Beta on both machines that I upgraded – even though I know sudo worked fine before I upgraded. Not sure why/how it broke…

Turns out SELinux broke it somehow.

,

No Comments

SELinux, Xen & LVMs

I discovered something new today: SELinux can and does prevent access to logical volumes. This is entirely unexpected for me, because I always thought SELinux only worked on files.

I was wondering why my test VM suddenly refused to start with the error “Disk is not accessible” after I upgraded it to F16Beta. I checked the dom0 logs, and read “couldn’t find bootloader”. At which I promptly went “Oh, crud, grub2 screwed up again!”, and promptly ignored it because it was after midnight.
Then I tried again today. The main difference being that I dropped out of X, and had the screen on when I started up the domU. So I caught these messages:

[  946.283648] avc:  denied  { read } for  pid=3193 comm="xend" name="dm-6"
[  946.690625] avc:  denied  { open } for  pid=3194 comm="pygrub" name="dm-6"

At which point I went “Oh, I see. Oops.”

Simple fix was to disable SELinux with a “setenforce 0” command. More extensive fix would be to:

  1. Find out why SELinux was suddenly enforced OR only just started blocking my Xen disk access
  2. Relabel the LVs so SELinux doesn’t throw a fit.

With regards to relabeling the LVs, the exact problem that SELinux has seems to be this:

scontext=system_u:system_r:xend_t:s0 tcontext=system_u:object_r:fixed_disk_device_t:s0

The context xend is running under is xend, while the context of the LV is that it’s a fixed disk.

Research says

semanage fcontext -a -t xen_image_t "/dev/mapper/vg_caesium_domU*"

should work, but no guarantees – just did it, and ls -lZ was unchanged. =|

(Filed a bugzilla bug on this: bugzilla.redhat.com/show_bug.cgi?id=747662)

,

No Comments

Xen documentation

I read a most interesting post on the Xen-users list today:

lists.xensource.com/archives/html/xen-users/2011-10/msg00350.html

I’m going to try to follow these suggested topics. I mean, I got Win2k8R2 running on Xen, and I’m running Xen on F16Beta, neither of which I see documented anywhere, so that’s surely something I can contribute to the community docs…

No Comments

Networking upgrade from Xen 4.0 to 4.1

I believe that Xen 4.1 saw the rewrite of what I’m assuming is the entire network configuration stack. Perhaps the most significant thing for me was that the domU’s network connection isn’t created/initiated properly if the domU config file has anything other than the MAC address and the bridge to which to attach the vif to.

Which meant that my test domU didn’t have a working connection. Oh, it appeared on both systems. Dom0 had a vif1.o show up, and the domU had a eth0 show up. Just that… ifup eth0 would timeout. But removing the ‘network-script=bridge’ component from the domU config file and restarting the domU got the network connection up and running again.

So I have to remember this for when I upgrade the rest of my domUs.

And dom0.

No Comments

Scheduling stuff to happen at reboots in Linux

Problem: My 1U is throwing correctable memory errors every few seconds, but I can’t do much about it, so I’m ignoring it.

Solution: Disable the log messages about correctable memory errors.

For future reference, the command is

echo 0 > /sys/module/edac_core/parameters/edac_mc_log_ce

Problem: I don’t want to have to do that after every reboot!

Solution:

  1. Use the module config file (located… somewhere) to disable the messages
  2. Use crontab’s @reboot keyword to run a command at every reboot
Guess which solution I chose.

No Comments