Posts Tagged ‘Xen’

A virtual virgin writes…

Tuesday, October 27th, 2009

I’ve been doing virtual machines since 2000, which makes me a bit of an old-timer, I would think. I invented the laptop rac using a single virtual machine and taught it in Oracle University classrooms from about 2002 on, which was something of a world first. I’ve been banging on about the virtues of virtualisation for a long time… you’d think I’d therefore get it right when it comes to putting it into production, wouldn’t you? But no. I’ve stuffed up. In fact, I’ve behaved like I’d never met a virtual machine before. A virtualised virgin, if you like.

Beguiled by the swift, elegant simplicity of the Citrix XenServer installation and included management tools, I’ve persuaded the powers that be at work to dedicate a 32GB, 8 CPU monster server to running XenServer. I’ve even persuaded them to do something of a first this financial year and part with about AU$6000 so that I can upgrade the storage from a paltry 400GB to a more useful 3000GB, all with a view to combining every test and dev database we’ve got onto a single server, but running in discrete, virtual servers for the purpose of easy snapshotting, backing up, migration and so on.In the process, we would free up two physical servers, one of which could house a small database that all our other databases link to for ‘globally unique’ information. That would free up 6GB of RAM on one of our other servers, thus enabling the two databases running there to increase their buffer caches and get their web response times down further from their already pretty good 0.6 seconds. Wins all round.

Except I chose Citrix XenServer.

It’s not a bad virtualisation product. It’s just immature compared to the likes of VMware. A core requirement was the easy ability to create snapshots, to then navigate back and forth up and down the snashot tree as required. XenServer is bloody awful at snapshotting anything (promised to be fixed in the next release). Another core requirement was the ability to export a virtual machine for backup purposes: one of my 80GB VMs took 11 hours to export with Citrix. On VMware ESXi, it takes 36 minutes. God knows why, but the difference -functionally- is enormous. Finally, as reported in these parts yesterday, XenServer throws a complete mental fit when confronted with Centos 5.4; ESXi just took it in its stride. (Oh, and as another aside, XenServer refuses to run Windows in a VM unless hardware virtualisation extensions are present; ESXi doesn’t appear to give a stuff, and runs them just fine and with seemingly perfectly acceptable performance results).

The XenServer is going, in other words. And we’re switching to ESXi.

I have my problems with ESXi, too, which I’ll no doubt document before long. The fact that its GUI management tool doesn’t run on Windows 7 was a bit of a killer, for example, but I have found a way around that. The licensing is also a complete pig’s breakfast and I’m still trying to get my head around that. But, all in all, ESXi does what we needed virtualisation to do, today, whereas Citrix XenServer doesn’t and won’t until the proverbial “next release”.

I apologise to my reader(s) for having eulogised a fundamentally dud technology (at least as far as we are concerned at work) for far too long. Normal service will be resumed soon, I hope!

Another Xen Annoyance

Monday, October 26th, 2009

I really do like Citrix’s XenServer, despite my moaning earlier about it’s barely-functional snapshotting capabilities.

Rather more disappointing is the fact that it’s completely floored by the arrival of a new distro version. Specifically, there is (understandably) no ‘template’ for Centos 5.4 -and if you try using the supplied template for Centos 5.3 hoping to slip in the incremental release without too much fuss, you’re in for a big disappointment: the virtual machine loses contact with its source DVD and demands a driver be supplied before it can continue reading from it. No such driver can be supplied, so that’s the end of that. Using a template for an OS installation is important, because it allows the VM guest to be properly paravirtualised. If you try and cook your own (using the Other Install Media template), the installation will succeed, but you won’t benefit from a properly xen-ified kernel and hence performance of the VM will be sub-optimal (posh language for ‘pretty crap’).

The only mechanism I’ve really got to work, in fact, is to create a new VM using the Centos 5.3 template, install Centos 5.3, and then do a yum update so that Centos itself takes care of the business of migrating to version 5.4. No problem if you have unlimited downloads on your broadband link, I suppose… but not exactly an ideal way to go.

I would hope that Citrix would regularly release new templates that could be downloaded and installed… but their forums are full of dark talk of ‘wait until the next release of XenServer next year’. That seems a crazy way of going about things. Despite having put XenServer into production use at work, I think I am going to have to re-evaluate it if it can’t do snapshotting or new distro upgrades properly.

Problems with Xen and Goal

Wednesday, October 21st, 2009

Here’s a funny thing. I build a Centos 5.3 x86_64 virtual machine running on my Citrix Xen Server. All works well. I enable the graphical user interface, as explained in an earlier post, and all continues to work well. I can reboot the machine as often as I like, logging in between reboots: all fine and dandy. I then download my Redhat GOAL script, to automate the preparation steps needed to get Oracle running on the new VM. The script executes perfectly well, and the ensuing Oracle installation goes swimmingly, too, as expected. Then I reboot the VM once more and get this at log on:

xenproblem

That is, no matter how often or correctly I type in root’s password, I merely get a “module is unknown” error. In fact, it turns out to be completely impossible to log on (as root!) in this VM ever again, which is a problem I’ve never seen before, under any circumstances. Had this been a physical machine, I’d have booted with a rescue CD and fiddled around to try and fix it, but since this was only virtual, I deleted it and built a new one. However, that doesn’t alter the fact that I could reboot and log in fine before I ran GOAL and the Oracle installation, and couldn’t afterwards. So where’s the problem?

Well, the only bit of GOAL that has anything to do with ‘modules’ as mentioned in the error message is around line 443 to 446, where it alters a configuration file in such a way that it switches on the pam_security.so pluggable authentication module, so that assorted security limits for users are enforced. It does this because that is what the Oracle installation guides tell you to do, so it’s not that GOAL is doing something weird or unexpected!

To test if those lines alone were causing all the trouble, I built a fresh Centos 5.3 64-bit VM and ran an edited version of GOAL that had just those 4 lines commented out. The result: perfectly normal reboot-and-login behaviour. Then I did manually what those four lines of bash script do: edit /etc/pam.d/login and add in the lines

session    required     /lib/security/pam_limits.so
session    required     pam_limits.so

Save the edited file and reboot. Result: module is unknown. Bingo!

So it’s definitely one or other (or maybe both!) of those two lines causing all the trouble. A couple of further edit-and-reboot experiments proved (to my satisfaction, at least!) that it was the first of these lines causing the problem. If I add just the ’session required pam_limits.so’ one, then the machine would reboot and let me log in perfectly normally. Add the line that specified the full path to the pam_limits.so file, however, and the ‘module is unknown’ error was back immediately.

Why that should be so, I have no idea -because a check before adding that line showed that the file did exist in that location:

xenproblem2

If anything, therefore, you might have thought the line which was vague on the specifics of a file’s location would have caused more trouble than one which was precise and accurate. But apparently not!

Now, I started to wonder about this: GOAL has been working perfectly on physical servers (and virtual servers as guests in VMware Workstation) for a very long time with both these lines being added to the relevant file. Why should it suddenly stop working just because we’re suddenly using XenServer? Was it some weird interaction between a xen-enabled kernel and these pluggable authentication modules? I wasn’t even sure how to go about working out what particular element of the interaction was causing all the grief. So then I paused, backed up and had another look at the problem from a slightly different angle: maybe GOAL had been wrong all these months in the first place?

A quick review of Oracle-on-Linux installations around the Interweb at least showed I wasn’t insane: plenty of other people use the fully-qualified-path version of the pam_limits.so file as well as me. Here’s Tim Hall’s guide to a 10.2 installation on Red Hat 5, for example:

xentimhall

Note that he uses /lib/security/pam_limits.so, not just plain-vanilla “pam_limits.so” with no path. He does exactly the same in his installation guide for 10.1 on Red Hat 4, too. By the time he documents 11.1 and 11.2 installations onto Red Hat-like OSes, he’s telling people to use exactly the same two-versions configuration as GOAL has used all this time:

xentimhall2

Lest it be thought I’m picking on Tim Hall’s documentation, however, let me say at once: everyone else does this, too. Here’s an article by John Smiley on Oracle’s very own website:

xenoracle

Spot the use of the fully-qualified filename!

The same author’s ‘install 11g‘ article was also wrong:

smiley11g

The funny thing about that is, as you’ll note, I can only show you the error these days by referencing the Google cache, because the original article has been corrected very recently (which is a good thing, of course).

Or there’s BDerzhavets 2007 blog post, which even appears to be a description of installing Oracle on Xen! Nevertheless, here’s the same instruction as everyone else has been giving all this time:

xenoracle2

Once more, the use of the fully-qualified filename.

And so on and on. Pick almost any guide (see footnote 1 below!) to installing Oracle on Linux written in the past 5 years or so, and I’ll pretty much guarantee that it tells you to either include the path-specified version of pam_limits.so on its own, or to add two lines to the configuration file, one of which will be the path-specified version. So GOAL is in good company! (See footnote 2 below, too!)

The only trouble with all this good company is that we’ve all been wrong! Because if you look at the official Oracle installation guide for 10g Release 2 on Linux, it says this:

xenoracle3

You’ll note that step 2 says nothing at all about adding two different versions of pam_limits.so -and uses the one version that does not mention a full path. And how about the official installation document for 11g Release 1 on Linux? Exactly the same thing: at the end of Section 2.8, you’re told simply to add the one path-less reference to pam_limits.so. And the same is true of the 11g Release 2 official documentation: just one reference to pam_limits.so, and no path required.

It’s interesting to think about how these documentation glitches arise in the first place -but more interesting to speculate about how they propagate themselves over the course of many years and vast numbers of installation guides! It’s the same mechanism that still has loads of guides telling you to stick a command to start the listener in your dbora script, even though that’s not been needed to automate listener startups since version 9, I guess: borrowing without referring back to the original, official documentation source, plus an understandable tendency to cut-and-paste instead of writing from first principles. Which is quite a separate issue from the question as to why advice made by GOAL and so many others to use a fully-qualified path-and-name version of the pam_limits.so file to your /pam.d/login file worked fine until it met Xen, of course, and I’m afraid I have no answer for that one. The fact is, however, that it’s GOAL and the other installation guides that are wrong, not the Xen-enabled environment. Do it the way the Oracle documentation tells you to do it and there are no problems with ‘modules not found’ at all.

All of which is an extremely long-winded way of saying that GOAL has been modified so that it now only makes the one, correct addition to the /pam.d/login file, regardless of whether you’re running virtualised or not. Which means it gets it correct on physical and virtual servers alike, as it always should have done. Apologies for not spotting my error much earlier. The correction has been applied to all four GOAL scripts, not just the Red Hat/Centos one.

Footnotes

1. I won’t list them all. But a quick bit of trivial Googling found this, this, this, this, this, this …and, well, you get the idea. All show (as at the time of writing) the use of a fully-qualified filename at the point of editing /pam.d/login. Incorrectly, as per the official doco. Naturally, a number of the authors of those sites might ’silently’ edit their work now to correct things. But the original screenshots exist, even if I haven’t bothered to show them here!

2. There are exceptions. For example, this site offers a 10g installation guide that is very non-standard in some respects… but it does manage to specify only the path-less version of the relevant edit. It was about the only guide I could find that did, though!

The Trouble with Xen

Sunday, October 11th, 2009

I have finally found the fatal flaw in Citrix XenServer: the thing which makes it less than perfect, anyway. Its snapshot-handling capabilities are rudimentary at best, useless at worst. Don’t get me wrong: taking a snapshot of a virtual machine just before you do something highly experimental with it is easy enough: right-click, Take Snapshot, job done. But trying to use that snapshot for anything remotely useful -like restoring the virtual machine to a usable state just after your highly experimental activities ruined the VM beyond simple repair- is practically impossible. From the Citrix manual comes this:

1. Note the name of the snapshot
2. Note the MAC address of the VM
3. Destroy the VM:

a. Run the vm-list command to find the UUID of the VM to be destroyed:
xe vm-list

b. Shutdown the VM:
xe vm-shutdown uuid=<vm_uuid>

c. Destroy the VM:
xe vm-destroy uuid=<vm_uuid>

4. Note the name of the storage to restore the VM on:

xe sr-list

The sr-list command lists all storage in the pool or on the host.

5. Create a new VM from the snapshot:

xe vm-install new-name-label=<vm_name_label> template=<template_name> sr-name-label=<sr_name>

6. Edit the MAC address of the VM to be the same as the MAC address of the VM begin restored.
7. Start the VM:

xe vm-start name-label=<vm_name>

I’m a bit annoyed by that piece of documentation in the first place because, at step 5, it tells you to issue a command that requires the use of a template name. But where in the preceding steps was any mention made of creating a template, or noting a template name? I mean, they told you to note down a UUID, a snapshot name, a MAC address and the name of the storage repositories. But the one thing they did not tell you to do was to note down a template name. So the all-important command depends on a piece of syntax that comes out of left field and is completely unexplained. (Turns out, the template name is actually the same thing as the snapshot name. But I had to guess that for myself… and documentation is about taking the guesswork out of these things. Or ought to be).

Anyway, that minor issue aside, the real problem here is that there are seven, highly-manual, steps involved. In VMware, you click a button. Big difference.

Worse, the command at step 5 takes an age to complete (because it’s copying every byte of your 20 or 30 or more GB virtual hard disk). The VMs I want to build at work will have virtual disks of 150GB, so I’ll have to do it overnight or over a long weekend! Not exactly the quick and simple affair that snapshotting is in every other virtual machine hypervisor out there, which is very disappointing. (It’s also a worry, because the above seven-step procedure involves making a copy of a large chunk of data, so I’m going to need rather more hard disk space than I anticipated!)

The good news is that Citrix have acknowledged the functionality needs fixing. The bad news is that they acknowledged that back in July of this year (see the penultimate post in that thread) and there’s still no sign of an actual fix. Here’s hoping.

Winxen

Thursday, October 8th, 2009

I’ve mentioned the fiddly nature of graphical Linux VMs on Citrix Xen Server… but I wouldn’t want you to think that graphical loveliness was particularly difficult to achieve per se on this platform. It just depends on your choice of guest operating system: pick Windows, for example -an OS that has a hard time doing anything but be graphical!- and you get a GUI straight off the installation CD or DVD:

winonxen

And, as you can see, an Oracle 32-bit installation (it happened to be the only version of Vista/2008-compatible Oracle I had to hand) onto a graphical Windows 2008 64-bit Server running via Citrix Xen Server, accessed via my Windows 7 desktop is plain sailing, with no surprises. (Honestly: the mind boggles at the combination of technologies here, and the thought that not so many years ago, this would all just have been a pipe-dream!)

The significance of this particular screenshot is that I am about to convert the dev and test environments at work to virtualised servers… running on Citrix Xen Server, because it’s the only bare metal virtualisation solution that seems able to promise to come to the party with what we need to do. Our production systems are currently Windows 64-bit affairs, so it’s important to know that whatever virtualisation technology we adopt can take a direct copy of a production system and do something with it. Converting the files (of which there are many) via RMAN or doing an export/import so that we can get the data into databases running on Linux VMs are not practical options. On the other hand, I’m pushing (at an ever-opening door) to migrate all our production DB servers from Windows to Linux anyway… so our virtualisation solution for the dev and test environments has to be able to cope with running Oracle-on-Linux VMs at some point in the future, too… which Citrix Xen Server does with ease, as I’ve mentioned previously.

Graphical Goodness in a Xen Centos VM

Wednesday, October 7th, 2009

I mentioned last time that Citrix Xen Server was incredibly fast and easy to setup, but a bit of a pain to use, practically, because the Centos VM I created on it insisted on running in strictly command-line mode only. All true enough… but a GUI is do-able, and here’s how I did it.

First, I read the available documentation! The relevant bit is here, which references a bit about installing guest agent software, which is documented here.

The basic outcome is that, in console (text) mode, you install the Xen Agent software for VM Guests first (that’s into the Centos 5.3 x64 virtual machine I’d already built in text mode, of course). You do that by logging on as root at the console, switching  the DVD selector so that it reads xs-tools.iso and then typing the commandmount /dev/xvdd /mnt. You’ve just effectively inserted the Xen Tools CD into your virtual CD-ROM.So now you can launch the relevant installer off that virtual CD by typing the command /mnt/Linux/install.sh. Say ‘yes’ when prompted and then reboot the virtual PC.

Next, you need to edit the Centos virtual machine’s /etc/gdm/custom.conf file (remember, I only deal with Centos 5.3 these days: the file is different for Centos 3 and 4 versions). I edited that file to contain the following:

gdmconfig

The [servers] bit is already in the file, but everything else has to be typed in, using your favourite text editor (nano in my case). The command= line continues without a break all the way through to the end -BlacklistTimeout 0 bit. The ‘flexible=true’ stuff is a new line in its own right.

With that file changed and saved, I then had to switch off the Centos firewall -something I’d normally do automatically in a GUI OS install, because the installer prompts you. But in a CLI environment, you have to take that step yourself. Logged in as root, type the command system-config-securitylevel-tui and -using your skill and judgement- press whatever combination of TAB, Space Bar and ENTER switches off the firewall in its entirety. You should also set SELinux to at least permissive if not outright disabled (because Oracle doesn’t interact with SELinux very well). Note that if you set SELinux to disabled, you should reboot the VM so that the new setting can take effect. If you set it to permissive, a reboot is not required.

Finally, you can now launch the Gnome Desktop Manager by simply typing the command gdm. If for some reason it’s already running. just ype  /usr/sbin/gdm-restart instead. In either case, you should see the Switch to Graphical Console button light up in the Xen Management tool on your Windows client PC. Click that, and you’ll be rewarded with  graphical Centos login screen -and a short while later, the glory that is this:

graphical xen

You will note from that screenshot that I’m busy running the Goal script for automated Oracle installations… which gives you a clue that Oracle on-Centos-on-Xen is a piece of cake. And which might, I suppose, be the subject of another post or several in the coming days! (It completes in 19 minutes, which as these things go is pretty darn’d quick, and very close to physical PC installation times. Impressive).

I have to say as a parting note that it’s a shame that Citrix Xen makes having a graphical Centos experience a matter of editing a couple of textfiles and keeping your fingers crossed. But I cannot fault the fact that it is a matter of typing just one or two commands and then the pain is behind you. Given the capability of this software, and its extraordinary price of US$0.00, I can live with it -especially as there are so many other redeeming features!

Virtualisation Bonanza

Tuesday, October 6th, 2009

It’s been a “long weekend” here in Sydney: today is a public holiday (though the occasion for it escapes me… perhaps something to do with Labour Day? Who knows!)

I’ve spent most of the day finalising my server setups. Now that my new desktop has arrived, my old quad core becomes The Other Half’s PC, and TOH’s old PC becomes my new ’spare server’. It’s not bad as these things go: dual core Core 2 Duo, 3GHz, 4GB RAM, 1TB hard disk space. So I thought I’d try a bit of free (as in zero cost) bare-bones virtualisation and see how I got on.

I started with Citrix Xen Server 5, then Citrix Xen Server 5.5 (the current release). I moved on to Vmware’s ESXi 3.5 and then Vmware’s ESXi’s 4.0 (again, the current release). I tried Hyper-V (Microsoft’s offering in this virtual space). And I wrapped up with a dabble with Oracle’s own VM-Server, version 2.5.

I’ll save the gory details for a later post, because it’s worth doing a proper side-by-side comparison of these things. But the short story is: Xen Server 5.0 installed extremely well, but when trying to create a Centos 5.3 x86_64 virtual machine, the Centos installer lost contact with the Xen Server’s DVD drive immediately after booting, rendering further installation impossible. Call that the deadest of dead-ends, then! This didn’t happen with the more recent Xen Server 5.5, however, and I was therefore able to install the Xen Server, its Windows client tools and create a Centos 64-bit virtual machine in precisely 27 minutes flat… which is amazing, I think, as these things go (especially as I was cooking dinner at the time). The only slight problem was that the Centos installation refused to install graphically and subsequently refused to run X at all… so it was a command-line only affair. That’s not the end of the world, of course, since it’s possible to run an X server on your Windows machine and have a graphical experience of a console-only virtual machine. But it’s not ideal.

ESXi Server was an extremely quick install, but it’s another 100MB download to get hold of the Windows remote administration client which is mysteriously called V-Sphere …and which, as it turns out, doesn’t actually run on Windows 7! There are workarounds: they involve downloading a dll which could contain absolutely anything from a website whose trustworthiness you have to take on, er, trust. And then you have to hack around a few XML files and keep your fingers crossed. Eventually, it does all work, but the experience is deeply, deeply nasty and not one I’d recommend.

Hyper-V was a success, in the sense that I had a working Centos 5.3 virtual machine within an hour and a half… but the graphics were all messed up, with the Centos machine needing to be set to an 800×600 display before it would display correctly on an old 1280×1024 monitor. Set it actually to display at 1280×1024 and it looked more like 3856×1900… very, very large and unusable! If you didn’t mind working in a tiny screen, or forever scrolling around an enormous one, I suppose you could call it workable. But yet another download (this time, 200MB+) was needed to get the Remote Server Administration Tools for Windows 7 installed… only then to discover that no matter what one did, you just kept being told that you lacked the permissions necessary to administer the virtual server. Apparently, if you faff around with assorted Group Policies and DCOM permissions, you might get it all working, but I never did. So, if you don’t mind running everything direct off the virtual server itself, no problemo. But if you want to manage it from your Windows 7 desktop, forget it, basically.

Oracle’s own virtualisation server was slick enough… but incredibly fiddly to use, with the need to pre-provision everything (such as the Centos installation disks) made incredibly difficult. (Think of it like having to pre-declare all your variables before being allowed to use them!) It doesn’t help that there’s still no Windows client software for remote administration of the virtual machines: so you’re reduced to creating a virtual Linux machine on your Windows desktop so you can administer a virtual Linux machine on your Oracle VM Server! Pathetic, to be honest, and far more trouble than it’s worth. Especially since installing the ‘client’ software involves an incredibly adventurous and lengthy install of a complete Oracle XE database server, plus a mountain of Oracle Application Server componentry. I kept seeing visions of sledgehammers and nuts, to be honest. After all that, there’s not even the facilty to just say, ‘please install from the DVD what I’ve just installed in the Virtual Server’s physical DVD drive’. Oh no. Nothing that simple. Various import options are available to make an already-obtained ISO usable as a VM installation source, but none of them worked for me. My server just sat there, promising me mountains of virtual pleasure, but denying me any of it. Meanwhile, I’d been forced to create an additional amply-apportioned virtual server simply to act as the managing remote client for the server I’d intended to create. Bizarre in the exteme.

Short story, then: Windows Hyper-V works, but remote administration is a bugger. Xen Server works incredibly simply, swiftly and elegantly…but without X, it’s a more complex command-line only experience (update: but this can be fixed!). ESXi is OK, but seems fiddly, and doesn’t have a good Windows 7 client. And Oracle’s own Virtual Server offering is a complete load of unmanageable nonsense as far as I’m concerned.

A detailed write-up to follow, in other words. Meanwhile, a lot of fun, even more frustration… and some surprising outcomes. (I had expected Oracle’s offering to be a lot more elegant than it was, for example!)