Here’s a funny thing. I build a Centos 5.3 x86_64 virtual machine running on my Citrix Xen Server. All works well. I enable the graphical user interface, as explained in an earlier post, and all continues to work well. I can reboot the machine as often as I like, logging in between reboots: all fine and dandy. I then download my Redhat GOAL script, to automate the preparation steps needed to get Oracle running on the new VM. The script executes perfectly well, and the ensuing Oracle installation goes swimmingly, too, as expected. Then I reboot the VM once more and get this at log on:

That is, no matter how often or correctly I type in root’s password, I merely get a “module is unknown” error. In fact, it turns out to be completely impossible to log on (as root!) in this VM ever again, which is a problem I’ve never seen before, under any circumstances. Had this been a physical machine, I’d have booted with a rescue CD and fiddled around to try and fix it, but since this was only virtual, I deleted it and built a new one. However, that doesn’t alter the fact that I could reboot and log in fine before I ran GOAL and the Oracle installation, and couldn’t afterwards. So where’s the problem?
Well, the only bit of GOAL that has anything to do with ‘modules’ as mentioned in the error message is around line 443 to 446, where it alters a configuration file in such a way that it switches on the pam_security.so pluggable authentication module, so that assorted security limits for users are enforced. It does this because that is what the Oracle installation guides tell you to do, so it’s not that GOAL is doing something weird or unexpected!
To test if those lines alone were causing all the trouble, I built a fresh Centos 5.3 64-bit VM and ran an edited version of GOAL that had just those 4 lines commented out. The result: perfectly normal reboot-and-login behaviour. Then I did manually what those four lines of bash script do: edit /etc/pam.d/login and add in the lines
session required /lib/security/pam_limits.so
session required pam_limits.so
Save the edited file and reboot. Result: module is unknown. Bingo!
So it’s definitely one or other (or maybe both!) of those two lines causing all the trouble. A couple of further edit-and-reboot experiments proved (to my satisfaction, at least!) that it was the first of these lines causing the problem. If I add just the ’session required pam_limits.so’ one, then the machine would reboot and let me log in perfectly normally. Add the line that specified the full path to the pam_limits.so file, however, and the ‘module is unknown’ error was back immediately.
Why that should be so, I have no idea -because a check before adding that line showed that the file did exist in that location:

If anything, therefore, you might have thought the line which was vague on the specifics of a file’s location would have caused more trouble than one which was precise and accurate. But apparently not!
Now, I started to wonder about this: GOAL has been working perfectly on physical servers (and virtual servers as guests in VMware Workstation) for a very long time with both these lines being added to the relevant file. Why should it suddenly stop working just because we’re suddenly using XenServer? Was it some weird interaction between a xen-enabled kernel and these pluggable authentication modules? I wasn’t even sure how to go about working out what particular element of the interaction was causing all the grief. So then I paused, backed up and had another look at the problem from a slightly different angle: maybe GOAL had been wrong all these months in the first place?
A quick review of Oracle-on-Linux installations around the Interweb at least showed I wasn’t insane: plenty of other people use the fully-qualified-path version of the pam_limits.so file as well as me. Here’s Tim Hall’s guide to a 10.2 installation on Red Hat 5, for example:

Note that he uses /lib/security/pam_limits.so, not just plain-vanilla “pam_limits.so” with no path. He does exactly the same in his installation guide for 10.1 on Red Hat 4, too. By the time he documents 11.1 and 11.2 installations onto Red Hat-like OSes, he’s telling people to use exactly the same two-versions configuration as GOAL has used all this time:

Lest it be thought I’m picking on Tim Hall’s documentation, however, let me say at once: everyone else does this, too. Here’s an article by John Smiley on Oracle’s very own website:

Spot the use of the fully-qualified filename!
The same author’s ‘install 11g‘ article was also wrong:

The funny thing about that is, as you’ll note, I can only show you the error these days by referencing the Google cache, because the original article has been corrected very recently (which is a good thing, of course).
Or there’s BDerzhavets 2007 blog post, which even appears to be a description of installing Oracle on Xen! Nevertheless, here’s the same instruction as everyone else has been giving all this time:

Once more, the use of the fully-qualified filename.
And so on and on. Pick almost any guide (see footnote 1 below!) to installing Oracle on Linux written in the past 5 years or so, and I’ll pretty much guarantee that it tells you to either include the path-specified version of pam_limits.so on its own, or to add two lines to the configuration file, one of which will be the path-specified version. So GOAL is in good company! (See footnote 2 below, too!)
The only trouble with all this good company is that we’ve all been wrong! Because if you look at the official Oracle installation guide for 10g Release 2 on Linux, it says this:

You’ll note that step 2 says nothing at all about adding two different versions of pam_limits.so -and uses the one version that does not mention a full path. And how about the official installation document for 11g Release 1 on Linux? Exactly the same thing: at the end of Section 2.8, you’re told simply to add the one path-less reference to pam_limits.so. And the same is true of the 11g Release 2 official documentation: just one reference to pam_limits.so, and no path required.
It’s interesting to think about how these documentation glitches arise in the first place -but more interesting to speculate about how they propagate themselves over the course of many years and vast numbers of installation guides! It’s the same mechanism that still has loads of guides telling you to stick a command to start the listener in your dbora script, even though that’s not been needed to automate listener startups since version 9, I guess: borrowing without referring back to the original, official documentation source, plus an understandable tendency to cut-and-paste instead of writing from first principles. Which is quite a separate issue from the question as to why advice made by GOAL and so many others to use a fully-qualified path-and-name version of the pam_limits.so file to your /pam.d/login file worked fine until it met Xen, of course, and I’m afraid I have no answer for that one. The fact is, however, that it’s GOAL and the other installation guides that are wrong, not the Xen-enabled environment. Do it the way the Oracle documentation tells you to do it and there are no problems with ‘modules not found’ at all.
All of which is an extremely long-winded way of saying that GOAL has been modified so that it now only makes the one, correct addition to the /pam.d/login file, regardless of whether you’re running virtualised or not. Which means it gets it correct on physical and virtual servers alike, as it always should have done. Apologies for not spotting my error much earlier. The correction has been applied to all four GOAL scripts, not just the Red Hat/Centos one.
Footnotes
1. I won’t list them all. But a quick bit of trivial Googling found this, this, this, this, this, this …and, well, you get the idea. All show (as at the time of writing) the use of a fully-qualified filename at the point of editing /pam.d/login. Incorrectly, as per the official doco. Naturally, a number of the authors of those sites might ’silently’ edit their work now to correct things. But the original screenshots exist, even if I haven’t bothered to show them here!
2. There are exceptions. For example, this site offers a 10g installation guide that is very non-standard in some respects… but it does manage to specify only the path-less version of the relevant edit. It was about the only guide I could find that did, though!