(based on openQRM-version 3.1.4)
by Matt Rechenburg
Creating a dynamic scalable and flexible web-farm hosting virtualized “root” web-server.
A common and required setup for ISP's which are selling dedicated web-servers with “root-access” to their customers is to create, manage and monitor a huge (and growing) number of virtualized systems.
This How-To will describe how to setup such an environment using the openQRM data-center management platform. It will go through the full installation steps, deployment of the “virtualization hosts”, creation and administration of the “partitions” up to configuration-management, monitoring and high-availability.
“filesystem-image” - A servers root-filesystem located on a storage server
“boot-image” - A linux kernel “image” containing the kernel-file, the kernel-modules and an initrd.
“virtual environment” (VE) - Virtual environemnts in openQRM are logical abstractions of services and how they should be deployed e.g. which kernel (boot-image), which rootfs (filesystem-image), single-server or cluster, special hardware profile, SLA's etc.
“virtualization host” - A server providing a virtualization technology (e.g. VMware, Xen, Linux-VServer, Qemu, ..). This is the system on which “partitions” are created on.
“partition” - Partitions are slices of the “virtualization host”. They are the virtual machines started on behalf of the “virtualization host” providing a separated and isolated operation system.
“resource” - Server-system, either physical or virtual. Systems added to openQRM are booting as “idle”. They get a small operation system stack per netboot and stay in this “available” status until they get provisioned (deployed) with some “virtual environment” (VE)
- one server-system dedicated for the openQRM-server
- one server-system to deploy a “virtualization host”
- one or more servers setup as dedicated storage-servers
- one or more hot-standby's to provide high-availablitiy for the openQRM-server - a dedicated server hosting the database for the openQRM-server
- any number of additional systems to enlarge the number of “virtualization-hosts”
1) choosing your preferred linux distribution
openQRM supports many different linux distribution e.g. RHEL, FC, Suse, Centos, Debian, …. For this How-To i selected FC4 (Fedory Core 4) but it will (should) work with every other major/modern linux-distribution too. During the installation please configure a static ip-address for the first network-interface.
In this How-To the ip of the openQRM-server will be “192.168.88.179” !
Please also make sure to disable the firewall and SELinux.
2) install the base operation system and the openQRM packages
Please notice that this How-To will make use of the fast-cloning capabilities of LVM2. Please make sure to create a dedicated LVM volume-group for storing the server-images !
In this How-To this volume-group is named “vol” !
After installing the base operation system this How-To requires the following set of openQRM-packages to be installed :
openqrm-core-base The base openQRM-server (Version 3.1.4)
openqrm-plugin-dhcpd Provides the dhcpd-server for ip-management and netbooting (Version 1.0 for openQRM 3.1.4)
openqrm-plugin-tftpd Features the tftpd-server for netbooting servers (Version 1.0 for openQRM 3.1.4)
openqrm-plugin-xen Provides the Xen virtualization technology for openQRM (Version 0.6 for openQRM 3.1.4)
openqrm-plugin-puppet Configuration management for the virtualized web-servers (Version 0.2 for openQRM 3.1.4)
openqrm-plugin-nagios Enhanced monitoring for the managed servers via Nagios (Version 0.2 for openQRM 3.1.4)
openqrm-plugin-apache Provides an apache web-server within openQRM, required for the Nagios-plugin (Version 0.2 for openQRM 3.1.4)
openqrm-plugin-lvm-mgmt Very useful, provides fast-cloning of filesystem-images via LVM2 (Version 0.2 for openQRM 3.1.4)
openqrm-plugin-sshlogin Nice to have, provides ssh-login to the managed servers within the web-browser (Version 2.0 for openQRM 3.1.4)
openqrm-plugin-webmin Nice to have, provides a web-interface for let the users administrate their systems (Version 0.3 for openQRM 3.1.4)
openqrm-extras-mysql Install if you want openQRM automatically setup a mysql-database for you (Version 0.0 for openQRM 3.1.4)
For this How-To the current latest (stable) openQRM version 3.1.4 is used. Please find the above packages for the 3.1.4 release on the source-forge download section at : http://sourceforge.net/project/showfiles.php?group_id=153504
Ininitalyze and start openQRM by running the qrm-installer :
cd /opt/qrm ./qrm-installer -i -c
From the “Components” menu choose :
apache dhcpd lvm-mgmt nagios puppet sshlogin tftpd xen webmin
In the “Config” menu choose :
Basic -> configure the base-system, if you do not have a special setup you can go with the defaults
Now select “Exit” and “Save”. The installation procedure will now configure and start the openQRM-sever. You can “watch” the start-up by running “tail -f /var/log/qrm/qrm.log”
After startup you can access the openQRM-management console via a webbrowser at :
http://[ip-address-of-the-openQRM-server-system]
Please login as “qrm” with the password “qrm”.
One of the first things you should do is to change this default password !
(Screenshot of the openQRM-management console after installation)
To add a first “resource” to the openQRM-environment simply start a server via netbooting. To enable netbooting please set the systems bios to “PXE”
The resource will boot-up via the network and being added to the openQRM-management GUI as an “idle” resource in maintainance state. New systems are “arriving” in openQRM in a maintainance mode which prevents them for direct provisioning. To make them available please disable the maintainance mode for the resource you would like to use for deployment.
(Screenshot of the resource-overview, one “idle” system in maintainance-mode)
Please make sure to have the “maintainance mode” disabled for this “idle” system vi the openQRM-GUI at “Resources” - “Action” - “disable maintainance mode”
It should look like this :
(Screenshot of the resource-overview, one “idle” system with the maintainance-mode disabled)
For creating a “filesystem-image” first a storage-server needs to be defined in openQRM.
In the management GUI please click on “Management Tools” - “Storage” - “Servers”. Then in the upper right menu select “Add new storage server”. Fill out the storage-server form, give a name e.g. “lvm-nfs”, select the “lvm-nfs” storage-server type, give the ip-address of the openQRM-server. Also you need to add the following environment variable for this storage-server :
SERVER_VOL=vol
Then submit to create and save this new storage-server.
(Screenshot of the creation of a storage-server in openQRM)
btw: The new storage-server types “lvm-nfs” and “lvm-iscsi” are provided by the lvm-mgmt plugin. Both types are using the LVM2 snapshot-features to enable the fast cloning of filesystem-images.
Since we want to use the fast-cloning features of the lvm-mgmt plugin we need to create an logical-volume in the volume-group dedicated for the filesystem-images. To automatically create a logical volume for the first filesystem-image please use the “qrm-lvm-manage-images” util provided by the lvm-mgmt plugin in the following way :
[root@demo ~]# cd /opt/qrm/plugins/lvm-mgmt/bin/
[root@demo bin]# ./qrm-lvm-manage-images
Usage : ./qrm-lvm-manage-images add/remove/snap/list <image-name> <-v volume-group>
[-m size in MB]
[-s image-snapshot-name]
[-i ip-address of the storage-server]
[root@demo bin]# ./qrm-lvm-manage-images add xen_host -v vol -m 5000
Creating logical volume xen_host size 5000 using volume group vol
Logical volume "xen_host" created
Detected NFS-image. Mounting and adding xen_host to /etc/fstab + /etc/exports
mke2fs 1.38 (30-Jun-2005)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
640000 inodes, 1280000 blocks
64000 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1312817152
40 block groups
32768 blocks per group, 32768 fragments per group
16000 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 28 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
[root@demo bin]#
This automatically created a logical-volume named “xen_host”,formatted it with ext3, added it to /etc/fstab and /etc/exports and reloaded the nfs-server so that the /vol/xen gets exported. Here to re-check everything went fine :
[root@demo bin]# df /vol/xen_host
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/vol-xen_host
5039616 141216 4642400 3% /vol/xen_host
[root@demo bin]# cat /etc/fstab | grep xen_host
/dev/vol/xen_host /vol/xen_host ext3 defaults 1 1
[root@demo bin]# exportfs | grep xen_host
/vol/xen_host <world>
[root@demo bin]# ls /vol/xen_host
lost+found
[root@demo bin]#
The image-store was created correctly. Now lets fill it with content using the “qrm-filesystem-image” util :
[root@demo ~]# cd /opt/qrm/bin/
[root@demo bin]# ./qrm-filesystem-image create
[root@demo bin]# Usage ./qrm-filesystem-image create
-s/--filesystem-image <image-name>
-t/--target <storage-server-name:/full-path/image-name>
-l/--location <hostname:/rootfs-location>
[-u/--username <username>] mandatory unless -o flag used
[-p/--password <password>] mandatory unless -o flag used
[--private-exclude <private-excludes>]
[--private-add <private-adds>]
[-e/--exclude <exclude-directory>]
[-a/--arch <Opteron|Any|Xeon|i686>]
[--keep-services]
[--remove-services]
[--keep-network]
[--remove-network]
[--hostname <hostname>]
[-o/--only-physical]
[--shared]
[--operation-system <operation-system>]
[-h/--help]
[root@demo bin]# ./qrm-filesystem-image create -s xen_host -t lvm-nfs:/vol/xen_host -l 192.168.88.179:/vol/xen_test_host_nfs -u qrm -p qrm -e /opt/qrm/ -e /vol/ --remove-network --keep-services --shared
The next step will create a Qrm-image from the system 192.168.88.179
(you will be prompted for the password of root@192.168.88.179)
Press <ENTER> to continue
Creating filesystem-image xen_host from 192.168.88.179
Transfering the image content from 192.168.88.179://vol/xen_test_host_nfs/
(this procedure may take time)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Tranfer of the filesystem-image completed. Now preparing the image
/opt/qrm/bin
Found already existing /tmp/u25320/private/shared.conf file and keeping it
starting service configuration
Staying with the current service configuration
starting network configuration
Removing current network interface configuration
Removing configuration for eth0 (backed up in /var/qrm/backup/09:12-06.13.2007/ in image)
Removing configuration for eth1 (backed up in /var/qrm/backup/09:12-06.13.2007/ in image)
Successfully created the filesystem image xen_host
[root@demo bin]#
The filesystem content was transferred successfully :
[root@demo bin]# cd /vol/xen_host [root@demo xen_host]# ls bin cdrom1 etc lib misc opt proc sbin sys usr boot custom home lost+found mnt poweroff reboot selinux syslog var cdrom dev initrd media net private root srv tmp vol [root@demo xen_host]#
Now lets directly create another filesystem-image to later use as the web-server template for the end-users. The image-store for this fs-image will be created in the same way as for the xen_host image via the “qrm-lvm-manage-images” util :
[root@demo ~]# cd /opt/qrm/plugins/lvm-mgmt/bin/
[root@demo bin]# ./qrm-lvm-manage-images add webserver -v vol -m 5000
Creating logical volume webserver size 5000 using volume group vol
Logical volume "webserver" created
Detected NFS-image. Mounting and adding webserver to /etc/fstab + /etc/exports
mke2fs 1.38 (30-Jun-2005)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
640000 inodes, 1280000 blocks
64000 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1312817152
40 block groups
32768 blocks per group, 32768 fragments per group
16000 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 26 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
[root@demo bin]#
Now copy the filesytem-image content from the “xen_host” to the “webserver” :
[root@demo ~]# cd /vol/webserver/ [root@demo webserver]# /bin/cp -aR /vol/xen_host/.autorelabel . [root@demo webserver]# /bin/cp -aR /vol/xen_host/* . [root@demo webserver]#
and add the new filesystem-image “webserver” to the openQRM-server via the qrm-cli :
[root@demo ~]# [root@demo ~]# /opt/qrm/bin/qrm-cli -u qrm -p qrm filesystem add -n webserver -s lvm-nfs -i /vol/webserver -a Any [root@demo ~]#
Now lets take a look at the filesystem-image section in the openQRM-GUI :
(Screenshot of the filesystem-images overview)
It is now time to create a “virtual environment” for the “xen_host” and deploy it to the available “resource”.
In the openQRM-GUI select “Virtual Environments” and click on “New virtual environment” in the upper right “tools” menu. Fill in the name for the new VE → “xen_host”. Enable “Partitioning” via the checkbox and select “Xen” as the virtualization-technology. As kernel-image please select the “Xen-Hypervisor” boot-image. As filesystem-image choose the “xen_host” image and enable the checkbox “Multi-Server” and “same for all”.
With this setup the VE will start the “virtualization host” as a cluster using a shared root-filesystem (SSI - single system image). That means you can later simply add and remove “resources” to the “xen_host” cluster to scale up or down.
(Screenshot of the creation of the xen_host VE)
To be able to successfully deploy the “virtualization host” (xen_host) we must make sure that also non-PXE clients will get an ip-address from the dhcpd-server provided by the dhcpd-plugin.
The default dhcpd-configuration of the dhcpd-plugin only servers PXE-clients by creating a class “NOTPXE” !
Please make sure to remove the following section from the dhcpd.conf file at /opt/qrm/plugins/dhcpd/etc/dhcpd.conf :
---------------------------------
class "NOTPXE" {
match if substring (option vendor-class-identifier, 0, 9) != "PXEClient";
ignore unknown-clients;
}
pool {
range 192.168.88.255 192.168.88.255;
allow members of "NOTPXE";
}
---------------------------------
The dhcpd.conf file should now look similar to this one (possibly with other ip-addresses) :
[root@demo ~]# cd /opt/qrm/plugins/dhcpd/etc/
[root@demo etc]# vi dhcpd.conf
[root@demo etc]# cat dhcpd.conf
allow booting;
allow bootp;
# Standard configuration directives...
option subnet-mask 255.255.255.0;
option broadcast-address 192.168.88.255;
option routers 192.168.88.1;
option domain-name "juggle";
option domain-name-servers 192.168.88.10;
ddns-update-style ad-hoc;
next-server 192.168.88.179;
# Group the PXE bootable hosts together
group {
# PXE-specific configuration directives...
option vendor-encapsulated-options 09:0f:80:00:0c:4e:65:74:77:6f:72:6b:20:62:6f:6f:74:0a:07:00:50:72:6f:6d:70:74:06:01:02:08:03:80:00:00:47:04:80:00:00:00:ff;
subnet 192.168.88.0 netmask 255.255.255.0 {
default-lease-time 21600;
max-lease-time 43200;
# the range to serve
range 192.168.88.189 192.168.88.199;
filename "/pxelinux.0";
}
}
[root@demo etc]#
After that please manually restart the dhcpd-plugin :
[root@demo ~]# /opt/qrm/plugins/dhcpd/etc/init.d/dhcpd restart stopping Qrm dhcpd plugin starting Qrm dhcpd plugin [root@demo ~]#
Now lets deploy this “xen_host” to the available “resource” by simply starting the VE.
In the openQRM-GUI click on → “Virtual Environments” - “Actions” - “Start”
The “idle” node will reboot now and start the “xen_host” VE.
(Screenshot of the active xen_host VE)
We are now ready to create and start a first “partition”.
Please go to the “Resources” overview in the openQRM-management GUI. Click on the “resource” which is running the “xen_host” VE. Then select the “Partition” tab and fill out the form e.g.
1 partition 1 CPU 400 MHz CPU-speed 148 MB memory (leave "local disk" as is / 0.0)
(Screenshot of the partition creation)
Now click on “Start partition”.
The partition is now created and starting on the “xen_host”.
(Screenshot of the starting partition)
To check the partitions start-up you can use the Xen-console. Click on the “partition” in the resource overview, in its resource-menu on the right select “Xen-Config” → “open console”. A java-applet will be opened in your browser connected to the console of the starting partition.
(Screenshot of the starting partitions console)
When it is fully up and running it is “just” another “idle” resource. Please notice the following screenshot with already disabled maintainance mode for the started partition :
(Screenshot of the “idle” partition after startup)
We created the “webserver” filesystem-image but we actually do not want to deploy it directly. We want to use it as a server-template, a so called “golden image”. For each customer we simply create a (LVM) snapshot of the “golden image” and use this for deployment.
As an example we create a VE “customer1” for the first customer.
First lets automatically create the filesyste-image clone from the “golden image”. Please go to “Filesystem-images” and select “Add new filesystem-image” from the upper right “Tools” menu. Name it “customer1”, select the “lvm-nfs” storage-server, give ”/vol/customer1” as the identifier and add the following “environment variables” :
CREATE_PHYSICAL=yes CREATE_FROM=webserver
This tells the lvm-mgmt plugin to physically create this image from a snapshot of the “webserver” fileystem-image. Clicking on “Save” will logically and physically create the filesytem-image “customer1”.
(Screenshot of cloning the webserver “golden-image”)
To re-check run “df” on the openQRM-server :
[root@demo ~]# df
...
/dev/mapper/vol-webserver
5039616 2114028 2669588 45% /vol/webserver
/dev/mapper/vol-customer1
5039616 2114028 2669588 45% /vol/customer1
[root@demo ~]#
The filesystem-image “customer1” is now a snap-shot of the “webserver” filesystem-image ready for deployment. Lets quickly adapt the hostname of this new image :
[root@demo ~]# [root@demo ~]# cat /vol/customer1/etc/sysconfig/network NETWORKING=yes HOSTNAME=localhost.localdomain [root@demo ~]# cat /vol/customer1/etc/sysconfig/network | sed -e "s/HOSTNAME=.*/HOSTNAME=customer1/g" > /vol/customer1/etc/sysconfig/network.new [root@demo ~]# mv /vol/customer1/etc/sysconfig/network.new /vol/customer1/etc/sysconfig/network mv: overwrite `/vol/customer1/etc/sysconfig/network'? y [root@demo ~]# cat /vol/customer1/etc/sysconfig/network NETWORKING=yes HOSTNAME=customer1 [root@demo ~]#
Now lets create the VE for the first customer. Please go to “virtual environments” - “Tools” - “Add new virtual environment”. Fill out the form in the following way :
Name : customer1 Kernel: Xen-Hypervisor Filesytem : customer1 Minimal Resource Profile ; Partition "Xen" !
(Screenshot of the VE creation for customer1)
Now, before starting this “customer1” VE, please add some Nagios Nagios system- and service checks in the “Nagios Config” tab for this VE e.g.
(Screenshot of the Nagios-check configuration of the customer1 VE)
Now start the “customer1” VE from the main “virtual environments” overview. The “customer1” VE now gets active.
(Screenshot of the active customer1 VE)
Now you can access the “customer1” virtualized server via the Xen-console provided by the Xen-plugin
(Screenshot of accessing the customer1 VE via the Xen-web-console)
You can also access “customer1” by ssh via the sshlogin plugin
(Screenshot of accessing the customer1 VE via the Sshlogin plugin)
and you can access the “customer1” systems webmin-console within the openQRM-GUI
(Screenshot of accessing the customer1 VE's webmin-console)
Additional you get the full system- and service status via the Nagios plugin
(Screenshot of the system- and services status)
Oh, Oh, yes, httpd is not yet configured and running on this “partition”.
We will fix this immediatly and automatically via the puppet plugin.
The puppet-configuration management util is based on hostnames so first we need to add the ip-address of the partition plus its hostname to /etc/hosts on the openQRM-server. Otherwise you can of course also add this ip-to-name resolving to your DNS-server. In our case the hostname of the partition is “customer1” and its ip-address is 192.168.88.197. This will be diffrent in your setup, please look-up this values in the openQRM-GUI. Now adding the hostname of the partition :
[root@demo ~]# echo "192.168.88.197 customer1" >> /etc/hosts [root@demo ~]#
Now we need to sign the request for a puppet-certificate for the new customer1 system. To list all waiting request please run on the openQRM-server :
[root@demo ~]# puppetca --list customer1 xen_host-1 [root@demo ~]#
To sign the request from the customer1 system run :
[root@demo ~]# puppetca --sign customer1 Signed customer1 [root@demo ~]#
On the customer1 server you should find something like :
... May 18 13:14:50 customer1 puppetd[3104]: Did not receive certificate May 18 13:16:50 customer1 puppetd[3104]: Got signed certificate May 18 13:16:50 customer1 puppetd[3104]: Starting Puppet client version 0.22.4 May 18 13:16:57 customer1 puppetd[3104]: Starting configuration run May 18 13:16:57 customer1 puppetd[3104]: Finished configuration run in 0.33 seconds ..
in /var/log/messages.
We now need to configure the puppet-manifest for the customer1 server. In the openQRM-GUI please go to “Management Tools” → “Puppet”. Click on “classes” in the Java-based puppet-manifest editor.
(Screenshot of the manifest-editor provided by the puppet-plugin)
You will find a pre-configured “sudo” class as an example provided by the puppet-plugin by default. We now want to create a “httpd” manifest for the customer1 system. Fill “htppd.pp” in the text-fbox and clicke on “Create new file”. Then click on “Edit” and cut-and-paste the following puppet-manifest “snipplet” in :
class httpd {
exec { "Start httpd":
path => "/bin:/usr/bin",
command => "/etc/init.d/httpd start",
}
}
Then uncheck “write backup” and hit “save”.
This created a new puppet class “httpd” which makes sure httpd is installed and started. We now need to associate this “httpd” class with the “customer1” system. Please go up one directory in the puppet-manifest editor and edit the “site.pp”. Add the following configuration for the “customer1” system in the “site.pp” :
node customer1 {
include httpd
}
Uncheck “write backup” and hit “save”.
Hint: Puppet makes a diffrent of “hostname” and “hostname.domainname” ! I experienced probs with the syntax in site.pp when using additional domainnames for the puppet-clients. To make sure the puppet-client start without a domainname please remove the “search” line from the /etc/resolv.conf on the guests (in this example our “customer1” system).
:) That's it. After a short while the “customer1” server will get an configuration update from the puppetmasterd running on the openQRM-server. This will include the new httpd-configuration for the “customer1” system which will then automatically start httpd.
Please this this “httpd” puppet-class just as a small example how you can use Puppet for enhanced configuration management of your systems.
Also please feel free to post your Puppet-receipts to the openQRM- and Puppet-website.
Finally the Nagios system- and service-check for the “customer1” system looks fine too:
(Screenshot active deployed customer1 VE running a webserver)
Here some suggestion how to even enhance this powerfull, virtualized environemnt :
etc.
We have selected “Xen” as virtualization technology for this How-To.
Of course openQRM is not limited to “Xen” only but you can simply select which virtualization method fits bets to your applications and services.
Supported virtualization technologies by openQRM are :
openQRM website
openQRM project website
http://sourceforge.net/projects/openqrm
Xen
http://www.cl.cam.ac.uk/research/srg/netos/xen/
Webmin
JTA - Telnet/SSH for the JAVA™ platform
http://javassh.org/space/start
Puppet automated configuration management
http://reductivelabs.com/projects/puppet/
Nagios system- and service monitoring
Hope you enjoyed this openQRM How-To.
Matt R.
…. to be continued ;)