In this article I shall discuss about the migration of a OES cluster from one storage area network (SAN) to a new SAN.
In a customer's environment, they are using a OES1-Linux two-node cluster to host their iPrint service. The servers in the cluster access two LUNs on the current SAN network – one for the iPrint resource, one for the Split Brain Detector (SBD). There are two paths to each of the LUNs thus resulting in two devices registered in Linux for each LUN. For example, a single LUN is seen by Linux as /dev/sdb and /dev/sdg.
In the original cluster the mount point for the iPrint partition is /mnt/mount_iprint. The iPrint container name and volume name were cont_iprint and iprint_vol respectively. Before performing the actual transfer of SAN, I set the iPrint resource in the cluster to be manually started using iManager (in other words, no auto-start). This is so that when the cluster is moved to the new SAN, the resource will not be started automatically before the cluster is verified to work correctly.
Then I shut down the iPrint service and mounted the partition manually. The service was shut down so that there will be no change to the files while backup was being performed. The manual mounting of the partition is done with the help of the Novell Cluster Service (NCS) scripts. After mounting the partition, I copied the contents of the partition into a local directory.
#start the script to use Novell NCS functions
#activate the partition
node1>activate_evms_container cont_iprint /dev/evms/cont_iprint/iprint_vol
#if directory does not exist,
node1>mkdir -p /mnt/mount_iprint
#mount the partition
node1>mount -t reiserfs /dev/evms/cont_iprint/iprint_vol /mnt/mount_iprint
#create backup directory and backup the files
node1>mkdir -p ~/backup_iprint
node1>cp -a /mnt/mount_iprint/* ~/backup_iprint
I proceeded to un-mount this partition.
#unmount the partition
#deactivate the partition
I then did a grep search in the /etc folder to look for configuration files with the words cont_iprint or iprint_vol. The files should be found in the /etc/opt/novell/ncs and /var/opt/novell/ncs directories. These files were also backed up as they need to be modified later on.
The LUNs were carved by the SAN storage guys. After the LUNs were created, I ran multipath -l but the new LUNs were not detected by the servers. The fastest way I knew to make the LUNs visible was to reboot the servers. After the servers finished rebooting, the new LUNs were recognised as disks at the directory /dev/disk/by-name/. An example is:
After the LUNs were detected by the multipath daemon, the next step is to use EVMS to manage the new LUNs so that they may be shared by the nodes in the cluster. The steps to create traditional Linux volumes on shared disks are listed in the documentation. (http://www.novell.com/documentation/oes/cluster_admin_lx/index.html?page=/documentation/oes/cluster_admin_lx/data/h2mdblj1.html)
In OES1 Linux, you need special root privileges to start the EVMSGUI application in graphical mode. To do so, when switching from a normal user account to a root user account, instead of using the
su command, I used the
sux command to switch to the root user account.
Before starting EVMSGUI, I ensured that all the nodes that are in the cluster are powered on and the novell-ncs process is running on them. Then, as the root user, I started EVMSGUI by running on the command line:
Upon starting EVMSGUI, I was prompted with a message saying that there is an “invalid or uninitialized partition found” on the drive. The message will appear for each new LUN created (see Figure 1). For the new LUN to function as a cluster volume, it must not be a compatibility volume and any existing segment manager must be removed.
Figure 1: Invalid Partition Message
New SAN LUNs will appear as logical volumes on the disk. See Figure 2 for the LUNs that are newly detected by my server (the two new volumes with the very long string of numbers). Right-click on the volume and select Display Details. On the Page 2 tab, look at the Status field to see if the volume is a compatibility volume or if it has another segment manager on it. If so, right-click on the volume to delete it.
Figure 2: EVMSGUI Showing the New Uninitialized LUNs
Then go to the Disks tab, right-click on the disk and select remove segment manager from Object if there is another segment manager for the disk. When prompted by the dialogue boxes, I chose “Recursive Delete”, “Write Zeroes” and “Continue” respectively. (See Figures 3 to 5.)
Figure 3: Dialogue Box Showing the Volume to be Deleted
Figure 4: Second Dialogue Box that Prompts for Next Step
Figure 5: Third Dialogue Box Prompting for Confirmation on Deletion
To create a Cluster Segment Manager, click on Actions > Create > Container. Select Cluster Segment Manager followed by the disk which the segment manager is to be created on. (The disk is available only after the original segment manager is removed.) Select Private as the type and enter a name for the container. I named the container iprint_c. See Figures 6 to 9.
Figure 6: Select Cluster Segment Manager
Figure 7: Select the Device to Create the Cluster Segment Manager
Figure 8: Select Private as the Type of Container
Figure 9: Dialogue Box Showing Container Created Successfully
From the Container tab, modify the properties of the newly created container and mark it active. (This step is important.)
After creating the segment manager, the next step is to create the EVMS volume. Click on Actions > Create > EVMS Volume. (See Figure 10.) I named the volume iprint_v.
Figure 10: Creating a New Volume on the Container
Lastly the file system has to be created on the volume. (You might have to remove what EVMS thinks is the original file system before you can create a new one – Figure 11.) In the Volumes tab, right-click on the newly created volume and choose Make File System (Figure 12). The LUN is now ready for the iPrint resource.
Figure 11: Removing an Existing File System
Figure 12: Choosing the File System to Create on the Volume
Deactivate the Cluster
As described in the documentation, there should only be one SBD partition for the cluster at any one time, so before the new SBD can be created on the new SAN, the cluster had to remove the old SBD. To check whether there is an existing SBD,
In my case, I had to first bring down the cluster. At the master node, I bring down the entire cluster with:
After the servers have left the cluster, then the SBD can be safely deleted. Before deleting the SBD partition, I created the new SBD partition first. This is done with EVMSGUI.
Creating the SBD Partition
From a cluster node, enter EVMSGUI. Then click Action > Create > Segment > Network Segment Manager. Select Free Space Storage Object. I chose to use the entire partition size for the SBD. At this point, if you check
/dev/evms/.nodes/, you will find that there are two SBD partitions which are named <cluster_name>.sbd and <cluster_name>.sbd1. This implies that there are now two SBD partitions. However, the second SBD partition is not recognised until the first one is deleted.
Reminder: before the first SBD partition can be deleted, keep only one node in the cluster connected. If there are many nodes in the cluster, use
cluster down to bring down the cluster. Then use
cluster join from the master node to keep the master node connected. If there's only one slave node in the cluster, from the slave node, run
cluster leave to disconnect that single node only. Either way works, it's a matter of choice. Just remember to leave only one node connected to the cluster.
From the connected node, start EVMSGUI. In EVMSGUI, choose the Available Objects tab, right-click on <cluster_name>.sbd and delete it.
Figure 13: Delete the SBD Partition from the Available Objects Tab
After deleting the SBD partition, the second SBD partition, originally named cluster_name.sbd1, will be automatically renamed as cluster_name.sbd. (If it does not, run
sbdutil -i -p /dev/evms/.nodes/cluster_name.sbd.) To verify that the SBD partition is residing on the new LUN, run EVMSGUI and there should only be one cluster.sbd under the Available Objects tab.
After I had done the above, I ran
cluster down from the master node and connected the slave node to the cluster. From experience, the slave node needs a while before it reconciles itself with the new SBD. Then start EVMSGUI to check that it is also using the new LUN partition. You can check by looking at the WWID of the partition.
Restore Backup Data to New Partition
Before the cluster is re-created to use the new SAN, the data from the old iPrint partition should be copied over to the new partition. To do that I used the script that comes with NCS to facilitate the activation and deactivation of EVMS volumes. Here is what I did:
node1>activate_evms_container iprint_c /dev/evms/iprint_c/iprint_v
node1>mount -t reiserfs /dev/evms/iprint_c/iprint_v /mnt/mount_iprint
node1>cp -a ~/backup_iprint/* /mnt/mount_iprint/
With the commands above, the new iPrint partition is mounted at /mnt/mount_iprint. The contents of the old iPrint partition is copied over to this location. I did this from the master node. I proceeded to then un-mount this partition and mount it on the other node to ensure that the data is intact.
On the slave node, I ran the same thing as I did on the master node to mount the iPrint partition. Then I did a listing on the directory's contents to ensure that the data is readable on both nodes.
The configuration files that contain the name of the previous container cont_iprint and volume iprint_vol had to be changed to use the new container (iprint_c) and volume (iprint_v). These files can be found in /var/opt/novell/ncs and /etc/opt/novell/ncs. In my case, I simply edited the iprint_storage.load and iprint_storage.unload files. I probably could have done this from iManager, but I prefer to do it with the command line.
Recreating the Cluster
According to the instructions in the documentation, one would need to perform the steps below.
Since the LUN for the SBD was already created, I simply need to run the sbdutil command to make the LUN the SBD partition:
sbdutil -c -d /dev/disk/by-name/3600a0b8000292ad80000c7f94861b21
After creating the SBD partition, you must edit the Cluster object in eDirectory and enable the Shared Disk Flag attribute. You must then save changes and reboot the cluster. To do this:
The cluster should now be working off the new SBD.
To verify that the cluster is working fine, from one of the servers run the command:
The output should show that the cluster is up and running with the two nodes. Additionally the Master_IP_Address_Resource item should also running.
Once this is done, access the iManager interface through the cluster's IP address. If the cluster works, this should proceed with no problems.
Once the iManager can be accessed from the cluster IP address, I then proceeded to start the iPrint resource from iManager. It should be running with no problem.
Then proceed to bring down the resource manually and bring it up on the other node.
After the testing is satisfactory, the iPrint resource can then be set to start automatically.