VMware SRM using HDS AMS 2000

I have been working recently configuring VMWare SRM using Hitachi AMS 2000 arrays. This has not been the most straightforward or well documented process. I hope this post will save someone else a little time.

In this post I will focus on the configuration and the required prerequisites for the HDS SRA 2.0 for VMWare SRM. During testing I used ESX 4.0 update1, vCenter 4.0 update1 and SRM 4.01.

When working with this solution there are some obvious pieces of documentation such as the following:

Hitachi Storage Replication Adapter Software VMware vCenter Site Recovery Manager Deployment Guide

Site Recovery Manager Administration Guide

There are also some other helpful pieces of documentation which may not be as obvious, such as:

Hitachi AMS 2000 Family Command Control Interface (CCI) Installation, Reference, and User’s Guides. These can be found on the HDS Support portal.

Implementing VMware Site Recovery Manager with Hitachi Enterprise Storage Systems Whitepaper

How To Debug CCI Issues 1.3 article on the HDS GSC website

Also, the sample horcm.conf file installed with the CCI has some good information in the comments.

Now on to the configuration.

Prerequisites

Before configuration of the HDS SRA is possible the following requirements must be in place:

  • Two HDS AMS2000 arrays connected by WAN (FC or iSCSI)
  • One VMWare vCenter installation in the primary/protected site
  • One VMWare vCenter installation in the secondary/recovery site
  • One VMWare SRM installation in the primary/protected site
  • One VMWare SRM installation in the secondary/recovery site
  • TrueCopy replication in place for LUN’s with protected datastores
  • SRM Sites paired

Test Environment

Our test environment consists of one cluster at the primary site and one at the secondary site. We have a test SharePoint environment which is stored across four VMFS datastores. See the diagram below.

Test Environment

Our goal for this test configuration will be to failover the SharePoint environment to the recovery site. We are replicating the LUN’s containing all of the SharePoint system data using TrueCopy Extended Distance. Also SRM is installed at both sites and the sites have been paired.

After the items above are in place we can move on to the configuring the SRA.

HDS SRA 2.0 Configuration

When configuring the storage replication adapter the primary documentation is the deployment guide referenced above. I thought there were a few things missing from the document.

The first step in getting the SRA configured is making sure you have a copy of the proper HDS CCI for your array firmware. The HDS SRA relies on the Hitachi Command Control Interface, which must be installed on the SRM servers. I installed the CCI in the default c:HORCM directory on both SRM servers. This is a straightforward install and is documented in the Hitachi AMS 2000 Family Command Control Interface (CCI) Installation Guide.

The portion of the CCI install that was tough for me was determining it needed to be installed as a service and creating the horcm.conf files. Our above example will only require two instances of the horcm service. One on each SRM server they will be HORCM0 and HORCM1.

To create the services we create the horcm_run.txt files and issue the following commands.

On the first SRM server – create the c:HORCMToolhorcm0_run.txt file and the execute the following command:
    C:HORCMtoolsvcexe /S=HORCM0 /A=C:HORCMToolsvcexe.exe

On the second SRM server – create the c:HORCMToolhorcm1_run.txt file and the execute the following command:
    C:HORCMtoolsvcexe /S=HORCM1 /A=C:HORCMToolsvcexe.exe

The horcmx_run.txt is created by making a copy of the file naming it appropriately and setting the HORCMINST variable to the correct instance number. This is documented in the file located in HORCMTool

After running these commands you should see the services appear in the windows services MMC

Then add the following lines to the %systemroot%driversetcservices file on each SRM server.

The file should appear as below with one blank line below the last horcm service

Once the services are installed the next step is to create the horcm.conf files. The first thing we need to do this is a command device. The SRA deployment guide left this step out. This is documented in the VMWare SRM with Enterprise Storage whitepaper mentioned earlier. Basically you create a small LUN and present it to the SRM server as a physical compatibility RDM. Then you initialize the disk and create a basic primary partition, but do not assign a drive letter or format it. I found one HDS document that said this LUN should be 33MB and one that said 36MB so I made it 40MB. Once this is done we have all that is needed to create the horcm.conf files.

HORCM0.conf on the SRM server at primary site

HORCM1.conf on the SRM server at secondary site

A couple of points to note on these files is the relationship between the hosts and devices in the group. The HORCM_LDEV section on each instance contains a reference to the half of the pair it controls. The HORCM_INST section contains a reference to the opposite instance in each file. Also the command device naming format. It consists of “.CMD-” followed by the array serial number and the LUN number “.CMD-11111112-4”. Now that we have the configuration files we copy them into the %windir% on their respective SRM servers and start the services.

After this is complete we can install the SRA. This is downloaded from the VMWare website and the executable is named RMHTCSRA.exe. It is a simple, no option, install. After this there are some environment variable which need to be set.

setx SplitReplication true /m
setx RMSRATMU 1 /m

Then reboot the SRM servers. We are now ready to configure the SRA using the SRM plug-in in vCenter.

Here we see the paired sites in site recovery.

 SRM Sites

Click on configure array managers and we see the following dialog.

 SRA Config 1

Click Add to add a protected site array manager and we see the following configuration dialog.

 SRM Config 2

We enter the name and HORCMINST=0 for the first instance on the primary server in the protected site. The we use a different name and HORCMINST=1 for the next instance on the secondary server in the recovery site. Here we see both sides configured.

 SRA Final ConfigSRA Final Config 2

The last step in the wizard allows us to confirm the SRA sees the replicated datastores properly.

 Replicated Datastores

We see the LUN numbers match the devices in the horcm<x>.conf files. The datastore group in this diagram consists of four LUN’s which also belong to the same TCE consistency group. These are the LUN’s being used by our test SharePoint application and database servers.

At this point we are now ready to complete configuration of the protection groups and recovery plans in Site Recovery Manager. The process for configuring these is documented in the SRM administration guide. Protection groups are configured at the protected site and recovery plans are configured at the recovery site. Here is a screenshot of the test recovery plan for our SharePoint environment.

Recovery Plan 1 

When we run a test on this recovery plan we can see the test runs successfully and waits for us to complete testing before clicking continue to return to a ready state.

Recovery Plan 2 

During this phase we can look at a couple of things to confirm what is happening in the process. One is the new datastores we will see in the configuration tab of the DR ESX hosts.

Datastore Snapshots/Replicas 

There was no need to change the LVM.EnableResignature or LVM.DisallowSnapshotLun settings at the host level in ESX 4 as this is enabled at the volume level and SRM handles this at the time of testing or failover. Another part of the process we can confirm at this time is the status of the TrueCopy pairs. Here we see the pairs are in split status.

TrueCopy Split Status

Now we can complete any other testing to confirm success of the test and then click continue in the recovery plan to return to a ready state. After the test completes we can see the datastores are removed from the recovery ESX hosts and the TCE pairs are returned to a paired status after resynchronization.

After the testing process is completed we can review some of the steps in the SRM logs. The logs are located under %allusersprofile% VMwareVMware vCenter Site Recovery ManagerLogs. These log entries and the HORCM logs under c:HORCMlog are the primary sources of information in troubleshooting problems with this process.

I hope someone finds this post useful. Next I am going to be testing this with secondary copies at the recovery site using Shadowimage and Copy-on-Write.
 

Regards,

Dave

HDS AMS 2000 Storage Resource Reporting with PowerShell

Welcome!

I have been creating a few PowerShell scripts for use with the HDS AMS 2000 array. One thing I found I needed was a quick way to look at DP(Dynamic Provisioning) pools, raid groups, and LUNs. I also wanted to be able to see the associations and filter easily. Since I am working on a new deployment I have been creating Raid Groups and Luns often. I needed a quick way to see what I currently had while creating new resources.

I created a PowerShell script that would quickly show existing resources by raid group or DP pool. It also uses nickname info for the devices that are maintained in three csv files(LU_Nicknames.csv,RG_Nicknames.csv,DP_Nicknames.csv). These are simple comma delimited text files which contain the ID and nickname of each resource. The files are updated as storage resources are added. This allows me to easily identify the resources and to filter for specific devices.

The script executes three HSNM2 CLI commands and reads the information into object form. The LUN information is then shown grouped by raid group or DP pool.

Here is the output with the nickname search parameter set to “DB”. This will return all database resources based on the naming standard. If this is left null it will return all resources.

Here is the script:

The script uses the start-session.ps1 file to establish connectivity with the HDS array. Additional information regarding the use of this include file can be found at this post. Then the script executes HSNM2 CLI commands to return information on DP pools raid groups and LUNS. The script uses regular expressions to parse the output and convert it into objects. It also reads in the nickname files and adds the data to the custom objects.

The objects are then output using the built-in PowerShell formating engine with a little custom formating thrown in for the group headers.

Here is an example nickname file:

I suppose this may not be necessary with the use of Device Manager, but I am still learning it and I could not quite get this view with it. Besides I am more of a scripting kind of guy. I also really like the output of this script as it gives me just the view of the array I need when I am allocating new storage and setting up new resources. I use this script in conjuction with two other scripts for creating LUN’s and raid groups. I plan to post those scripts soon.

Hope this helps,

Dave

HDS AMS 2000 Performance Analysis with PowerShell and PowerGadgets

Welcome!

It has been a while since my last post due to a very busy schedule with a SAN and virtualization project.

I have been working on an implementation of a HDS AMS 2500 midrange array for a VMWare vShere 4 environment. So far everything has been working and performing well. The management software included with the HDS AMS 2000 series array is SNM2(Storage Navigator Modular 2), A java based web application. This software also has a command line version which appears to be pretty comprehensive. It consists of a series of DOS executables, which can be run from PowerShell. There are a series of scripts I have been working on for viewing and creating storage resources on the array. I will share many of these in future posts. In this post I want to share some scripts I have written to extend the functionality of the performance monitoring utility in SNM2.

The base functionality of the array allows you to capture performance statistics to a text file. The file can be captured manually or automatically for a specified time period and interval down to one minute. One text file is produced per capture or all captures can be written to one file. Also, I believe based on the information I read in the SNM2 manual you can do some graphing with the web interface, but it requires an additional license and personally I think the PowerGadgets graphs are better.

The 4 scripts I have started with are get-performance_processor.ps1, get-performance_ports.ps1, get-performance_raidgroups.ps1, get-performance_luns.ps1, which do pretty much what they say and produce the following PowerGadgets charts.

Chart

The chart group is a tabbed interface which allows you to tab through the controllers and ports/RG/LU/Procs depending on the script being used. Each script generates different groups of charts for different performance counters. I have not implemented all of the performance counters just the ones which are most important to me now. I will be improving these scripts over time and implementing more counters. Here is an example of how the script works.

Script

After executing the script it will ask whether or not to collect data, if yes it will prompt for interval in minutes and time period. If no it will use previously collected data in the default output directory. Next it will ask to list data in text output. Then it will prompt for generation of each group of charts for ports, raid groups, luns or processors depending on the script run.

Now to the script. All of the scripts rely on the start-session.ps1 script and also require a password file be set for logging into the array. Additionally, an array has to be registered.

Example 1 shows a PowerShell script which will register an array and set the admin password.

You will need to replace ARRAYNAME, USERNAME and the IP Addresses for your environment.

 Example 2 shows the start-session PowerShell script which defines environmental information.

You will need to change the paths and ARRAYNAME for your environment.

Example 3 shows the get-performance_processor.ps1 script

This script collects the data from the array in separate files. Reads the pertinent data from the files and transforms it into object form which is fed into the PowerGadgets out-chart cmdlet. The other three scripts are longer as they digest more information.

To use these scripts you will need PowerShell, PowerGadgets( this a pay product with a free trial ), SNM2 CLI, and the script files attached to this post. Oh and an HDS AMS 2000 array.

Here are the script dowmloads
start-Session.txt
get-performance_processor.txt
get-performance_ports.txt
get-performance_raidgroups.txt
get-performance_luns.txt

Save the files to your script directory and change the extensions to .ps1

I hope someone finds this useful.

Regards,

Dave