Splunk – Quantitative Finance Journey – The Beginning

Hello again, it has been quite a while since last posting as I have been busy with a lot of changes. I have worked for Splunk for over a year now and I am really enjoying it. Great company, awesome technology, and a bunch of smart energetic people.

As always I like to post about unique items I feel may be useful to others. I have been working on something that might fit the bill and I am anxious to see if it may be of interest to others. Over the past 24 months or so I have been studying investing, trading, and quantitative finance. Concurrently, I have also been working to become more proficient with Splunk. I like to combine activities to gain momentum so I decided stock market and economic data would be the perfect way to dig deeper into Splunk and hopefully improve my investing/trading. In the beginning I only looked at it as a way to learn more about Splunk while using data that was interesting to me. However, as I dug in I found the Splunk ecosystem and world of Quantitative finance have a lot of similarities. The primary ones being lots of data, Python, and machine learning libraries.

In the world of quantitative finance, Python is very widely used. In fact, Pandas, a commonly used Python library was created in a hedge fund. The Python libraries used in quantitative finance are substantially the same libraries provided in the Python for Scientific Computing Splunk App. Additionally, much of the financial and market data provided by free and pay sources is easily accessible via REST API. Splunk also provides the HTTP Event collector (HEC), which is a very easy to use REST endpoint for sending data to Splunk. This makes it relatively easy to collect data from web API’s and send to Splunk.

I promise I will get to a little meat in this post, but I would like to provide some background. I am starting the second iteration of a Splunk app and set or data load/sync scripts. I plan to write about my journey, the code and the solution along the way. I hope to get some feedback and find out if this Splunk app would be desirable to others. If so we’ll see where it goes.

When starting to do trading research I found there were various places to get market and economic data. Places like the Federal Reserve (FRED), the exchanges, the census, the bureau of economic analysis, etc. In the end I found I could get most of the core data I wanted from three places;

  • Federal Reserve Economic Data (https://fred.stlouisfed.org/) – FRED is an economic data repository hosted and managed by the Federal Reserve Bank of St. Louis.
  • Quandl (https://quandl.com) – This is a data service that is now owned by NASDAQ and features many free and pay sources for market and economic data. There are various sources like this, but this I chose to start here as it fit my need and budget.
  • Gurufocus (https://www.gurufocus.com) – This is a site with pay and free resources but offers some great fundamental data available via REST API to subscribers.

The sources are endless and only limited by your imagination and your wallet as some data is very expensive. The main data most people will start with is end of day stock quote data and fundamental financial data. This is exactly what I get from quandl and gurufocus, as well as the macroeconomic data from FRED. There are lots of ways to get data into Splunk, but my preference in this case was to use Python code and interact with the internet REST API’s, Splunk REST API’s and HEC. This allows me to have Python scripts control all of my data loads and configuration in Splunk. Splunk also provides an extensible app development platform which can be used to build add-ons for data input. I will likely move my data load processes to this model in the future.

The other aspect that Splunk brings is the ability to integrate custom Python code via the Machine Learning Toolkit (MLTK) as custom algorithms. This provides the ability to implement analysis, such as concepts from modern portfolio theory for risk optimization and return projection. Additionally, this gives us a path to do more advanced things using the MLTK. I have only scratched the surface on this subject and I have lots of ideas to explore and learn in the future. Splunk simplifies operationalizing these processes and in my opinion makes the task of getting from raw data to usable information much easier.

Ok, hopefully that provides enough background and context. Now I would like to show an example of the following process.

  • Use Python to download end of day stock quote data from quandl.com using their REST API.
  • Use Python to send the data to Splunk via the HTTP Event Collector.
  • Use Splunk to calculate the daily returns of a set of stocks over a period of time.
  • Utilize the Splunk Machine Learning Toolkit to calculate correlation of the stocks based on daily returns.

The following code sample shows a simplified version of code used to retrieve data from the Quandl Quotemedia end-of-day data source. The returned data is formatted and sent to a Splunk metrics index. Splunk metrics were created to provide a high performance storage mechanism for metrics data. Learn more about Splunk metrics here and here.

Once the quote data is loaded then we can see all of the metrics loaded by the process. The following screenshot shows our resulting indexed data.

EOD_DATA

Now that we have our data loaded we can do some more advanced processing. A common fundamental calculation done in quantitative finance using modern portfolio theory is to calculate daily returns. The following example shows how to use the metrics data loaded into Splunk for this calculation. For this example I have loaded data for various S&P 500 sector ETF’s as well as a gold miners ETF. Here is the calculation and results.

Daily_Return

The next step in our process is to use the Splunk Machine Learning Toolkit to calculate correlation of our equities. The Python Pandas library has a function that makes this process very easy. We can access that functionality and easily operationalize that process in Splunk. It just so happens there is a Correlation Matrix algorithm in the GitHub Splunk MLTK algorithm contribution site available here. The documentation to add a custom algorithm can be found here and you will notice this Correlation Matrix example is highlighted. Here is the example of using this algorithm and the corresponding output.

Correlation_Matrix

The example above shows the correlation of all of the examined ETF’s over a period of 60 days. The value of 1 is perfectly correlated and the value of -1 is perfectly inversely correlated. As noted previously this calculation is the basis for more advanced operation to determine theoretical portfolio risk and return. I hope to visit these in future posts.

Regards,

Dave

File System Assessment with PowerShell and Splunk

It has been a while since I posted, but here goes. I have been experimenting with Splunk to make one of my old processes better and easier.

In the past I have done file system assessments for customers to provide capacity planning information as well as usage patterns. There are a number of ways I have collected data depending on if the platform is Windows, Linux, or a NAS device. In this post I will focus on collecting file system metadata from a Windows file server. To do this I use a PowerShell script to collect the data and dump to a CSV file. Historically I would use the CSV output and load it into SQL Server as the first step. Then I would use the SQL connectivity and pivot charting functionality in Excel for the reporting. It occurred to me as I have been working with Splunk that I could improve this process using Splunk.

Another thought also occurred to me, this process could be performed by owners of Splunk with no impact on licensing costs. Splunk is licensed on daily ingest volume and that volume can be exceeded 3 times per month without penalty. File system assessment data is something that would typically only be collected on a periodic basis so this data could be ingested without increasing the daily license capacity. Using the methods I show below organizations who own Splunk could easily do free periodic file system assessments.

The first step is to collect the file system metadata using the following PowerShell script.

The script requires a single argument which specifies the path to an XML file. This XML file is used to define configuration for the script. Here is an example of how you would call the script.

The XML configuration file defines the file system(s) to scan. Here is an example:

The example includes an explanation of optional attributes for the XML element(s). This allows control of how the data is organized and tagged, which provides more useful reporting options. The sample also shows several configuration examples and some example output. Once the metadata is collected into CSV files. It can be easily loaded into Splunk using the ad-hoc data load feature or a file system input on a forwarder. Here is an example of a file metadata record/event in Splunk.

Splunk Record

One thing to note here is the event timestamp in Splunk. The Time field is derived from the modified time of the file. This was done on purpose, because in my experience doing file system assessments it is the only timestamp that is generally accurate. I have found many cases where last accessed is not available or incorrect. I have also seen many cases where the create date is incorrect. Sometimes the create date is more recent than the modified date and occasionally it is even in the future. Here is the Splunk props.conf sourcetype stanza I defined for the data. It includes TIME_FORMAT and TIMESTAMP_FIELDS to use the modified date for the _time field. It also uses MAX_DAYS_AGO since the modified date can go back many years.

Once the data is loaded into Splunk here is the type of information we can easily find.

Splunk Dashboard

These are just some simple charts, but the metadata provides many reporting options. There are benefits of using Splunk besides the fact that it can be done without additional cost.

  • This eliminates the need to create tables and/or ETL processes for a relational database
  • The data can be loaded very easily compared to using a relational database
  • The dashboard can be reused very easily for new reports. Simply use a dedicated index and clean or delete/recreate as needed for updated reporting

If I were doing this in an environment I managed on a day to day basis. I would send the data directly to Splunk via the HTTP event collector. I’ll need to modify the collection code a bit to provide an example, but I’ll try to post a follow-up with that info.

I hope some folks find this useful.

Regards,

Dave

Splunk and Eureqa; IOT Manufacturing Demo

Introduction

I have recently been working with the Splunk universal machine data platform. It is a very interesting and useful technology with a wide range of application. I have also been working more with Eureqa and decided to explore how the two technologies could be used together. I decided to use a hypothetical Internet of Things manufacturing scenario to setup a demo using Splunk and Eureqa.

This is a simplified version of a scenario becoming more common in Manufacturing and other industries today. The idea is data is collected about various aspects of the manufacturing process as well as completing quality testing at various stages of the process. The results of the quality testing is used in conjunction with the data collected during the manufacturing process as a training data set to build predictive algorithms/models. These models are then used in real time during the manufacturing process to measure quality and identify defects. This allows for the real time actions to correct quality issues or to stop the manufacturing process to reduce costs incurred by defective product.

Splunk and Eureqa fit in this process like peanut butter and jelly. Splunk is a universal machine data platform that will literally ingest any type of data. It indexes the data and provides a search language to produce valuable information in real time. Eureqa is a robotic data scientist and can find meaningful predictive models automatically from your data. These tools can be used to easily solve the real time IOT manufacturing quality use case. Here is the demonstration scenario:

  • Chemical Manufacturing – (PolyVinylHydroSilica)
  • Problem: Manufacturing data is not timely enough
    • Decisions cannot be made during manufacturing process
    • Data reporting is only looking historically and lags 24 to 48 hours
  • Objective: Improve quality and meet the following goals
    • Improve data collection on manufacturing process
    • Utilize data to accurately predict and identify failures
    • Identify failures as early as possible to reduce costs
  • Identified high level requirements
    • Track manufacturing process data
    • Identify quality issues quickly to reduce costs
    • Provide near real time reporting

 

Legacy Process Flow

Legacy Process

 

Updated Process Flow with Splunk and Eureqa

Updated Process

Output Details and Operation

The initial benefit of this process and the low hanging fruit that will allow for a positive return on investment is the savings from halting production early on lots identified as inferior product. The second valuable benefit is the real time reporting of manufacturing process metrics. Splunk has real time reporting capabilities which are used to drive dashboards with valuable information. Here is an example below.

Splunk Dashboard

The above dashboard shows key performance indicators on the manufacturing process such as; speed, volume, and cost of manufacturing as well as the cost saved by stopping failed lots. This information is driven by the manufacturing data from each phase as well as the model results from Eureqa. The models in Eureqa are generated from the training data using the search functionality. The screenshot below shows the candidate models produced from the search.

Eureqa Models

We select models for each phase based on variables included and the performance. Also note Eureqa suggests the best model based on best tradeoff of error vs complexity. I chose to use model 2 for phase I, model 5 for phase II and model 11 for phase III. Phase I is more tolerant and only requires a value of 10 or above to be moved to the next phase. Phase II requires a value of 20 or above to be along to phase III. Finally phase III requires a value of 30 or above to be a passing lot. To demonstrate this process I wrote an application to simulate the manufacturing process. This simulation environment is shown in the diagram below.

Demo Environment

The demo for this process is shown below. I also talk about the various integration points between Splunk and Eureqa and show a couple of examples.

Regards,

Dave

Nutonian Eureqa

Eureqa API and SQL Server Data Load

RoundTower Technologies offers an analytics solution called FastAnswers powered by Eureqa. It is an amazing piece of software created by the brilliant people at Nutonian. Eureqa can use sample data to automate the creation of predictive models. If you are interested, the Data Scientists at Nutonian and RoundTower can explain the mathematics and science behind the technology. Please visit the RoundTower and Nutonian links above to learn more about the solution.

One of the main goals of Eureqa is to provide data science skills to non-PHD data analysts. This is why Eureqa is a very intriguing technology to me. I have an interest in analytics and machine learning as well as some BI background, but I do not have a deep understanding of statistics or machine learning. Eureqa is great because it helps bridge the skills gap.

The primary user interface of the Eureqa application is web based, very intuitive, and well documented. Nutonian also provides a Python API for programmatic access to Eureqa functionality. I have been experimenting with Eureqa and the Python API so I thought I would share some things I learned. Hopefully it will be helpful to someone else using Eureqa. Also, if you are interested, the current API documentation is located at http://eureqa-api.nutonian.com/en/stable/ . This contains basic information on using the API and a few helpful examples.

A question that always comes up when talking about Eureqa is “how does it connect to database sources”. So one of the first things I decided to learn about Eureqa was how to load data from SQL Server. The goal of this post is to show the basics of using the API as well as getting data from SQL.

To use the Eureqa Python API you will at least need to have Python 2.7 installed. I prefer to use a distribution that includes Python and many common libraries including the popular machine learning libraries. I also like to use a Python IDE for development, so my preferred environment is Anaconda2 and JetBrains PyCharm. There is also a good Python IDE called Spyder included with Anaconda2. If you need more assistance getting started there is a plethora tutorials on Python, Anaconda2, and various Python IDE’s on the web.

Once the development environment is setup the next step is installing and enabling the API. To start using the API your Eureqa account or local installation must be licensed and given access to use the API. If this has been done you will see the API options shown below on the settings page after logging in to Eureqa.

Eureqa_Settings_API
Eureqa_Settings_API

This page provides the ability to download the Python API and an access key, which are the two things required to use the API. The Eureqa API installation is easy, just use pip from your default Python directory using the command shown on the settings page. The next step is the API key, which is also easy, just click the get access key button on the settings page and the following dialog box is shown, which allows us to give the key a logical name.


Eureqa API KeyGen

After assigning a name click generate key. Then a dialog like the one below is shown with the API key.

Eureqa API KeyGen Result

Now we can use this key to interact with the Eureqa API. The following code example will show how to connect to SQL server retrieve data and load into a Eureqa data set. In this example we will be using data from concrete strength testing, which is publicly available from the University of California Irvine (UCI) machine learning department. The first step in our Python code will be to define our connection variables and load the required libraries, which is shown below.

Once that is done we will define two functions; one to retrieve data from SQL Server and write to a temporary .csv file and one to load data from the temporary .csv file into a Eureqa data source.

Both of these functions are fairly straightforward and self-explanatory. Notice the first function uses the pyodbc and csv Python libraries we loaded in the first step. These libraries provide database access and csv text processing functionality.

The next piece of code is the main part of the application. The first step is to connect to Eureqa, which is done using the Eureqa interface. This is the entrance point for all programmatic operations in Eureqa. After we have a connection to Eureqa we then define our SQL query and execute the two functions to retrieve and load the data.

The screenshots below show what we see in the Eureqa interface before we execute the code. A Eureqa installation with no data sets.

Eureqa Data Set

We execute the script and see a very uneventful output message that tells us the script ran successfully.

Eureqa Script Result

Now after refreshing the Eureqa data sets window there is a new data set called Concrete_Strength_Data.

Eureqa Data Set Result

Here is a subset of the imported data, which strangely enough looks like the data returned from the SQL query in our code.

Eureqa Data Subset

Now that we have a data set loaded it can be used to run searches and build models. So if you happen to be interested in a predictive model to estimate concrete strength. Here is the model Eureqa built based on the UCI concrete strength data, which it solved in minutes. Eureqa!!

Eureqa Model Result

I’ll expand on this next time.

Regards,

Dave

EMC Unity REST API and PowerShell

Earlier this month at EMC World the new Unity storage array was unveiled. There are some cool new features with this product and the architecture is a bit different from the VNX. I like to experiment in my home lab and especially with PowerShell and REST APIs. So when I heard a Unity VSA would be available including the EMC Unity REST API. I was waiting for the EMC World release so I could get my hands on it.
Unity

I downloaded the EMC Unity VSA and set it up the first day of EMC World, then started some initial testing with the Unity REST API and PowerShell. I quickly found some distinct differences in the API when compared to the Isilon and XtremIO API’s. The first difference is the authentication mechanism which took some time to figure out. I found some quirks with the PowerShell Invoke-RestMethod and Invoke-WebRequest cmdlets. These cmdlets did not provide enough control over the web requests so I had to resort to creating my own with the .NET framework.

The EMC Unity REST API uses basic authentication via a request header, but also uses login session cookies. While trying to get this working I found I needed to use the .NET class System.Net.HttpWebRequest object for greater control. The two problems I had with the PowerShell cmdlets was the inability to control auto redirect and use request cookies. The use of request cookies is required to maintain session after initial login. The data required to create one of the cookies was returned in a response after an auto redirect so this had to be turned off to capture the information successfully.

The two cookies which have to be captured during the login process are MOD_AUTH_CAS_S and mod_sec_emc which are then used in all subsequent requests to the API. There are also a couple of additional cookies which are passed back and forth during the authentication process. I created a couple of functions to complete the login process, one of which is a recursive function which handles the auto redirects and collects the required login session cookies into global variables. The complete code is more than makes sense to show in this post, but the below example shows the main elements of building the request. This code is called recursively and collects the required session information to be passed in subsequent requests.

Once the login process is complete and the required session cookies are collected data requests to the EMC Unity REST API can be issued. The below code is an example of issuing a request for system information.

The above code also includes the header EMC_CSRF_TOKEN which is actually only required when doing a POST or DELETE. Another thing to notice in the code above is the use of the fields query string. The desired object properties to be returned must be specified using this method. The output is below.

The EMC Unity REST API was a bit of a challenge to get started and took some time with fiddler to figure out. I will say though for the initial release, the API documentation was pretty good. Unity does seem pretty cool, easy to use, and it’s awesome to have the Unity VSA for experimentation. I will try to get some more comprehensive example code on GitHub soon.

Regards,

Dave

EMC VNXe Performance PowerShell Module

vnxePoSH

I thought I would revisit the VNXe performance analysis topic. In 2012 I published some posts around performance analysis of an EMC VNXe storage array. This information applies only to the 1st generation VNXe arrays not to the newer 2nd generation arrays.

In my previous articles I posted a PowerShell module to use to access the VNXe performance database. Since that time I fixed a bug around rollover of timestamps and made a couple other small improvements. The module has also now been published to GitHub.

Here is some background and you can see my previous posts for additional information.

http://muegge.com/blog/emc-vnxe-performance-analysis-with-powershell/

http://muegge.com/blog/emc-vnxe-performance-analysis-with-powershell-part-ii/

The VNXe collects performance statistics in a sqlite database, which can be accessed via scp on the array controllers. It is also available via the diagnostic data package, which can be retried via the system menu in Unisphere. There are a few database file which hold different pieces of performance data about the array. The MTSVNXePerformance PowerShell module provides cmdlets to query the database files and retrieve additional performance information over what is provided in the Unisphere GUI.

I will show some examples of using the module to get performance information. The first one is a simple table with pool capacity information.

The first step is to load the modules, set file path variables, and the sqlite location. This also uses a module to provide charting from the .Net MSChart controls.

The next step is to get some data from the VNXe sqlite tables.

This give us the information we need about pools. So now we can look at rollup Information by using a command like so.

Next we will look at IOPS from the dart summary data. Data is stored in different tables based on type of information and time period. As data is collected it I summarized and moved to historical tables storing longer time periods at less data resolution. Here we are going to get dart store stats which gives us all IO information for each of the data movers

This produces the following charts using the MSChart .Net charting controls.

The module can be used to produce complete VNXe performance reports like the one below.

The script that produces the report above is included in the examples folder in the GitHub project.
 

I recieved a fair amount of interest on the first posts around this topic. I hope this update and refresher is still useful to some folks.

Regards,

Dave

XtremIO PowerShell Module Update

I have been working with version 4.0 of XtremIO. This version of XtremIO has an updated REST API which provides a bunch of new functionality. The MTSXtremIO PowerShell Module provides PowerShell management of the EMC XtremIO storage array.

This new version of the REST API includes new performance features, new snapshot functionality, and many new objects. I have updated the MTSXtremIO PowerShell module for XtremIO to support the new API and functionality. The current version of the XtremIO PowerShell module contains 85 cmdlets for working with XtremIO.

The first four Cmdlets are to setup a connection to the XtremIO XMS, which is how you connect to the REST API. One XMS can manage multiple XtremIO clusters.

Disable-CertificateValidation

Get-PasswordFromFile

New-PasswordFile

Set-XIOAPIConnectionInfo

 

There are 41 get Cmdlets to retrieve various XtremIO objects.

Get-XIOPerformance

Get-XIOAlert

Get-XIOAlertDefinition

Get-XIOAPITypes

Get-XIOBBU

Get-XIOBrick

Get-XIOCluster

Get-XIOConsistencyGroup

Get-XIOConsistencyGroupVolume

Get-XIODAE

Get-XIODAEController

Get-XIODAEPSU

Get-XIODataProtectionGroup

Get-XIOEmailNotifier

Get-XIOEvent

Get-XIOInfinibandSwitch

Get-XIOInitiator

Get-XIOInitiatorGroup

Get-XIOIscsiPortal

Get-XIOIscsiRoute

Get-XIOItem

Get-XIOLDAPConfiguration

Get-XIOLocalDisk

Get-XIOLunMap

Get-XIOPerformance

Get-XIOScheduler

Get-XIOSlot

Get-XIOSnapshot

Get-XIOSnapshotSet

Get-XIOSNMPNotifier

Get-XIOSSD

Get-XIOStorageController

Get-XIOStorageControllerPSU

Get-XIOSYRNotifier

Get-XIOSyslogNotifier

Get-XIOTag

Get-XIOTarget

Get-XIOTargetGroup

Get-XIOUserAccount

Get-XIOVolume

Get-XIOXenvs

Get-XIOXms

 

There are 11 Cmdlets for creating XtremIO objects.

New-XIOConsistencyGroup

New-XIOInitiator

New-XIOInitiatorGroup

New-XIOIscsiPortal

New-XIOIscsiRoute

New-XIOLunMap

New-XIOScheduler

New-XIOSnapshot

New-XIOTag

New-XIOUserAccount

New-XIOVolume

 

There are 13 Cmdlets for removing XtremIO objects.

Remove-XIOConsistencyGroup

Remove-XIOConsistencyGroupVolume

Remove-XIOInitiator

Remove-XIOInitiatorGroup

Remove-XIOIscsiPortal

Remove-XIOIscsiRoute

Remove-XIOLunMap

Remove-XIOScheduler

Remove-XIOSnapshot

Remove-XIOSnapshotSet

Remove-XIOTag

Remove-XIOUserAccount

Remove-XIOVolume

 

There are 16 Cmdlets for changing various XtremIO objects.

Add-XIOConsistencyGroupVolume

Set-XIOAlertDefinition

Set-XIOConsistencyGroup

Set-XIOEmailNotifier

Set-XIOInitiator

Set-XIOInitiatorGroup

Set-XIOLDAPConfiguration

Set-XIOScheduler

Set-XIOSnapshot

Set-XIOSNMPNotifier

Set-XIOSYRNotifier

Set-XIOSyslogNotifier

Set-XIOTag

Set-XIOTarget

Set-XIOVolume

Update-XIOSnapshot

 

This provides a good basic set of functionality and coverage over 100% of the API. There are still many improvements to be made but this provides a good base set of functionality. There are some items which are at the top of the list to address such as;

  • Improvements in pipeline support across functions. One known issue that exists with the cmdlets is the ability to pipe the output of a get into a set. There is currently a problem with the set cmdlet using property name as input correctly.
  • Complete multiple cluster support across all functions
  • Implement ShouldProcess in all new, set, remove functionality.

 

Other items on roadmap or under consideration

  • Complete Get-XIOPerformance function. This function only provides very raw object output and has an incomplete implementation.
  • Implement use of certificate based authentication
  • Test use of server certificates on XMS
  • Add informational output objects on new, set, and remove functions
  • Implement additional error handling to include informational messages

 

Get the latest version of the MTSXtremIO XtremIO PowerShell Module . I will work on pipeline improvements, multiple cluster support, and support for should process next. These items will provide a solid foundation for the module functionality. I also have a fair amount of help to complete. Please let me know what other items are missing and where things can be improved.

 

Regards,

 

Dave

 

XtremIO Snapshots SQL Server and PowerShell Part IV

I have continued to work on the MTSXtremIO module and adding functionality for XtremIO 4.0. One interesting feature in 4.0 that provides even more benefit with snapshots is the ability to refresh a snapshot. With SQL server this saves some steps with LUN mapping and mounting volumes. I have done some testing using two methods to control the snapshot. I will talk about both methods and show a scripting example using each method.

The first method uses a PowerShell Cmdlet included in the PowerShell Toolkit in EMC Storage Integrator for Windows. The XtremIO 4.0 features are found in the latest version on the software, version 3.8. To use this method the EMC Storage Integrator for Windows must be installed on the machine to be used for scripting. This could potentially be each of the participating SQL Servers.

  1. Here is an example script refreshing a database using the PowerShell Toolkit included with version 3.8 of EMC Storage Integrator for Windows.

    The Second method is using an open source PowerShell Module I developed to manage XtremIO. It is a PowerShell interface to the XtremIO REST API. It provides a comprehensive PowerShell management interface including the snapshot refresh functionality in 4.0.

  2. Here is an example refreshing a database using the Update-XIOSnapshot Cmdlet found in MTSXtremIO module.

The way both of these scripts work relies on some initial setup of a snapshot copy mounted to a server. This could be the same or different servers. The diagram below shows the test scenario used in the example scripts above.

In either scenario above the copy process happens very quickly. In my tests it only took a few seconds. The only difference in the scripts above is the command to connect to the resources and the command to refresh the snapshot. Either method is simple and really makes database snapshots easy. The scripting is very straightforward and easy to understand.

The main benefit of the ESI Toolkit is it is officially supported by EMC. It does require an install an just provides some core provisioning functionality. The MTSXtremIO PowerShell Module does not require installation and can be used for many other XtremIO management and reporting tasks. It is an open source project located here.

Regards,

Dave

Hyper-V Cluster with Storage Spaces

I was recently discussing the topic of using Microsoft Storage Spaces in conjunction with Hyper-V and failover clustering. The Hyper-V cluster with Storage Spaces architecture has been around for a couple of years. Storage Spaces was introduced in Windows Server 2012 and was improved in Windows Server 2012 R2. There are many great resources available on the topic so I will not cover the details here. If you would like an overview of the architecture here are some great places to start.

Aidan Finn’s Blog – http://www.aidanfinn.com/2013/07/yes-you-can-run-hyper-v-on-your-storage-spaces-cluster-sofs/ – This talks about the architecture I am following. He also has a ton of additional information on Hyper-V, Storage Spaces, and other Microsoft virtualization topics.

TechNet is always a great place to start as well as the Microsoft blogs. Here are some links I found useful.

Failover Clustering – https://technet.microsoft.com/en-us/library/hh831579.aspx

Hyper-V and Failover Clustering – https://technet.microsoft.com/en-us/library/Cc732181(v=WS.10).aspx

Storage Spaces – https://technet.microsoft.com/en-us/library/hh831739.aspx

Microsoft Clustering and High Availability Blog – http://blogs.msdn.com/b/clustering/archive/2012/06/02/10314262.aspx

Now on to the primary point of this post. A couple colleagues of mine said they would like to see a demo of this architecture and in particular a successful failover of a virtual machine. I have seen some videos with parts of the configuration of Hyper-V failover clustering and also Storage Spaces. However, I was not able to find any demonstrations of the failover process. I decided to setup a cluster and create a demo video.

First I will explain the lab environment I used. It is a basic cluster-in-a-box using a virtualized Hyper-V failover cluster on ESXi 5.5. Hyper-V on Server 2012 R2 does not support nested virtualization, but Microsoft says it’s coming in Windows Server 2016. VMware ESXi 5.5 does support nested virtualization with a few small tweaks which you can read about on Derek Seaman’s Blog. Here is the basic setup I used.

Here is the VMware configuration of each Hyper-V virtual host.

There are a couple of items to call out here. The clustered disks use a different SCSI controller from the OS disks which are set to allow SCSI bus sharing for the virtual disks. Hard Disk 2 is used as a disk witness for the cluster quorum and Hard Disks 3-8 are used for the Storage Spaces pool and virtual volume. This volume becomes the clustered shared volume for virtual machine storage. The screenshots below show this configuration.

The Hyper-V failover cluster is running one Windows 7 virtual machine. In the demo video below, I will show the failover process for this VM by simulating the loss of the active cluster host.

One aspect of this configuration, which has some real benefits is a low cost and highly available virtualization solution that does not require and additional software other than Windows. While the cluster configuration in VMware is still more simplified it requires the use of VCenter and the additional cost.

Regards,

Dave

SQL Server, PowerShell, and XtremIO Snapshots Part 3 (VSS Native)

During the writing of this post EMC announced the GA release of XtremIO 4.0 on July 2nd. The new documentation and the native VSS provider are now available on EMC support. This will provide the ability to script application consistent snapshots without using AppSync. Unfortunately we do not have our lab upgraded to 4.0 yet, but that will be coming soon and I will test the script and process soon. In the meantime I will talk about how this will provide another way to do application consistent snapshots for XtremIO. I will show the architecture and a mock script of how I think it will work at this point.

In order to use the XtremIO VSS provider it must be installed on the server where we want to do an application consistent snapshot. The install is downloaded from EMC support, at the time of this post the file name for the latest version is XtremIOVSSProvider-1.0.8.msi. After installation the connection to XtremIO is configured through the control panel using the applet.

The control panel applet is only to configure the connection to the XtremIO XMS.

The VSS Provider installation can be verified by opening a command line as administrator and typing vssadmin list providers. This shows us the XtremIO VSS Provider.

The process to use an XtremIO snapshot for a database copy using the VSS provider is very similar to the process used in part 1 of this blog series. The primary difference is the VSS Provider is called to create the XtremIO snapshot. The following image shows the basic VSS architecture.

The test environment is a SQL Server virtual machine on vSphere. The SnapTest01 volume is on a 50GB RDM on XtremIO and the SnapTest01_QA volume is a snapshot of the SnapTest01 volume.

 

The example script will show the process to refresh the QA volume with a new snapshot copy. The script is almost identical to the part 1 script. The first step is to load a few PowerShell modules, define some constants, and connect to VCenter and XtremIO. This is done by using the PowerCli and a function from my MTSXtremIO module, read about that here. This function uses the XtremIO REST API to create the snapshot. I also use a couple of other modules with some of my common functions and a NTFS Security Module which I did not write. I will put links to those at the end of the post.

The example above loads module dependencies and connects to VCenter and XtremIO. The SQL Management Objects are loaded to provide SQL Server functionality.

The next step is to detach the current QA database copy, remove the virtual hard disk, and remove snapshots.

The example above uses SQL Management Objects to access SQL and detach the database. It the uses the VMware PowerCli to remove the RDM from the virtual machine. Then connects to the XtremIO via REST API and deletes snapshots

Now we are ready to create a new snapshot, add it to the lunmap, add the disk to the vm, and attach the database.

The above example creates a snapshot via the VSS provider and maps the volume to the host using the XtremIO REST API. It also rescans the disks and then adds the RDM to the virtual machine. Then the database is attached using SQL SMO.

Although I have not been able to test this code it as we need to upgrade XtremIO first. I had hoped the VSS provider would work with 3.0 but unfortunately I received the following message when I tried it.

“The registered provider does not support shadow copy of this volume”

Hopefully this example should be pretty close.

Regards,

Dave

MTSXtremIO Module
NTFSSecurity Module
MTSAuthentication Module