Wednesday, November 9, 2011

Preparing for migrating to OM12: Moving from SQL 2005 to SQL 2008 – Part I: Along came a theory

----------------------------------------------------------------------------------
Postings in the same series:
Part  II – The Preparations
Part III – The Migration  
----------------------------------------------------------------------------------

As stated before a new series of blog postings would come out, all about moving from SQL Server 2005 to SQL Server 2008 in order to prepare for migrating from SCOM R2 to OM12 RTM (expected in 2012).

In the first posting of this series I will go discuss the theory behind it, all about the required steps and procedures in order to make it a smooth transition with an absolute minimum amount of downtime for your SCOM R2 environment.

The Lab

Especially for this series I have build myself a new SCOM R2 CU#5 environment, based on two W2K03 SP2 servers.

One server is a DC for the AD and hosts the RMS Role. The second server is the SQL Server and hosts – besides both OpsMgr databases – the Reporting Server role for the SCOM MG as well.

Some basic MPs (Server OS, IIS and SQL) are imported and tuned and also some customized Reports are created which aren’t saved into a MP of their own but saved directly to the SSRS instance itself.

The Mission
The SQL 2005 Server will be replaced by a SQL 2008 R2 SP1 server, running on W2K08 R2 SP1. And of course, the SCOM R2 databases will be moved to this new server along with the customized Reports and their special folders.

The Theory, all about TLA and KIs
Basically this move to SQL Server 2008 R2 SP1 has three different phases. Each phase has its own characteristics, challenges and potential pitfalls.  So preparation is key here.

A typical case where usage of the Traffic Light Approach (TLA) comes in handy. TLA is really simple. Per phase some Key Indicators (KIs) are defined. And per KI there are three different states (sounds almost like Monitors in SCOM and believe me, the similarity doesn’t end here) :

  • Green: All is well and a go ahead;
  • Yellow: Warning. Some issues require additional attention, but on itself no show-stopper (yet!);
  • Red: Something is not OK and must be taken care of before moving on to the next phase.

And yes, a couple of Yellow’s might end up as a single Red. So beware.

Phase I
This phase is a nice one since it doesn’t touch the SCOM R2 MG in any kind of way. So one has time to plan, discuss, design, install and tune. This phase is all about provisioning a new SQL 2008 R2 SP1 Server based on W2K08 R2 SP1. Yes, the latest technology is being used here, all because this server will be used for some years so it’s better to start with the latest RTM/GA technology instead of using technologies which are already a few years old.

On the other hand, this phase has some challenges of its own as well. The installation of SQL 2008 R2 SP1 on itself isn’t that hard. But more of a challenge will be what type of SQL server will be provided, like: stand-alone, clustered, VM or physical? Your SCOM R2 environment is based on decisions take a few years ago and nowadays other decisions might be taken. So make sure the correct type of SQL Server is provided.

This list of questions/steps do make up Phase I:

  1. Is the new SQL Server going to be physical or virtual? Also check what Microsoft officially recommends here;
  2. Is the new SQL Server going to be clustered or not? Involve management and let them tell you how they look upon Monitoring. Is it business critical or not?
  3. What version of SQL Server are you going to use (Standard or Enterprise)?
  4. Provisioning of the new SQL Server, installing it with the features AND correct COLLATION settings required by OM12;
  5. Preparing the security of the new SQL Server so SCOM can access it;
  6. Adjusting the Master Database with a special script so it understand typical SCOM messages, piped into the event log;
  7. Enabling CLR on the SQL Server so Group calculation works in SCOM R2
  8. Export of all custom RDLs residing on the current SQL 2005 server, used by SCOM;
  9. Make sure the SCOM environment is healthy. Check SCOM (in the Console and OpsMgr event logs on the RMS and MS servers) and solve any serious issues before moving on to Phase II;
  10. DON’T deploy a SCOM R2 Agent on this server!!!!
  11. Make a list with the SCOM service accounts and their passwords, required in Phase II.

This phase has these five KIs:

  • The correct collation settings;
  • The correct SQL Features (required by OM12);
  • The correct security settings for SCOM;
  • The script run against the Master Database;
  • CLR enabled on the SQL Server.

Only when all KIs are green, it’s time to move on to Phase II.

Phase II
This is the phase where things get real. The SCOM databases and Reporting functionality are moved from the current SQL Server to the one provisioned in Phase I. So this touches your SCOM R2 environment BIG time!

This list of steps do make up Phase II:

  1. Backup of your SCOM environment (SQL, RMS, MS servers and databases). This is step enables the fall-back scenario and not Step 5 since the SCOM R2 environment is already affected by then;
  2. Removal of the SCOM Reporting functionality (we KEEP the Data Warehouse of course!);
  3. Stopping of the Health Service on RMS and all MS servers;
  4. Stopping the Configuration and SDK service on the RMS;
  5. Backup the SCOM databases, including OpsMgr, OpsMgrDW and Reporting (because no more data comes in these backups are the most current ones and won’t be outdated in any kind of way);
  6. Stop the SQL Server service (and all other related SQL services) on the old SQL Server;
  7. Restore of the SCOM databases OpsMgr and OpsMgrDW on the new SQL 2008 R2 SP1 server;
  8. Adjustment of the registry keys on the RMS, so the new SQL server is used;
  9. Adjustment of the registry keys on all MS servers, so the new SQL server is used;
  10. Adjustment of some entries on both SCOM databases on the new SQL Server so the new SQL server is present in the database;
  11. Starting all SCOM services on the RMS;
  12. Checking the OpsMgr event log for any error;
  13. Launching the OpsMgr Console on the RMS to see whether all is OK;
  14. When all is well, starting the Health Service on the MS servers, one by one and checking the OpsMgr event logs on those servers whether all goes well;
  15. Installation of SCOM Reporting on the new SQL server;
  16. Installation of CU#5 on that SQL Server as well;
  17. Checking SCOM Reporting;
  18. Checking the successful upload of data into the Data Warehouse (OpsMgr event logs of the RMS and MS servers);
  19. Restore of exported RDLs (Phase I, Step 8);
  20. Checking whether all Reports show up again in SCOM (this might take an hour or so).

This phase has different KIs where every single KI is capable of creating a RED traffic light, thus stopping the whole Phase II in total…

  • Step 1: these backups MUST be validated! Otherwise no fall back scenario is in place;
  • Step 5: these backups MUST be validated as well. Otherwise no move to the new SQL Server can be made;
  • Step 6: The restores must be successfully and give you healthy databases;
  • Steps 8, 9 and 10 must be OK otherwise SCOM won’t work;
  • Step 13: SCOM Console must be running OK otherwise SDK has having issues with the database connectivity;
  • Step 14: Are the MS servers OK and in working condition;
  • Step 16: Is Reporting successfully installed?
  • Step 20: Is SCOM Reporting functioning properly?

Phase III
This phase is the wrap up. All is OK and running smoothly, otherwise TLA would have stopped you and kept you in Phase II. But since you’re here I presume you’re in a good shape.

This list of steps makes up Phase III:

  1. Push a SCOM R2 Agent to the new SQL Server;
  2. Run the SCOM R2 environment with its new SQL 2008 R2 SP1 server for a week and check it for any glitches;
  3. Run some tests, like creating Reports, building Groups based on dynamic population (are these Groups properly populated afterwards) and so on;
  4. Talk with the end-users of the SCOM environment in order to get an idea how they see it, ask open questions.

When all is well in these Steps, it’s time for some cleaning and sharing:

  1. Decommission the old SQL server;
  2. Make a high-level report for Management so they know what has been done with what results.

And now it’s time for a beer or glass of wine!

In the next posting I will write about preparing the new SQL server as described in Phase I. No, it won’t be detailed step-by-step guide but it will highlight the most important steps and referring to the sources, like where to get that script to be run against the Master database and so on. So stay tuned and see you all next time!

No comments: