Thursday, November 4, 2010

Tips to build and move a data center

I’ve been lucky enough to be able to move our data center twice in six months, or three times in two years and learned a few things. The last two times it was back to an in-house data center so we were responsible for the build out as well. This includes working with contractors to build the walls, the power company to upgrade the service, inspectors and of course the in house facilities group.

If you haven’t experienced the fun of loading your entire company’s infrastructure on a truck in February in New England and watched it drive away, hoping the roads aren’t icy, you may be able to learn from some of the headaches I’ve gotten.

Building the data center
  1.  By assuming the risk of time and materials, you can save a significant amount of money, however unless you truly have the time to manage the project closely a fixed bid price may be best.
  2. Used equipment can be a good way to reduce costs, however for large ticket items, a professional inspection and service check is a good insurance policy.
  3. If a vendor promised something three times and for whatever reasons had to back out, cancel and go somewhere else. You’re getting the run around.
  4.  Don’t forget inspectors. If you are doing anything that requires an inspector to sign off, make sure to include them at the beginning of the project. It’s much easier to change the design on paper than when it is all built.
  5. Make sure you have time to test and resolve power and cooling issues, preferably not the same weekend you are moving.
  6. The last week is the worst. Everyone will be stressed out, trying to hit their deadlines. Part of the job is keeping everyone from getting on each other’s nerves.
  7. Don’t get oversold on “green”.  Super energy efficient is great, but not always worth the extra money. We had planned to design our data center retrofit to leverage a very efficient cooling layout, but to do so would have cost $150,000 in demolition and ductwork. The ROI just didn’t make sense.
  8. Meet often, but with the right people. We spent close to $5,000 because of people interrupting the  HVAC mechanics to ask when the cooling would be ready.  One person needs to be the point person and everyone goes to them. Otherwise you get rumors, distraction and all sorts of technical and political issues.
  9. Construction is disruptive. To avoid upsetting one group, make sure you annoy everyone equally.
  10. Plan for other damages and issues to crop up.  Something as simple as carrying piping materials can cause damage to the walls. It’s real easy to damage sheetrock walls with a 10’ long metal pipe. Expect to spend some money to fix these later, or make sure it is in the contract that the vendor will do it.
  11. Always get at least three bids. Unless it is so small it will cost more in internal labor to go through the bidding process, take the time. The easiest way to save money is to negotiate for it and the best time for that is before you assign the work. Plus different vendors may suggest ways to cut costs. If so suggest them to the other vendors to see if it makes sense.
  12.  If possible ask your vendors for the parts lists and comparison shop. Many times they work with a favorite supply house that may not always get the best price, especially if they just mark up the price and pass it along. By comparison shopping you show you are involved in the project, plus shows you are very serious about saving money. Typically contractors will mark up the materials 10% so if you find a better price and ask them to order it, expect a slight increase. 

Moving the data center
1.       Plan every detail of the move. When we plan we have it down to the minute on who is going to work on which cabinet, when each server will be shutdown, when it will be unracked, moved, racked, cabled, powered up etc. Be prepared though to throw the plan out. Things never go according to plan, so don’t get hung up when things slip. By having gone through the planning process you will be so familiar with what needs to happen you will be able to make good “heat of the moment” decisions.
2.       Make sure you order food. Not everyone likes the same thing, so make sure you get your order in, otherwise you could get stuck with nothing but cheese or Hawaiian pizza. No offense to the three people who actually prefer these foods, but seriously pineapple on pizza?
3.       Only have one copy of the documentation. If everyone has a different copy of the documentation, throw all of it out and go without it. You won’t be any worse off and at least then you can blame it on lack of documentation.
4.       Make sure the documentation isn’t on a server that is being moved. Trust me, if it’s on the server you are moving, it won’t do any good.
5.       Communicate outages clearly and often. One of the worst things that can happen is for one of the VP’s to call 10 minutes after the truck left claiming they didn’t know about it and need to ship a ten million dollar order.
6.       Plan for people downtime. I’ve heard the whole “Sleep is for the weak” argument, but at some point you cause more damage than you do work. Taking a break and getting some sleep will make you more effective.
7.       Double check the tools and other supplies (moving carts, paper, pencil, tape etc) are available. Typically when data centers are moving, no one else is around and all the stores are closed. Getting  a roll of tape Tuesday afternoon is easy, at 3:00AM on Saturday in the middle of nowhere is not.
8.       Go over the process ahead of time. A walkthrough the day before and again an hour before will make sure people don’t get confused and start putting servers back where they came from.
9.       Servers take 5 minutes to rack, and 5 minutes per cable with two people. Really. I questioned the time once and the team had me do a cabinet (pre move of course). It really does take that long. Also that is 5 minutes per cable. If you have a server with 8 Ethernet ports, 2 power supplies and an out of band management port it will really take an hour. Any less and it will look awful.
10.   When planning the schedule ensure that people aren’t all working on the same cabinet at the same time. It won’t work.
11.   Make sure everyone understands the port numbering. If the switches go 1-24 on the top row and some people think it is odd on the top and even on the bottom. You will have problems.
12.   Verify the documentation before you move. It’s much easier to write down which cable goes to which port when it is still plugged in, than it is to remember where it came from. Have it done once, then have someone else double check it. It really is that important.
13.   People environment. Servers don’t mind the noise and like 65 degree air, people not so much. If you can turn the temp up and the noise down.
14.   Separate application test teams. By the end of the weekend you will be tired and can get sloppy. If possible have a separate team to test the environment
15.   Startup order is important. If you try to bring up servers before the domain, or some applications before the database servers, you can cause issues. If there is an order, make sure the people racking and cabling the servers leave them off until they are ready to come up.
16.   Record issues and lessons learned. Keep track of every problem you run into and what you did to resolve it. This does two things, it reminds everyone how many hurdles the team overcame, and it helps down the road when you see the same problem.
17.   Setup a conference call and make sure everyone can dial in with cell phones, and mutes them (unless they are talking of course) to help troubleshoot issues.
18.   Always have a plan B. If the elevator breaks can you really carry the servers up the stairs?
19.   If you have redundancy or a DR site, use it, but only if it makes sense to. In our case we didn’t have a hot site, but if you do and it is truly redundant use it and move during the week. It will make the move less stressful and test your DR.
20.   Sometimes things just break. If you have something that is broken, don’t automatically assume it is move related. It probably is, but don’t assume it.
21.   If you can, reboot all the servers before the move. This helps find startup issues like patches that were downloaded but not installed.
22.   Have a priority list. You may run out of time, so make sure you know what applications and servers need to be up.
23.   All the teams are team members. Many times other departments, contractors or subcontractors will help. Treat them as if they were your own employees. The success of your project depends on them too.
24.   Celebrate after. Moving a data center is a huge undertaking. Take the time to recognize the team for a great job. You never know when you will have to do it again.

The last tip I’ve learned. If you are the manager for the project or team, be on site and involved in the move, not in the way, and don’t try to help, but get coffee, food, snacks, coil up the old cables, sweep the floor,  etc.

Let’s be honest, you probably aren’t that much help, but having you there emphasize how important the project is. Besides if it goes really bad at least you know you can sleep in late on Monday. I mean you’re probably going to get fired anyway, might as well be rested.

No comments:

Post a Comment