Tuesday, August 9, 2011

Why move to Fiber Channel over Ethernet? Actually…. No reason.

--- warning --- This is an old article from 09. Some of this may have changed since then, but since there is some talk about fcoe now that others are saying it is dead, I figured I'd repost it (or maybe post it for the first time since I can't seem to find it anywhere) --- Rich


I’m a big fan of new technology. Typically it makes a lot of sense and allows us to run a very efficient IT organization. Efficient to us, means more than cheap, though our costs are lower than the average in our industry, it also means providing great business value. Our main metric is really how well our internal customers think of IT. Our goal is to make sure that if anyone in the company is asked how the IT department does, we want to hear back “They’re great”.

Since we are normally early adopters we tend to do a lot of testing and benchmarking ourselves. We deployed Microsoft Windows XP to 100% of the company by the time it was shipping. We are running Office 2007, with Sharepoint integration and Unified Communications and have been for years. We have some 10Gb Ethernet deployed, but for all the latest and greatest technology we have deployed, we spent months in the lab testing and checked over the numbers to prove the investment made sense.

I was excited to start hearing about Fiber Channel over Ethernet, or FCoE, but the more we peeled back the onion the more we cried. It just doesn’t make sense to us. I’m sure it must to some companies, but for us we can’t make it make sense. The following article explains why we just don’t see this as a good fit for us. Your mileage may vary, as the saying goes. We are not paid to do technology analysis, just our analysis of why it won’t work for us and why we went to iSCSI instead.

Talking to experts and reading a ton of web pages we were told there were a few key advantages; namely, power savings; security; better performance and less cabling. These advantages would overcome the higher initial costs to purchase, the training curve and upgrades required to get it running. Let’s jump into our testing and see how it made out.

FCoE requires new hardware, iSCSI can use existing hardware or new hardware if you want to scale more

FCoE, according to the experts, requires a big investment to get working. You need to replace the standard network interface cards (NIC’s) in the servers with Converged Network Adapters, or CNA’s. That costs money and of course downtime to put them in. You also need to upgrade your data center network to the new, non-standard, Converged Enhanced Ethernet. If you want to reduce costs you could just replace the part of your data center that is going to use FCoE and run all the FCoE ports to the upgraded LAN, but then you lose the ability to plug in anywhere and have it work which complicates the cabling. Wasn’t simpler cabling one of the point of this?

To start using iSCSI requires the interfaces on the disk storage to change, but that’s it. iSCSI will run over traditional data center networks and servers can typically just install an iSCSI initiator, which is basically a driver, and start using the storage right away. If you need better performance, you can upgrade the network cards to ones that support TOE, or TCP/IP Offload Engine if they don’t already support that. You should test them in the lab before buying them since some of them can actually make performance worse not better.

iSCSI is easier to manage than FC and FCoE

We used to be a fiber channel shop, but every time we wanted to make a change, it seemed to involve a billable support call. Maybe we just weren’t storage experts, or maybe we just were too nervous. I don’t know. It’s a lot to have to manage the worldwide names, zoning, interconnects etc.

I do know though that when we put in the iSCSI storage we wanted to setup replication between our two data centers to prepare for a consolidation of the two. I decided to “grease the skids” a little and called our sales team to see if they would volunteer a local sales engineer to help us do this. The iSCSI storage was still new, so getting some outside advice seemed prudent. They agreed but when I went to the storage admin to let him know he laughed.

“I don’t need any help, it’s already done. You right click on the volume and chose where you want it to replicate. It should be done in a few minutes”. I was shocked, very relieved, but shocked that it was so easy.

FCOE is a better protocol than iSCSI when combined with the new Converged Enhanced Ethernet

The new Converged Enhanced Ethernet, or CEE, that you need to run FCoE with is lossless and blocking, which sounds cool, but this means if the switch gets busy, all of the ports stop accepting traffic. I’m not sure I want the traffic in my data center to stop, especially on all the ports of a switch. Traditional Ethernet will drop packets, which sounds worse, but since the overlying protocols expect this, they simply retransmit or back off as needed. This is how Ethernet has worked for over 30 years so it’s pretty proven. Don’t mistake CEE for Ethernet. Just because it has the word Ethernet in the name, that’s the end of the similarities. It’s like calling catwoman a cat.

You can work around this by only using the new data center Ethernet for your SAN ports and leaving the rest of the network on the traditional Ethernet. Of course this means you have twice as many cables and essentially a separate network for storage, which is what FCoE is supposed to get rid of. Confused yet?

The FCoE folks will say that they are more reliable because their network is lossless, but in our testing this had no impact. In fact last week we had a power outage and accidentally took down half of our iSCSI storage. Since this was the first time we had taken it offline without a shutdown, we were concerned about data corruption. Out of 100 virtual machines that were running when the storage went away, we had zero data corruption. Once we fixed the power issue and spun the storage back up, we just restarted the virtual images and were back up and running.

Also, even though many people think of TCP/IP and Ethernet as one technology they are separate. You can run Ethernet with other protocols, like IPX. In fact the FCoE standard is a new ethertype. Why does this matter? iSCSI, which runs over IP can be routed, FCoE can’t. While I’m not sure that you would want to run iSCSI over a WAN, you could do it in a pinch or at least route it internally in your data center. Personally I like having the option.

FCOE has better theoretical performance

I suspect the overhead of TCP and IP makes this true, but iSCSI is good enough. We run our entire organization on iSCSI over straight 1Gbe. The way SANS are designed now there are many 1Gb links so getting 20G of performance with regular gigabit Ethernet is easy. If you use 10GB instead of 1Gb links, you will be waiting on disk or server, long before you are waiting on the network anyway, even if you need to retransmit a few bits here and there.

In fact with our previous fiber channel solution and our new iSCSI solution we ran some test builds. Engineers hate waiting for a build to finish, so making them take longer wasn’t going to be an easy sell. When we first met with them they said “if it’s slower, we aren’t moving”. Luckily for us we matched the performance, which made it an easy sell. What we did not do though was test a new FC array so the test is somewhat skewed, but does show that iSCSI performance was more than adequate for our needs. SAP, Exchange and SQL data rates also showed similar performance.
We looked at our iSCSI ports to see how many, if any retransmissions there were. While we did see some retransmissions, the percentage of packets we had to retransmit was .000007 percent, on the worst port. This is hardly enough to worry about.

Power savings

I think this argument is more for fiber channel versus fiber channel over Ethernet, but seriously how much does a single card use for power in a server? As it turns out it’s around 5 watts, at .12 a kwh, that equates to $5.12 USD to run a year. Not much of an ROI, unless you plan to keep it for say 100 years……

I guess you could compare the power needed to run a fiber channel switch versus an Ethernet switch as well. A Brocade DCX with 48, 8GB fiber channel ports on it uses 1337 watts. A 3com 8814 with 48 10GB ports uses 1620 watts or 283 watts more, or to put it in terms of money around $297.50 at our average cost of .12 kwh.
Again there is definitely some power savings, but not really enough to be a big factor in the decision.

FCoE is more secure

Sorry I don’t believe this either. If you run network policy you can dynamically restrict access (or allow access) to only devices that need to access other iscsi devices, only registered clients can talk to the servers and only on ports required for iscsi. Arguably FC would be more secure, since it’s physically separate but you could also build a physically separate LAN using traditional switching.

Of course this is the same argument we had with VLANS back when they were new, and yes physically separate LANS are arguably more secure then VLANS, but who would run a network without VLANS today?

FCoE uses less cabling and is easier to manage.

Uhm how’s that again? If I switch to iscsi instead of FC it will be the same, a pair of 10Ge links and a remote management port. I’ll agree that you will have fewer cables than straight fiber channel but if you’re going to move anyway why compare that.

Secondly if the cables are already run, labeled and dressed in, who cares? Frankly if it’s already done, wouldn’t it be more work to pull the old cables out, run new ones, label them, update the documentation and clean them up? Sounds like it to me.
I’ll agree that you will use less cables going to 10G Ethernet over 1G, but that’s a networking discussion and really has little to do with the protocols on top of it.

Summary

The main reasons I hear to switch to FCoe; power savings; security; better performance and less cabling didn’t hold water in our testing. In addition iSCSI is easier to manage, more flexible and cheaper to start using. Oh and we can use it now.
For us the decision came down to total cost. ISCSI performed better than we needed and we didn’t need to spend a lot of cash to get it in. We could use our existing servers and network and simply upgraded the storage. Since we were going to do that anyway changing from fiber channel made sense.

We also wanted to change now. FCoE and the new enhanced Ethernet isn’t ready yet, in fact most of the Converged Enhanced Ethernet won’t even be ratified for a year and deploying pre-standard products can mean a re-install when the standard finally does come out.

Some companies, or more accurately, some storage administrators may be hesitant to switch to iSCSI and will invariably come up with reasons not to switch. Religious battles aside, iSCSI is worth a serious lab test to try it. You can switch now, get 10G Ethernet, reduce your cabling costs and get the benefits today.

I can see why Brocade and Cisco like it. Brocade is rooted in fiber channel, without it they lose a lot of revenue and market share, and it’s a great way for them to better position the products they got when they purchased Foundry. Cisco on the other hand has to create a new market just to be able to continue to grow and keep its stock price climbing, or at least staying the same. What better way than to convince us we all need a new Cisco network. Maybe this time they’ll throw in the forklift needed to change everything out. For the price they could.




No comments:

Post a Comment