Tuesday, September 6, 2011

How people want to manage networks

I was watching a cool video with Mark Benioff and Eric Schmidt (CEO's of salesforce.com and Google - you may have heard of them) and one thing they highlighted between Microsoft and Apple was Apple focused on the consumer experience more than Microsoft did. Now I'm not about to comment on that but it did get me thinking, how do network managers want to manage networks?

I mean the way it works now managing networks is a lot of work. Typically you have a ton of devices that you need to manually configure, or at least configure separately, using a "command line interface" or CLI. These are usually pretty cryptic commands, though they do usually have a help key, like ?, to make it easier. The problem is though you usually need to do this to every device to make a change or view what's going on.

Imagine a scenario where you get a call from a user in a remote site saying "SAP is slow". Generally that prompts a lot of questions, like:

"What site are you in?"
"Are other applications running slowly?"
"Is anyone else having the same issue?"
and my favorite
"Have you rebooted?"

Now even once you get these questions answered, typically a network engineer will need to find their laptop and boot it up (which can take a few minutes), connect to a network or cellular modem, fire up VPN and then start to look around at the site to see what's going on.

If this is an after hours page, those minutes can seem like a long time (especially at say 3:00AM when you are trying to be quiet so you don't wake the rest of the house). Then you need to start "telnetting" to different routers and switches to figure out what's happening. Many times it's as simple as a bad cable or port and simply changing that can fix it, but it's not always the port the user is plugged into. Sometimes it is a port "upstream" that can take longer to find.

Other times, it's a simple issue of too many people using the link, sometimes appropriately, sometimes not.  With viruses, the users may not even realize that they are using resources.

I spend a lot of time thinking about how to make these problems easier to find. I'd love to say we don't have them at Enterasys but we do to, but when we have them we figure out how to fix them, I mean really fix the underlying issue, like why does it take so long to figure out someone closed a fiber cable in a door in Ireland.

What we came up with is called isaac. With isaac, instead of having to boot up a laptop, connect in through VPN and then start troubleshooting by going to each device, I get to "chat" with my network. In the scenarios above the chat is really simple.
"Are any devices down". I should have already been paged on these of course, but it's good to double check.
"Is the site experiencing any bandwidth issues"
       if so "Who is the biggest user",
       then if I want to I can stop the user from causing problems with a simple command like:
                    "ratelimit <user>", or
                    "blacklist <user>"
"Are any ports showing errors"?
     If so where are they so I can have a local technician replace them.

I can actually do these commands from my smart phone, or anyone else's smart phone that lets me get to Twitter or Chatter

We think this is a better way. What do you think?  I'd love to get comments on how you want to manage networks. What other commands would you want to see?


No comments:

Post a Comment