Massive Azure Outage and No Support
-
@gjacobse Account managers and billing departments and the janky support are all part and parcel of it. I'd argue it's 100% a cloud issue. Just not a technical IT issue.
-
@MattSpeller said:
@gjacobse Account managers and billing departments and the janky support are all part and parcel of it. I'd argue it's 100% a cloud issue. Just not a technical IT issue.
It's a Microsoft issue. Anyone seen something like this with Rackspace, Amazon, Digital Ocean, etc.? This is twice in a month with MS on an issue like this.
-
Now, to buy time, they are claiming that they are turning the systems on. But it has been fifteen minutes and not a single one is back on yet. We are at nine hours down now and no answers, no fixes, no nothing.
-
@scottalanmiller Totally agree it's MS that screwed up, just pointing out that this is a little talked about issue with cloud computing. Maybe it should be called "Trust in Other Companies" computing. The other companies you mention still have billing departments and account managers and they're all human and capable of (unintentionally) giving you a real bad day. How do you build reliability into the cloud from outside?
-
@MattSpeller said:
@scottalanmiller Totally agree it's MS that screwed up, just pointing out that this is a little talked about issue with cloud computing. Maybe it should be called "Trust in Other Companies" computing. The other companies you mention still have billing departments and account managers and they're all human and capable of (unintentionally) giving you a real bad day. How do you build reliability into the cloud from outside?
Well one option is crossing cloud boundaries, a lot of companies do this. They host half on Amazon and half on Rackspace, for example.
-
You can have multiple accounts with a single provider. Unfortunately Azure causes problems with this and you can't safely use that technique with Microsoft because of the whacky authentication stuff that they try to do. Their authentication is why this all happened. They get so weird about how that stuff works that even they can't figure out which account is which.
-
This thread makes me glad we only have colocated systems. We did tons of research into the colo, and we are 200% happy with our choice. They have all the niceties... onsite generators, 2x power feeds, each w/ UPS redundancy, redundant network feeds.... I feel like our little, specialized cloud hosting platform we provide is more robust than MS now.
-
@RojoLoco said:
@MattSpeller said:
Stuff like this makes me glad we have (most) of our junk internal. So much trust to put in another company!
Edit: advantages to cloud are obvious, don't get me wrong.
On-prem for the WIN!!!!
Yes and no. On premises you and your team however small are responsible. The last two Non Profits I worked for I was the sole IT person, supporting (x) staff, (x) servers and (x) locations.
I walked in on Wednesday to a server crash - I walked out of the office at 1pm on Friday.
Ideally, in a hosted arrangement, you would have more than just 2 or 3 people to work on the issue, therefore allowing the task of rebuilding / replacing a system to be balanced - sleep is a good thing to have. You start to make mistakes after so many hours of being awake and dealing with a single issue.
-
@gjacobse said:
I walked in on Wednesday to a server crash - I walked out of the office at 1pm on Friday.
The cost of being a one person show is high, no doubt. I still love it.
-
@MattSpeller said:
@gjacobse said:
I walked in on Wednesday to a server crash - I walked out of the office at 1pm on Friday.
The cost of being a one person show is high, no doubt. I still love it.
I really love my team. That's not to say I didn't enjoy being the soul person. But sometimes you need to have a second pair of eyes to catch that obscure error you missed looking over the logs for teh 24th time.
-
It has been forty minutes since they told us that they were going to turn on the systems after everything was "resolved" and not a single system is on yet. We've got all kinds of people sitting around working on this wasting our time and they can't be bothered to do anything. This is insane. At this point they are just flat out lying to us and think we are idiots who can't tell or will forget or something.
-
@gjacobse said:
@RojoLoco said:
@MattSpeller said:
Stuff like this makes me glad we have (most) of our junk internal. So much trust to put in another company!
Edit: advantages to cloud are obvious, don't get me wrong.
On-prem for the WIN!!!!
Yes and no. On premises you and your team however small are responsible. The last two Non Profits I worked for I was the sole IT person, supporting (x) staff, (x) servers and (x) locations.
I walked in on Wednesday to a server crash - I walked out of the office at 1pm on Friday.
Ideally, in a hosted arrangement, you would have more than just 2 or 3 people to work on the issue, therefore allowing the task of rebuilding / replacing a system to be balanced - sleep is a good thing to have. You start to make mistakes after so many hours of being awake and dealing with a single issue.
Good point, but I work for a profit driven company that is not afraid to (wisely) spend money to make the infrastructure sound. I am the only official IT person, but my manager knows as much as I do (more), plus he is a developer, so we have tech-minded C-levels. It's truly ideal, and I love my job.
-
@gjacobse Master and Apprentice works well.
-
@scottalanmiller Time to order pizza and coke for the office, have a moral boost to get everyone through the poop tsunami
-
Everyone is in a car trying to get to Spiceworld. Trust me, Microsoft is going to be VERY glad that they don't have a booth this year. After losing my email on Office 365 and now bringing down Azure and lying to us about getting it fixed I'm beyond livid.
-
@MattSpeller said:
@scottalanmiller Time to order pizza and coke for the office, have a moral boost to get everyone through the poop tsunami
Beers and Netflix until stuff comes back up...
-
@RojoLoco cracks the top on the scotch and pours two "wee" drams
-
50 minutes after they lied about turning the systems on.... nothing.
-
Wow this is terrible. Hopefully something happens soon.
-
At an hour now since they claimed to be powering on the VMs. An hour!!!