The Inverted Pyramid of Doom Challenge
-
@scottalanmiller said in The Inverted Pyramid of Doom Challenge:
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
Because performance and availability problems come from the bottom up not the top down. SQL has storage as a dependency, storage doesn't have SQL as a dependency, and everything rolls downhill...
That doesn't make sense, though. Applications care that they have enough CPU, memory, IOPS, bandwidth, etc. That's it. They don't care how it is delivered, only that it is available when needed. This would be, again, a failing of any application team and any IT team if they look to the application for issues involving not providing enough resources for performance.
If your point here is that incompetent IT departments tend to buy unsupportable, crappy software.. sure. No one is denying that people don't do their jobs well. But that doesn't mean we should recommend doing things poorly just because lots of people aren't good at their jobs.
Most IT departments (Even enterprises) are not skilled (or skilled well) at troubleshooting infrastructure (Especially beats like ERP that might have a dozen interdependent systems) without assistance. Most ERP vendors know this and so rather than let the customers deploy a database for 20K users on a Hyper-V host with a 3 Disk RAID 5 (and then the project be written off as a failure and their name be damaged) take this choice away.
For the 5 years I consulted "why is this slow" was one of the most common engagements. 9/10 of the time I was chasing some crazy application issue it had nothing to do with the application. Generally it was staring people in the face, had a giant RED alarm, and was fairly obvious (Disk latency isn't supposed to be 1200ms, and NL-SAS drives shouldn't be used for DB's in 5 billion dollar companies Yo?). Assuming that vendors are crossing a line by assuming internal IT doesn't understand what it will take to deliver their applications is CRITICAL to being a successful application vendor. I've seen users, IT and C-suite trash applications that worked fine, but the infrastructure was all wrong....
This is part of a huge reason for many vendors pushing for SaaS offerings, or OPEX offerings. If you don't bundle high levels of support that can extend beyond the application your risking your revenue. Much like why Scale (and other highly successful HCI vendors) try to own support of the ENTIRE stack. If they didn't own support of the hypervisor people would do awful, awful things then blame them.
Its not fair, but its the reality we live in...
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
Most IT departments (Even enterprises) are not skilled (or skilled well) at troubleshooting infrastructure (Especially beats like ERP that might have a dozen interdependent systems) without assistance. Most ERP vendors know this and so rather than let the customers deploy a database for 20K users on a Hyper-V host with a 3 Disk RAID 5 (and then the project be written off as a failure and their name be damaged) take this choice away.
Most ERP vendors are not skilled at this either, though. SAP continuously fails at this. Their competitors are often worse. Some actually do the opposite and specifically require three disk RAID 5 rather than do this to avoid it.
Sure, most IT departments are bad. But again we are going down the "assumption of bad decisions" bad. The one that says "we should make bad decisions, because we make bad decisions, so we start recommending bad things." It's not good logic to say "people are often dumb, so we assume you are and make products based on that." That might be good logic for making money (and why vendors always recommend it) but it's not a good idea to do business with those vendors.
Basically how I read this is "in these cases, your vendor is building their products based around you not being competent and that you will make bad decisions." That's great, that just, to me, repeats by original point that IT should rule out those vendors.
-
@scottalanmiller said in The Inverted Pyramid of Doom Challenge:
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
Backups - (Some Hypervisors have changed block tracking so a backup takes minutes, others don't meaning a full can take hours). BTW I hear Hyper-V is getting this in 2016 (Veeam had to write their own for their platform).
Sure... but what does the application layer care? Either the application takes care of its own backups and doesn't care what the hypervisor does, or it relies on IT to handle backups and it isn't any of their concern either.
Again, this is an application vendor or programmer trying to get involved in IT decisions, processes and designs. Do you let the company that makes your sofa determine how big your fireplace has to be because "they want to ensure that you are cozy?"
Application owners have RPO/RTO's and they expect the infrastructure people to often take care of that. (When I have a 5TB OLTP database, in guest options generally fail to deliver somehow).
If I buy a couch or desk that's massive for a tiny apartment I could see the sales guy asking how big my door way is to make sure they can deliver it. Otherwise I'll be saying "GALLERY FURNITURE SUCKS THEY SELL COUCHES THAT DON"T WORK". This is what users, application owners, and infrastructure people do today. Vendors MUST protect their name. I'm not saying these whiners make any sense, but people do this.
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
For the 5 years I consulted "why is this slow" was one of the most common engagements. 9/10 of the time I was chasing some crazy application issue it had nothing to do with the application. Generally it was staring people in the face, had a giant RED alarm, and was fairly obvious (Disk latency isn't supposed to be 1200ms, and NL-SAS drives shouldn't be used for DB's in 5 billion dollar companies Yo?). Assuming that vendors are crossing a line by assuming internal IT doesn't understand what it will take to deliver their applications is CRITICAL to being a successful application vendor. I've seen users, IT and C-suite trash applications that worked fine, but the infrastructure was all wrong....
And that's why external IT consulting was brought in. Not a random application vendor.
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
Application owners have RPO/RTO's and they expect the infrastructure people to often take care of that. (When I have a 5TB OLTP database, in guest options generally fail to deliver somehow).
Yup, and if they outside IT to the application vendor, that SLA isn't owned by the actual IT department but by someone who came in, put in something new and ran away. If the application owners need a reliable RPO/RTO, they need to work with IT, not work against them.
-
@scottalanmiller said in The Inverted Pyramid of Doom Challenge:
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
Application owners have RPO/RTO's and they expect the infrastructure people to often take care of that. (When I have a 5TB OLTP database, in guest options generally fail to deliver somehow).
Yup, and if they outside IT to the application vendor, that SLA isn't owned by the actual IT department but by someone who came in, put in something new and ran away. If the application owners need a reliable RPO/RTO, they need to work with IT, not work against them.
Its the reality in most companies. Software vendors requirements are not rooted in how IT SHOULD be run, but how it does. I agree with you in principal (it shouldn't matter) I've just seen hundreds of counter examples that would have destroyed these companies names.
There was thread on SW recently where someone said "NIMBLE SUCKS I DON"T GET THE IOPS I PROMISED". The next post was his Nimble sales rep posting "So I see your at 20% load, your IO latency is .5 ms currently and while your 220C model is one of our smaller ones we have far larger ones. If your having any problems please call us and we will help you" I laughed, but it made me realize the damage that incompetent IT do to the name of a product or application. We are at the point that a sales rep would rather piss off a customer and call them out as an idiot (he was nice about it) than risk their companies name being drug through the mud.
The "IT guy is always" right attitude in IT bothers me. Part of why I always enjoyed arguing with you (and others internally) as its the only way to challenge my idea's and learn and thing of new things. Part of the reason I enjoyed consulting (although I did learn a lot of tact of how to carefully make people think it was their idea, or gently expose why what they were doing was hilariously a bad idea).
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
If I buy a couch or desk that's massive for a tiny apartment I could see the sales guy asking how big my door way is to make sure they can deliver it.
Would you honestly still do business with a furniture store that didn't state the size (that's demanding certain performance, the results not the means) but rather demanded that you buy a certain make or style of door regardless of the fact that the one that you had would have been big enough? because that's the comparison.
Making sure that the SIZE is right I always agreed to. Demanding only doors from certain vendors be used is where the insanity happens. Or forcing you to install a new door because they don't trust your measurements.
-
@scottalanmiller said in The Inverted Pyramid of Doom Challenge:
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
If I buy a couch or desk that's massive for a tiny apartment I could see the sales guy asking how big my door way is to make sure they can deliver it.
Would you honestly still do business with a furniture store that didn't state the size (that's demanding certain performance, the results not the means) but rather demanded that you buy a certain make or style of door regardless of the fact that the one that you had would have been big enough? because that's the comparison.
Making sure that the SIZE is right I always agreed to. Demanding only doors from certain vendors be used is where the insanity happens. Or forcing you to install a new door because they don't trust your measurements.
I think we've reached stasis here. I've provided examples where the platform matters.
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
Its the reality in most companies. Software vendors requirements are not rooted in how IT SHOULD be run, but how it does.
It's mandated shadow IT. It's a subversive approach. One nice thing for internal IT is that any failing of the system you get to run the vendor through the ringer. But I want products for my customers that are based around them being successful, not assuming their failure. I have very different goals (I want the company to succeed) than the software vendors (they could care less if it works, only that they don't get blamed.)
I don't blame vendors for taking advantage of bad business processes, suckers deserve to be suckered they have no one to blame but themselves, but my point is that good IT would be working on protecting their businesses from these processes and good management would be tasking them to do so.
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
There was thread on SW recently where someone said "NIMBLE SUCKS I DON"T GET THE IOPS I PROMISED". The next post was his Nimble sales rep posting "So I see your at 20% load, your IO latency is .5 ms currently and while your 220C model is one of our smaller ones we have far larger ones. If your having any problems please call us and we will help you" I laughed, but it made me realize the damage that incompetent IT do to the name of a product or application. We are at the point that a sales rep would rather piss off a customer and call them out as an idiot (he was nice about it) than risk their companies name being drug through the mud.
That's not incompetence, though. That's just someone lying. there is a difference.
-
One last thought...
IF the reason that Xen has 2% market share is because there is NO LOGICAL REASON for vSphere or paid Hyper-V (with VMM to manage) then that means 98% of IT people are idiots. If 98% are idiots, wouldn't that mean they should be outsourcing their IT as much as possible to their vendors or others? (and therefore not deploy Xen).
Catch-22
-
@scottalanmiller said in The Inverted Pyramid of Doom Challenge:
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
There was thread on SW recently where someone said "NIMBLE SUCKS I DON"T GET THE IOPS I PROMISED". The next post was his Nimble sales rep posting "So I see your at 20% load, your IO latency is .5 ms currently and while your 220C model is one of our smaller ones we have far larger ones. If your having any problems please call us and we will help you" I laughed, but it made me realize the damage that incompetent IT do to the name of a product or application. We are at the point that a sales rep would rather piss off a customer and call them out as an idiot (he was nice about it) than risk their companies name being drug through the mud.
That's not incompetence, though. That's just someone lying. there is a difference.
I learned years ago to never prescribe malice to what you can attribute to ignorance in this industry. He likely was unhappy the number in his dashboard didn't say 100K!
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
I think we've reached stasis here. I've provided examples where the platform matters.
Okay, I'll buy that. The platform matters when internal IT has failed and you outsourced to an external IT department who has an interest in selling you something that you don't need to make extra money on probably the sale and definitely the consulting. Yes, I agree, but I don't agree that that doesn't match my original point. It's not in the interst of the customer, but there is a reason why they feel that they have to do it based on other decisions made in the same way.
Do you feel, however, that since this discussion is based on scale for the context of the original question, that there is ever a realistic time that this happens at three or fewer compute nodes? We are talking about three nodes for an entire business here. What business, anywhere, is that small and deploying systems where vendors interact with them in this manner? I'm not saying that theoretically it isn't possible, but this thread is asking for an example where this has ever happened.
Outside of pure theory, and even there I feel that it is hard to theorize, who has products that need these kinds of things while being so small as to not have benefits of the IPOD due to scale?
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
I learned years ago to never prescribe malice to what you can attribute to ignorance in this industry. He likely was unhappy the number in his dashboard didn't say 100K!
And likewise, my rule of thumb is that willful ignorance is one of the worst forms of malice
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
One last thought...
IF the reason that Xen has 2% market share is because there is NO LOGICAL REASON for vSphere or paid Hyper-V (with VMM to manage) then that means 98% of IT people are idiots. If 98% are idiots, wouldn't that mean they should be outsourcing their IT as much as possible to their vendors or others? (and therefore not deploy Xen).
Catch-22
Agreed. That makes total sense. And I agree 100%. Almost all (and I totally mean that, something like 98%) of IT should be outsourced. The industry should shrink dramatically, the smaller pool of people who remain should be consolidated into fewer shops and those shops should demand far higher levels of excellence and continued training and raise salaries significantly as the industry tends to lose people who are really valuable because they can often make more money elsewhere and choose to.
However, this is what I'm talking about that assuming bad decisions will be made we then make bad recommendations based on that. It's not valuable to make recommendations and there is no point in doing so, to the 98%. They neither look for nor listen to good advice. Good advice always exists solely for those that will take it. It's the same discussion as "is college valuable for you (in IT.)" If every single person ever listened to that advice, it would be self defeating in weird ways. But they don't, advice around it is for the .1% who might listend and enact change. Good advice remains good advice, that there are reasons why bad decisions are made doesn't make them bad decisions.
I'm never sure if I can explain what I mean here well, but I see it a lot in IT - people give bad advice (could be in any arena, I see it constantly, though) based on the idea that "well they won't listen to my actual good advice." Sure, we know that they won't, but should we make bad recommendations because of that?
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
The "IT guy is always" right attitude in IT bothers me.
Me too, that's why I rarely feel that IT should be allowed to decide things. But IT is the sole critical information holder for a lot of things in the business, performance, cost and security being key ones. No other department has a view into these factors. It is the job of business management to ensure that they have skilled IT that knows its role in the business, to pass decisions through them and to listen to them. And it is IT's job to understand that it is part of the context of the business and only knows certain factors.
That's why an IT veto is important because IT can identify scams, incompetence, lack of industry standards, bad practices and such that other departments would likely not even understand (what do you MEAN VB6 isn't a common thing any longer!!) - because they can filter out options that have no place being considered. Whereas the business needs to make most decisions because they know what is valuable to the business. Neither can act alone. The idea with much of this, though, is that IT would be bypassed and much of the most important IT decisions be made by people who are not IT.
Imagine doing this to HR or accounting. Would other departments demand that HR violate good practices around compensation or reporting? Or that accounting not correctly record expenditures or not pay all taxes? (These do happen, and it's normally very bad.)
-
This is my quote from the original challenge: "We all (I hope by now) know that SANs have their place and a super obvious one that explains why enterprises use them almost universally and know why that usage has no applicability to normal SMBs - scale."
I agree with why lots of shops might deploy systems like you are describing, even if I generally don't agree with that decision, but I'm pretty confident that the use cases that you are describing @John-Nicholson are tied, nearly universally, to a scale that would already prompt a SAN-based infrastructure (or similar.)
Have you seen these in small environments where the scale did not exist to warrant a SAN otherwise?
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
Security - Guest introspection support to hit compliance needs, Micro-segmentation requirements (EPIC has drop in templates for NSX, Possibly HCI at some point). If you want actual micro-semtnagation and inspection on containers there isn't anything on the market that competes with Photon yet. At some point there may be ACI templates but that will require network hardware lock in (Nexus 9K) and that's even crazier (Applications defining what switch you can buy!).
Really, once you even start to think of defining the storage, you have to define the switches too. Once you are into that realm of not allowing IT to screw anything up, you can't let them screw anything up. You'd really want to be defining cabling, UPS and more as well.
-
I'm definitely not trying to say that there are absolutely zero potential use cases for an IPOD, but only that they are so rare below the "scale" line (the drop dead number is three and the general rule of thumb is twelve) that I'm wondering if anyone has a real world example.
Even within the described examples, they are theoretical I believe, and very unlikely. How many of these have been observed?
It's worth pointing out the use case and adding it as an aside to a recommendation document. I'll give a completely different example, that hopefully explains my thinking...
If someone is deciding on if they want to attend university or not, we generally focus on things like the time and money, career advantages and such. But there are career choices, like doctor, lawyer or teacher, that require a degree and it is not in any way a decision, it simply is a requirement. That doesn't imply that the degree is useful for that field outside of the requirement, but the requirement is the requirement. So if an IPOD is a non-IT requirement without business context, it does not fall into the business context. While this should be assumed as being an exception to any case of business logic, it should probably be explicitly stated nonetheless because people often forget about the cases outside of the decision matrix - or focus solely on them and think that cases outside of the logic pool influence those within it (e.g. the career success of doctors tells us nothing about the value of a college degree outside of that one case where it is a requirement.)
-
@scottalanmiller said in The Inverted Pyramid of Doom Challenge:
This is my quote from the original challenge: "We all (I hope by now) know that SANs have their place and a super obvious one that explains why enterprises use them almost universally and know why that usage has no applicability to normal SMBs - scale."
I agree with why lots of shops might deploy systems like you are describing, even if I generally don't agree with that decision, but I'm pretty confident that the use cases that you are describing @John-Nicholson are tied, nearly universally, to a scale that would already prompt a SAN-based infrastructure (or similar.)
Have you seen these in small environments where the scale did not exist to warrant a SAN otherwise?
Have you see a flexpod or vBlock? Part of CI is defining the network switches and configuration of the fabric. The argument is even with a 20% capital markup the time to outcome out weighs the do it yourself approach, and historically they are right. The difference is going from CI to HCI has moved the time to value down exponentially. I think the logical progression for HCI vendors in some area's is to do just this.
Scale (long before you worked with them) in the old GPFS days had a stricter HCL for switches than any other iSCIS storage vendor I had ever seen. You know what, there was a reason. Scale out systems are incredibly vulnerable to shifty low end switching. I even tried deploying one with 3750X's (much more expensive than the 2910AL's, but practically much slower) and performance was awful until switches were replaced. The funny thing was the customer tried to blame scale (and not the slow Cisco switches that the network team was in love with). I would argue Scale is "ahead of the curve" in having HCL's and restrictions on outside factors that can make them look bad (this was something like 5 years ago).