Report Blames IT Staff for Not Taking Backups at King's College London
-
https://www.theregister.co.uk/2017/02/23/kcl_external_review/
This is some seriously failure at KCL here. They had everything running off of a single SAN and that SAN borked itself due to a firmware bug when one controller failed, taking out the other. The IT team had declined to use backups and were just storing a copy of data on the same SAN that the data was originating from, so everything was lost all at once. And they had not been properly maintaining their storage.
-
That's pretty epic on the fail scale, there. Of course their SAN vendor threw them under the bus and claimed that "just weeks before" a firmware patch had been released that would have protected them. Okay, maybe that's true, but that means that up until a few weeks before that bug had been unpatched and the system was at risk of this happening up until that point. Of course they should be patching their obviously totally critical system, but they also shouldn't need to worry about controllers shooting each other in the head in a top end SAN.
-
@mlnews said in Report Blames IT Staff for Not Taking Backups at King's College London:
https://www.theregister.co.uk/2017/02/23/kcl_external_review/
This is some seriously failure at KCL here. They had everything running off of a single SAN and that SAN borked itself due to a firmware bug when one controller failed, taking out the other. The IT team had declined to use backups and were just storing a copy of data on the same SAN that the data was originating from, so everything was lost all at once. And they had not been properly maintaining their storage.
-
@nadnerB it's true, system admins only get known for their failures.
-
So I wonder how many resumes have been sent out since then, ha ha. Seriously wonder what kind of reprimands happened to the IT Department there.
-
@dafyre said in Report Blames IT Staff for Not Taking Backups at King's College London:
So I wonder how many resumes have been sent out since then, ha ha. Seriously wonder what kind of reprimands happened to the IT Department there.
RGE: Resumé Generating Event
-
@dafyre said in Report Blames IT Staff for Not Taking Backups at King's College London:
So I wonder how many resumes have been sent out since then, ha ha. Seriously wonder what kind of reprimands happened to the IT Department there.
I'm sure there will be some... but I wonder if this was one of those. It will cost £x to do this right and to back it up. Then the administration said... ok sure but do it for 1/3 of that.
-
@dafyre said in Report Blames IT Staff for Not Taking Backups at King's College London:
So I wonder how many resumes have been sent out since then, ha ha. Seriously wonder what kind of reprimands happened to the IT Department there.
Not many, I would guess. Blame has been placed, hiring will be hard, no reason to get rid of those people.
-
@coliver said in Report Blames IT Staff for Not Taking Backups at King's College London:
@dafyre said in Report Blames IT Staff for Not Taking Backups at King's College London:
So I wonder how many resumes have been sent out since then, ha ha. Seriously wonder what kind of reprimands happened to the IT Department there.
I'm sure there will be some... but I wonder if this was one of those. It will cost £x to do this right and to back it up. Then the administration said... ok sure but do it for 1/3 of that.
That's my guess. No management was named in the report.... um, NO management was involved in a blunder of this scale? Possible, but not likely.
-
@scottalanmiller said in Report Blames IT Staff for Not Taking Backups at King's College London:
@coliver said in Report Blames IT Staff for Not Taking Backups at King's College London:
@dafyre said in Report Blames IT Staff for Not Taking Backups at King's College London:
So I wonder how many resumes have been sent out since then, ha ha. Seriously wonder what kind of reprimands happened to the IT Department there.
I'm sure there will be some... but I wonder if this was one of those. It will cost £x to do this right and to back it up. Then the administration said... ok sure but do it for 1/3 of that.
That's my guess. No management was named in the report.... um, NO management was involved in a blunder of this scale? Possible, but not likely.
The Register ever points that out:
Among the most anticipated details of the review were whether it would name managers who had been responsible for poor decisions that caused the data loss. These were not included.
Then places the blame solely on the IT department
"In addition some data has consciously never been backed up on tape due to capacity constraints and the potential impact of this was never communicated to the College," the review added.
Sure there could have been bad communication but this sounds like a management and purse string failure to me.
-
@coliver said in Report Blames IT Staff for Not Taking Backups at King's College London:
Sure there could have been bad communication but this sounds like a management and purse string failure to me.
My guess is that the failure was from the same people that commissions the report.
-
Really good point made here...
HP didn't check the firmware nor update it before replacing something that they knew had this bug!
-
Source: "In addition some data has consciously never been backed up on tape due to capacity constraints and the potential impact of this was never communicated to the College," the review added.
Oh really, it never got communicated to management? Either management ignored a known problem, or their hiring process only got incompetent IT people. Neither makes management look good.
-
@travisdh1 said in Report Blames IT Staff for Not Taking Backups at King's College London:
Source: "In addition some data has consciously never been backed up on tape due to capacity constraints and the potential impact of this was never communicated to the College," the review added.
Oh really, it never got communicated to management? Either management ignored a known problem, or their hiring process only got incompetent IT people. Neither makes management look good.
And these aren't the professors, these are the ones that actually work. Imagine how little the professors know that didn't manage to actually work in IT or make it into management.
-
At the end of the day, this was a single point of failure (SPOF) SAN and a large inverted pyramid of doom setup. I'm sure that at this scale that they saved a load of money by doing this and foregoing high reliability, but it shows that even the most expensive SANs with the vendor's own support still maintain the dual controller fragilities that we normally associate only with lower end gear. Given how rarely outages are reported, the number that we assume must die like this privately without the public being told is likely huge. What we do know is that the vendor, especially the one in question, claims that this stuff never happens and seems to always blame the customer, even when the vendor themselves is the one who did it, like here.
-
Yeah, that's a serious CLM...
-
@Tim_G said in Report Blames IT Staff for Not Taking Backups at King's College London:
Yeah, that's a serious CLM...
Non-profit / university, I doubt that they really care. They placed blame and covered up the management, might as well keep the scapegoats around.
-
Goes nicely with this one: https://www.itnews.com.au/news/atos-faulty-sans-will-be-sent-to-us-for-forensic-testing-453108
Part of this saga: https://www.itnews.com.au/news/hpe-storage-crash-killed-ato-online-services-444490
-
I'm not sure what to make of this paragraph. What else would shared drives be used for?
-
@Breffni-Potter said in Report Blames IT Staff for Not Taking Backups at King's College London:
I'm not sure what to make of this paragraph. What else would shared drives be used for?
Lol. I feel there was no one competent involved from management to users to IT to vendor to auditors.