Ubuntu Systemd Bad Entry
-
This post is deleted! -
All disks in the array appear to be fine according to MD.... So this is clearly this is something with the VM.
-
So I was able to just restore this VM to a snapshot from the other day.
Should I perform another fsck on this virtual system?
-
Not if it does not prompt you to.
-
So how can I check to see if whatever caused this issue is still present? I mean if it just happens from time to time, fine.
But wouldn't it be good to know what caused it?
-
@DustinB3403 said in Ubuntu Systemd Bad Entry:
But wouldn't it be good to know what caused it?
That's a common thought and it makes sense, kind of. But computers are ridiculously complex beasts and not all issues are replicable. Gamma radiation, insanely uncommon bugs, memory errors, CPU errors, disk errors and such can all lead to corruption. These things happen. If you want to investigate every possible error ever you can easily spend more than the system is worth and only "guess" at the problem in the end - all for something that is unlikely to ever happen again.
Think of a windshield and you get a crack in it. You don't remember something hitting your windshield. Do you stop driving and spend months doing forensics trying to determine if it was a rock, bird, bug, bridge debris, glass fragility, bizarre temperature change, etc. that caused it to crack? Would knowing be useful? Not if it doesn't happen again.
So yes, KNOWING would be great. But FINDING OUT is not. Make sense? The cost required to know isn't worth it unless it becomes a repeating problem.
-
This issue is still occurring and interrupting my backup schedule for my VM's.
The host appears to be fine. So either I have to build a new VM, or something is wrong with the host.
Guest
-
So doing to a smartctl on the host it appears that /dev/sdb does have several errors. I'll be replacing this drive today and see if the issue persist.
The other 3 disks have no smart errors at all.
-
essentially this one disk is in a pre-failed state due to age.
So performing
mdadm /dev/md0 --fail /dev/sdb --remove /dev/sdb
and then replacing this disk I should be in a good state.
-
And the array is resilvering the now replaced disk.
As an FYI for anyone on software RAID, the drives are organized in a manner that aligns to the SATA connections on the board.
IE : USB boot device is SDA
SATA1 (or 0 however it is labeled) = SDB
SATA2 = SDC
and so on. -
Well at least for now, the I/O errors have stopped after I replaced the bad disk in the host array and reverted the VM.
I'll keep an eye on it and report back if the issue comes back.
-
And these are back.
-
Same disk?
-
On the guest OS there is only 1 disk (it's presented from the array).
I checked the smart stats on each drive and found no issues. MD was also fine.
-
-
-
Although on a separate note..... this is a shit desktop... that is acting as the hypervisor... so errors really shouldn't surprise me.
-
All disks in the host are marked as "Old_age" so there really isn't much that I can do besides assemble something else.
Hopefully out of newer equipment.
-
Maybe I'm reading this wrong.. Smart overall-health self-assessment is passed on all drives.
So maybe it's just a column saying "it'll fail here"
-
I'm performing long tests on each of the drives to confirm the information I have is accurate. I'll update in ~90 minutes.