Solved PCI bus error
-
Just a little old..
It is running RHEL 4
-
@JaredBusch Most likely a failing NIC, yes.
Also, talk about a blast from the past. That's the same era of hardware as a Vostro 220 I pulled out of service yesterday.
-
@travisdh1 said in PCI bus error:
@JaredBusch Most likely a failing NIC, yes.
Also, talk about a blast from the past. That's the same era of hardware as a Vostro 220 I pulled out of service yesterday.
Jared seems to be trying to keep this unit in service, unless of course there is a good reason to replace it.
-
@DustinB3403 said in PCI bus error:
@travisdh1 said in PCI bus error:
@JaredBusch Most likely a failing NIC, yes.
Also, talk about a blast from the past. That's the same era of hardware as a Vostro 220 I pulled out of service yesterday.
Jared seems to be trying to keep this unit in service, unless of course there is a good reason to replace it.
Yeah, this is @JaredBusch tho, I think we all know he's not keeping it in service by his choice.
-
So, no one can hep me confirm that the NIC is failing?
Also, anything I can look at to see if this is the onboard NIC or the NIC on a separate card?
I'll be on site on Tuesday.
-
@JaredBusch said in PCI bus error:
So, no one can hep me confirm that the NIC is failing?
Also, anything I can look at to see if this is the onboard NIC or the NIC on a separate card?
I'll be on site on Tuesday.
Did not see any info indicating NIC failure. No idea what device 6 is.
Looking up the Intel NIC u linked shows this:
-
No, it's not the NIC.
It says PCIe error bus 0, device 6, function 0.
That's 00:06.0, and that is the PCI bridge E7520. Which I think is connected directly to the chipset on the CPU. Can't remember exactly what was on the CPU and the chipset back in those days.
Either way the motherboard/CPU is done.
Or I guess technically speaking a driver error caused by drive corruption could have caused the same error. After all it's the OS that gives the error message here.
-
@Pete-S said in PCI bus error:
After all it's the OS that gives the error message here.
No the error message if from the BMC (predecessor to iDRAC).
-
@Pete-S said in PCI bus error:
No, it's not the NIC.
It says PCIe error bus 0, device 6, function 0.That is why I wanted others to look. The way I read the man page it seemed that the bus was omitted when using
lspci
. -
@JaredBusch said in PCI bus error:
@Pete-S said in PCI bus error:
No, it's not the NIC.
It says PCIe error bus 0, device 6, function 0.
That is why I wanted others to look. The way I read the man page it seemed that the bus was omitted when using
lspci
.No its
<bus>:<device>.<func>
But it's a bit confusing nowadays compared how it was in the old days when you had all the devices on the same bus.
-
So the customer asked me to spec out a replacement server.
This is what I am thinking to recommend.
Dell PowerEdge R6515 – Chassis with 8x 2.5” drives AMD EPYC 7262 or 7302P 1x 16gb RDIMM 3200MT PERC H730P 3x 480GB SSD SATA Mix Use Hot plug Dual hot plug power supply Riser Config 1 1x16LP PCIe slot iDRAC 9 Express BOSS controller card with 2 M.2 240GB RAID 1
Comments?
-
Seems like anything will work in this scenario given how old the original was. What's the workload?
-
@scottalanmiller said in PCI bus error:
Seems like anything will work in this scenario given how old the original was. What's the workload?
A proprietary system from TopTech
Server load is nothing normally. The system is catching up from a planned maintenance window at the moment.
I'll get another snapshot once it is caught up.
-
@scottalanmiller said in PCI bus error:
Seems like anything will work in this scenario given how old the original was.
I am future planning. The system will get replaced by a new version.
But that requires infrastructure updates at the terminals also.
-
@JaredBusch said in PCI bus error:
@scottalanmiller said in PCI bus error:
Seems like anything will work in this scenario given how old the original was.
I am future planning. The system will get replaced by a new version.
But that requires infrastructure updates at the terminals also.
Well sure, but even the smallest modern system will be orders of magnitude faster. Hard to believe anything wouldn't have the "oomph" for the task unless the workload isn't just updated, but overhauled.
-
@scottalanmiller said in PCI bus error:
Well sure, but even the smallest modern system will be orders of magnitude faster. Hard to believe anything wouldn't have the "oomph" for the task unless the workload isn't just updated, but overhauled.
Right the workload will not change. That is pretty consistent. The specs for the new version are higher. But still, yes, anything modern will power it.
-
Yeah it sleeps all day long..
-
@JaredBusch said in PCI bus error:
So the customer asked me to spec out a replacement server.
This is what I am thinking to recommend.
Dell PowerEdge R6515 – Chassis with 8x 2.5” drives AMD EPYC 7262 or 7302P 1x 16gb RDIMM 3200MT PERC H730P 3x 480GB SSD SATA Mix Use Hot plug Dual hot plug power supply Riser Config 1 1x16LP PCIe slot iDRAC 9 Express BOSS controller card with 2 M.2 240GB RAID 1
Comments?
Does iDRAC 9 Express allow remote access to the console?
-
@JaredBusch said in PCI bus error:
So the customer asked me to spec out a replacement server.
This is what I am thinking to recommend.
Dell PowerEdge R6515 – Chassis with 8x 2.5” drives AMD EPYC 7262 or 7302P 1x 16gb RDIMM 3200MT PERC H730P 3x 480GB SSD SATA Mix Use Hot plug Dual hot plug power supply Riser Config 1 1x16LP PCIe slot iDRAC 9 Express BOSS controller card with 2 M.2 240GB RAID 1
Comments?
Yeah, I assume this is a low budget spec.
Pick the cheapest epyc rome unless you expect the server to handle lots more in the future. 7232P is the cheapest.
Also skip the BOSS card and pick 2x960GB read-intensive drives in RAID 1. Since you have the H730P RAID1 and it's cache, RAID1 should be more than fine.I mean comparing to the old machine you could also use the H330 card. You don't get the cache but the SSDs have cache and RAID1/10 doesn't require any parity calculations so the H330 will get the job done.
-
@Dashrender said in PCI bus error:
@JaredBusch said in PCI bus error:
So the customer asked me to spec out a replacement server.
This is what I am thinking to recommend.
Dell PowerEdge R6515 – Chassis with 8x 2.5” drives AMD EPYC 7262 or 7302P 1x 16gb RDIMM 3200MT PERC H730P 3x 480GB SSD SATA Mix Use Hot plug Dual hot plug power supply Riser Config 1 1x16LP PCIe slot iDRAC 9 Express BOSS controller card with 2 M.2 240GB RAID 1
Comments?
Does iDRAC 9 Express allow remote access to the console?
I don't think so. You can only do power cycling with Express.