Solved RHEL 4 not seeing ext3 label
-
Continues from this original thread
Re: PCI bus errorGetting a kernel panic no matter where I try to restore.
Lots of test and I am seeing that the boot process is unable to see the filesystem label to mount root.@DustinB3403 found this which got me searching in the right place.
@DustinB3403 said in PCI bus error:
@JaredBusch I'm assuming you already found this, but if not.
But I cannot get anything working.
I am assuming that it is a driver built in to the kernel that is missing.
This seems to be the initial failure
GRUB is telling it to look for that.
The original system has that label on
sda3
I can boot to a CentOS 4 ISO file and enter rescue mode and check the label. it is listed.
So frustrated.
I have tried many combinations of the drive choices and such within Proxmox. I restored even to bare metal on an old desktop, etc, etc. -
and solved it. finally..
one of the reboots into the CentOS 4 disc was slow or something and I caught it pop this screen (took me 6 reboots to get screenshot).
This is the Kudzu hardware detection thing.
The VM is setup using the LSI 53C895A SCSI Controller.
Booted into rescue mode with the CentOS 4 CD.chroot /mnt/sysimage vi /etc/modprobe.conf # make this the only scsi_hostapadter alias scsi_hostadapter sym53c8xx # exit vi !!! omfg how!!! cd /boot mkinitrd -v -f initrd-2.6.9-55.EL.img 2.6.9-55.EL exit exit
System automatically reboots to come out of rescue mode.
Make sure you remove the ISO at this point.
Then boom..
-
You could potentially try to install centos4, or rhel4 is even better, so you get to a bootable system.
Then just copy the files from the backup over your installation. -
@Pete-S said in RHEL 4 not seeing ext3 label:
You could potentially try to install centos4, or rhel4 is even better, so you get to a bootable system.
Then just copy the files from the backup over your installation.It is an option I have thought about. I'll be on site this morning, and I will be shutting down the host and booting to a Fedora Live to run
dd
in an effort to get a solid disk image.I tried their built in process on Tuesday and it failed with sector/block read errors. A little digging through the files on the recovery ISO showed that all they were doing was using
dd
, so I am hoping to usedd
with more intelligent options to continue on and such. -
So this is quite old (15 years) but maybe... Link
Sounds like you lost the label on your boot partition. Boot from CD into rescue mode and use e2fslabel to label the partition or change the root= to use /dev/hdx in grub.
-
@JaredBusch said in RHEL 4 not seeing ext3 label:
@Pete-S said in RHEL 4 not seeing ext3 label:
You could potentially try to install centos4, or rhel4 is even better, so you get to a bootable system.
Then just copy the files from the backup over your installation.It is an option I have thought about. I'll be on site this morning, and I will be shutting down the host and booting to a Fedora Live to run
dd
in an effort to get a solid disk image.I tried their built in process on Tuesday and it failed with sector/block read errors. A little digging through the files on the recovery ISO showed that all they were doing was using
dd
, so I am hoping to usedd
with more intelligent options to continue on and such.What's repair solution for bad blocks in a setup like this? If dd can't read because of bad blocks, I'm hoping 'nix has some tool to fix/recover/replace these bad blocks, assuming the data's recoverable on the hardware, otherwise it's a restore time, right?
-
@DustinB3403 said in RHEL 4 not seeing ext3 label:
Boot from CD into rescue mode and use e2fslabel to label the partition or
I stated in the OP that booting into rescue mode, the label is showing correct.
-
@DustinB3403 said in RHEL 4 not seeing ext3 label:
change the root= to use /dev/hdx in grub.
I did that also. It still failed to mount it.
-
This is the script that performs the backup itself. Well the chunk that does a backup to HDD
backup2hd() { echo "Backup to HD started..." AUTOBACKUP=$1 AUTO=0 RES=0 if [ "${AUTOBACKUP}" = "AUTO" ]; then RES=0 AUTO=1 echo "Auto Full Backup Starts..." # mt rewind else RES=2 AUTO=0 fi #TODO: Mount check - can't backup to a non-existant or read-only mount point RES=0 # Assume all is well - really the mount check would reset this, but until then just "go with it" # Make temp directory... # TDR_ROOT is the base directory we are going to use on the mounted volume (e.g. /media/usbdisk) TMP_TDR=${TDR_ROOT}/tmp/TDR-backup mkdir -p $TMP_TDR rm -rf $TMP_DIR # Size sanity check - can't backup to a device too small. # -- Exclusion HD list mkdir -p $TMP_TDR/hd for HD in $HD_EXCLUDE do mkdir -p $(dirname $TMP_TDR/hd/$HD) # Account for device names like /dev/cciss/c0d0p1 touch $TMP_TDR/hd/$HD done dialog --title "BackupHD" --defaultno --yesno "Skip size check?" 5 30 if [ $? -eq 1 ]; then # - Find total size of backup for HD in $(dmesg | grep -P "^\s+\S+:\s+\S+\d+" | grep -P "(\d+|>)$" | cut -d':' -f1 | sed 's/ //g') do if [ ! -f $TMP_TDR/hd/$HD ]; then mkdir -p $(dirname $TMP_TDR/hd/$HD) # Account for device names like /dev/cciss/c0d0p1 touch $TMP_TDR/hd/$HD unset TOTALSIZE unset SIZE for PART in $(sfdisk -l /dev/$HD | grep -P "Linux$" | cut -d' ' -f1 ) do echo "Checking $PART size..." SIZE=$(dump -S $PART ) TOTALSIZE=$(($TOTALSIZE + $SIZE )) echo "$PART is $SIZE bytes" done fi done rm -rf $TMP_TDR/hd/ # Find device mounted on TDR_ROOT TARGETSIZE=$(df $TDR_ROOT| tail -n 1 | awk '{print $4}' ) TARGETSIZE=$(( $TARGETSIZE * 1024 )) # Convert to bytes if [ $TOTALSIZE -gt $TARGETSIZE ]; then dialog --title "BackupHD" --msgbox "Target volume is too small.\nTotal size required [$TOTALSIZE]\nTotal size available [$TARGETSIZE]\n" 10 60 RES=99 else RES=0 fi fi # Check that $RES = 0 so we can continue... # Otherwise quit this routine. if [ $RES -ne 0 ]; then break fi if [ -z $PREFIX ]; then # Default prefix to "YYYY-MM-DD-HHMM_" PREFIX=$(date +'%F-%H%M')_ fi RECOVERY=$TMP_TDR/recovery-procedure rm -f $RECOVERY if [ $RES -eq 0 ]; then # make restore procedure script touch $RECOVERY chmod +x $RECOVERY echo '#!/bin/bash' >> $RECOVERY echo 'unset SSH' >> $RECOVERY echo '# -- ' >> $RECOVERY echo '# ' >> $RECOVERY echo '# --' >> $RECOVERY echo 'RESTORE_DIR=$(dirname "$0")' >> $RECOVERY echo 'PREFIX='${PREFIX} >> $RECOVERY echo 'mkdir -p /tmp/TDR-recover' >> $RECOVERY echo 'tar xf ${RESTORE_DIR}/${PREFIX}system-data.tar -C /tmp/TDR-recover' >> $RECOVERY mkdir -p $TMP_TDR/hd # -- Exclusion list for HD in $HD_EXCLUDE do mkdir -p $(dirname $TMP_TDR/hd/$HD) # Account for device names like /dev/cciss/c0d0p1 touch $TMP_TDR/hd/$HD done # - restore boot block and partition table for HD in $(dmesg | grep -P "^\s+\S+:\s+\S+\d+" | grep -P "(\d+|>)$" | cut -d':' -f1 | sed 's/ //g') do if [ ! -f $TMP_TDR/hd/$HD ]; then mkdir -p $(dirname $TMP_TDR/hd/$HD) # Account for device names like /dev/cciss/c0d0p1 # restore boot block echo "dd if=/tmp/TDR-recover/hd/$HD.partinfo bs=512 count=63 of=/dev/$HD" >> $RECOVERY # restore partition table echo "sfdisk /dev/$HD < /tmp/TDR-recover/hd/$HD.sfdisk" >> $RECOVERY touch $TMP_TDR/hd/$HD fi done echo "echo \"#--- Sleep for a while to let slow controllers (HP/Compaq RAID's for one) catch up...\"" >> $RECOVERY echo "sleep 10" >> $RECOVERY rm -rf $TMP_TDR/hd/ # -- Exclusion HD list for HD in $HD_EXCLUDE do mkdir -p $(dirname $TMP_TDR/hd/$HD) # Account for device names like /dev/cciss/c0d0p1 touch $TMP_TDR/hd/$HD done # - recreate partitions (including swap), restore data, re-install grub for HD in `dmesg |grep -P "^\s+\S+:\s+\S+\d+"|grep -P "(\d+|\>)$"|cut -d':' -f1|sed 's/ //g'` do unset FILE if [ ! -f $TMP_TDR/hd/$HD ]; then mkdir -p $(dirname $TMP_TDR/hd/$HD) # Account for device names like /dev/cciss/c0d0p1 touch $TMP_TDR/hd/$HD for PART in $(sfdisk -l /dev/$HD | grep -P "Linux$" | cut -d' ' -f1 ) do # Create partition restore procedure LABEL=$(e2label $PART) PART_BASE=$(basename $PART) echo "echo \"# === $LABEL on $PART ===\"" >> $RECOVERY echo "mke2fs -j -L $LABEL $PART" >> $RECOVERY echo "mkdir -p /mnt/$PART_BASE" >> $RECOVERY echo "mount $PART /mnt/$PART_BASE" >> $RECOVERY echo "cd /mnt/$PART_BASE" >> $RECOVERY echo "rm -rf *" >> $RECOVERY FILE="\${RESTORE_DIR}/${PREFIX}${PART_BASE}.img" echo "echo \"# --- Restoring $LABEL from $FILE --- \"" >> $RECOVERY # TODO: RSH=ssh RMT=rmt restore -r ${REMOTE_TAPE} echo "restore -v -M -rf $FILE" >> $RECOVERY echo "rm -f restoresymtable" >> $RECOVERY echo "cd /" >> $RECOVERY echo "umount /mnt/$PART_BASE" >> $RECOVERY if [ "$LABEL" = "/boot" ]; then echo "echo Restoring GRUB bootloader" >> $RECOVERY echo "mkdir -p /mnt/$PART_BASE/boot" >> $RECOVERY echo "mount $PART /mnt/$PART_BASE/boot" >> $RECOVERY echo "grub-install --no-floppy --recheck --root-directory=/mnt/$PART_BASE /dev/$HD" >> $RECOVERY echo "umount /mnt/$PART_BASE/boot" >> $RECOVERY fi echo "" >> $RECOVERY done # Recreate the swap partition for PART in $( sfdisk -l /dev/$HD|grep -P "Linux swap$"|cut -d' ' -f1 ) do echo "mkswap $PART" >> $RECOVERY echo "" >> $RECOVERY done fi done rm -rf $TMP_TDR/hd/ # Now to actually do the backup # -- backup recovery-procedure script rm -f $TDR_ROOT/${PREFIX}system-data.tar tar cf $TDR_ROOT/${PREFIX}system-data.tar -C $TMP_TDR recovery-procedure cp -v $RECOVERY $TDR_ROOT/${PREFIX}recovery-procedure # -- Exclusion HD list for HD in $HD_EXCLUDE do mkdir -p $(dirname $TMP_TDR/hd/$HD) # Account for device names like /dev/cciss/c0d0p1 touch $TMP_TDR/hd/$HD done # -- backup partition table information for HD in `dmesg |grep -P "^\s+\S+:\s+\S+\d+"|grep -P "(\d+|\>)$"|cut -d':' -f1|sed 's/ //g'` do if [ ! -f $TMP_TDR/hd/$HD ]; then mkdir -p $(dirname $TMP_TDR/hd/$HD) # Account for device names like /dev/cciss/c0d0p1 dd if=/dev/$HD of=$TMP_TDR/hd/$HD.partinfo bs=512 count=63 sfdisk -d /dev/$HD > $TMP_TDR/hd/$HD.sfdisk tar --append -f $TDR_ROOT/${PREFIX}system-data.tar -C $TMP_TDR hd/$HD.partinfo hd/$HD.sfdisk touch $TMP_TDR/hd/$HD fi done rm -rf $TMP_TDR/hd/ # -- Exclusion HD list for HD in $HD_EXCLUDE do mkdir -p $(dirname $TMP_TDR/hd/$HD) # Account for device names like /dev/cciss/c0d0p1 touch $TMP_TDR/hd/$HD done # -- backup data for each partition for HD in $(dmesg |grep -P "^\s+\S+:\s+\S+\d+"|grep -P "(\d+|\>)$"|cut -d':' -f1|sed 's/ //g') do unset FILE if [ ! -f $TMP_TDR/hd/$HD ]; then mkdir -p $(dirname $TMP_TDR/hd/$HD) # Account for device names like /dev/cciss/c0d0p1 touch $TMP_TDR/hd/$HD for PART in $(sfdisk -l /dev/$HD|grep -P "Linux$"|cut -d' ' -f1) do # dump to file -- remote could be set in the $TDR_ROOT variable.... PART_BASE=$(basename $PART) FILE=${REMOTE}${TDR_ROOT}/${PREFIX}${PART_BASE}.img echo "Dumping $PART_BASE to $FILE ..." # -B 4589824 => (4589824 x 1024 = 4699979776 bytes) or DVD size chunk # -B 665600 => ( 665600 x 1024 = 681574400 bytes) or CD size chunks # dump $DUMP_OPT -M -B 4589824 -0 $PART -j9 -f $FILE dump $DUMP_OPT -M -B 665600 -b 10 -0 $PART -j9 -f $FILE done fi done rm -rf $TMP_TDR/hd/ #TODO: Package the resulting files into one (or more chunks) ? rm -Rf $TMP_TDR if [ ${AUTO} -eq 0 ]; then dialog --no-kill --msgbox "[Backup]\nBackup is done!" 6 40 fi echo "It is safe to reboot now" elif [ $RES -eq 1 ]; then dialog --no-kill --msgbox "[Backup]\nThis computer encountered an error\n Try another method\n" 7 50 fi }
-
Well
dd
is moving right along.
I had to use their recovery CD to boot the hardware. It would not boot to any of my USB drives.
So that is
dd
from RHEL 4. The USB disk it is writing to is formatted FAT. So a direct write puked at 4GB.The version of
split
on there only supports a size tag ofm
at the largest. So I went with 650MB on the split to match what their normal process creates.
-
I'm monitoring the progress in console 2 (ctl+alt+f2) with
watch -n 1 "ls -lash /dd_manual/dd"
-
Process completed with no errors yesterday.
Now to merge it all back together and try to restore it to a VM.
-
It feels like I'm watching reality TV.
-
@JaredBusch said in RHEL 4 not seeing ext3 label:
Process completed with no errors yesterday.
Now to merge it all back together and try to restore it to a VM.
Do you need to merge it? just wondering?
-
@Dashrender said in RHEL 4 not seeing ext3 label:
Do you need to merge it? just wondering?
How else does it become a single disk image file to import into my hypervisor?
-
So back home, and I have the files backed up in like 4 places.
I recombined the .img files and then unzipped them.
Getting ready to setup a new VM on Proxmox, but I poked around
dmesg
on the running system first.SCSI subsystem initialized Fusion MPT base driver 3.02.73rh Copyright (c) 1999-2006 LSI Logic Corporation Fusion MPT SPI Host driver 3.02.73rh ACPI: PCI Interrupt 0000:02:05.0[A] -> GSI 34 (level, low) -> IRQ 201 mptbase: Initiating ioc0 bringup ioc0: 53C1030: Capabilities={Initiator,Target} scsi0 : ioc0: LSI53C1030, FwRev=01032300h, Ports=1, MaxQ=255, IRQ=201 ACPI: PCI Interrupt 0000:02:05.1[B] -> GSI 33 (level, low) -> IRQ 209 mptbase: Initiating ioc1 bringup ioc1: 53C1030: Capabilities={Initiator,Target} scsi1 : ioc1: LSI53C1030, FwRev=01032300h, Ports=1, MaxQ=255, IRQ=209 Fusion MPT SAS Host driver 3.02.73rh megaraid cmm: 2.20.2.6rh (Release Date: Tue Jan 16 12:35:06 PST 2007) megaraid: 2.20.4.6-rh2 (Release Date: Wed Jun 28 12:27:22 EST 2006) megaraid: probe new device 0x1000:0x1960:0x1028:0x0518: bus 9:slot 4:func 0 ACPI: PCI Interrupt 0000:09:04.0[A] -> GSI 106 (level, low) -> IRQ 233 megaraid: fw version:[351S] bios version:[1.10] scsi2 : LSI Logic MegaRAID driver scsi[2]: scanning scsi channel 0 [Phy 0] for non-raid devices Vendor: PE/PV Model: 1x6 SCSI BP Rev: 1.0 Type: Processor ANSI SCSI revision: 02 scsi[2]: scanning scsi channel 1 [Phy 1] for non-raid devices scsi[2]: scanning scsi channel 2 [virtual] for logical drives Vendor: MegaRAID Model: LD 0 RAID1 69G Rev: 351S Type: Direct-Access ANSI SCSI revision: 02 SCSI device sda: 143114240 512-byte hdwr sectors (73274 MB) sda: asking for cache data failed sda: assuming drive cache: write through SCSI device sda: 143114240 512-byte hdwr sectors (73274 MB) sda: asking for cache data failed sda: assuming drive cache: write through sda: sda1 sda2 sda3 Attached scsi disk sda at scsi2, channel 2, id 0, lun 0 Vendor: MegaRAID Model: LD 1 RAID5 139G Rev: 351S Type: Direct-Access ANSI SCSI revision: 02 SCSI device sdb: 286228480 512-byte hdwr sectors (146549 MB) sdb: asking for cache data failed sdb: assuming drive cache: write through SCSI device sdb: 286228480 512-byte hdwr sectors (146549 MB) sdb: asking for cache data failed sdb: assuming drive cache: write through sdb: sdb1 Attached scsi disk sdb at scsi2, channel 2, id 1, lun 0
I think this tells me that I should try the megaRAID controller this time. I swaer I already tried. But I have slept since then. Tuesday and Wednesday were crazy stressed getting data..
-
Well damnit. It does not see the second disk..
Looks like an error during boot
-
can you boot from a live image and see both disks?
I did a d2vm of a windows 2003 server and I had to run checkdisk like 10 times before it finally worked.. don't ask my why I tried it so many times... I think there is a thread around here somewhere about it.
-
@Dashrender said in RHEL 4 not seeing ext3 label:
can you boot from a live image and see both disks?
I did a d2vm of a windows 2003 server and I had to run checkdisk like 10 times before it finally worked.. don't ask my why I tried it so many times... I think there is a thread around here somewhere about it.
The restored drives are fine. Can be mounted as previously noted and the label reports correctly.
The issue seems to be that the kernel, as built, is not loading the drives correctly. Potentially because the VM is using a SCSI driver method the old ass kernel does not understand.
-
Didn't Dell "back in the day" use or require their own megaraid driver's on Linux?? Can't remember as its been ages since I delt with a 28XX series with a PERC raid card.
-
Using VirtIO SCSI (the default selection) the drives are not even seen by tthe recovery boot image. The onyl thing shown is the USB drive holding the data to restore.