ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Bash script to manage file directory by size

    Scheduled Pinned Locked Moved IT Discussion
    9 Posts 6 Posters 318 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • IRJI
      IRJ
      last edited by

      I am housekeeping some directories by running a script to delete everything older than X days.

      find /var/backups/app1* -mtime +30 -exec rm {} \;

      That seems to be working ok, but size is varying among servers. I would rather delete oldest files once I threshold is met.

      Example: I have a hard limit of 5GB that I cannot exceed. After keeping 30 days worth of files on some servers I am only using 500MB and others 3GB. I would rather keep more days and just not exceed 5GB so I can use all available space.

      So I am thinking something like this :

      If directory exceeds 4GB, then delete the 5 oldest days of logs

      pmonchoP DashrenderD 2 Replies Last reply Reply Quote 2
      • pmonchoP
        pmoncho @IRJ
        last edited by pmoncho

        @IRJ said in Bash script to manage file directory by size:

        I am housekeeping some directories by running a script to delete everything older than X days.

        find /var/backups/app1* -mtime +30 -exec rm {} \;

        That seems to be working ok, but size is varying among servers. I would rather delete oldest files once I threshold is met.

        Example: I have a hard limit of 5GB that I cannot exceed. After keeping 30 days worth of files on some servers I am only using 500MB and others 3GB. I would rather keep more days and just not exceed 5GB so I can use all available space.

        So I am thinking something like this :

        If directory exceeds 4GB, then delete the 5 oldest days of logs

        You can use du -h to get directory usage and then eval it.

        Edit: I believe du -hs will give you a summarized in human readable form

        IRJI 1 Reply Last reply Reply Quote 1
        • DashrenderD
          Dashrender @IRJ
          last edited by

          @IRJ said in Bash script to manage file directory by size:

          I am housekeeping some directories by running a script to delete everything older than X days.

          find /var/backups/app1* -mtime +30 -exec rm {} \;

          That seems to be working ok, but size is varying among servers. I would rather delete oldest files once I threshold is met.

          Example: I have a hard limit of 5GB that I cannot exceed. After keeping 30 days worth of files on some servers I am only using 500MB and others 3GB. I would rather keep more days and just not exceed 5GB so I can use all available space.

          So I am thinking something like this :

          If directory exceeds 4GB, then delete the 5 oldest days of logs

          why 5 days? why not just one if that gets you below 4 GB, if not, then run again, etc?

          1 Reply Last reply Reply Quote 0
          • 1
            1337
            last edited by

            Bash is so primitive. Use python, php, perl or whatever you feel comfortable with instead.
            Then you can make the script do whatever you want and also produce a meaningful log file.

            1 Reply Last reply Reply Quote 0
            • JaredBuschJ
              JaredBusch
              last edited by

              I recently went through this with some backups. I was originally just doing the mtime thing. But then we found one of the server sending the backups was screwing up and not sending all 4 parts.. So we suddenly had no valid backups.

              We fixed the server and then I made this script file_cleanup.sh

              #!/bin/bash
              # Send everything to logs and screen.
              exec 1> >(logger -s -t $(basename $0)) 2>&1
              
              # Variables and descriptions of their use.
              # Array of dates found in the filename of the backup files.
              arrDates=()
              # Number of full backup sets to keep.
              keep=4
              # How many full backup sets have been found.
              found=0
              # Bas path to the backup files, minus the last folder.
              base="/home/username/"
              # Full path to the backup files, populated by the script.
              path=""
              
              # This script requires that the final folder name be passed as a paramter.
              # This is because it is designed to be ran independently for each subfolder.
              # ex: ./file_cleanup.sh FolderA
              # ex: ./file_cleanup.sh FolderB
              
              #check for the path to be passed
              if [ ! -z "$1" ]
              then
                  # Create the full path to be checked based on the passed parameter.
                  path=$base$1
              else
                  exit 127
              fi
              
              printf "Executing cleanup of backup files located in $path.\n"
              
              # Loop through all of the files in the path and parse out an array of the file dates from the file names.
              # All backups are named `backup-0000001-YYYYMMDD-XXXX*`.
              cd $path
              for f in backup-*
              do
                  # The date is from character 15 for 8 characters.
                  arrDates=("${arrDates[@]}" "${f:15:8}")
              done
              cd ~
              
              # Sort in reverse order and only show unique dates.
              arrDates=($(printf '%s\n' "${arrDates[@]}" | sort -ru))
              
              # Loop through the array of dates and check for there to be 4 files for each date.
              for checkdate in "${arrDates[@]}"
              do
                  count=$(find "$path"/backup-0000001-"$checkdate"-* -type f -printf '.' | wc -c)
                  if [ $count -eq 4 ] && [ $found -lt $keep ]
                  then
                      found=$((found+1))
                      printf "Checking $checkdate, we found $count files. We are keeping this date, currently we have $found dates saved.\n"
                  elif [ $count -gt 0 ] && [ ! $count -eq 4 ]
                  then
                      printf "Incorrect number of files '('$count')' found, removing invalid backup dated $checkdate.\n"
                      rm $path/backup-*-$checkdate-*
                  elif [ $count -gt 0 ] && [ $found -eq $keep ]
                  then
                      printf "We have already found $keep full sets of backup files. Removing backup files dated $checkdate.\n"
                      rm $path/backup-*-$checkdate-*
                  else
                      printf "The date $checkdate returned $count files. This is an unhandled scenario, doing nothing.\n"
                  fi
              done
              
              JaredBuschJ 1 Reply Last reply Reply Quote 1
              • IRJI
                IRJ @pmoncho
                last edited by

                @pmoncho said in Bash script to manage file directory by size:

                @IRJ said in Bash script to manage file directory by size:

                I am housekeeping some directories by running a script to delete everything older than X days.

                find /var/backups/app1* -mtime +30 -exec rm {} \;

                That seems to be working ok, but size is varying among servers. I would rather delete oldest files once I threshold is met.

                Example: I have a hard limit of 5GB that I cannot exceed. After keeping 30 days worth of files on some servers I am only using 500MB and others 3GB. I would rather keep more days and just not exceed 5GB so I can use all available space.

                So I am thinking something like this :

                If directory exceeds 4GB, then delete the 5 oldest days of logs

                You can use du -h to get directory usage and then eval it.

                Edit: I believe du -hs will give you a summarized in human readable form

                Yeah I think this is the route I will need to go.

                ObsolesceO 1 Reply Last reply Reply Quote 1
                • ObsolesceO
                  Obsolesce @IRJ
                  last edited by

                  @IRJ said in Bash script to manage file directory by size:

                  @pmoncho said in Bash script to manage file directory by size:

                  @IRJ said in Bash script to manage file directory by size:

                  I am housekeeping some directories by running a script to delete everything older than X days.

                  find /var/backups/app1* -mtime +30 -exec rm {} \;

                  That seems to be working ok, but size is varying among servers. I would rather delete oldest files once I threshold is met.

                  Example: I have a hard limit of 5GB that I cannot exceed. After keeping 30 days worth of files on some servers I am only using 500MB and others 3GB. I would rather keep more days and just not exceed 5GB so I can use all available space.

                  So I am thinking something like this :

                  If directory exceeds 4GB, then delete the 5 oldest days of logs

                  You can use du -h to get directory usage and then eval it.

                  Edit: I believe du -hs will give you a summarized in human readable form

                  Yeah I think this is the route I will need to go.

                  Can you install PowerShell on it? Then it'd be really easy for me to help 😉

                  1 Reply Last reply Reply Quote 1
                  • JaredBuschJ
                    JaredBusch @JaredBusch
                    last edited by JaredBusch

                    @JaredBusch THis is what it looks like on a run

                    journalctl -u backup-cleanup  -f
                    -- Logs begin at Wed 2020-01-08 22:25:52 CST. --
                    Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Executing cleanup of backup files located in /home/toptech/FolderA.
                    Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200131, we found 4 files. We are keeping this date, currently we have 1 dates saved.
                    Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200130, we found 4 files. We are keeping this date, currently we have 2 dates saved.
                    Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200129, we found 4 files. We are keeping this date, currently we have 3 dates saved.
                    Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200128, we found 4 files. We are keeping this date, currently we have 4 dates saved.
                    Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: We have already found 4 full sets of backup files. Removing backup files dated 20200127.
                    
                    IRJI 1 Reply Last reply Reply Quote 1
                    • IRJI
                      IRJ @JaredBusch
                      last edited by

                      @JaredBusch said in Bash script to manage file directory by size:

                      @JaredBusch THis is what it looks like on a run

                      journalctl -u backup-cleanup  -f
                      -- Logs begin at Wed 2020-01-08 22:25:52 CST. --
                      Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Executing cleanup of backup files located in /home/toptech/FolderA.
                      Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200131, we found 4 files. We are keeping this date, currently we have 1 dates saved.
                      Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200130, we found 4 files. We are keeping this date, currently we have 2 dates saved.
                      Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200129, we found 4 files. We are keeping this date, currently we have 3 dates saved.
                      Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200128, we found 4 files. We are keeping this date, currently we have 4 dates saved.
                      Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: We have already found 4 full sets of backup files. Removing backup files dated 20200127.
                      

                      That is nice and clean. Easy to integrate with SIEM if you'd like as well.

                      1 Reply Last reply Reply Quote 0
                      • 1 / 1
                      • First post
                        Last post