Bash script to manage file directory by size
-
I am housekeeping some directories by running a script to delete everything older than X days.
find /var/backups/app1* -mtime +30 -exec rm {} \;
That seems to be working ok, but size is varying among servers. I would rather delete oldest files once I threshold is met.
Example: I have a hard limit of
5GB
that I cannot exceed. After keeping 30 days worth of files on some servers I am only using 500MB and others 3GB. I would rather keep more days and just not exceed5GB
so I can use all available space.So I am thinking something like this :
If directory exceeds 4GB, then delete the 5 oldest days of logs
-
@IRJ said in Bash script to manage file directory by size:
I am housekeeping some directories by running a script to delete everything older than X days.
find /var/backups/app1* -mtime +30 -exec rm {} \;
That seems to be working ok, but size is varying among servers. I would rather delete oldest files once I threshold is met.
Example: I have a hard limit of
5GB
that I cannot exceed. After keeping 30 days worth of files on some servers I am only using 500MB and others 3GB. I would rather keep more days and just not exceed5GB
so I can use all available space.So I am thinking something like this :
If directory exceeds 4GB, then delete the 5 oldest days of logs
You can use du -h to get directory usage and then eval it.
Edit: I believe du -hs will give you a summarized in human readable form
-
@IRJ said in Bash script to manage file directory by size:
I am housekeeping some directories by running a script to delete everything older than X days.
find /var/backups/app1* -mtime +30 -exec rm {} \;
That seems to be working ok, but size is varying among servers. I would rather delete oldest files once I threshold is met.
Example: I have a hard limit of
5GB
that I cannot exceed. After keeping 30 days worth of files on some servers I am only using 500MB and others 3GB. I would rather keep more days and just not exceed5GB
so I can use all available space.So I am thinking something like this :
If directory exceeds 4GB, then delete the 5 oldest days of logs
why 5 days? why not just one if that gets you below 4 GB, if not, then run again, etc?
-
Bash is so primitive. Use python, php, perl or whatever you feel comfortable with instead.
Then you can make the script do whatever you want and also produce a meaningful log file. -
I recently went through this with some backups. I was originally just doing the
mtime
thing. But then we found one of the server sending the backups was screwing up and not sending all 4 parts.. So we suddenly had no valid backups.We fixed the server and then I made this script
file_cleanup.sh
#!/bin/bash # Send everything to logs and screen. exec 1> >(logger -s -t $(basename $0)) 2>&1 # Variables and descriptions of their use. # Array of dates found in the filename of the backup files. arrDates=() # Number of full backup sets to keep. keep=4 # How many full backup sets have been found. found=0 # Bas path to the backup files, minus the last folder. base="/home/username/" # Full path to the backup files, populated by the script. path="" # This script requires that the final folder name be passed as a paramter. # This is because it is designed to be ran independently for each subfolder. # ex: ./file_cleanup.sh FolderA # ex: ./file_cleanup.sh FolderB #check for the path to be passed if [ ! -z "$1" ] then # Create the full path to be checked based on the passed parameter. path=$base$1 else exit 127 fi printf "Executing cleanup of backup files located in $path.\n" # Loop through all of the files in the path and parse out an array of the file dates from the file names. # All backups are named `backup-0000001-YYYYMMDD-XXXX*`. cd $path for f in backup-* do # The date is from character 15 for 8 characters. arrDates=("${arrDates[@]}" "${f:15:8}") done cd ~ # Sort in reverse order and only show unique dates. arrDates=($(printf '%s\n' "${arrDates[@]}" | sort -ru)) # Loop through the array of dates and check for there to be 4 files for each date. for checkdate in "${arrDates[@]}" do count=$(find "$path"/backup-0000001-"$checkdate"-* -type f -printf '.' | wc -c) if [ $count -eq 4 ] && [ $found -lt $keep ] then found=$((found+1)) printf "Checking $checkdate, we found $count files. We are keeping this date, currently we have $found dates saved.\n" elif [ $count -gt 0 ] && [ ! $count -eq 4 ] then printf "Incorrect number of files '('$count')' found, removing invalid backup dated $checkdate.\n" rm $path/backup-*-$checkdate-* elif [ $count -gt 0 ] && [ $found -eq $keep ] then printf "We have already found $keep full sets of backup files. Removing backup files dated $checkdate.\n" rm $path/backup-*-$checkdate-* else printf "The date $checkdate returned $count files. This is an unhandled scenario, doing nothing.\n" fi done
-
@pmoncho said in Bash script to manage file directory by size:
@IRJ said in Bash script to manage file directory by size:
I am housekeeping some directories by running a script to delete everything older than X days.
find /var/backups/app1* -mtime +30 -exec rm {} \;
That seems to be working ok, but size is varying among servers. I would rather delete oldest files once I threshold is met.
Example: I have a hard limit of
5GB
that I cannot exceed. After keeping 30 days worth of files on some servers I am only using 500MB and others 3GB. I would rather keep more days and just not exceed5GB
so I can use all available space.So I am thinking something like this :
If directory exceeds 4GB, then delete the 5 oldest days of logs
You can use du -h to get directory usage and then eval it.
Edit: I believe du -hs will give you a summarized in human readable form
Yeah I think this is the route I will need to go.
-
@IRJ said in Bash script to manage file directory by size:
@pmoncho said in Bash script to manage file directory by size:
@IRJ said in Bash script to manage file directory by size:
I am housekeeping some directories by running a script to delete everything older than X days.
find /var/backups/app1* -mtime +30 -exec rm {} \;
That seems to be working ok, but size is varying among servers. I would rather delete oldest files once I threshold is met.
Example: I have a hard limit of
5GB
that I cannot exceed. After keeping 30 days worth of files on some servers I am only using 500MB and others 3GB. I would rather keep more days and just not exceed5GB
so I can use all available space.So I am thinking something like this :
If directory exceeds 4GB, then delete the 5 oldest days of logs
You can use du -h to get directory usage and then eval it.
Edit: I believe du -hs will give you a summarized in human readable form
Yeah I think this is the route I will need to go.
Can you install PowerShell on it? Then it'd be really easy for me to help
-
@JaredBusch THis is what it looks like on a run
journalctl -u backup-cleanup -f -- Logs begin at Wed 2020-01-08 22:25:52 CST. -- Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Executing cleanup of backup files located in /home/toptech/FolderA. Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200131, we found 4 files. We are keeping this date, currently we have 1 dates saved. Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200130, we found 4 files. We are keeping this date, currently we have 2 dates saved. Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200129, we found 4 files. We are keeping this date, currently we have 3 dates saved. Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200128, we found 4 files. We are keeping this date, currently we have 4 dates saved. Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: We have already found 4 full sets of backup files. Removing backup files dated 20200127.
-
@JaredBusch said in Bash script to manage file directory by size:
@JaredBusch THis is what it looks like on a run
journalctl -u backup-cleanup -f -- Logs begin at Wed 2020-01-08 22:25:52 CST. -- Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Executing cleanup of backup files located in /home/toptech/FolderA. Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200131, we found 4 files. We are keeping this date, currently we have 1 dates saved. Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200130, we found 4 files. We are keeping this date, currently we have 2 dates saved. Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200129, we found 4 files. We are keeping this date, currently we have 3 dates saved. Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: Checking 20200128, we found 4 files. We are keeping this date, currently we have 4 dates saved. Jan 31 08:00:39 ftp.domain.local file_cleanup.sh[37655]: We have already found 4 full sets of backup files. Removing backup files dated 20200127.
That is nice and clean. Easy to integrate with SIEM if you'd like as well.