How do you find duplicates from Windows SMB shares using Linux
- 
 I'm just looking for a way to tally the amount of duplicate files there are on any given share, doesn't need to be anything fancy. I would ideally like it to check the hashes of the files and then post a summary to a log file. I'm looking at fdupes ( dnf install fdupes) as this might do what I want, but I'm open to suggestions.
- 
 @DustinB3403 said in How do you find duplicates from Windows SMB shares using Linux: I'm looking at fdupes ( dnf install fdupes ) as this might do what I want, but I'm open to suggestions. I looked it up and that's what I found as likely the best option, too. 
- 
 @DustinB3403 said in How do you find duplicates from Windows SMB shares using Linux: I'm just looking for a way to tally the amount of duplicate files there are on any given share, doesn't need to be anything fancy. I would ideally like it to check the hashes of the files and then post a summary to a log file. I'm looking at fdupes ( dnf install fdupes) as this might do what I want, but I'm open to suggestions.I would assume you can just write command output to a file and that should accomplish what you want with most simplicity. 
- 
 @IRJ Yeah the output part is really simple, fdupes seems really simple too. fdupes -rmsHA --sameline /target > output.logis running.I just wasn't sure if there was any better options out there. 
- 
 @DustinB3403 said in How do you find duplicates from Windows SMB shares using Linux: @IRJ Yeah the output part is really simple, fdupes seems really simple too. fdupes -rmsHA --sameline /target > output.logis running.I just wasn't sure if there was any better options out there. I just realized that the --samelineoption can be replaced with-1as in number one. The manual isn't clear about that and reading the option itself is difficult to delineate the difference.
- 
 @DustinB3403 said in How do you find duplicates from Windows SMB shares using Linux: @IRJ Yeah the output part is really simple, fdupes seems really simple too. fdupes -rmsHA --sameline /target > output.logis running.I just wasn't sure if there was any better options out there. you also may want to grep for certain data if the entire output is too noisy 
- 
 @IRJ said in How do you find duplicates from Windows SMB shares using Linux: @DustinB3403 said in How do you find duplicates from Windows SMB shares using Linux: @IRJ Yeah the output part is really simple, fdupes seems really simple too. fdupes -rmsHA --sameline /target > output.logis running.I just wasn't sure if there was any better options out there. you also may want to grep for certain data if the entire output is too noisy Normally I would filter down, but since I'm just trying to get a grasp on the amount of potential duplication that there is, filtering at this point would only skew that number. 
- 
 @DustinB3403 some folks claim jdupes is faster, I have used both, and did not much of a difference. 
 Both work well.
- 
 @pattonb to get an idea of how many dupes use the following fdupes -r -m /directory(share to scan) 
- 
 I wonder if this would run faster directly on the server in powershell instead? I'm assuming with doing this over SMB you have to download all files, run the hash - if ran locally, you get to skip the download time, I assume. 
- 
 @Dashrender said in How do you find duplicates from Windows SMB shares using Linux: I wonder if this would run faster directly on the server in powershell instead? I'm assuming with doing this over SMB you have to download all files, run the hash - if ran locally, you get to skip the download time, I assume. I gathered that the SMB shares are hosted on Linux, but I could be wrong. If they are hosted on Windows like you are assuming, then I would agree that PowerShell would probably be most performant for this. 
- 
 @IRJ said in How do you find duplicates from Windows SMB shares using Linux: @Dashrender said in How do you find duplicates from Windows SMB shares using Linux: I wonder if this would run faster directly on the server in powershell instead? I'm assuming with doing this over SMB you have to download all files, run the hash - if ran locally, you get to skip the download time, I assume. I gathered that the SMB shares are hosted on Linux, but I could be wrong. If they are hosted on Windows like you are assuming, then I would agree that PowerShell would probably be most performant for this. The title says - Windows SMB Shares. My guess is that Dustin is a lone wolf running a 'nix OS as his machine - and the rest of the company is using Windows. Nothing wrong with that, just my guess. 
- 
 @Dashrender said in How do you find duplicates from Windows SMB shares using Linux: @IRJ said in How do you find duplicates from Windows SMB shares using Linux: @Dashrender said in How do you find duplicates from Windows SMB shares using Linux: I wonder if this would run faster directly on the server in powershell instead? I'm assuming with doing this over SMB you have to download all files, run the hash - if ran locally, you get to skip the download time, I assume. I gathered that the SMB shares are hosted on Linux, but I could be wrong. If they are hosted on Windows like you are assuming, then I would agree that PowerShell would probably be most performant for this. The title says - Windows SMB Shares. My guess is that Dustin is a lone wolf running a 'nix OS as his machine - and the rest of the company is using Windows. Nothing wrong with that, just my guess. His company is significantly Mac. 
- 
 @JaredBusch said in How do you find duplicates from Windows SMB shares using Linux: @Dashrender said in How do you find duplicates from Windows SMB shares using Linux: @IRJ said in How do you find duplicates from Windows SMB shares using Linux: @Dashrender said in How do you find duplicates from Windows SMB shares using Linux: I wonder if this would run faster directly on the server in powershell instead? I'm assuming with doing this over SMB you have to download all files, run the hash - if ran locally, you get to skip the download time, I assume. I gathered that the SMB shares are hosted on Linux, but I could be wrong. If they are hosted on Windows like you are assuming, then I would agree that PowerShell would probably be most performant for this. The title says - Windows SMB Shares. My guess is that Dustin is a lone wolf running a 'nix OS as his machine - and the rest of the company is using Windows. Nothing wrong with that, just my guess. His company is significantly Mac. aww, that's right - he has been asking a lot of MAC questions lately. 
- 
 @Dashrender said in How do you find duplicates from Windows SMB shares using Linux: @JaredBusch said in How do you find duplicates from Windows SMB shares using Linux: @Dashrender said in How do you find duplicates from Windows SMB shares using Linux: @IRJ said in How do you find duplicates from Windows SMB shares using Linux: @Dashrender said in How do you find duplicates from Windows SMB shares using Linux: I wonder if this would run faster directly on the server in powershell instead? I'm assuming with doing this over SMB you have to download all files, run the hash - if ran locally, you get to skip the download time, I assume. I gathered that the SMB shares are hosted on Linux, but I could be wrong. If they are hosted on Windows like you are assuming, then I would agree that PowerShell would probably be most performant for this. The title says - Windows SMB Shares. My guess is that Dustin is a lone wolf running a 'nix OS as his machine - and the rest of the company is using Windows. Nothing wrong with that, just my guess. His company is significantly Mac. aww, that's right - he has been asking a lot of MAC questions lately. Unix questions to be more precise, but yeah we are a heavy Mac shop. 




