ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    How do you find duplicates from Windows SMB shares using Linux

    IT Discussion
    linux duplication reporting
    6
    15
    1.3k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • IRJI
      IRJ @DustinB3403
      last edited by

      @DustinB3403 said in How do you find duplicates from Windows SMB shares using Linux:

      I'm just looking for a way to tally the amount of duplicate files there are on any given share, doesn't need to be anything fancy. I would ideally like it to check the hashes of the files and then post a summary to a log file.

      I'm looking at fdupes ( dnf install fdupes ) as this might do what I want, but I'm open to suggestions.

      I would assume you can just write command output to a file and that should accomplish what you want with most simplicity.

      DustinB3403D 1 Reply Last reply Reply Quote 0
      • DustinB3403D
        DustinB3403 @IRJ
        last edited by

        @IRJ Yeah the output part is really simple, fdupes seems really simple too.

        fdupes -rmsHA --sameline /target > output.log is running.

        I just wasn't sure if there was any better options out there.

        IRJI 1 Reply Last reply Reply Quote 0
        • DustinB3403D
          DustinB3403
          last edited by DustinB3403

          @DustinB3403 said in How do you find duplicates from Windows SMB shares using Linux:

          @IRJ Yeah the output part is really simple, fdupes seems really simple too.

          fdupes -rmsHA --sameline /target > output.log is running.

          I just wasn't sure if there was any better options out there.

          I just realized that the --sameline option can be replaced with -1 as in number one. The manual isn't clear about that and reading the option itself is difficult to delineate the difference.

          1 Reply Last reply Reply Quote 0
          • IRJI
            IRJ @DustinB3403
            last edited by

            @DustinB3403 said in How do you find duplicates from Windows SMB shares using Linux:

            @IRJ Yeah the output part is really simple, fdupes seems really simple too.

            fdupes -rmsHA --sameline /target > output.log is running.

            I just wasn't sure if there was any better options out there.

            you also may want to grep for certain data if the entire output is too noisy

            DustinB3403D 1 Reply Last reply Reply Quote 0
            • DustinB3403D
              DustinB3403 @IRJ
              last edited by

              @IRJ said in How do you find duplicates from Windows SMB shares using Linux:

              @DustinB3403 said in How do you find duplicates from Windows SMB shares using Linux:

              @IRJ Yeah the output part is really simple, fdupes seems really simple too.

              fdupes -rmsHA --sameline /target > output.log is running.

              I just wasn't sure if there was any better options out there.

              you also may want to grep for certain data if the entire output is too noisy

              Normally I would filter down, but since I'm just trying to get a grasp on the amount of potential duplication that there is, filtering at this point would only skew that number.

              P 1 Reply Last reply Reply Quote 0
              • P
                pattonb @DustinB3403
                last edited by

                @DustinB3403 some folks claim jdupes is faster, I have used both, and did not much of a difference.
                Both work well.

                P 1 Reply Last reply Reply Quote 1
                • P
                  pattonb @pattonb
                  last edited by pattonb

                  @pattonb to get an idea of how many dupes use the following

                  fdupes -r -m /directory(share to scan)

                  1 Reply Last reply Reply Quote 0
                  • DashrenderD
                    Dashrender
                    last edited by

                    I wonder if this would run faster directly on the server in powershell instead? I'm assuming with doing this over SMB you have to download all files, run the hash - if ran locally, you get to skip the download time, I assume.

                    IRJI 1 Reply Last reply Reply Quote 0
                    • IRJI
                      IRJ @Dashrender
                      last edited by

                      @Dashrender said in How do you find duplicates from Windows SMB shares using Linux:

                      I wonder if this would run faster directly on the server in powershell instead? I'm assuming with doing this over SMB you have to download all files, run the hash - if ran locally, you get to skip the download time, I assume.

                      I gathered that the SMB shares are hosted on Linux, but I could be wrong.

                      If they are hosted on Windows like you are assuming, then I would agree that PowerShell would probably be most performant for this.

                      DashrenderD 1 Reply Last reply Reply Quote 0
                      • DashrenderD
                        Dashrender @IRJ
                        last edited by Dashrender

                        @IRJ said in How do you find duplicates from Windows SMB shares using Linux:

                        @Dashrender said in How do you find duplicates from Windows SMB shares using Linux:

                        I wonder if this would run faster directly on the server in powershell instead? I'm assuming with doing this over SMB you have to download all files, run the hash - if ran locally, you get to skip the download time, I assume.

                        I gathered that the SMB shares are hosted on Linux, but I could be wrong.

                        If they are hosted on Windows like you are assuming, then I would agree that PowerShell would probably be most performant for this.

                        The title says - Windows SMB Shares.

                        My guess is that Dustin is a lone wolf running a 'nix OS as his machine - and the rest of the company is using Windows. Nothing wrong with that, just my guess.

                        JaredBuschJ 1 Reply Last reply Reply Quote 0
                        • JaredBuschJ
                          JaredBusch @Dashrender
                          last edited by

                          @Dashrender said in How do you find duplicates from Windows SMB shares using Linux:

                          @IRJ said in How do you find duplicates from Windows SMB shares using Linux:

                          @Dashrender said in How do you find duplicates from Windows SMB shares using Linux:

                          I wonder if this would run faster directly on the server in powershell instead? I'm assuming with doing this over SMB you have to download all files, run the hash - if ran locally, you get to skip the download time, I assume.

                          I gathered that the SMB shares are hosted on Linux, but I could be wrong.

                          If they are hosted on Windows like you are assuming, then I would agree that PowerShell would probably be most performant for this.

                          The title says - Windows SMB Shares.

                          My guess is that Dustin is a lone wolf running a 'nix OS as his machine - and the rest of the company is using Windows. Nothing wrong with that, just my guess.

                          His company is significantly Mac.

                          DashrenderD 1 Reply Last reply Reply Quote 0
                          • DashrenderD
                            Dashrender @JaredBusch
                            last edited by

                            @JaredBusch said in How do you find duplicates from Windows SMB shares using Linux:

                            @Dashrender said in How do you find duplicates from Windows SMB shares using Linux:

                            @IRJ said in How do you find duplicates from Windows SMB shares using Linux:

                            @Dashrender said in How do you find duplicates from Windows SMB shares using Linux:

                            I wonder if this would run faster directly on the server in powershell instead? I'm assuming with doing this over SMB you have to download all files, run the hash - if ran locally, you get to skip the download time, I assume.

                            I gathered that the SMB shares are hosted on Linux, but I could be wrong.

                            If they are hosted on Windows like you are assuming, then I would agree that PowerShell would probably be most performant for this.

                            The title says - Windows SMB Shares.

                            My guess is that Dustin is a lone wolf running a 'nix OS as his machine - and the rest of the company is using Windows. Nothing wrong with that, just my guess.

                            His company is significantly Mac.

                            aww, that's right - he has been asking a lot of MAC questions lately.

                            DustinB3403D 1 Reply Last reply Reply Quote 0
                            • DustinB3403D
                              DustinB3403 @Dashrender
                              last edited by

                              @Dashrender said in How do you find duplicates from Windows SMB shares using Linux:

                              @JaredBusch said in How do you find duplicates from Windows SMB shares using Linux:

                              @Dashrender said in How do you find duplicates from Windows SMB shares using Linux:

                              @IRJ said in How do you find duplicates from Windows SMB shares using Linux:

                              @Dashrender said in How do you find duplicates from Windows SMB shares using Linux:

                              I wonder if this would run faster directly on the server in powershell instead? I'm assuming with doing this over SMB you have to download all files, run the hash - if ran locally, you get to skip the download time, I assume.

                              I gathered that the SMB shares are hosted on Linux, but I could be wrong.

                              If they are hosted on Windows like you are assuming, then I would agree that PowerShell would probably be most performant for this.

                              The title says - Windows SMB Shares.

                              My guess is that Dustin is a lone wolf running a 'nix OS as his machine - and the rest of the company is using Windows. Nothing wrong with that, just my guess.

                              His company is significantly Mac.

                              aww, that's right - he has been asking a lot of MAC questions lately.

                              Unix questions to be more precise, but yeah we are a heavy Mac shop.

                              1 Reply Last reply Reply Quote 0
                              • 1 / 1
                              • First post
                                Last post