Directory Tree Depth: Report
-
Looking at the files on one our servers, I surprised at the depth and length of the directory tree.
Some of it is pretty ridiculous as to just how long the path and file name are. Is there a reporting tool that could break it down, and give feed back? I see a burning fuse with such long paths, and want to generate a presentable report.
-
You want a report of the length to know how close they are getting to the limit? That should be easier enough. If this was Linux I could do it in seconds.
-
Pretty much, but then to maybe educate on shortening them going forward. There isn't much reason to repeatedly push the 255 character limit on a file and the depth. I respect the need to document and retain data,.. but this seems excessive
-
Filenames should be descriptive and readable but not self documenting. That's not their purpose.
-
Filing and sorting is something that is always a bit of this and what 'fits'. Ultimately it's up to the person to decide how they want to arrange it.
When I was doing basic computer instruction I gave the example of Music since 95% of people understand music and know what it is.
I started with Top of the tree My Music, then genre, Group or Artist, Album, song.
But that is pretty shallow on the directory tree list. I"m seeing some as many as 10 or 11 folders deep before the files.
This one backup report I'm reviewing has 136,745 files with 11,486 folders for 108,405,977 KB - just seems as if it's a bit deep.
-
This is where traditional filesystems have broken down and why more modern things like Sharepoint with flat storage and heavy metadata tend to work so much better.
-
@scottalanmiller said:
This is where traditional filesystems have broken down and why more modern things like Sharepoint with flat storage and heavy metadata tend to work so much better.
Heavy meta data to enable searches? What populates the metadata portion?
-
@Dashrender said:
Heavy meta data to enable searches? What populates the metadata portion?
The same thing that creates folders and filenames... humans.
-
@scottalanmiller said:
@Dashrender said:
Heavy meta data to enable searches? What populates the metadata portion?
The same thing that creates folders and filenames... humans.
Good metadata requires a lot of consideration - rarely do I find good folder structure, hence people are loosing things all the time.
-
@Dashrender said:
Good metadata requires a lot of consideration - rarely do I find good folder structure, hence people are loosing things all the time.
In which case the value of the organization is moot and all that matters is the shorter filenames and not making things hard for other people.
-
Here is an example in Python.
import os mypath = input("What starting path would you like? ") filelist = [] for (dirpath, dirnames, filenames) in os.walk(mypath): for name in filenames: namelength = len(os.path.join(dirpath, name)) fullname = os.path.join(dirpath, name) print(str(namelength) + " Name: " + fullname)
-
Now let's take this up a notch. Rather than just printing a list, let's create a dictionary (aka a hash or a map) that we can then sort. This will not only allow us to look for the biggest offenders but will also allow us to filter out the shorter filenames that we don't care about.
import os import operator mypath = input("What starting path would you like? ") limit = input("Only show filenames longer than? ") filelist = [] filedict = {} for (dirpath, dirnames, filenames) in os.walk(mypath): for name in filenames: namelength = len(os.path.join(dirpath, name)) fullname = os.path.join(dirpath, name) filedict[fullname] = namelength for offenders in sorted(filedict, key=filedict.get, reverse=True): if filedict[offenders] > int(limit): print(offenders, filedict[offenders])
-
This is actually a problem in PowerShell too because it has the same 256 limit!! Which I just don't understand why this hasn't been permanently fixed--problem has only been around for a decade or so!
Anyway, to accomplish in PowerShell the best bet is to use Robocopy to just list the directories, then it's child's play to get the lengths > 256 and display. It's the Robocopy that's a pain, luckily:
http://thesurlyadmin.com/2014/08/04/getting-directory-information-fast/
Not exactly on topic, but it has the code for building an array with the data in it.