Working with Files In Linux



  • I am working on document cleanup in an ancient custom (shitty) application we are trying to retire. Basically, there are files everywhere, and I need to find the files that are referenced in the database in the filesystem. My plan is to dump the file references from the application's database into a table, and do the same for the filesystem in another table. I will then match by filename and go from there.

    However, I'm not sure how to approach capturing the files at the filesystem level. Say said files are structured in /this/directory, what would be the best way to capture the following data?

    Filename | Absolute Path | Modified Date

    Any advice would be appreciated. For what it's worth, this is on CentOS 7.

    Thanks!!


  • Service Provider

    No need to get the filename, the absolute path will include that already.


  • Service Provider

    I'm not clear what you are asking. Do you want a list of ALL files under said /directory or are you looking for only certain ones?



  • @scottalanmiller said:

    No need to get the filename, the absolute path will include that already.

    I want the file name and path to said file separate, but I suppose I could separate them through another step. I'm going to be matching by file name. basically table1.filename = table2.filename



  • @scottalanmiller said:

    I'm not clear what you are asking. Do you want a list of ALL files under said /directory or are you looking for only certain ones?

    Every single file under /this/directory.


  • Service Provider

    @anthonyh said:

    @scottalanmiller said:

    No need to get the filename, the absolute path will include that already.

    I want the file name and path to said file separate, but I suppose I could separate them through another step. I'm going to be matching by file name. basically table1.filename = table2.filename

    Just use a filter on the existing file, no need to make a separate file for that.


  • Service Provider

    @anthonyh said:

    @scottalanmiller said:

    I'm not clear what you are asking. Do you want a list of ALL files under said /directory or are you looking for only certain ones?

    Every single file under /this/directory.

    Oh okay.

    find /dir -type f -print
    

    Where /dir is the directory name in question. See if that gives you want you want.



  • This is super easy to do in Linux.... If you know all the commands like @scottalanmiller! :D



  • @scottalanmiller said:

    @anthonyh said:

    @scottalanmiller said:

    I'm not clear what you are asking. Do you want a list of ALL files under said /directory or are you looking for only certain ones?

    Every single file under /this/directory.

    Oh okay.

    find /dir -type f -print
    

    Where /dir is the directory name in question. See if that gives you want you want.

    That gives me the absolute path, but no date. I found this command that gets me a little closer:

    find /this/directory -type f -exec stat -c "%n %y" {} ;

    Gives me this:

    /this/directory/data/EFile/DOC/227349_FS86478.pdf 2011-08-19 10:21:22.000000000 -0700

    But it's not ideal, yet. I'd need to delimit the file and timestamp with something other than a space. I would love to eliminate the decimal on the seconds as well as the timezone, but I can work around those.



  • Ooh, I'm very close!

    find /this/directory -type f -printf "%f\t" -printf "%h\t" -printf "%Tc\n"

    Gets me this:

    254405_FS85691.pdf /this/directory/data/EFile/CASEDOC Mon 27 Aug 2012 08:52:15 AM PDT

    If I can get the timestamp formatted as YYY-MM-DDD HH:MM:SS (24h time) I will be golden! I don't care about PDT vs PST.



  • I think I've got it close enough!

    find /this/directory -type f -printf "%f\t" -printf "%h\t" -printf "%TY-%Tm-%Td %TH:%TM\n"

    Result:

    101581_PR78450.pdf /this/directory/data/EFile/MO 2007-10-30 11:16


  • Service Provider

    @anthonyh said:

    @scottalanmiller said:

    @anthonyh said:

    @scottalanmiller said:

    I'm not clear what you are asking. Do you want a list of ALL files under said /directory or are you looking for only certain ones?

    Every single file under /this/directory.

    Oh okay.

    find /dir -type f -print
    

    Where /dir is the directory name in question. See if that gives you want you want.

    That gives me the absolute path, but no date. I found this command that gets me a little closer:

    find /this/directory -type f -exec stat -c "%n %y" {} ;

    Gives me this:

    /this/directory/data/EFile/DOC/227349_FS86478.pdf 2011-08-19 10:21:22.000000000 -0700

    But it's not ideal, yet. I'd need to delimit the file and timestamp with something other than a space. I would love to eliminate the decimal on the seconds as well as the timezone, but I can work around those.

    Easier to work with the date if you use UNIX time instead of a human readable format. And you can use the cut command to trim off anything trailing that you don't want.


Log in to reply
 

Looks like your connection to MangoLassi was lost, please wait while we try to reconnect.