Linux: Basic Working with Text Files



  • One thing that often surprises those used to working with Windows when they come to the Linux world, or the UNIX world in general, is the heavy use of text files. In Linux, nearly everything is handled with a plain text file. Most logs (although this is beginning to change), nearly all configuration files, command output and more is all text. Linux is focused very heavily on text. Because of this we have a lot of common text manipulation, reading and management tools in Linux that we use all of the time.

    In this lesson we will be introducing the following new commands:

    • cat: Short for concatenate, this command prints out the contents of one (or more) text files.
    • head: Prints out the top of a text file.
    • tail: Prints out the end of a text file.
    • less: Makes a large text file able to be scrolled through, backwards and forwards.

    The cat command is one of the most common tools that we will use as system administrators. It is very simple. Since we do not know yet how to get text into a file we will need to work with files that already exist. Let's try it:

    # cat /etc/passwd
    root:x:0:0:root:/root:/bin/bash
    bin:x:1:1:bin:/bin:/sbin/nologin
    daemon:x:2:2:daemon:/sbin:/sbin/nologin
    adm:x:3:4:adm:/var/adm:/sbin/nologin
    lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
    sync:x:5:0:sync:/sbin:/bin/sync
    shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
    halt:x:7:0:halt:/sbin:/sbin/halt
    mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
    operator:x:11:0:operator:/root:/sbin/nologin
    games:x:12:100:games:/usr/games:/sbin/nologin
    ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
    nobody:x:99:99:Nobody:/:/sbin/nologin
    avahi-autoipd:x:170:170:Avahi IPv4LL Stack:/var/lib/avahi-autoipd:/sbin/nologin
    systemd-bus-proxy:x:999:997:systemd Bus Proxy:/:/sbin/nologin
    systemd-network:x:998:996:systemd Network Management:/:/sbin/nologin
    dbus:x:81:81:System message bus:/:/sbin/nologin
    polkitd:x:997:995:User for polkitd:/:/sbin/nologin
    tss:x:59:59:Account used by the trousers package to sandbox the tcsd daemon:/dev/null:/sbin/nologin
    postfix:x:89:89::/var/spool/postfix:/sbin/nologin
    sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
    

    The passwd file is one that we will be working with extensively in the future. This might be our first time seeing it. We won't worry about what all of the contents means right now, we are just using it as a standard text file for this example.

    Don't worry about the contents of your /etc/passwd file at this point, what matters is that we were able to read it using nothing more than the cat command. We can do this the same with any text file. In fact, we can do it the same with any number of text files. If we give cat more than one text file at a time, it will spit them all out onto the screen for us as one big glob of text. Not something that we need to do very often, but it does have its uses, especially for a system administrator who might need to work with more than one file at a time.

    Try "catting" the /etc/group file and see how it outputs. It should be similar to /etc/passwd. These are files that you will use often. Now try putting both together with cat like this:

    # cat /etc/passwd /etc/group
    

    This probably went beyond what would fit on your screen, that is normal. And it brings up an important additional tool: less

    Using cat is great for tiny files or for implementing in tool chains that we will learn about later. The less utility is actually a play on words. Long ago we had a utility called more, so named, I suspect, because it would show you one screen of text and you could hit the space bar for it to show you "more" of the file. It was very basic and even had huge limitations like it could scroll down but it could not scroll back up! Rather limited, for sure. So once computers because more powerful, the new utility called less was created (get it, less IS more!!) and allowed for searching, line counting, directional movement and so forth.

    Over time we will get more and more used to using less. For the moment we can start with a small file like /etc/passwd:

    # less /etc/passwd
    

    You can navigate up and down using the cursor keys and the page up and page down keys. You can do things like jump to the bottom of the file using "G". Press "q" to stop reading the file.

    Now if you thought that that was exciting, wait until you get to do the same with a large file! We will browse the rather expansive system log file using less to get a sense of its value.

    # less /var/log/messages
    

    Poke around a little, the log file is going to be a constant friend. But for now, just learning to get around in less is all that we need.

    Of course tools like less are wonderful when we want to look at a large file and navigate manually. But many times we will want something much simpler. What if we just wanted to see the top ten or so lines of a file, you know maybe to see if we have the right file or something. Well, there is a command for that, of course. The top of a file is call its "head", and so the command for reading the top of the file is head. We will use that log file once again:

    # head /var/log/messages
    

    Useful, indeed. The output is much like we would get from cat but instead of the entire file, we just got the beginning of it. (If you want to see why we need this, try using cat on that file.) But it is pretty rare that we would want to see the oldest (top) portion of a log file, right? What is normally of real interest is the very end of it. If the top of a text file is called its head, the bottom is called its tail. And, of course, the command for reading the end of a file is called tail.

    # tail /var/log/messages
    

    Now this is useful, just the last ten lines from the log file. If you are looking to see what has "just happened", this is how you do it. Super handy.

    With these four simple commands, you will have a pretty good ability to work with basic text files. But there is one even cooler option that I want to leave you with, and that is known as following. The action to follow in UNIX does not get its own command, however. Instead we follow using the follow "f" flag on tail. When we do this, tail shows the last ten lines of a file and then continues to follow the file printing out every new line added to it as it happens. It becomes a real time text file monitor!

    # tail -f /var/log/messages
    

    You might not see very much activity on your Linux VM as you are not doing anything to it. On a busy system you might see rather a lot of activity when doing this.

    At this point you are probably so excited that you did not notice that there is no way to exit the following operation! (That is the act of following, not the operation that I am just about to tell you about.) The tail -f command will continue to follow a file indefinitely until you stop it. And to stop it we use "Control-c". You will use a lot of Control-c on UNIX systems, get used to it. We will learn in depth later what exactly it does, but suffice it to say it is the generic "stop this command from running" key sequence.

    That's it, you now have some text reading basics on Linux!


    Part of a series on Linux Systems Administration by Scott Alan Miller



  • Making heads and tails of text files...

    By default, the head and tail commands each give an output of ten lines. That's not bad, and for many things it is all that we need. But there are times that it would be nice to get more or less from these commands.

    To get head to give us a different number of lines, just add the number of lines that you wish to be returned following a hyphen, as it if were a normal flag. Here is an example to return the top five lines of a file.

    # head -5 /var/log/messages
    

    Strangely, and I have no explanation for this, tail works similarly but not exactly the same. The tail command needs a "-n" flag followed by the number of lines that we wish to extract. So if we wanted to see the final eighteen lines of a text file we would do it like so:

    # tail -n 18 /var/log/messages