Topics Map > University of Chicago > IT Services > Applications, Operating Systems, & Devices

Basic Unix - The File System

This article describes the Unix file system.

One of the features which make Unix especially attractive is the structure of its file system. Unix makes use of an inverted branching-tree file structure. The tree trunk consists of a single large file, "/" (called root), which contains all the files on the system. (In this structure, all files which contain other files are called "directories"). Any number of subdirectories may branch from this main trunk. However, the hierarchical structure does not end there, for each major branch may also spawn any number of further branches (sub-subdirectories). These subdirectories may, in their turn, send forth yet more levels of files...and so on. All successive directory names are separated by `/', so you'll find directory names like "/usr/local/doc/email-addr".

As an example of the way this file structure works in practice, consider the following example. Someone with a username of "user" creates a directory to contain her correspondence. This is one of several such high-level directories she has in her home directory; others might be entitled "projects," "dissertation," "articles," etc.Within the directory "correspondence," she creates several subdirectories: "personal," "memos," "professional." These subdirectories contain her basic files: in this case, individual letters and memos. She might also store programs (which, like most everything else in Unix, are really files at heart) in any of her directories. With the branching structure of her files and directories (much like folders on a Macintosh, or nested folders in a real-world filing system), "user" could easily to keep track of a large number of files.

Directories

All directories and files, including all home directories (the directories assigned to individual users) are ultimately branched from the root directory. If you issue the command

 > ls -F /

(meaning "list root, showing which files are directories and which are certain other special files"), you will see a listing something like this:

 News/ export/ nfs/ Usenet/
bin@ home@ pcfs/ userhome@
boot h11@ h21@ usr/
core h12@ h22@ var/
dev/ h13@ h23@ vmUnix*
h1@ kadb* h24@ vmUnix.931007
h2@ lib@ sbin/ vmUnix.ORIG.Z
h3@ lost+found/ scratch/ vmUnix.dist.Z
h4@ mnt/ sys@ vmUnix.nopatch.Z
etc/ mntl/ tmp/

Your home directory will be probably be inside one of the directories beginning with "h".

Unix has several built-in commands which simplify working with directories and files. If you want to work with a file, you can access it in either of two ways. You may provide Unix with its absolute or fully qualified pathname, which is a list of all the directory names between root and your destination file. If you have several subdirectories, this can mean quite a bit of typing. Consider the case of our friend "user," who wishes to list the contents of a file named "Unixmemos" to her terminal. This file exists in a subdirectory entitled "Memos," which is itself a subdirectory of a larger directory entitled "Correspondence," which is a subdirectory of a still larger directory called "Work" -- which is part of user's directory on the disk /h3. The fully qualified name, or "pathname," for this file would be "/h3/user/Work/Correspondence/Memos/unixmemos".

This is obviously quite a bit of typing, and it is all too easy to make a mistake while doing it. Fortunately, it is not necessary for our long-suffering example person to type this whole path each time she wants to view or edit "Unixmemos." First of all, Unix automatically places everyone in their home directories when they log in to the system, here "/h3/user". The practical effect of placing someone in a default directory is that the higher levels of the path (in this example everything from '/' to 'user') need not be specified when working with files in that default directory.

To further simplify any work with directories and files further down the path, user may change her working directory with the command `cd' (discussed below in Changing from one directory to another."). As an example, let's say user intends to work with "Memos." She intends to edit not only the file "Unixmemos," but also the files "pcmemos" and "macmemos." Rather than typing a large part of a very long pathname each time she wanted to shift from one to the other, it would behoove user to use the command

 > cd Work/Correspondence/Memos

to get closer to the files she wants to work with. Then she may enter

 > cd

to return to her home directory.

Note that the C-shell does not allow you to switch directly from one lower-level directory to another. If user wanted to work with the files in another directory, under "Personal" for example, and make "Personal" the default directory, she could move to "Personal" by typing the command lines

 > cd [to return to her home directory]
> cd Personal

Then she could work with any files under "Personal."

Directory-name abbreviations: . .. ~

Our friend "user" could also use another method to switch back and forth, because her Unix shell reserves three special characters as abbreviations to make it easier to work with directories. These characters are '.', '..' and '~'.

 . denotes your current directory
.. symbolizes the parent directory of the directory you are currently in
~ is a home directory; the default directory is your own

So, if Personal and user's current directory (Work) were both subdirectories of the same parent, she could type

 > cd /docs/Personal

If Personal were a subdirectory of a directory two levels higher, shecould type

 > cd /docs/Personal

-- and so on. The '..' abbreviation refers only to the directory at the level immediately above the one you are currently in. So, if she ever wanted to obtain a list of the directories and files in a parent directory, user could simply type

> ls ..

As another way to use directory-name abbreviations, consider the caseof a person whose username is "pers". He wants to access and work with files in the subdirectories "food" and "cats," in the home directory of a classmate whose username is "othr." (Under the default file protections of Unix, almost any file can be read by any user on the system. Please note that there are ethical implications of reading such files when you do not have permission from their owners. We urge you to protect your own important and/or sensitive files, by making use of the commands described in Basic Unix - File and Directory Permissions." )

To obtain a list of other's files in the subdirectory "food", pers would normally have to enter the command line

> ls /h2/othr/food

However, this command can be simplified by using the ~ abbreviation. Pers can obtain the same result by typing the command line

> ls ~othr/food

If he had changed directories, was in ~othr/food, and wanted a listing of his own home directory, he could obtain one by issuing the command

> ls ~

since '~', without a following username, is interpreted as the home directory of the person logged in.

These three directory abbreviations can also aid in other standard Unix operations. If pers was in his home directory and wanted to copy a file named "fondue" from ~othr/food, he could do so by combining the abbreviations described above and the 'cp' utility in the command line

> cp ~othr/food/fondue .

The copy utility 'cp' requires the location of a fileand its destination. Thus, the command line first uses "~othr" to refer to othr's home directory, then the character '.' to symbolize pers's current directory. We'll discuss 'cp' in more detail in "Copying, moving, and removing files."

Naming conventions

Unix naming conventions also make long, descriptive names possible; to make best use of the file structures of Unix, you'll probably want to give your directories names which describe their contents.

Directory and file names under our version of Unix may be any typeable length, and may contain pretty much anything. However, you should probably avoid punctuation other than numbers, periods and underscores, since many punctuation marks will have specific meanings to your shell (for more information on characters to avoid, see Wildcards,"below). This includes spaces; spaces inside filenames can be very annoying to deal with.

Some Unix users always begin directory names with upper-case letters, to more easily differentiate them from simple filenames.

Wildcards: ? * [ ] { }

Some characters function as "wildcards" in Unix commands involving file and directory names. Effectively, using one (or a pair) of these characters in a command creates an ambiguous reference -- a filename which may be interpreted in any of several different ways. When you use wildcards, the shell will generate a list of the filenames in your directory which match your file reference. (If you must use one of the reserved characters and don't want its special meaning, simply precede the character with a backslash ['\'].)

?

The question mark stands for a single character in a filename. In the command line

> ls memo?

the question mark will cause 'ls' to list all filenames or directories which begin with the string "memo" and contain exactly one more character, whatever that character may be. For example, if your directory contained the files "memo1", "memo2", "memo3", "memo4" "mamo1", "mimo", "m_mo" and the directories "momol" and "memok", `ls' displays the list:

memo1 memo2 memo3 memo4 memok

(Note that memok is a directory, so its contents will be listed as well.)

If instead you replaced one of the characters within the word "memo" with the wildcard "?":

> ls m?mo

you would receive a list of all filenames in your directory which started with the character 'm', ended with the characters 'mo', and had any single character in place of the question mark:

mamo mimo m_mo

*

The asterisk, '*', stands for any number of characters, including zero, in a filename. The one exception is that it will not match a filename which begins with a period, '.' . Thus a command line containing the file reference 'memo*', like:

> ls memo*

would list all filenames which begin 'memo', however many characters follow.

[ ]

The square brackets, '[ ]', are a limited form of the question mark, allowing you to search for specific filenames by specifying a list or a range of characters. Appending "[1234]" or "[1-4]" to "memo" in the command line

> ls memo[1-4]

would display the filenames "memo1", "memo2", "memo3", and "memo4" from the list above. The brackets define a character class that includes all the characters listed within the brackets. Square brackets are especially useful when combined with other wildcards. The command line

> ls [aeiou]*

would list all the files whose names begin with the vowels "a", "e", "i", "o", or "u". (Brackets are also useful when creating your own, more complex applications, since they enable you to feed the names of selected files to other utilities or programs one at a time.)

{ }

Finally, curly brackets, '{ }', allow you to specify partial filenames. If you happened to be in a directory containing files numbered from 1 through 1000, for example, you could type:

> ls 4{37,56,74}

to get a listing of "437", "456", and "474".

Paths

As you saw above, Unix is structured like a tree -- only upside-down.

Like MS-DOS, Unix uses "paths," or descriptions of a certain route down the tree, to find command names. Just as you can refer to files without typing their full tree-structure names, so you can call most programs in an abbreviated fashion, since Unix assumes it should look in certain places first. You don't need to know where the file containing the program "rm" is stored, for instance, to use the 'rm' command:

> rm myfilename

The system administrators have set up a default path for new users; you can change this, or type an explicit path to the command you want:

> /bin/rm myfilename

Navigating around in the file system

Checking current directory: pwd

To show the full path name of your current directory, use the 'pwd' (print working directory) command:

> pwd
/h2/pers/dir/

Changing from one directory to another: cd

The basic way to change directories in Unix is with the 'cd' (change directory) command. You can use 'cd' with absolute path names, relative path names, and some abbreviations.

An example of absolute path names: to change from anywhere else to /usr/local/doc, you can type "cd /usr/local/doc". All absolute path names start with "/", the root directory.

An example of relative path names: to change from /usr/local/doc/news to /usr/local/doc/news/newusers, you can type "cd newusers", since newusers is a subdirectory.

You can also use special directory names, such as ".", "..", and "~" (discussed earlier in Directories). Remember that "." is your current directory, whatever that is at the time; ".." is your current parent directory, and you can specify subdirectories without giving the full path name.

So, as another example of relative path names: to switch from /mydir/dir-one to /mydir/dir-two, you could say "cd /docs/dir-two".

Recall that "cd ~username" should change to the home directory of person "username", regardless of where that directory is and where you are; if the directory is protected, you'll get a message saying that you can't cd there.

With no arguments, 'cd' always returns you to your home directory:

 > cd
> pwd
/h2/pers

Listing files: ls

To actually see what's in a directory, you use the `ls' (list) command. It has a number of useful options:

 -F displays directory names with a "/", executable files with a "*",
links to other files with a "@"
-a shows all files in the directory, including those beginning with
"."; also lists . and .. directories
-l displays in long (detailed) format, including information on each
file's owner and access privileges (see Section 4, "File and directory permissions," for a discussion of these things).
-t sorts by time, with last-written first

Don't forget that options can be combined. For example, 'ls -Fa' (or -aF; order does not matter) shows all files in the directory, with a "/" after directory names, and other characters to mark other special files. And 'ls -laF' does both these things, in detailed format.

There are other options for `ls' (lots of them, actually). Typing "man ls" will get you the detailed manual entry.

Copying, moving, and removing files: cp, mv, rm

You'll need the following three commands to carry out basic operations with files:

 cp copy a file
mv rename a file; move files into a directory
rm scratch, delete, remove a file

To copy the contents of one file into another, use the 'cp' (copy files) command.

> cp firstfile secondfile

Note that if "secondfile" already exists, it will be overwritten withthe contents of "firstfile"; 'cp' has a "-i" (interactive) option which will warn you before overwriting files. You can uncomment some aliases defined in your .cshrc file that will force cp, mv, and rm to always use the -i option. It is highly recommended!

Both "firstfile" and "secondfile" can be either relative path names or absolute path names; "secondfile" can be a directory name instead. For example, "cp myfile .." copies the file "myfile" to the parent directory of wherever you are, if you have write privileges to that parent directory. If "secondfile" is a directory, you can replace "firstfile" with several file names, or a wildcard. To copy all files ending in ".txt" to a "Letters" directory, for example:

> cp *.txt Letters

To move a file into another directory, use 'mv':

> mv filh1 Stuff_directory

You can think of this as working exactly like 'cp', only the original is deleted. Like 'cp', 'mv' has a "-i" option which will warn you if -- in the above example -- a file named "filh1" already exists in Stuff_directory. If you don't use the -i option, you risk overwriting files, so be careful.

Since Unix doesn't have a separate command to rename a file, 'mv' does double duty for that as well. To change filh2's name to "thing":

> mv filh2 thing

Everything noted above about path names and write privileges applies equally to 'cp' and 'mv'.

To delete a file or files from a directory, use the'rm' command:

Before you start experimenting, however, be aware that 'rm' removes files permanently. Unless you're certain that the file is on a system backup (it's older than a week, say, and you haven't touched it since), and you're willing to bribe the system administrators to get it back for you, you should treat 'rm' with a great deal of caution.

> rm file7 oldfile *.txt

For multiple deletions, like the one above, we recommend the "-i" flag (which works the same as with 'cp' and 'mv': it asks you if you're sure you want to do that).

Creating and removing directories: mkdir, rmdir:

To create a new directory, inside of the current directory, use the 'mkdir' command:

> mkdir Stuff

People often start directory names with a capital letter and file names with a lowercase letter, to make them easier to distinguish.

To remove an empty directory, use 'rmdir'. This command returns an error message if there are still files inside; you must first remove them, using 'rm', or move them elsewhere using 'mv'.

> rmdir No_more_stuff

Finding files: find

Occasionally you may forget where you put a file, or where a command is located. With the fast-find feature of the (otherwise very complex) Unix `find' command, you can easily find it again:

 > find which
/usr/bin/ypwhich
/usr/share/man/man1/which.1
/usr/share/man/man1/ypwhich.1
/usr/ucb/which

The `find' command, given only a filename or other string, will check the fast-find database (a list of filenames from all directories which were publicly searchable the last time the database was updated); it will return names of files on the system containing the pattern you specified. Note that the database does not contain filenames from directories which are not publicly searchable, so if you've set your own directories to be very private, the fast-find feature of 'find' can't help you locate your own files.

For more advanced uses of the 'find' command, check out its man page.

Files

Showing contents of files: cat, more/less

You could use the 'cat' command to see what's in a text file, but since it lists the entire file without stopping, it's not useful in most situations. (You can use 'cat' to do other interesting things, though; for more information, check out its man page.)

The 'more' and 'less' utilities allow you to view a file one screenful at a time. At the bottom of each screen, both `more' and `less' will prompt you for the next screen with:

--More--(n%)

At this point you may press the return key to display the next line, or the space bar to display the next screen. With the "-n" option to 'more', where n is an integer specifying a number of lines, you can define the size a screenful should be.

Showing beginnings/ends of files: head/tail

To display only the beginning of a file, use the `head' command; to display only the end, use `tail'. Without an option specifying how many lines you want displayed (`head -3', for instance, for the first three lines), both `head' and `tail' default to ten lines.

For some reason known only to a few Unix wizards, 'head' works with multiple filenames, while 'tail' does not.

Comparing files: diff

To show the differences between two similar files, use 'diff'. Let's say we have two files, one copied from another, and make a minor change to filh2: we change the word "file" to "lozenge." Comparing them, we see:

 > diff filh1 filh2
3c3
< file,
---
> lozenge,

Searching for strings in files: grep

To search for a certain character string or other pattern in a file, use the 'grep' command.

Some of the more useful options to 'grep' are "-n", to display the line numbers where the string is found; "-l", to list filenames containing the string, instead of displaying the lines where it is found; and "-i", to ignore case (otherwise you'll find only strings in which the uppercase and lowercase letters are exactly as you typed them!).

For a list of the files in /usr/local/doc/news/newusers which discuss"netiquette," in any capitalization:

 > grep -il netiquette /usr/local/doc/news/newusers/*
/usr/local/doc/news/newusers/README
/usr/local/doc/news/newusers/emily-postnews
/usr/local/doc/news/newusers/info-postings.1
/usr/local/doc/news/newusers/info-postings.2
/usr/local/doc/news/newusers/info-postings.3
/usr/local/doc/news/newusers/news-answers-intro

Simple file creation

One of the strengths of Unix is the variety of tools available for any given task, and creating files is no exception.

Using an application, such as 'script'

Sometimes you'll want to record a session, or part of a session, either to document something or to have an accurate and complete record of what happened. In such cases, you can make use of the 'script' command. When you type "script", a copy of everything that goes to your screen will also be written to a file in your current directory. (This produces a slightly messy file, which you'll probably want to edit; see Basic Unix - Editing Files.")

The default filename used by 'script' is "typescript". If you wish, you can specify another name:

> script myfilename

The 'script' utility will continue merrily copying all terminal output to the file until you type a D, so don't forget and leave it running.

Many applications other than 'script' allow you to create files. For example, you can save messages from within an electronic mail program into a file; for more on email programs, see "Electronic mail" in the Basic Unix - Communicating with Others section.

Using input/output redirection, such as 'cat >>file'

You can create a new file most simply by typing:

> cat >> newfile

(The first '>' represents your prompt, but the others you must actually type. What you're doing here is referred to as "input/output redirection," and is one of the features that makes Unix so flexible.)

This will create a new file named "newfile" if there isn't one in that directory, or append to an existing one if there is. After you press the return key, anything you type will be copied into the file until you type a D. This method works well for short files, but since you can't go back and edit any line other than the current one, we recommend using an editor for longer files.

Using an editor

The best way to create long files, and the method which allows you the most freedom, is using an editor. See Basic Unix - Editing Files, for more information.

Controlling disk use

Because Unix systems are shared by so many people, all of whom store things in their directories, disk space is a constant problem. Remember that disk space is a shared resource, and all users of the a Unix system are expected to be responsible about the number and size of files kept on those machines.

You should always be aware of how much space you are using, and make an effort to keep it to a minimum. Fortunately, Unix offers several tools to help you do this.

Showing disk usage and disk free: du -s, df

To see how much space your current directory is using (it, the files it contains, and all directories beneath it), use the 'du -s' (disk usage, summary) command:

 > du -s
1039 .

This says your current directory and its contents take up 1039 kilobytes, or a little over one megabyte; if this were your home directory, you'd be in pretty good shape. five megabytes of disk space before the system administrators take notice and ask you to clear out larger files and those you use infrequently.

To get a sense of how this fits into a larger picture, use the 'df' (disk free) command:

 > df
Filesystem kbytes used avail capacity Mounted on
/dev/sd0a 15487 9907 4032 71% /
/dev/sd0g 93007 82787 920 99% /usre
/dev/sd2a 181807 149137 23580 86% /usr/local
/dev/sd0h 62359 53245 5997 90% /scratch
/dev/sd0d 30991 1524 27918 5% /tmp
/dev/sd0e 61999 14122 44778 24% /var
/dev/sd0f 15487 2916 11797 20% /var/log
midway:/var/spool/mail 145234 107101 23610 82% /var/spool/mail
midway:/usr/local/share 1008502 838944 68708 92% /usr/local/share
harper:/nfs/harper/h1 781838 764818 17020 98% /nfs/harper/q1
harper:/nfs/harper/h2 1006959 967436 19384 98% /nfs/harper/h2
harper:/nfs/harper/h3 963662 845974 69505 92% /nfs/harper/q3
harper:/nfs/harper/h4 961070 840444 24519 97% /nfs/harper/q4
bluebird:/usr/local/share 300127 235306 49815 83% /Usenet/share
bluebird:/usr/local/lib/news
605886 78208 467090 14% /Usenet/news

The important figure, so far as you're concerned, is the disk where your own files reside. If your home directory were in /h2, and there were only nine megabytes free for everyone to use, it would be a bad idea to copy a large file to your directory until more space were free.

Showing the size of files: du (-a); ls -s; wc

The 'du' command alone, with no options, will also display size summaries for each directory beneath the current one:

 > du
828 ./Saved_mail
183 ./Game_info
1039 .

To see how much space each individual file, in each of the directories beneath the current one, is using, type the 'du -a' command.

Another command which shows the sizes of files only in the current directory (unlike 'du -a', which acts recursively) is 'ls -s' (list files, displaying size in kilobytes):

 > ls -s
total 75
9 aufs.docs 1 refusing.mail
7 cmds.2 2 script.to.try
3 cmds.5 1 spell_files
2 cmds.6 19 stephen.wright.quotes
3 faces 1 tex_practice
1 humor 5 ultimate.cshrc
1 info 1 Unix.quote
1 matt.null 1 Unix_tricks
2 mpage 2 unvalentine.c
1 name_gen 2 uunet.info

Because 'ls -s' (without the additional "-F" option) does not distinguish between files and directories in determining size, its output can be misleading. In the above example, all names with underscores (such as "tex_practice" are directories, containing files of their own; the 'ls -s' command merely lists them as one-kilobyte files. You may wish to always combine the options, and type the command as "ls -sF".

A third useful command is 'wc' (word count), which displays the number of lines, words, and characters in each file you specify:

 > wc cmds.*
144 1221 7126 cmds.2
44 358 2136 cmds.5
29 217 1216 cmds.6
229 1880 10967 total

With all these commands, you should find keeping track of your disk usage easy.

Compressing files: compress/uncompress

Once you know how much space your files take up, you'll want to compress those that you don't use frequently, or that you haven't used in some time -- especially larger files.

The 'compress' command converts files into smaller binary files for storage, so they won't take up as much disk space. A compressed file's name ends in ".Z" 'uncompress' command.

 > compress file
> uncompress file.Z

You can glance at a compressed file without uncompressing it, by using the 'zcat' command.




Keywords:unix file system   Doc ID:16189
Owner:Larry T.Group:University of Chicago
Created:2010-12-08 18:00 CSTUpdated:2015-09-02 13:07 CST
Sites:University of Chicago, University of Chicago - Sandbox
Feedback:  8   0