[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: du and tar: --exclude-from option
From: |
Bob Proulx |
Subject: |
Re: du and tar: --exclude-from option |
Date: |
Wed, 30 Oct 2002 17:05:02 -0700 |
User-agent: |
Mutt/1.4i |
Jacob Elder <address@hidden> [2002-09-28 18:13:11 -0400]:
> It appears that du and tar use a different pattern language for their
> --exclude-from option. I was trying to predict the size of a backup that
> would be performed with tar, and came across a discrepency.
>
> df reports that I'm using 2882M.
> tar --exclude-from says that the backup is 2722M.
> du with my tar-friendly exclude from reports that I'm using 2942M.
> du without any exclude file reports that I'm using exactly: 2942M.
df is reporting disk blocks free.
du is reporting disk blocks used.
tar makes tape archives and I did not realize reported a size.
However looking at the size of the archive created would be
representative of the space used. But tar has its own overhead for
storing files and I would expect the size to be larger than the number
of bytes consumed by the file.
A frequent confusion is that du reports the number of characters
consumed by files. In order to get that information you would need
either a script to add up the ls -l data or yet a different command.
That functionality is not provided by du.
The du command asks the filesystem how many disk blocks are consumed
by a file. This is usually a larger number of bytes than the file
size. A file of one character will consume not just one character but
also the entire size of the filesystem fragment size. I am going to
wave my hands and say that is a 512 byte block even for the smallest
file. Not true anymore but once was always true. Different
filesystems implement this differently and that is not the point here.
Also, a file could be sparse.
Therefore I would expect du to rarely ever give you the same values as
if you added up the space occupied by characters. It would rarely be
the same size as the tar archive size. The sizes you quoted seem in
the right magnitude to just be confusion over file sizes and disk
blocks.
> The options have the same name, and both commands hail from gnu.org, so
> shouldn't they cooperate?
I am not convinced by this data that the exclude list is really the
issue here. It might be. But the other confusion seems a much more
likely explanation.
Bob