[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Lzip-bug] Source code repository for lzip
From: |
Antonio Diaz Diaz |
Subject: |
Re: [Lzip-bug] Source code repository for lzip |
Date: |
Mon, 03 Mar 2014 19:09:47 +0100 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.11) Gecko/20050905 |
Michał Górny wrote:
So in fact the 'underlying stream' in lzip file format is incompatible
with the 'underlying stream' in xz? Am I understanding this correctly?
Yes. The sad thing is that the 'underlying stream' in the lzip file
format is identical to the 'underlying stream' in lzma-alone. It is xz
the one that throw away the code in lzma-utils and rewrote it from
scratch, changing the license, the stream, and adding some features
dangerous for Free Software and mostly useless for GNU/Linux systems.
Just make a test: does Gentoo use any of the "advanced" features of xz
("cheaper" recompression of already compressed files, binary filters,
user defined filters, etc)?
BTW, the simpler stream in lzip files is what makes lziprecover possible.
I agree with you that xz is unnecessarily complex and therefore you
could say that it moves in that regard, but I guess I don't understand
lzip enough to see what the arguments are in favor of it instead,
and that's what I'm trying to get a grasp on, what the key benefits are.
Lzip is just a compressor like gzip or bzip2, only it compresses more. A
wise person should choose lzip by default and only switch to xz after
verifying that he needs the additional complexity.
In practice, and surprisingly, it happens the contrary: people choose
xz, complain about its unnecessary complexity, and ask what are the
benefits of using something simpler that makes the job as well or even
better.
It occurs to me that if data safety was my top priority, I'd use a tool
dedicated to just that task, like PAR2.
Normally one compresses the files before using a tool like parchive on
them. See for example how the page of zfec[1] (another package
implementing an "erasure code") describes the process:
"a Unix-style tool like "zfec" does only one thing -- in this case
erasure coding -- and leaves other tasks to other tools. Other
Unix-style tools that go well with zfec include GNU tar for archiving
multiple files and directories into one file, lzip for compression, and
GNU Privacy Guard for encryption or sha256sum for integrity. It is
important to do things in order: first archive, then compress, then
either encrypt or integrity-check, then erasure code."
[1] https://pypi.python.org/pypi/zfec
But it is usually much easier to just store two or more copies of your
important files on different media and use lziprecover to merge the
copies if all of them get damaged.
> But if I just need a tool to compress my sources for distribution, I
> can safely assume that something else will be responsible for
> ensuring the integrity of my data.
Then I would prefer the simplest tool that makes the job; lzip.
Another technical concern I have, is regarding memory. How does lzip
compare in regards to xz? If the peak memory use is determined
by the dictionary size, doesn't this make efficient use of memory
a matter of better implementation rather than the format?
Peak memory use is mainly a matter of choosing the right dictionary
size, and lzip is much better than xz in this regard because 'xz -9'
always uses a 64 MiB dictionary (even to compress a very small file),
while lzip automatically uses the smallest possible dictionary size for
each file.
Xz is the only compressor that gave problems in the 'dist' target of
automake. This is why it is the only compressor that automake does not
invoke with option '-9' by default. Once again, a wise man would just
use lzip; there is no reason to waste 674 MiB to compress a small file
as xz does.
Regards,
Antonio.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [Lzip-bug] Source code repository for lzip,
Antonio Diaz Diaz <=