[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
GNU Parallel seems to drop
From: |
Dirk Eddelbuettel |
Subject: |
GNU Parallel seems to drop |
Date: |
Tue, 25 Sep 2012 03:53:18 +0000 (UTC) |
User-agent: |
Loom/3.14 (http://gmane.org/) |
Hi Ole,
I have some large jobs in which a file is piped into awk, and awk then splits
the large file into distinct files based on a token found on the line.
To make matters concrete, imagine a file
A B foo C D
E F foo G H
I J giz K L
M N foo O P
Q R giz S T
where the 1st, 2nd and 4th line go to the file data/foo, and the 3rd and 5th to
data/giz.
I would like to parallelize this. And instead of
zcat foo.gz | awk -v v1=A v2=B -F: '....'
I tried (several variations, ending with)
zcat foo.gz | parallel --pipe -- awk -v v1=A v2=B -F: -f script.awk
which should avoid most shell quoting headaches. Unfortunately, parallel seems
to swallow a lot of lines. I started with approx 670 mb, and the parallel
approach only yields about 3. Ouch. I am obviously doing something wrong
here,
but what is it?
I started with the current Debian package 20120422 and just tried the most
recent release 20120822 which did not change things.
Thanks for writing and supporting parallel. It looks rather useful.
Cheers, Dirk
PS It would be nice if you also provided .info documents. I still like those as
my go-to docs when in Emacs. I tried makeinfo on your .texi files, but there
seems to be some metadata missing.
- GNU Parallel seems to drop,
Dirk Eddelbuettel <=
- Re: GNU Parallel seems to drop, Dirk Eddelbuettel, 2012/09/25
- Re: GNU Parallel seems to drop, Ole Tange, 2012/09/25
- Re: GNU Parallel seems to drop, Dirk Eddelbuettel, 2012/09/25
- Re: GNU Parallel seems to drop, Ole Tange, 2012/09/25
- Re: GNU Parallel seems to drop, Dirk Eddelbuettel, 2012/09/25
- Re: GNU Parallel seems to drop, Ole Tange, 2012/09/25
- Re: GNU Parallel seems to drop data, Dirk Eddelbuettel, 2012/09/25
- Re: GNU Parallel seems to drop data, Ole Tange, 2012/09/25
- Re: GNU Parallel seems to drop data, Dirk Eddelbuettel, 2012/09/25
- Re: GNU Parallel seems to drop data, Ole Tange, 2012/09/25