[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: GNU Parallel seems to drop data
From: |
Dirk Eddelbuettel |
Subject: |
Re: GNU Parallel seems to drop data |
Date: |
Tue, 25 Sep 2012 13:48:10 +0000 (UTC) |
User-agent: |
Loom/3.14 (http://gmane.org/) |
Ole Tange <ole <at> tange.dk> writes:
> On Tue, Sep 25, 2012 at 1:50 PM, Dirk Eddelbuettel <edd <at> debian.org>
> wrote:
>
> > Well a little "apt-get install gawk-doc" and two seconds of searching lead
> > to
> > the '>>' operator to append to files ... and tada, it now works.
>
> Depending on how it appends that may not work. Do you know for sure it
> flushes for every record? Otherwise you may get half-records.
Yes, now that I am in the office and my actual data, that verification in the
next step. I probably also need the '-k' switch [ does that have "significant"
performance implications? ] to ensure the order is the same which is important
for the subsequent "munging" of the appropriately split files.
> If these give the same output, then you are golden. If not, you may
> have half-records in the parallel data.
>
> parallel -k --tag 'sort {} | md5sum' ::: dataSerial/*
> parallel -k --tag 'sort {} | md5sum' ::: dataParallel/*
Brilliant idea to compare via md5sum. Quicker than my formal munging.
Dirk
- GNU Parallel seems to drop, Dirk Eddelbuettel, 2012/09/25
- Re: GNU Parallel seems to drop, Dirk Eddelbuettel, 2012/09/25
- Re: GNU Parallel seems to drop, Ole Tange, 2012/09/25
- Re: GNU Parallel seems to drop, Dirk Eddelbuettel, 2012/09/25
- Re: GNU Parallel seems to drop, Ole Tange, 2012/09/25
- Re: GNU Parallel seems to drop, Dirk Eddelbuettel, 2012/09/25
- Re: GNU Parallel seems to drop, Ole Tange, 2012/09/25
- Re: GNU Parallel seems to drop data,
Dirk Eddelbuettel <=
- Re: GNU Parallel seems to drop data, Ole Tange, 2012/09/25
- Re: GNU Parallel seems to drop data, Dirk Eddelbuettel, 2012/09/25
- Re: GNU Parallel seems to drop data, Ole Tange, 2012/09/25
- Re: GNU Parallel seems to drop data, Dirk Eddelbuettel, 2012/09/25
- Re: GNU Parallel seems to drop data, Ole Tange, 2012/09/25
- Re: GNU Parallel seems to drop data, Dirk Eddelbuettel, 2012/09/25
- Re: GNU Parallel seems to drop data, Ole Tange, 2012/09/25
- Re: GNU Parallel seems to drop data, Dirk Eddelbuettel, 2012/09/25