|
From: | Jon Schipp |
Subject: | Parallel Output Differs |
Date: | Mon, 12 Aug 2013 17:35:28 -0500 |
My output is not staying consistent after each parallel run and the files are not changing.
I compiled the 20130722 tarball from source and am running it on a x86_64 Red Hat machine, 2.6.32-358.14.1.el6.x86_64
First run:
$ ~user1/parallel-20130722/src/parallel 'zcat {} | grep '\''[[:space:]]8\.8\.8\.8[[:space:]]'\''' ::: conn.* | wc -l
10411949
Second run, immediately after:
$ ~user1/parallel-20130722/src/parallel 'zcat {} | grep '\''[[:space:]]8\.8\.8\.8[[:space:]]'\''' ::: conn.* | wc -l
10302213
I tried the command like 'ls conn.* | parallel ..." too but same behavior.
The actual line count amount for the query is:
$ zcat conn.* | grep '[[:space:]]141\.142\.225\.125[[:space:]]' | wc -l
38165310
There are 24 files, each a little over 100MB. The machine has 24 cores. The lower the number jobs scheduled the
more accurate the result. Issuing option -j 4 provided the accurate result:
$ ~user1/parallel-20130722/src/parallel -j 16 'zcat {} | grep '\''[[:space:]]141\.142\.225\.125[[:space:]]'\''' ::: conn.* | wc -l
15585105
$ ~user1/parallel-20130722/src/parallel -j 8 'zcat {} | grep '\''[[:space:]]141\.142\.225\.125[[:space:]]'\''' ::: conn.* | wc -l
21112856
$ ~user1/parallel-20130722/src/parallel -j 4 'zcat {} | grep '\''[[:space:]]141\.142\.225\.125[[:space:]]'\''' ::: conn.* | wc -l
38165310
Do you know why this is and how I could get the correct output while having the performance increase of using as many core as possible?
Any explanation or help is appreciated. Thanks, Jon
[Prev in Thread] | Current Thread | [Next in Thread] |