[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Using gnu-parallel with processes that have an extended startup time
From: |
Ole Tange |
Subject: |
Re: Using gnu-parallel with processes that have an extended startup time |
Date: |
Thu, 18 Jul 2013 13:01:52 +0200 |
On Tue, Jul 16, 2013 at 3:02 PM, Ole Tange <ole@tange.dk> wrote:
> On Tue, Jul 16, 2013 at 2:20 PM, Diaa Sami <diaasami@gmail.com> wrote:
>
>> Hi,
>> I'm using gnu parallel with a custom python script that processes lines, one
>> line in, one or more lines out, and this script happens to have a long
>> startup time because of the kind of processing it has to perform on the
>> input(it has to load a dictionary in memory first).
>> I was wondering if gnu parallel can just keep the processes running and just
>> feed them records rather than starting a process for each block.
>
> So you are doing something like:
>
> cat bigfile | parallel --pipe yourprogram > output
>
> And no: GNU Parallel currently does not have an option for feeding
> more blocks to a running instance.
Now I have made a first version in git. It is highly inefficient, but
give it a spin and see if you can find errors.
cat bigfile | parallel --round-robin --pipe yourprogram > output
git clone git://git.sv.gnu.org/parallel.git
/Ole