parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Revision of GNU Parallel's processing of SIGTERM


From: Martin d'Anjou
Subject: Re: Revision of GNU Parallel's processing of SIGTERM
Date: Mon, 13 Apr 2015 09:44:10 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0

On 15-04-12 07:14 AM, Ole Tange wrote:

You mean in sub Job::kill():

             # Wait up to 200 ms between TERMs - but only if any pids
are alive
             my $sleep = 1;
             for (my $sleepsum = 0; kill 0, $family_pids[0] and $sleepsum < 200;
                  $sleepsum += $sleep) {
                 $sleep = ::reap_usleep($sleep);
             }

'kill 0, pid' returns true if the process is still running.
$family_pids[0] is the immediate child (i.e. the parent of any
(grand)*children)).
There is no need to see if any (grand)*children are running: it is the
job of $family_pids[0] to kill those.

I agree: it is the job of the parent process to kill its child processes. But this does not match what I see in the code. Here is the kill loop:

    for my $pid (@family_pids) {
        if(kill 0, $pid) {
        # The job still running
        kill $signal, $pid;
        $alive = 1;
        }
    }

The kill loop sends the kill signal to all elements of the family_pids array. So IMO the code and the intent have to match. In fact, I ran into this confusion when I wrote the test (line 19 of this file):
https://github.com/martinda/gnu-parallel/blob/sigterm-1/testsuite/tests-to-run/parallel-local-signals.sh#L19

I propose two solutions:
- Change the code to only kill the parent process (existing users will have to change their application code to kill children processes themselves) - Do not change the default behaviour, but add an option so that only the parent job is killed (existing users see no change)

What solution is preferable?

Martin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]