[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Revision of GNU Parallel's processing of SIGTERM
From: |
Martin d'Anjou |
Subject: |
Re: Revision of GNU Parallel's processing of SIGTERM |
Date: |
Mon, 13 Apr 2015 09:44:10 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 |
On 15-04-12 07:14 AM, Ole Tange wrote:
You mean in sub Job::kill():
# Wait up to 200 ms between TERMs - but only if any pids
are alive
my $sleep = 1;
for (my $sleepsum = 0; kill 0, $family_pids[0] and $sleepsum < 200;
$sleepsum += $sleep) {
$sleep = ::reap_usleep($sleep);
}
'kill 0, pid' returns true if the process is still running.
$family_pids[0] is the immediate child (i.e. the parent of any
(grand)*children)).
There is no need to see if any (grand)*children are running: it is the
job of $family_pids[0] to kill those.
I agree: it is the job of the parent process to kill its child
processes. But this does not match what I see in the code. Here is the
kill loop:
for my $pid (@family_pids) {
if(kill 0, $pid) {
# The job still running
kill $signal, $pid;
$alive = 1;
}
}
The kill loop sends the kill signal to all elements of the family_pids
array. So IMO the code and the intent have to match. In fact, I ran into
this confusion when I wrote the test (line 19 of this file):
https://github.com/martinda/gnu-parallel/blob/sigterm-1/testsuite/tests-to-run/parallel-local-signals.sh#L19
I propose two solutions:
- Change the code to only kill the parent process (existing users will
have to change their application code to kill children processes themselves)
- Do not change the default behaviour, but add an option so that only
the parent job is killed (existing users see no change)
What solution is preferable?
Martin