[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Revision of GNU Parallel's processing of SIGTERM
From: |
Martin d'Anjou |
Subject: |
Re: Revision of GNU Parallel's processing of SIGTERM |
Date: |
Sun, 12 Apr 2015 19:53:03 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 |
On 15-04-12 07:14 AM, Ole Tange wrote:
On Sat, Apr 11, 2015 at 12:56 AM, Martin d'Anjou
<martin.danjou14@gmail.com> wrote:
Hello Ole,
I worked on the SIGTERM propagation feature today. I have questions, the
questions are also in the code in the form of comments, if you prefer to
read them there (search for "Question"):
https://github.com/martinda/gnu-parallel/compare/sigterm-1?expand=1#diff-5379ba718ef5b0a2feb45981e768a9fd
Q1:
Inside sub wait_and_exit, job->kill(TERM") is called twice. As I am trying
to update the documentation, I find this complex to explain.
Do you know why the call is made twice?
Should I write my own "wait_and_exit" for the SIGTERM propagation feature?
It think it is a left over from when $job->kill() did not send 2 TERMs.
The idea for this is if programs like GNU Parallel (that needs 2 TERMs
to exit) are started from GNU Parallel.
I understand now. Very clear. Another special program is emacs: I have
read that SIGINT does not kill it! I have one other program like this,
3rd party binary unfortunately.
Q2:
I have added a [--wait-for-children [GRACE_PERIOD]] option for the user to
extend the grace period of $sleepsum in case the user is dealing with
processes that are long to "put to rest".
My question: should this option be available in general, or just for the
propagation feature?
Do we really need an option for this? I would like to see at least 2
real life scenarios, where this makes sense and for which a hard coded
value will not work.
I really do not like the current --wait-for-children solution that I
proposed. After much thinking it is a bit too specific, and it does not
fit well.
I have prepared the documentation for a different approach. I will send
another email to keep things separate. This discussion is getting to be
a lot of text.
In terms of a real life scenario, I can offer an overview of my workflow.
Some processes take a long time to terminate from the point of view of
GNU Parallel, because from the time GNU Parallel issues the TERM signal
and the time GNU Parallel hears back from the processes, there could be
an amount of time longer than 200ms. For example, the current chain of
command with SIGTERM in my workflow is: Jenkins, script, script, GNU
Make, GNU Parallel, script, grid engine submission host, grid engine
master, grid engine execution host, script, program. The last program is
CPU/RAM/IO intensive, the layers above are for build management. When
users hit the "kill the running job" button, SIGTERM has to make its way
down to the low level program, the low level program does some work to
properly terminate the process (could be a few seconds), and then it
goes back up the chain. At each level, a little processing needs to
happen to close that level properly. Each level along the way works
better when its child process terminates in an orderly fashion. The
delay between sending SIGTERM and hearing back from the child-most
process can be more than 200ms.
I hope this demonstrates that in some cases, extending the grace period
beyond 200ms benefits the user.
Q3:
Still in the wait_and_exit subroutine, the grace period is "ANDed" with the
family_pids[0].
Why just the 0'th element? Why not the entire array?
You mean in sub Job::kill():
# Wait up to 200 ms between TERMs - but only if any pids
are alive
my $sleep = 1;
for (my $sleepsum = 0; kill 0, $family_pids[0] and $sleepsum < 200;
$sleepsum += $sleep) {
$sleep = ::reap_usleep($sleep);
}
'kill 0, pid' returns true if the process is still running.
$family_pids[0] is the immediate child (i.e. the parent of any
(grand)*children)).
There is no need to see if any (grand)*children are running: it is the
job of $family_pids[0] to kill those.
Ok, I understand now. Yes this makes sense. I agree.
The for loop runs up to 200 ms, but if the pid dies earlier, then the
loop exits.
But maybe this should be revised:
When a job times out (--timeout) we want to kill it. It is OK to give
it 200 - 1000 ms to clean up, so 'kill TERM', wait, 'kill TERM', wait,
'kill KILL'.
When GNU Parallel receives 2 TERMs, it should for all jobs 'kill
TERM', wait, 'kill TERM', wait, 'kill KILL'.
The wait should always be an upper limit: Do not wait a full second,
if the job finishes faster.
I am not sure whether GNU Parallel should also kill the
(grand*)children, and if so how that should be done to work well for
most cases. Maybe:
'kill TERM', wait, 'kill TERM', wait, 'kill KILL', 'kill KILL
@grandchildren_pid'
This way the parent is given a chance to cleanup, but if it did not
manage, then GNU Parallel does the cleaning. It would be good to have
testcases for this kind of scenario.
The new tests I wrote are very close to this. They are on github for now:
https://github.com/martinda/gnu-parallel/blob/sigterm-1/testsuite/tests-to-run/parallel-local-signals.sh
I should be able to write one for this scenario if needed.
Thank you very much for your explanations, it helps a lot.
Martin