parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New behavior proposal --halt -% with job killing


From: Martin d'Anjou
Subject: Re: New behavior proposal --halt -% with job killing
Date: Sat, 25 Apr 2015 08:55:24 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0

On 15-04-25 03:25 AM, Ole Tange wrote:
On Thu, Apr 23, 2015 at 5:10 PM, Martin d'Anjou
<martin.danjou14@gmail.com> wrote:

Good summary:

You could come up with many options here:

When to halt:
--halt [condition for halt]
--timeout [condition for halt is an amount of time]
--memfree [condition for halt is an amount of memory]
kill -TERM [condition for halt is the signal]
I think our solution should make it possible to extend this list. E.g.
maybe it will be possible to detect whether the remote job failed or
the network connection to the remote server failed.

--memfree is special, however. It retries indefinitely, if the job
gets killed due to low memory.

How to handle jobs after a halt:
--halt-job-handling [killpending[,killrunning]]
--timeout-job-handling [killpending[,killrunning]]
and so on. Users could use both kills if they wanted both.
And --kill-TERM-job-handling

killrunning will always imply killpending, but the opposite is not the
case, right?

For me yes, killrunning would always imply killpending. So it could be [killpending,killall].


--retries should be thrown into the mix, too.

What does retry mean?


I can easily think of real life situations where the handling of a
death due to --memfree is different from a --timeout, and I think the
current behaviour (retrying indefinitely) is correct. But do we have a
real life situation where we want --halt-job-handling to be different
from --timeout-job-handling given that we have --retries?

I am reluctant to put in 3 options that are extremely rarely used
(there are plenty of options as it is and testing becomes harder the
more combinations needs to be tested).

Agreed.

For me, one option suffices for all possible ways of halt/timeout/kill -TERM: it is either to kill pending (stop spawning new jobs), or kill all (which implies kill pending). I do not use memfree so throw it in the mix where it makes sense.


Or, you could use an explicit "plus" sign to mean halt and kill all running
and pending jobs:
--halt +1-99%
POLA would say that --halt +1-99% == --halt 1-99%


There is a precedent. The find program uses +n, -n and n (man find). "find /path/to/files* -mtime +5" is not the same as "find /path/to/files* -mtime -5".

Martin





reply via email to

[Prev in Thread] Current Thread [Next in Thread]