[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: job race!
From: |
Ole Tange |
Subject: |
Re: job race! |
Date: |
Thu, 25 Apr 2013 13:19:09 +0200 |
On Wed, Apr 24, 2013 at 9:25 PM, Ozgur Akgun <ozgurakgun@gmail.com> wrote:
> I want to be able to say, something like `parallel --timeout (fastest * 2)`
> and let get the same output.
I have been pondering if I could somehow make a '--timeout 5%'. It should:
1. Run the first 3 jobs to completion (no --timeout)
2. Compute the average and standard deviation for all completed jobs
3. Adjust --timeout based on the new average, standard deviation and user input
4. Go to 2 until all jobs are finished
The user input would be a percentage e.g. 5% - meaning "I want the job
killed if takes longer to run than the 95% fastest jobs". We can
statistically compute that limit if we assume that the run time of the
jobs is normally distributed (the bell curve
https://en.wikipedia.org/wiki/File:Normal_Distribution_PDF.svg) and
that the run time of the jobs does not depend on the order (e.g. it
will not work if we get all the fast jobs first).
I am not sure if the run times of jobs generally are normally
distributed or if they are more like Chi-square or another continuous
distribution, and in this case it probably does not matter, because
the percentage of jobs that people want timed out will always be <
30%. If you have some insight in this, please speak up.
With the above '--timeout 5%' will normally kill 5% of the jobs - even
if they are not "bad", and that might less useful than just a percent
of the median run time:
--timeout 200%
which would kill all jobs taking more than twice as long as the median
run time (using remedian to do median in finite memory).
I do not think looking at the fastest jobs is a good indicator: You
can have an odd job that is extremely fast while the median is much
slower.
/Ole
Re: job race!, Ole Tange, 2013/04/25