parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Distribute files to directories instead of ssh servers?


From: Ole Tange
Subject: Re: Distribute files to directories instead of ssh servers?
Date: Wed, 5 Jun 2013 23:48:12 +0200

On Tue, Jun 4, 2013 at 10:50 PM, P Fudd <fink@ch.pkts.ca> wrote:
> Hello!
>
> I'm trying to distribute 1000 files to 7 directories.  This is easy if
> you're distributing to servers using parallel and ssh.
:
> I briefly toyed with
>
>    parallel 'mv {1} /data/queue-$(( ( $PARALLEL_SEQ % 7 ) + 1 ))' ::: *.msg
>
> but I was really hoping for something where I could specify the
> destinations with a wildcard.

I have been considering having a {variable} that contained the jobslot
so you could do:

  parallel -j7 mv {} /data/queue-{slot} ::: *.msg

It would not guarantee the same amount in each folder (if the one mv
takes longer, that slot will receive fewer files). But if your *.msg
are similarly sized it would roughly give the same number of tiles.

{slot} would also be useful if you had programs that cannot work on
the same dir in parallel, but you only want one working in the dir at
a time.

The major problem with that approach is that job slot is only a
concept I use to explain how GNU Parallel works: There is no variable
in GNU Parallel called jobslot.

- o -

Another way would be to expand --xapply so that if an input source has
more values than others, then the others will cycle their values (not
really sure how easy that would be to implement). Then you would be
able to do:

   parallel --xapply mv {1} /data/queue-{2} ::: *.msg ::: 1 2 3 4 5 6 7

Also I am afraid if that may make --xapply less useful - maybe there
are uses where it is handy that --xapply uses an empty string if the
values run out.

- o -

Maybe there are even better ideas for syntax to do what you want.

/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]