I have earlier encouraged you to post examples of use of GNU Parallel.
Today I had to split a 200 GB gz-file into smaller files. The file contained records of 4 lines, so I had to unpack the .gz file, chop at a 4 line limit around 10 MB and gzip the chunk under a unique name:
The limiting factor in this was GNU Parallel which is not uncommon for --pipe.
The spreadstdin() and write_record_to_pipe() are to blame. They can be sped up by re-writing these functions in C/C++. But it might even be sufficient to split up the parts into a reader process (which would read a chunk, find the split point, and put it in a queue), a few writer processes (which given a chunk would write it to the user program) and a manager process (which would communicate between the reader and the writer and spawn off new writer processes if needed), so fork does not have to be called for every block. Any takers?