openexr-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Openexr-devel] EXR texture memory overhead


From: Larry Gritz
Subject: Re: [Openexr-devel] EXR texture memory overhead
Date: Fri, 16 Sep 2016 11:16:12 -0700

Underscoring once again how critical it is that we start processing the long list of pending PRs. There are a lot of good ideas, bug fixes, and performance improvements just rotting in people's private repos, waiting for somebody to fold them into official OpenEXR releases.

Thanks, Alexandre. I haven't read your patch in detail, but I'm definitely interested in additions to IlmImf internals that could be employed for my use case in order to cut down on unnecessary copies and redundant buffer allocations. Though in my quick scan, it looks scanline-specific, whereas for my use case we're dealing primarily with tiled files, so a second patch will be necessary to do the equivalent for tiles.


On Sep 16, 2016, at 10:51 AM, Alexandre <address@hidden> wrote:

Forget my point the use case is entirely different from my experience. `
This is a separate issue and has nothing to do with the original request.
—— 

In our use case the fact of not being able to control OpenEXR threads (assuming the thread pool is used) and not being able to know how much memory is used is enough to our application slow because it doesn’t know what is going on. We cannot block other tasks from happening because some part of the application has started decoding an OpenEXR file. 
To fix our issue the fact of being able to use the original application’s threads to do the decompressing and optionally to provide the buffers so that it can know which resources are taken is enough to fix the matter.
The issue is only visible when reading untiled and multi-layered files because their size is significant enough to take up to most of the resources of what the application offers.

On OpenEXR side this is implemented by https://github.com/openexr/openexr/pull/141

On OIIO side I dont think there’s much to do to implement this:
Extract the decompressing part of  LineBufferTask::execute and do it in the OIIO read_native_scanlines function as a replacement of the setFramebuffer/read_pixels pair. Combined with the right call from the application using OIIO so that internally it can use the provided application buffers;

I am going to work on it when I get time and will notify you Larry if we get significant performance gains. 

On 16 Sep 2016, at 19:12, Larry Gritz <address@hidden> wrote:

It is true that using OIIO's ImageCache to read a single file sequentially can have wasteful memory consequences -- right after you've read the image, you have a copy in the app's buffer that you requested, you still have a copy in the ImageCache waiting around for the next time you need it, and you may have a third copy of some or all of the pixels within libIlmImf's internal data structures (if the file is still open). That's not really what ImageCache is designed for, and I'm confident that's not how Soren is trying to use it.

Soren is dealing with a texture system within a renderer. So that waste I described above will disappear -- as the app requests additional texture data, what's filling the cache will be paged out, and new pixels will come in. The cache has a fixed maximum size. Also, in the context of an OIIO TextureCache, there is no "app buffer", the IC's tile data itself is where the texture is directly accessed from when doing texture filtering operations.

It's clear that Soren's case is already dealing with tiled and MIP-mapped files (right, Soren?). And if you're going to make tiles for use with ImageCache, it's much better to use OIIO's "maketx" rather than OpenEXR's "exrmaketiled". The maketx does a number of additional things besides just tiling, including computing SHA-1 hashes on the file and storing that in the header, so that the TextureSystem can automatically notice duplicate textures and not read from the redundant files. That won't happen properly if you use exrmaketiled.

We routinely use OIIO's texture cache to render frames that reference 1-2 TB of texture, spread over 10,000 or more files, using a maximum of 1GB tile memory and 1000 max files open at once. Works smooth as can be. If your use of ImageCache is resulting in "blowing up computer's RAM + swap" and the kernel has to kill the app, either you're setting something wrong, or there is a bug (or use case I haven't considered) that I desperately want to examine and make better. I would love a detailed description of how to reproduce this, so I can fix it.

All that is a red herring. What Soren is describing is a very real effect, which is two-fold and completely independent of OIIO:

1. The amount of memory that libIlmImf holds *per open file* as overhead or internal buffers or whatever (I haven't tracked down exactly what it is) is much larger than what libtiff holds as overhead per open file.

2. libIlmImf seems to have a substantial amount of memory overhead *per thread*, and that can really add up if you have a large thread pool. In contrast, libtiff doesn't have a thread pool (for better or for worse), so there isn't a per-thread component to its memory overhead.



On Sep 16, 2016, at 6:13 AM, Alexandre <address@hidden> wrote:

I think the bottleneck is in OpenImageIO's ImageCache rather than OpenEXR by itself.

I’ve spent quite some time debugging OpenImageIO in this regard. The worst case scenario you can give to OpenImageIO is when trying to read untiled multi-layered EXR files.
Most of people seem to only be working with zip scanlines because this suits Nuke scan-line architecture perfectly but it is a nightmare in reality for all other applications that don’t work with scan-lines.

The OpenImageIO cache can be set in auto-tile mode, in which case it will open/close the file multiple times to decode (so it is slower) but can use less memory because it doesn’t require to allocate as much big chunks of memory.
When not set to auto-tile it will just decode the full image, meaning that OpenEXR will allocate a big chunk of memory to decompress, OpenImageIO will allocate a big chunk of memory to convert to the user requested data format. And here is the worst part, OpenImageIO will leave the file opened in the cache on the thread local storage of the calling thread.

And you might even go worse than that if you’ve got multiple threads trying to decode different untiled EXR files concurrently, then OpenImageIO will just blow up your computer’s RAM + swap and the kernel should kill your app very quickly.


There are a couple of workarounds:

- Make all your files go through an initial pass of converting them to tiled EXR files (with exrmaketiled)
- Don’t use OpenImageIO cache at all

The Foundry has come up with an extension (in a pull request) to let a chance to the application calling OpenEXR to pass its own buffers  (instead of the ones used internally) so that the decompression of the EXR files could happen outside of OpenEXR itself in the calling application controlled memory and threads.

This is very important if your application is going to do other stuff concurrently than just reading your 1 EXR file.

On our side we are going to try and implement that in OpenImageIO so that in the same way you could pass your own buffers and threads to OpenImageIO which would in turn pass them to OpenEXR.




On 16 Sep 2016, at 12:34, Søren Ragsdale <address@hidden> wrote:

Hello, OpenEXR devs. I've been doing some comparative rendering tests I've found something a bit surprising.

TIFF and EXR texture access *times* seems more or less the same, which is fine because the underlying data is equivalent. (Same data type, compression, tile size, etc.) But the RAM overhead seems much higher for EXRs. We've got a 9GB render using TIFFs and a 13GB render using EXRs.

Does anyone have some theories why EXR texture access is requiring 4GB more memory?


Prman-20.11, OSL shaders, OIIO/TIFF textures:
real 00:21:46
VmRSS 9,063.45 MB
OpenImageIO ImageCache statistics (shared) ver 1.7.3dev
Options:  max_memory_MB=4000.0 max_open_files=100 autotile=64
          autoscanline=0 automip=1 forcefloat=0 accept_untiled=1
          accept_unmipped=1 read_before_insert=0 deduplicate=1
          unassociatedalpha=0 failure_retries=0
Images : 1957 unique
  ImageInputs : 136432 created, 100 current, 796 peak
  Total size of all images referenced : 166.0 GB
  Read from disk : 55.5 GB
  File I/O time : 7h 2m 33.9s (16m 54.2s average per thread)
  File open time only : 27m 44.0s


Prman-20.11, OSL shaders, OIIO/EXR textures:
real 00:21:14
VmRSS 12,938.83 MB
OpenImageIO ImageCache statistics (shared) ver 1.7.3dev
Options:  max_memory_MB=4000.0 max_open_files=100 autotile=64
          autoscanline=0 automip=1 forcefloat=0 accept_untiled=1
          accept_unmipped=1 read_before_insert=0 deduplicate=1
          unassociatedalpha=0 failure_retries=0
Images : 1957 unique
  ImageInputs : 133168 created, 100 current, 771 peak
  Total size of all images referenced : 166.0 GB
  Read from disk : 55.5 GB
  File I/O time : 6h 15m 42.1s (15m 1.7s average per thread)
  File open time only : 1m 22.5s

_______________________________________________
Openexr-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/openexr-devel


_______________________________________________
Openexr-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/openexr-devel

--
Larry Gritz
address@hidden



_______________________________________________
Openexr-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/openexr-devel


--
Larry Gritz
address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]