Initial feedback for std.experimental.image
via Digitalmars-d
digitalmars-d at puremagic.com
Wed Jul 8 11:07:08 PDT 2015
On Monday, 6 July 2015 at 13:48:53 UTC, Rikki Cattermole wrote:
>
> Please destroy!
>
You asked for it! :)
As a reference to a library that is used to handle images on a
professional level (VFX industry), I'd encourage you to look at
the feature set and interfaces of OpenImageIO. Sure, it's a big
library and some of it is definitely out of scope for what you
try to accomplish (image tile caching and texture sampling,
obviously).
Yet, there are some features I specifically want to mention here
to challenge the scope of your design:
- arbitrary channel layouts in images: this is a big one. You
mention 3D engines as a targeted use case in the specification.
3D rendering is one of the worst offenders when it comes to crazy
channel layouts in textures (which are obviously stored as image
files). If you have a data texture that requires 2 channels (e.g.
uv offsets for texture lookups in shaders or some crazy data
tables), its memory layout should also only ever have two
channels. Don't expand it to RGB transparently or anything else
braindead. Don't change the data type of the pixel values wildly
without being asked to do so. The developer most likely has
chosen a 16 bit signed integer per channel (or whatever else) for
a good reason. Some high end file formats like OpenEXR even allow
users to store completely arbitrary channels as well, often with
a different per-channel data format (leading to layouts like
RGBAZ with an additional mask channel on top). But support for
that really bloats image library interfaces. I'd stick with a
sane variant of the uncompressed texture formats that the OpenGL
specification lists as the target set of supported in-memory
image formats. That mostly matches current GPU hardware support
and probably will for some time to come.
- padding and memory alignment: depending on the platform, image
format and task at hand you may want the in-memory layout of your
image to be padded in various ways. For example, you would want
your scanlines and pixel values aligned to certain offsets to
make use of SIMD instructions which often carry alignment
restrictions with them. This is one of the reasons why RGB images
are sometimes expanded to have a dummy channel between the
triplets. Also, aligning the start of each scanline may be
important, which introduces a "pitch" between them that is
greater than just the storage size of each scanline by itself.
Again, this may help speeding up image processing.
- subimages: this one may seem obscure, but it happens in a
number common of file formats (gif, mng, DDS, probably TIFF and
others). Subimages can be - for instance - individual animation
frames or precomputed mipmaps. This means that they may have
metadata attached to them (e.g. framerate or delay to next frame)
or they may come in totally different dimensions (mipmap levels).
- window regions: now this not quite your average image format
feature, but relevant for some use cases. The gist of it is that
the image file may define a coordinate system for a whole image
frame but only contain actual data within certain regions that do
not cover the whole frame. These regions may even extend beyond
the defined image frame (used e.g. for VFX image postprocessing
to have properly defined pixel values to filter into the visible
part of the final frame). Again, the OpenEXR documentation
explains this feature nicely. Again, I think this likely is out
of scope for this library.
My first point also leads me to this criticism:
- I do not see a way to discover the actual data format of a PNG
file through your loader. Is it 8 bit palette-based, 8 bit per
pixel or 16 bits per pixel? Especially the latter should not be
transparently converted to 8 bits per pixel if encountered
because it is a lossy transformation. As I see it right now you
have to know the pixel format up front to instantiate the loader.
I consider that bad design. You can only have true knowledge of
the file contents after the image header were parsed. The same is
generally true of most actually useful image formats out there.
- Could support for image data alignment be added by defining a
new ImageStorage subclass? The actual in-memory data is not
exposed to direct access, is it? Access to the raw image data
would be preferable for those cases where you know exactly what
you are doing. Going through per-pixel access functions for large
image regions is going to be dreadfully slow in comparison to
what can be achieved with proper processing/filtering code.
- Also, uploading textures to the GPU requires passing raw memory
blocks and a format description of sorts to the 3D API. Being
required to slowly copy the image data in question into a
temporary buffer for this process is not an adequate solution.
Let me know what you think!
More information about the Digitalmars-d
mailing list