Commercial video processing app in D (experience report)
thedeemon via Digitalmars-d-announce
digitalmars-d-announce at puremagic.com
Thu Apr 28 01:20:08 PDT 2016
On Thursday, 28 April 2016 at 06:22:18 UTC, Relja Ljubobratovic
wrote:
> Can you share with us some of your experience working on image
> and video processing modules in the app, such as are filters
> here:
> http://www.infognition.com/VideoEnhancer/filters.html
>
> If I may ask, was that part implemented in D, C++, or was some
> 3rd party library used?
Thanks!
The filters listed there are third-party plugins originally
created for VirtualDub ( http://virtualdub.org/ ) by different
people, in C++. We made just 2-3 of them, like motion-based
temporal denoiser (Film Dirt Cleaner) and Intelligent Brightness
filter for automatic brightness/contrast correction. Our most
interesting and distinctive piece of tech is our Super Resolution
engine for video upsizing and it's not in that list, it's
built-in in the app (and also available separately as plugins for
some other hosts). All this image processing stuff is written in
C++ and works directly with raw image bytes, no special libraries
involved. When video processing starts our filters usually launch
a bunch of worker threads and these threads work in parallel each
on its part of video frame (divided into horizontal stripes
usually). Inside they often work block-wise and we have a bunch
of template classes for different blocks (RGB or monochrome)
parameterized by pixel data type and often block size, so the
size is often is known at compile-time and compiler can unroll
the loops properly. When doing motion search we're using our
vector class parameterized by precision, so we have vectors of
different precision (low-res pixel, high-res pixel, half-pixel,
quarter-pixel etc.) and type system makes sure I don't add or mix
vectors of different precision and don't pass a
half-pixel-precise vector to a block reading routine that expects
quarter-pixel precise coordinates. Where it makes sense and
possible we use SIMD classes like F32vec4 and/or SIMD intrinsics
for pixel operations.
Video Enhancer allows chaining several VD filters and our SR
rescaler instances to a pipeline and it's also parallelized, so
when first filter finishes with frame X it can immediately start
working on frame X+1 while the next filter is still working on
frame X. Previously it was organized as a chain of DirectShow
filters with a special Parallelizer filter inserted between video
processing ones, this Parallelizer had some frame queue inside
and separated receiving and sending threads, allowing the
connected filters to work in parallel. In version 2 it's
trickier, since we need to be able to seek to different positions
in the video and some filters may request a few frames before and
after the current, so sequential pipeline doesn't suffice
anymore, now we build a virtual chain inside one big DirectShow
filter, and each node in that chain has its worker thread and
they do message passing to communicate. After all, we now have a
big DirectShow filter in 11K lines of C++ that does both Super
Resolution resizing and invoking VirtualDub plugins (imitating
VirtualDub for them) and doing colorspace conversions where
necessary and organizing them all into a pipeline that is
pull-based inside but behaves as push-based DirectShow filter
outside.
So the D part is using COM to build and run a DirectShow graph
with all the readers, splitters, codecs and of course our big
video processing DirectShow filter, it talks to it via COM and
some callbacks but doesn't do much with video frames apart from
copying.
Btw, if you're interested in an image processing app in pure D,
I've got one too:
http://www.infognition.com/blogsort/
(sources: https://bitbucket.org/infognition/bsort )
More information about the Digitalmars-d-announce
mailing list