Whats holding ~100% D GUI back?

Sat Nov 30 16:41:34 UTC 2019

On Friday, 29 November 2019 at 23:55:55 UTC, Ola Fosheim Grøstad 
wrote:
> On Friday, 29 November 2019 at 16:40:01 UTC, Gregor Mückl wrote:
>> This presentation is of course a simplification of what is 
>> going on in a GPU, but it gets the core idea across. AMD and 
>> nVidia do have a lot of documentation that goes into some more 
>> detail, but at some point you're going to hit a wall.
>
> I think it is a bit interesting that Intel was pushing their 
> Phi solution (many Pentium-cores), but seems to not update it 
> recently. So I wonder if they will be pushing more independent 
> GPU cores on-die (CPU-chip). It would make sense for them to 
> build one architecture that can cover many market segments.
>

Intel Xe is supposed to be a dedicated GPU. I expect a radical 
departure from their x86 cores and their previous Xeon Phi chips 
that used a reduced x86 instruction set. Any successor to that 
needs more cores, but these can be a lot simpler.

>> The convolutions for aurealization are done in the frequency 
>> domain. Room impulse responses are quite long (up to several 
>> seconds), so a time domain convolutions are barely feasible 
>> offline. The only feasible way is to use the convolution 
>> theorem, transform everything into frequency space, multiply 
>> it there, and transform things back...
>
> I remember reading a paper about casting rays into a 3D model 
> to estimate an acoustic model for the room in the mid 90s. I 
> assume they didn't do it real time.
>

Back in the 90s they probably didn't. But this is slowly becoming 
feasible. See e.g.

https://www.oculus.com/blog/simulating-dynamic-soundscapes-at-facebook-reality-labs/

This has been released as part of the Oculus Audio SDK earlier 
this year.

> I guess you could create a psychoacoustic parametric model that 
> works in the time domain... it wouldn't be very accurate, but I 
> wonder if it still could be effective. It is not like Hollywood 
> movies have accurate sound...  We have optic illusions for the 
> visual system, but there are also auditive illusions for the 
> aural systems. E.g. shepard tones that ascend forever, and I've 
> heard the same have been done with motion of sound by morphing 
> the phase of a sound over speakers, that have been carefully 
> placed with an exact distance between them, so that a sound 
> moves to the left forever. I find such things kinda neat... :)
>

I think what you're getting to is filter chains that emulate 
reverb, but stay in the time domain. The canonical artificial 
reverb is the Schroeder reverberator. However, you still need a 
target RT60 to get the correct reverb tail length.

You can try to derive that time in various ways. Path tracing is 
one. Maybe you could get away with an estimated reverb time based 
on the Sabine equation. I've never tried. Microsoft Research is 
working on an approach that precomputed wave propagation using 
FDTD and resorts to runtime lookup of these results.

> Some electro acoustic composers explore this field, I think it 
> is called spatialization/diffusion? I viewed one of your vidoes 
> and the phasing reminded me a bit of how these composers work. 
> I don't have access to my record collection right now, but 
> there are some soundtracks that are surprisingly spatial. Kind 
> of like audio-versions of non-photorealistic rendering 
> techniques. :-) The only one I can remember right now seems to 
> be Utilty of Space by N. Barrett (unfortunately a short clip):
> https://electrocd.com/en/album/2322/Natasha_Barrett/Isostasie
>

Spatialization is something slightly different. It refers to the 
creation of the illusion that sound originate from a specific 
point or volume in space. That's surprisingly hard to get right 
and it's an active area of research.

That track is interesting. I don't remember encountering any 
other purely artistic use of audio spatialization.

>> There's a lot of pitfalls. I'm doing all of the convolution on 
>> the CPU because the output buffer is read from main memory by 
>> the sound hardware. Audio buffer updates are not in lockstep 
>> with screen refreshes, so you can't reliably copy the next 
>> audio frame to the GPU, convolve it there and read it back in 
>> time because the GPU is on it's own schedule.
>
> Right, why can't audiobuffers be supported in the same way as 
> screenbuffers? Anyway, if Intel decides to integrate GPU cores 
> and CPU cores tighter then... maybe. Unfortunately, Intel tends 
> to focus on making existing apps run faster, not to enable the 
> next-big-thing.
>

A GPU in compute mode doesn't really care about the semantics of 
the data in the buffers it gets handed. FIR filters should map 
fine to GPU computing, IIR filters not so much. So, depending on 
the workload, GPUs can do just fine.

The real problem is one of keeping different sets of deadlines in 
a realtime system. Graphics imposes one set (the screen refresh 
rate) and audio imposes another (audio output buffer update 
rate). The GPU is usually lagging behind the CPU rather than in 
perfect lockstep and it's typically under high utilization, so it 
won't have appropriate open timeslots to meet other deadlines in 
most situations.

>> Perceptually, it seems that you can get away with a fairly low 
>> update rate for the reverb in many cases.
>
> If the sound sources are at a distance then there should be 
> some time to work it out? I haven't actually thought very hard 
> on that... You could also treat early and late reflections 
> separately (like in a classic reverb).
>

Early and late reverb need to be treated separately for 
perceptual reasons. The crux that I didn't mention previously is 
that you need an initial reverb ready as soon as a sound source 
starts playing. That can be problem with low update rates in 
games where sound sources come and go quite often.

> I wonder though if it actually has to be physically correct, 
> because, it seems to me that Hollywood movies can create more 
> intense experiences by breaking with physical rules.  But the 
> problem is coming up with a good psychoacoustic model, I guess. 
> So in a way, going with the physical model is easier... it 
> easier to evaluate anyway.
>

I'm really taking a hint from graphics here: animation studios 
started to use PBR (that is path tracing with physically 
plausible materials) same as VFX houses that do photorealistic 
effects. They want it that way because then the default is being 
correct. They can always stylize later.

If you're interested, we can take this discussion offline. This 
thread is the wrong place for this.