This Right In: PLDI 2020 will take place online and registration is FREE. Closes on Jun 5, so hurry!

Timon Gehr timon.gehr at gmx.ch
Tue Jun 16 17:36:40 UTC 2020


On 16.06.20 17:35, Robert M. Münch wrote:
> On 2020-06-15 13:01:02 +0000, Timon Gehr said:
> 
>> The talk will be on YouTube.
> 
> Great.
> 
>> Papers:
>> https://www.sri.inf.ethz.ch/publications/bichsel2020silq
>> https://www.sri.inf.ethz.ch/publications/gehr2020lpsi
>>
>> Source code:
>> https://github.com/eth-sri/silq
>> https://github.com/eth-sri/psi/tree/new-types
> 
> Thanks, somehow missed these.
> ...

I think they were not online when you asked (neither were the versions 
in ACM DL).

> What's the main difference of your approach WRT something like this: 
> http://pyro.ai/
> ...

Pyro is a Python library/EDSL, while PSI is a typed programming language 
(with some support for dependent typing).

Pyro's focus is on scalable machine learning. PSI alone would not be 
particularly helpful there.

Pyro fits a parameterized probabilistic model to data using maximum 
likelihood estimation while at the same time inferring a posterior 
distribution for the latent variables of the model. If you use a 
probabilistic model without parameters, Pyro can be used for plain 
probabilistic inference without maximum likelihood estimation.

PSI currently does not do optimization, just probabilistic inference. 
(PSI can do symbolic inference with parameters, then they can be 
optimized with some other tool.)

The goal is to find a distribution such that KL-divergence of the 
posterior and this distribution is as small as possible. PSI always 
finds the true posterior when it is successful (i.e. KL-divergence 0 
when applicable), but will not always succeed, in particular, it might 
not be fast enough, or the result may not be in a useful form.

Pyro produces best-effort results. You may have to use some sort of 
validation to make sure that results are useful.

- The posterior distribution is assumed to have a specific form that can 
be represented symbolically and is normalized by construction. Often, 
the true posterior is not actually (known to be) in that family.

- The KL-divergence is upper-bounded using ELBO (evidence lower bound).

- The (gradient of the) ELBO is approximated by sampling from the 
assumed posterior with current parameters.

- This approximate ELBO is approximately optimized using gradient descent.

Also see: https://pyro.ai/examples/svi_part_i.html

> BTW: I'm located in Zug... so not far away from you guys.
> 



More information about the Digitalmars-d-announce mailing list