Remove real type

Sun Apr 25 08:45:02 PDT 2010

Andrei Alexandrescu Wrote:

> On 04/24/2010 07:21 PM, strtr wrote:
> > Andrei Alexandrescu Wrote:
> >>
> >> So are you saying there are neural networks with thresholds that
> >> are trained using evolutionary algorithms instead of e.g. backprop?
> >> I found this:
> > The moment a network is just a bit recurrent, any gradient descent
> > algo will be a hell.
> >
> >>
> >> https://docs.google.com/viewer?url=http://www.cs.rutgers.edu/~mlittman/courses/ml03/iCML03/papers/batchis.pdf
> >>
> >>
> >>
> which does seem to support the point. I'd have to give it a closer look
> >> to see whether precision would affect training.
> >>
> > I would love to see your results :)
> >
> > But even in the basic 3 layer sigmoid network the question is: Will
> > two outputs which are exactly the same(for a certain input) stay the
> > same if you change the precision.
> 
> You shouldn't care.
Why do you think this? Because I'm pretty sure I do care about this.
Part of my research involves trained networks making only a few decisions and those decisions should stay the same for all users.

> 
> > When the calculations leading up to
> > the two outputs are totally different ( for instance fully dependent
> > on separated subsets of the input; separated paths), changing the
> > precision could influence them differently leading to different
> > outputs ?
> 
> I'm not sure about that. Fundamentally all learning relies on some 
> smoothness assumption - at a minimum, continuity of the transfer 
> function (small variation in input leads to small variation in output). 
No.
You could maybe say you want small variations in the network to lead to small variations in the output. But I wouldn't even limit myself to those systems. Almost anything a bit more complex that the standard feed forward network can magnify small changes and even the standard network relies on large differences between the weights; what is a small change for one input might be a enormous change for another.

> I'm sure certain oddities could be derived from systems that impose 
> discontinuities, but by and large I think those aren't all that interesting.
A lot of the more recent research is done in spiking neural networks; dynamical systems with lots of bifurcations. I wouldn't say those are not that interesting. But then again, who am I? :P

> 
> The case you mention above involves a NN making a different end discrete 
> classification decision because numeric vagaries led to some threshold 
> being met or not. 
No, the numeric discrepancy I suggested would lie in the different rounding calculations

1.2 * 3 = x1
6/3 + 3.2/2 = x2

If for a certain precision x1 == x2, will this then hold for all precisions?

> I have certainly seen that happening - even changing 
> the computation method (e.g. unrolling loops) will lead to different 
> individual results. 
I don't care about that, only portability after compilation is what I'm after.

> But that doesn't matter; statistically the neural 
> net will behave the same.
As I said, statistically is not what I am after and in practice a NN will barely ever get such nice normal inputs that statistics can say anything about the workings of it.