Your algorithm is interesting, but I don't think it is fundamentally different from the one I sketched out. Also, it uses expensive operations like array appends, and the filter function. The bit mask route should be much faster.