I've been using the same mechanism as jemalloc in SDC's runtime and it bucket basically by keeping 2 bits of precision + shift. It goes as : 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, 28, 32, 40, 48, 56, 64, ... It work quite well in practice.