D for BigData: the first BetterC library by Tamediadigital

test123 test123 at gmail.com
Mon May 2 07:22:49 UTC 2022


On Monday, 2 May 2022 at 06:17:17 UTC, Cym13 wrote:
> No, that's not what this is for. Hyperloglog is useful if you 
> have a big dataset that may contain duplicates and you want to 
> know how many unique items you have (with a reasonnable 
> probability). For example, as a website, this can be used to 
> estimate how many visitors you have without having to store 
> every single IP address to check for duplicates at new 
> connections. The tradeoff is that it's probabilistic: you don't 
> need to store every address so you need much less space and 
> time to get a count of unique ips, but you have to accept a 
> margin of error on that result and you can't know what the IPs 
> were in the first place, just how many of them there are.

Thanks for quick anwser.

You mean with Hyperloglog, I can not get each IP count but only 
the value how much IP has beed add into set ?


More information about the Digitalmars-d-announce mailing list