Something a little different today on the channel: HyperLogLog!
It's one of my favorite algorithms, used to estimate cardinality of a set. Typically used in environments with very large datasets (spread across many servers in a cluster) where a true, accurate distinct count would be very expensive.
HLL uses a simple observation about coin flipping probabilities, and extends that to cardinality estimation. Really clever algo, and provides a very fast and compact datastructure with reasonably small errors (<2% across billions of unique elements, typically in just a few kb of memory).
https://www.youtube.com/watch?v=lJYufx0bfpw
#programming #algorithm #hyperloglog #cardinality #datastructures