#hyperloglog

DragonflyDBDragonflydbio
2025-03-27

Ever wonder what a HyperLogLog data structure is? (Who hasn’t!?) In our latest video, learn how Dragonfly implements this memory-efficient counter to track millions of unique users with just 1.5KB of memory. youtu.be/EIJbC9lxzts
youtu.be/EIJbC9lxzts

𝐭𝐚𝐤𝐢𝐧𝐠 𝐥𝐮𝐜 (RN)Luc@dresden.network
2025-01-08

@bkastl Hm, feel you!

Arbeite durchaus in dem Bereich und war bisher immer ein großer Freund des Ethikrates.

Evtl. sollte sie eine Befürworterin des #HyperLogLog werden.

media.ccc.de/v/38c3-privacy-pr

#38c3 #Patientenakte

2024-09-16

Completed the First Assignment of #645 @CMUDB , Hyperloglog was an interesting data structure to learn about.

#hyperloglog #presto

Gea-Suan Lingslin@abpe.org
2024-03-20

Google 的 HyperLogLog++

算是接續昨天寫的「Redis 對 HyperLogLog 省空間的實作」,在 Redis 的 HyperLogLog 實作有提到 Google 的論文「HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm」,裡面提出了 HyperLogLog++ (HLL++)。

論文中 Google 提出來的改進主要有三個,第一個是用了 64-bit hash function:

5.1 Using a 64 Bit Hash Fu

blog.gslin.org/archives/2024/0

#Computer #Murmuring #Programming #algorithm #data #google #hll #hyperloglog #structure

Gea-Suan Lingslin@abpe.org
2024-03-19

Redis 對 HyperLogLog 省空間的實作

HyperLogLog (HLL) 是用統計方式解決 Count-distinct problem 的資料結構以及演算法,不要求完全正確,而是大概的數量。

演算法其實沒有很難懂,在 2007 年的原始論文「HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm」裡面可以讀到演算法是長這樣:

可以

blog.gslin.org/archives/2024/0

#Computer #Murmuring #Software #algorithm #count #data #distinct #hyperloglog #problem #redis #structure

Kornelkornel
2023-10-31

is super clever.

It can count any number of unique values in constant space (i.e. without storing the values) within a specified margin of error.

And HLLs can be merged to count unique number of values in both sets! So you can quickly count something like "unique number of requests per day", and combine these into "per month", and "per year", without storing a year worth of history.

Tomasz Nurkiewicznurkiewicz@fosstodon.org
2023-07-24

Fantastic explanation of #HyperLogLog algorithm: youtube.com/watch?v=lJYufx0bfp. What's great about this video is that it uses very basic concepts, so that even non-programmers will understand it. On the other hand, CS terms like hash functions or sorted sets are mentioned in fine print, so the video doesn't sound childish

Eugene Alvin Villar 🇵🇭seav@en.osm.town
2023-07-18

#TIL about the #HyperLogLog algorithm and I think it's a damn brilliant way to estimate the number of unique elements of a potentially gargantuan set of items and only running in O(n) time and O(1) space. The fact that variants of the algorithm can be done in parallel makes it even more awesome!

youtu.be/lJYufx0bfpw

#algorithms #ComputerScience #SoME3 #mathematics #maths #statistics

2023-06-28

Something a little different today on the channel: HyperLogLog!

It's one of my favorite algorithms, used to estimate cardinality of a set. Typically used in environments with very large datasets (spread across many servers in a cluster) where a true, accurate distinct count would be very expensive.

HLL uses a simple observation about coin flipping probabilities, and extends that to cardinality estimation. Really clever algo, and provides a very fast and compact datastructure with reasonably small errors (<2% across billions of unique elements, typically in just a few kb of memory).

youtube.com/watch?v=lJYufx0bfp

#programming #algorithm #hyperloglog #cardinality #datastructures

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst