### Raison d'être
> **Problem**: How many times does a value occur in a list?
> **Solution**: A probabilistic approach that utilizes a matrix of hash functions that serve as counters. The result can only *overestimate* how many times a given value has occurred in a list, but it will never *underestimate*.
### Background






### Difference from [Bloom-filters 1](Bloom-filters%201.md)
The main conceptual different is that CM sketch represents a multiset, and has different assumptions about the kind of updates. More formally, CM sketch summarizes a frequency distribution, while Bloom Filter is concerned with representing which elements are present in a set
Count–min sketches are essentially the same data structure as the [counting Bloom filters](https://en.wikipedia.org/wiki/Counting_Bloom_filter "Counting Bloom filter"). However, they are used differently and therefore sized differently: a count-min sketch typically has a sublinear number of cells, related to the desired approximation quality of the sketch, while a counting Bloom filter is more typically sized to match the number of elements in the set.
The main conceptual different is that CM sketch represents a multiset, and has different assumptions about the kind of updates. More formally, CM sketch summarizes a frequency distribution, while Bloom Filter is concerned with representing which elements are present in a set
The implementation details of the countmin sketch are very similar to those of a bloom filter. The latter structure only uses bits, rather than integer-valued counters. When an object is inserted into a bloom filter, hash functions indicate bits that should be set to 1 (whether or not they were previously 0 or 1). The count-min sketch, which is responsible for keeping counts rather than just tracking membership, instead increments counters. Looking up an object in a bloom filter just involves checking the bits corresponding to that object — if any of them are still 0, then the object has not been previously inserted. Thus Lookup in a bloom filter can be thought of as taking the minimum of bits, which exactly parallels the Count operation of a countmin-sketch. That the count-min sketch only overestimates frequency counts corresponds to the bloom filter’s property that it only suffers from false positives.
---
Date: 20210628
Links to: [Computer Science MOC](Computer%20Science%20MOC.md) [Algorithms and Data Structures](Algorithms%20and%20Data%20Structures.md) [Count-Sketch](Count-Sketch.md)
Tags: #review
References:
* [Count min sketch](https://www.youtube.com/watch?v=lGoCslwItiU&t=192s)
* [Count min sketches, another youtube video](https://www.youtube.com/watch?v=mPxslXpg8wA&t=288s)
* [Count sketch explain](https://stackoverflow.com/questions/6811351/explaining-the-count-sketch-algorithm)
* [Count min sketch slides](https://www.slideshare.net/gakhov/probabilistic-data-structures-part-3-frequency)
* [Wikipedia](https://en.wikipedia.org/wiki/Count%E2%80%93min_sketch)
* [Difference from bloom filter](https://sites.google.com/site/countminsketch/home/faq)
* [Web analytics data mining example](https://highlyscalable.wordpress.com/2012/05/01/probabilistic-structures-web-analytics-data-mining/)