artemis.externals.tdigest¶
Submodules¶
Package Contents¶
-
class
artemis.externals.tdigest.TDigest(delta=0.01, K=25)¶ Bases:
object-
__add__(self, other_digest)¶
-
__len__(self)¶
-
__repr__(self)¶
-
__iter__(self)¶ Iterates over centroids in the digest.
-
_add_centroid(self, centroid)¶
-
_compute_centroid_quantile(self, centroid)¶
-
_update_centroid(self, centroid, x, w)¶
-
_find_closest_centroids(self, x)¶
-
_threshold(self, q)¶
-
update(self, x, w=1)¶ Update the t-digest with value x and weight w.
-
batch_update(self, values, w=1)¶ Update the t-digest with an iterable of values. This assumes all points have the same weight.
-
compress(self)¶
-
percentile(self, p)¶ Computes the percentile of a specific value in [0,100].
-
cdf(self, x)¶ Computes the cdf of a specific value, ie. computes F(x) where F denotes the CDF of the distribution.
-
trimmed_mean(self, p1, p2)¶ Computes the mean of the distribution between the two percentiles p1 and p2. This is a modified algorithm than the one presented in the original t-Digest paper.
-
centroids_to_list(self)¶ Returns a Python list of the TDigest object’s Centroid values.
-
to_dict(self)¶ Returns a Python dictionary of the TDigest and internal Centroid values. Or use centroids_to_list() for a list of only the Centroid values.
-
update_from_dict(self, dict_values)¶ Updates TDigest object with dictionary values.
The digest delta and K values are optional if you would like to update them, but the n value is not required because it is computed from the centroid weights.
- For example, you can initalize a new TDigest:
digest = TDigest()
- Then load dictionary values into the digest:
digest.update_from_dict({‘K’: 25, ‘delta’: 0.01, ‘centroids’: [{‘c’: 1.0, ‘m’: 1.0}, {‘c’: 1.0, ‘m’: 2.0}, {‘c’: 1.0, ‘m’: 3.0}]})
Or update an existing digest where the centroids will be appropriately merged:
digest = TDigest() digest.update(1) digest.update(2) digest.update(3) digest.update_from_dict({‘K’: 25, ‘delta’: 0.01, ‘centroids’: [{‘c’: 1.0, ‘m’: 1.0}, {‘c’: 1.0, ‘m’: 2.0}, {‘c’: 1.0, ‘m’: 3.0}]})
Resulting in the digest having merged similar centroids by increasing their weight:
{‘K’: 25, ‘delta’: 0.01, ‘centroids’: [{‘c’: 2.0, ‘m’: 1.0}, {‘c’: 2.0, ‘m’: 2.0}, {‘c’: 2.0, ‘m’: 3.0}], ‘n’: 6.0}
Alternative you can provide only a list of centroid values with update_centroids_from_list()
-
update_centroids_from_list(self, list_values)¶ Add or update Centroids from a Python list. Any existing centroids in the digest object are appropriately updated.
Example
digest.update_centroids([{‘c’: 1.0, ‘m’: 1.0}, {‘c’: 1.0, ‘m’: 2.0}, {‘c’: 1.0, ‘m’: 3.0}])
-