artemis.generators.common¶
Base class for collection of data generators
Refer to arrow/python/pyarrow/benchmarks/common.py
Common utils Common Base Builtin Generator
Module Contents¶
-
artemis.generators.common.KILOBYTE¶
-
artemis.generators.common.MEGABYTE¶
-
artemis.generators.common.DEFAULT_NONE_PROB= 0.0¶
-
artemis.generators.common.check_random_state(seed)¶ Turn seed into a numpy.random.RandomState instance Ensures if using multiple generators in code we avoid repeatability problems
https://scikit-learn.org/stable/developers/utilities.html#validation-tools
Parameters
seed: None | int | instance of RandomState
-
artemis.generators.common._multiplicate_sequence(base, target_size)¶
-
artemis.generators.common.get_random_bytes(n, seed=42)¶ Generate a random bytes object of size n. Note the result might be compressible.
-
artemis.generators.common.get_random_ascii(n, seed=42)¶ Get a random ASCII-only unicode string of size n.
-
artemis.generators.common._random_unicode_letters(n, seed=42)¶ Generate a string of random unicode letters (slow).
-
artemis.generators.common._1024_random_unicode_letters¶
-
artemis.generators.common.get_random_unicode(n, seed=42)¶ Get a random non-ASCII unicode string of size n.
-
class
artemis.generators.common.GeneratorBase(name, **kwargs)¶ Common base class for generators
-
property
random_state(self)¶
-
property
name(self)¶ Algorithm name
-
reset(self)¶
-
to_msg(self)¶
-
static
from_msg(logger, msg)¶
-
generate(self)¶
-
initialize(self)¶
-
__iter__(self)¶
-
__next__(self)¶
-
sampler(self)¶
-
property
-
class
artemis.generators.common.BuiltinsGenerator(seed=None)¶ Bases:
object-
sprinkle(self, lst, prob, value)¶ Sprinkle value entries in list lst with likelihood prob.
-
sprinkle_nones(self, lst, prob)¶ Sprinkle None entries in list lst with likelihood prob.
-
generate_int_list(self, n, none_prob=DEFAULT_NONE_PROB)¶ Generate a list of Python ints with none_prob probability of an entry being None.
-
generate_float_list(self, n, none_prob=DEFAULT_NONE_PROB, use_nan=False)¶ Generate a list of Python floats with none_prob probability of an entry being None (or NaN if use_nan is true).
-
generate_bool_list(self, n, none_prob=DEFAULT_NONE_PROB)¶ Generate a list of Python bools with none_prob probability of an entry being None.
-
generate_decimal_list(self, n, none_prob=DEFAULT_NONE_PROB, use_nan=False)¶ Generate a list of Python Decimals with none_prob probability of an entry being None (or NaN if use_nan is true).
-
generate_object_list(self, n, none_prob=DEFAULT_NONE_PROB)¶ Generate a list of generic Python objects with none_prob probability of an entry being None.
-
_generate_varying_sequences(self, random_factory, n, min_size, max_size, none_prob)¶ Generate a list of n sequences of varying size between min_size and max_size, with none_prob probability of an entry being None. The base material for each sequence is obtained by calling random_factory(<some size>)
-
generate_fixed_binary_list(self, n, size, none_prob=DEFAULT_NONE_PROB)¶ Generate a list of bytestrings with a fixed size.
-
generate_varying_binary_list(self, n, min_size, max_size, none_prob=DEFAULT_NONE_PROB)¶ Generate a list of bytestrings with a random size between min_size and max_size.
-
generate_ascii_string_list(self, n, min_size, max_size, none_prob=DEFAULT_NONE_PROB)¶ Generate a list of ASCII strings with a random size between min_size and max_size.
-
generate_unicode_string_list(self, n, min_size, max_size, none_prob=DEFAULT_NONE_PROB)¶ Generate a list of unicode strings with a random size between min_size and max_size.
-
generate_int_list_list(self, n, min_size, max_size, none_prob=DEFAULT_NONE_PROB)¶ Generate a list of lists of Python ints with a random size between min_size and max_size.
-
generate_tuple_list(self, n, none_prob=DEFAULT_NONE_PROB)¶ Generate a list of tuples with random values. Each tuple has the form (int value, float value, bool value)
-
generate_dict_list(self, n, none_prob=DEFAULT_NONE_PROB)¶ Generate a list of dicts with random values. Each dict has the form
{‘u’: int value, ‘v’: float value, ‘w’: bool value}
-
get_type_and_builtins(self, n, type_name)¶ Return a (arrow type, list) tuple where the arrow type corresponds to the given logical type_name, and the list is a list of n random-generated Python objects compatible with the arrow type.
-