skpro.distributions.base.BaseDistribution#

class skpro.distributions.base.BaseDistribution(index=None, columns=None)[source]#

Base probability distribution.

Attributes:
iloc

Integer location indexer.

loc

Location indexer.

name

Return the name of the object or estimator.

shape

Shape of self, a pair (2-tuple).

Methods

cdf(x)

Cumulative distribution function.

clone()

Obtain a clone of the object with same hyper-parameters.

clone_tags(estimator[, tag_names])

Clone tags from another estimator as dynamic override.

create_test_instance([parameter_set])

Construct Estimator instance if possible.

create_test_instances_and_names([parameter_set])

Create list of all test instances and a list of names for them.

energy([x])

Energy of self, w.r.t.

get_class_tag(tag_name[, tag_value_default])

Get a class tag's value.

get_class_tags()

Get class tags from the class and all its parent classes.

get_config()

Get config flags for self.

get_param_defaults()

Get object's parameter defaults.

get_param_names()

Get object's parameter names.

get_params([deep])

Get a dict of parameters values for this object.

get_tag(tag_name[, tag_value_default, ...])

Get tag value from estimator class and dynamic tag overrides.

get_tags()

Get tags from estimator class and dynamic tag overrides.

get_test_params([parameter_set])

Return testing parameter settings for the estimator.

is_composite()

Check if the object is composed of other BaseObjects.

log_pdf(x)

Logarithmic probability density function.

mean()

Return expected value of the distribution.

pdf(x)

Probability density function.

pdfnorm([a])

a-norm of pdf, defaults to 2-norm.

ppf(p)

Quantile function = percent point function = inverse cdf.

quantile(alpha)

Return entry-wise quantiles, in Proba/pred_quantiles mtype format.

reset()

Reset the object to a clean post-init state.

sample([n_samples])

Sample from the distribution.

set_config(**config_dict)

Set config flags to given values.

set_params(**params)

Set the parameters of this object.

set_random_state([random_state, deep, ...])

Set random_state pseudo-random seed parameters for self.

set_tags(**tag_dict)

Set dynamic tags to given values.

to_str()

Return string representation of self.

var()

Return element/entry-wise variance of the distribution.

property loc[source]#

Location indexer.

Use my_distribution.loc[index] for pandas-like row/column subsetting of BaseDistribution descendants.

index can be any pandas loc compatible index subsetter.

my_distribution.loc[index] or my_distribution.loc[row_index, col_index] subset my_distribution to rows defined by row_index, cols by col_index, to exactly the same/cols rows as pandas loc would subset rows in my_distribution.index and columns in my_distribution.columns.

property iloc[source]#

Integer location indexer.

Use my_distribution.iloc[index] for pandas-like row/column subsetting of BaseDistribution descendants.

index can be any pandas iloc compatible index subsetter.

my_distribution.iloc[index] or my_distribution.iloc[row_index, col_index] subset my_distribution to rows defined by row_index, cols by col_index, to exactly the same/cols rows as pandas iloc would subset rows in my_distribution.index and columns in my_distribution.columns.

property shape[source]#

Shape of self, a pair (2-tuple).

to_str()[source]#

Return string representation of self.

pdf(x)[source]#

Probability density function.

Let \(X\) be a random variables with the distribution of self, taking values in (N, n) DataFrame-s Let \(x\in \mathbb{R}^{N\times n}\). By \(p_{X_{ij}}\), denote the marginal pdf of \(X\) at the \((i,j)\)-th entry.

The output of this method, for input x representing \(x\), is a DataFrame with same columns and indices as self, and entries \(p_{X_{ij}}(x_{ij})\).

Parameters:
xpandas.DataFrame or 2D np.ndarray

representing \(x\), as above

Returns:
DataFrame with same columns and index as self

containing \(p_{X_{ij}}(x_{ij})\), as above

log_pdf(x)[source]#

Logarithmic probability density function.

Numerically more stable than calling pdf and then taking logartihms.

Let \(X\) be a random variables with the distribution of self, taking values in (N, n) DataFrame-s Let \(x\in \mathbb{R}^{N\times n}\). By \(p_{X_{ij}}\), denote the marginal pdf of \(X\) at the \((i,j)\)-th entry.

The output of this method, for input x representing \(x\), is a DataFrame with same columns and indices as self, and entries \(\log p_{X_{ij}}(x_{ij})\).

If self has a mixed or discrete distribution, this returns the weighted continuous part of self’s distribution instead of the pdf, i.e., the marginal pdf integrate to the weight of the continuous part.

Parameters:
xpandas.DataFrame or 2D np.ndarray

representing \(x\), as above

Returns:
DataFrame with same columns and index as self

containing \(\log p_{X_{ij}}(x_{ij})\), as above

cdf(x)[source]#

Cumulative distribution function.

ppf(p)[source]#

Quantile function = percent point function = inverse cdf.

energy(x=None)[source]#

Energy of self, w.r.t. self or a constant frame x.

Let \(X, Y\) be i.i.d. random variables with the distribution of self.

If x is None, returns \(\mathbb{E}[|X-Y|]\) (for each row), “self-energy” (of the row marginal distribution). If x is passed, returns \(\mathbb{E}[|X-x|]\) (for each row), “energy wrt x” (of the row marginal distribution).

Parameters:
xNone or pd.DataFrame, optional, default=None

if pd.DataFrame, must have same rows and columns as self

Returns:
pd.DataFrame with same rows as self, single column “energy”
each row contains one float, self-energy/energy as described above.
mean()[source]#

Return expected value of the distribution.

Let \(X\) be a random variable with the distribution of self. Returns the expectation \(\mathbb{E}[X]\)

Returns:
pd.DataFrame with same rows, columns as self
expected value of distribution (entry-wise)
var()[source]#

Return element/entry-wise variance of the distribution.

Let \(X\) be a random variable with the distribution of self. Returns \(\mathbb{V}[X] = \mathbb{E}\left(X - \mathbb{E}[X]\right)^2\)

Returns:
pd.DataFrame with same rows, columns as self
variance of distribution (entry-wise)
pdfnorm(a=2)[source]#

a-norm of pdf, defaults to 2-norm.

computes a-norm of the entry marginal pdf, i.e., \(\mathbb{E}[p_X(X)^{a-1}] = \int p(x)^a dx\), where \(X\) is a random variable distributed according to the entry marginal of self, and \(p_X\) is its pdf

Parameters:
a: int or float, optional, default=2
Returns:
pd.DataFrame with same rows and columns as self
each entry is \(\mathbb{E}[p_X(X)^{a-1}] = \int p(x)^a dx\), see above
quantile(alpha)[source]#

Return entry-wise quantiles, in Proba/pred_quantiles mtype format.

This method broadcasts as follows: for a scalar alpha, computes the alpha-quantile entry-wise, and returns as a pd.DataFrame with same index, and columns as in return. If alpha is iterable, multiple quantiles will be calculated, and the result will be concatenated column-wise (axis=1).

The ppf method also computes quantiles, but broadcasts differently, in numpy style closer to tensorflow. In contrast, this quantile method broadcasts as sktime forecaster predict_quantiles, i.e., columns first.

Parameters:
alphafloat or list of float of unique values

A probability or list of, at which quantiles are computed.

Returns:
quantilespd.DataFrame

Column has multi-index: first level is variable name from self.columns, second level being the values of alpha passed to the function. Row index is self.index. Entries in the i-th row, (j, p)-the column is the p-th quantile of the marginal of self at index (i, j).

sample(n_samples=None)[source]#

Sample from the distribution.

Parameters:
n_samplesint, optional, default = None
Returns:
if n_samples is None:
returns a sample that contains a single sample from self,
in pd.DataFrame mtype format convention, with index and columns as self
if n_samples is int:
returns a pd.DataFrame that contains n_samples i.i.d. samples from self,
in pd-multiindex mtype format convention, with same columns as self,
and MultiIndex that is product of RangeIndex(n_samples) and self.index
clone()[source]#

Obtain a clone of the object with same hyper-parameters.

A clone is a different object without shared references, in post-init state. This function is equivalent to returning sklearn.clone of self.

Raises:
RuntimeError if the clone is non-conforming, due to faulty __init__.

Notes

If successful, equal in value to type(self)(**self.get_params(deep=False)).

clone_tags(estimator, tag_names=None)[source]#

Clone tags from another estimator as dynamic override.

Parameters:
estimatorestimator inheriting from :class:BaseEstimator
tag_namesstr or list of str, default = None

Names of tags to clone. If None then all tags in estimator are used as tag_names.

Returns:
Self

Reference to self.

Notes

Changes object state by setting tag values in tag_set from estimator as dynamic tags in self.

classmethod create_test_instance(parameter_set='default')[source]#

Construct Estimator instance if possible.

Parameters:
parameter_setstr, default=”default”

Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

Returns:
instanceinstance of the class with default parameters

Notes

get_test_params can return dict or list of dict. This function takes first or single dict that get_test_params returns, and constructs the object with that.

classmethod create_test_instances_and_names(parameter_set='default')[source]#

Create list of all test instances and a list of names for them.

Parameters:
parameter_setstr, default=”default”

Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

Returns:
objslist of instances of cls

i-th instance is cls(**cls.get_test_params()[i])

nameslist of str, same length as objs

i-th element is name of i-th instance of obj in tests convention is {cls.__name__}-{i} if more than one instance otherwise {cls.__name__}

classmethod get_class_tag(tag_name, tag_value_default=None)[source]#

Get a class tag’s value.

Does not return information from dynamic tags (set via set_tags or clone_tags) that are defined on instances.

Parameters:
tag_namestr

Name of tag value.

tag_value_defaultany

Default/fallback value if tag is not found.

Returns:
tag_value

Value of the tag_name tag in self. If not found, returns tag_value_default.

classmethod get_class_tags()[source]#

Get class tags from the class and all its parent classes.

Retrieves tag: value pairs from _tags class attribute. Does not return information from dynamic tags (set via set_tags or clone_tags) that are defined on instances.

Returns:
collected_tagsdict

Dictionary of class tag name: tag value pairs. Collected from _tags class attribute via nested inheritance.

get_config()[source]#

Get config flags for self.

Returns:
config_dictdict

Dictionary of config name : config value pairs. Collected from _config class attribute via nested inheritance and then any overrides and new tags from _onfig_dynamic object attribute.

classmethod get_param_defaults()[source]#

Get object’s parameter defaults.

Returns:
default_dict: dict[str, Any]

Keys are all parameters of cls that have a default defined in __init__ values are the defaults, as defined in __init__.

classmethod get_param_names()[source]#

Get object’s parameter names.

Returns:
param_names: list[str]

Alphabetically sorted list of parameter names of cls.

get_params(deep=True)[source]#

Get a dict of parameters values for this object.

Parameters:
deepbool, default=True

Whether to return parameters of components.

  • If True, will return a dict of parameter name : value for this object, including parameters of components (= BaseObject-valued parameters).

  • If False, will return a dict of parameter name : value for this object, but not include parameters of components.

Returns:
paramsdict with str-valued keys

Dictionary of parameters, paramname : paramvalue keys-value pairs include:

  • always: all parameters of this object, as via get_param_names values are parameter value for that key, of this object values are always identical to values passed at construction

  • if deep=True, also contains keys/value pairs of component parameters parameters of components are indexed as [componentname]__[paramname] all parameters of componentname appear as paramname with its value

  • if deep=True, also contains arbitrary levels of component recursion, e.g., [componentname]__[componentcomponentname]__[paramname], etc

get_tag(tag_name, tag_value_default=None, raise_error=True)[source]#

Get tag value from estimator class and dynamic tag overrides.

Parameters:
tag_namestr

Name of tag to be retrieved

tag_value_defaultany type, optional; default=None

Default/fallback value if tag is not found

raise_errorbool

whether a ValueError is raised when the tag is not found

Returns:
tag_valueAny

Value of the tag_name tag in self. If not found, returns an error if raise_error is True, otherwise it returns tag_value_default.

Raises:
ValueError if raise_error is True i.e. if tag_name is not in
self.get_tags().keys()
get_tags()[source]#

Get tags from estimator class and dynamic tag overrides.

Returns:
collected_tagsdict

Dictionary of tag name : tag value pairs. Collected from _tags class attribute via nested inheritance and then any overrides and new tags from _tags_dynamic object attribute.

classmethod get_test_params(parameter_set='default')[source]#

Return testing parameter settings for the estimator.

Parameters:
parameter_setstr, default=”default”

Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

Returns:
paramsdict or list of dict, default = {}

Parameters to create testing instances of the class Each dict are parameters to construct an “interesting” test instance, i.e., MyClass(**params) or MyClass(**params[i]) creates a valid test instance. create_test_instance uses the first (or only) dictionary in params

is_composite()[source]#

Check if the object is composed of other BaseObjects.

A composite object is an object which contains objects, as parameters. Called on an instance, since this may differ by instance.

Returns:
composite: bool

Whether an object has any parameters whose values are BaseObjects.

property name[source]#

Return the name of the object or estimator.

reset()[source]#

Reset the object to a clean post-init state.

Using reset, runs __init__ with current values of hyper-parameters (result of get_params). This Removes any object attributes, except:

  • hyper-parameters = arguments of __init__

  • object attributes containing double-underscores, i.e., the string “__”

Class and object methods, and class attributes are also unaffected.

Returns:
self

Instance of class reset to a clean post-init state but retaining the current hyper-parameter values.

Notes

Equivalent to sklearn.clone but overwrites self. After self.reset() call, self is equal in value to type(self)(**self.get_params(deep=False))

set_config(**config_dict)[source]#

Set config flags to given values.

Parameters:
config_dictdict

Dictionary of config name : config value pairs.

Returns:
selfreference to self.

Notes

Changes object state, copies configs in config_dict to self._config_dynamic.

set_params(**params)[source]#

Set the parameters of this object.

The method works on simple estimators as well as on composite objects. Parameter key strings <component>__<parameter> can be used for composites, i.e., objects that contain other objects, to access <parameter> in the component <component>. The string <parameter>, without <component>__, can also be used if this makes the reference unambiguous, e.g., there are no two parameters of components with the name <parameter>.

Parameters:
**paramsdict

BaseObject parameters, keys must be <component>__<parameter> strings. __ suffixes can alias full strings, if unique among get_params keys.

Returns:
selfreference to self (after parameters have been set)
set_random_state(random_state=None, deep=True, self_policy='copy')[source]#

Set random_state pseudo-random seed parameters for self.

Finds random_state named parameters via estimator.get_params, and sets them to integers derived from random_state via set_params. These integers are sampled from chain hashing via sample_dependent_seed, and guarantee pseudo-random independence of seeded random generators.

Applies to random_state parameters in estimator depending on self_policy, and remaining component estimators if and only if deep=True.

Note: calls set_params even if self does not have a random_state, or none of the components have a random_state parameter. Therefore, set_random_state will reset any scikit-base estimator, even those without a random_state parameter.

Parameters:
random_stateint, RandomState instance or None, default=None

Pseudo-random number generator to control the generation of the random integers. Pass int for reproducible output across multiple function calls.

deepbool, default=True

Whether to set the random state in sub-estimators. If False, will set only self’s random_state parameter, if exists. If True, will set random_state parameters in sub-estimators as well.

self_policystr, one of {“copy”, “keep”, “new”}, default=”copy”
  • “copy” : estimator.random_state is set to input random_state

  • “keep” : estimator.random_state is kept as is

  • “new” : estimator.random_state is set to a new random state,

derived from input random_state, and in general different from it

Returns:
selfreference to self
set_tags(**tag_dict)[source]#

Set dynamic tags to given values.

Parameters:
**tag_dictdict

Dictionary of tag name: tag value pairs.

Returns:
Self

Reference to self.

Notes

Changes object state by setting tag values in tag_dict as dynamic tags in self.