bayesml.multivariate_normal package#

_images/multivariate_normal_example.png

Module contents#

The multivariate normal distribution with normal-wishart prior distribution.

The stochastic data generative model is as follows:

  • \(D \in \mathbb{N}\): a dimension of data

  • \(\boldsymbol{x} \in \mathbb{R}^D\): a data point

  • \(\boldsymbol{\mu} \in \mathbb{R}^D\): a parameter

  • \(\boldsymbol{\Lambda} \in \mathbb{R}^{D\times D}\) : a parameter (a positive definite matrix)

  • \(| \boldsymbol{\Lambda} | \in \mathbb{R}\): the determinant of \(\boldsymbol{\Lambda}\)

\[\begin{split}p(\boldsymbol{x} | \boldsymbol{\mu}, \boldsymbol{\Lambda}) &= \mathcal{N}(\boldsymbol{x}|\boldsymbol{\mu},\boldsymbol{\Lambda}^{-1}) \\ &= \frac{| \boldsymbol{\Lambda} |^{1/2}}{(2\pi)^{D/2}} \exp \left\{ -\frac{1}{2}(\boldsymbol{x}-\boldsymbol{\mu})^\top \boldsymbol{\Lambda} (\boldsymbol{x}-\boldsymbol{\mu}) \right\},\end{split}\]
\[\begin{split}\mathbb{E} [\boldsymbol{x} | \boldsymbol{\mu}, \boldsymbol{\Lambda}] &= \boldsymbol{\mu}, \\ \mathrm{Cov} [\boldsymbol{x} | \boldsymbol{\mu}, \boldsymbol{\Lambda}] &= \boldsymbol{\Lambda}^{-1}.\end{split}\]

The prior distribution is as follows:

  • \(\boldsymbol{m}_0 \in \mathbb{R}^{D}\): a hyperparameter

  • \(\kappa_0 \in \mathbb{R}_{>0}\): a hyperparameter

  • \(\nu_0 \in \mathbb{R}\): a hyperparameter (\(\nu_0 > D-1\))

  • \(\boldsymbol{W}_0 \in \mathbb{R}^{D\times D}\): a hyperparameter (a positive definite matrix)

  • \(\mathrm{Tr} \{ \cdot \}\): a trace of a matrix

  • \(\Gamma (\cdot)\): the gamma function

\[\begin{split}p(\boldsymbol{\mu},\boldsymbol{\Lambda}) &= \mathcal{N}(\boldsymbol{\mu}|\boldsymbol{m}_0,(\kappa_0 \boldsymbol{\Lambda})^{-1})\mathcal{W}(\boldsymbol{\Lambda}|\boldsymbol{W}_0, \nu_0) \\ &= \left( \frac{\kappa_0}{2\pi} \right)^{D/2} |\boldsymbol{\Lambda}|^{1/2} \exp \left\{ -\frac{\kappa_0}{2}(\boldsymbol{\mu}-\boldsymbol{m}_0)^\top \boldsymbol{\Lambda} (\boldsymbol{\mu}-\boldsymbol{m}_0) \right\} \\ &\qquad \times B(\boldsymbol{W}_0, \nu_0) | \boldsymbol{\Lambda} |^{(\nu_0 - D - 1) / 2} \exp \left\{ -\frac{1}{2} \mathrm{Tr} \{ \boldsymbol{W}_0^{-1} \boldsymbol{\Lambda} \} \right\},\\\end{split}\]
\[\begin{split}\mathbb{E}[\boldsymbol{\mu}] &= \boldsymbol{m}_0 & (\nu_n > D), \\ \mathrm{Cov}[\boldsymbol{\mu}] &= \frac{1}{\kappa_0 (\nu_0 - D - 1)} \boldsymbol{W}_0^{-1} & (\nu_n > D + 1), \\ \mathbb{E}[\boldsymbol{\Lambda}] &= \nu_0 \boldsymbol{W}_0,\end{split}\]

where \(B(\boldsymbol{W}_0, \nu_0)\) is defined as follows:

\[B(\boldsymbol{W}_0, \nu_0) = | \boldsymbol{W}_0 |^{-\nu_0 / 2} \left( 2^{\nu_0 D / 2} \pi^{D(D-1)/4} \prod_{i=1}^D \Gamma \left( \frac{\nu_0 + 1 - i}{2} \right) \right)^{-1}.\]

The posterior distribution is as follows:

  • \(\boldsymbol{x}^n = (\boldsymbol{x}_1, \boldsymbol{x}_2, \dots , \boldsymbol{x}_n) \in \mathbb{R}^{D\times n}\): given data

  • \(\boldsymbol{m}_n \in \mathbb{R}^{D}\): a hyperparameter

  • \(\kappa_n \in \mathbb{R}_{>0}\): a hyperparameter

  • \(\nu_n \in \mathbb{R}\): a hyperparameter \((\nu_n > D-1)\)

  • \(\boldsymbol{W}_n \in \mathbb{R}^{D\times D}\): a hyperparameter (a positive definite matrix)

\[\begin{split}p(\boldsymbol{\mu},\boldsymbol{\Lambda} | \boldsymbol{x}^n) &= \mathcal{N}(\boldsymbol{\mu}|\boldsymbol{m}_n,(\kappa_n \boldsymbol{\Lambda})^{-1})\mathcal{W}(\boldsymbol{\Lambda}|\boldsymbol{W}_n, \nu_n) \\ &= \left( \frac{\kappa_n}{2\pi} \right)^{D/2} |\boldsymbol{\Lambda}|^{1/2} \exp \left\{ -\frac{\kappa_n}{2}(\boldsymbol{\mu}-\boldsymbol{m}_n)^\top \boldsymbol{\Lambda} (\boldsymbol{\mu}-\boldsymbol{m}_n) \right\} \\ &\qquad \times B(\boldsymbol{W}_n, \nu_n) | \boldsymbol{\Lambda} |^{(\nu_n - D - 1) / 2} \exp \left\{ -\frac{1}{2} \mathrm{Tr} \{ \boldsymbol{W}_n^{-1} \boldsymbol{\Lambda} \} \right\},\end{split}\]
\[\begin{split}\mathbb{E}[\boldsymbol{\mu} | \boldsymbol{x}^n] &= \boldsymbol{m}_n & (\nu_n > D), \\ \mathrm{Cov}[\boldsymbol{\mu} | \boldsymbol{x}^n] &= \frac{1}{\kappa_n (\nu_n - D - 1)} \boldsymbol{W}_n^{-1} & (\nu_n > D + 1), \\ \mathbb{E}[\boldsymbol{\Lambda} | \boldsymbol{x}^n] &= \nu_n \boldsymbol{W}_n,\end{split}\]

where the updating rule of the hyperparameters is

\[\begin{split}\bar{\boldsymbol{x}} &= \frac{1}{n} \sum_{i=1}^n \boldsymbol{x}_i, \\ \boldsymbol{m}_n &= \frac{\kappa_0\boldsymbol{\mu}_0+n\bar{\boldsymbol{x}}}{\kappa_0+n}, \\ \kappa_n &= \kappa_0 + n, \\ \boldsymbol{W}_n^{-1} &= \boldsymbol{W}_0^{-1} + \sum_{i=1}^{n}(\boldsymbol{x}_i-\bar{\boldsymbol{x}})(\boldsymbol{x}_i-\bar{\boldsymbol{x}})^\top + \frac{\kappa_0 n}{\kappa_0+n}(\bar{\boldsymbol{x}}-\boldsymbol{\mu}_0)(\bar{\boldsymbol{x}}-\boldsymbol{\mu}_0)^\top, \\ \nu_n &= \nu_0 + n.\\\end{split}\]

The predictive distribution is as follows:

  • \(\boldsymbol{x}_{n+1} \in \mathbb{R}^D\): a new data point

  • \(\boldsymbol{\mu}_\mathrm{p} \in \mathbb{R}^D\): the hyperparameter of the predictive distribution

  • \(\boldsymbol{\Lambda}_\mathrm{p} \in \mathbb{R}^{D \times D}\): the hyperparameter of the predictive distribution (a positive definite matrix)

  • \(\nu_\mathrm{p} \in \mathbb{R}_{>0}\): the hyperparameter of the predictive distribution

\[\begin{split}&p(x_{n+1}|x^n) \\ &= \mathrm{St}(x_{n+1}|\boldsymbol{\mu}_\mathrm{p},\boldsymbol{\Lambda}_\mathrm{p}, \nu_\mathrm{p}) \\ &= \frac{\Gamma (\nu_\mathrm{p} / 2 + D / 2)}{\Gamma (\nu_\mathrm{p} / 2)} \frac{|\boldsymbol{\Lambda}_\mathrm{p}|^{1/2}}{(\nu_\mathrm{p} \pi)^{D/2}} \left( 1 + \frac{1}{\nu_\mathrm{p}} (\boldsymbol{x}_{n+1} - \boldsymbol{\mu}_\mathrm{p})^\top \boldsymbol{\Lambda}_\mathrm{p} (\boldsymbol{x}_{n+1} - \boldsymbol{\mu}_\mathrm{p}) \right)^{-\nu_\mathrm{p}/2 - D/2},\end{split}\]
\[\begin{split}\mathbb{E}[\boldsymbol{x}_{n+1} | \boldsymbol{x}^n] &= \boldsymbol{\mu}_\mathrm{p} & (\nu_\mathrm{p} > 1), \\ \mathrm{Cov}[\boldsymbol{x}_{n+1} | \boldsymbol{x}^n] &= \frac{\nu_\mathrm{p}}{\nu_\mathrm{p}-2} \boldsymbol{\Lambda}_\mathrm{p}^{-1} & (\nu_\mathrm{p} > 2),\end{split}\]

where the parameters are obtained from the hyperparameters of the posterior distribution as follows:

\[\begin{split}\boldsymbol{\mu}_\mathrm{p} &= \boldsymbol{m}_n, \\ \boldsymbol{\Lambda}_\mathrm{p} &= \frac{\kappa_n (\nu_n - D + 1)}{\kappa_n + 1} \boldsymbol{W}_n, \\ \nu_\mathrm{p} &= \nu_n - D + 1.\end{split}\]
class bayesml.multivariate_normal.GenModel(c_degree, mu_vec=None, lambda_mat=None, h_m_vec=None, h_kappa=1.0, h_nu=None, h_w_mat=None, seed=None)#

Bases: Generative

The stochastic data generative model and the prior distribution

Parameters:
c_degreeint

a positive integer.

mu_vecnumpy.ndarray, optional

a vector of real numbers, by default [0.0, 0.0, … , 0.0]

lambda_matnumpy.ndarray, optional

a positive definite symetric matrix, by default the identity matrix

h_m_vecnumpy.ndarray, optional

a vector of real numbers, by default [0.0, 0.0, … , 0.0]

h_kappafloat, optional

a positive real number, by default 1.0

h_nufloat, optional

a real number > c_degree-1, by default the value of c_degree

h_w_matnumpy.ndarray, optional

a positive definite symetric matrix, by default the identity matrix

seed{None, int}, optional

A seed to initialize numpy.random.default_rng(), by default None

Methods

gen_params()

Generate the parameter from the prior distribution.

gen_sample(sample_size)

Generate a sample from the stochastic data generative model.

get_constants()

Get constants of GenModel.

get_h_params()

Get the hyperparameters of the prior distribution.

get_params()

Get the parameter of the sthocastic data generative model.

load_h_params(filename)

Load the hyperparameters to h_params.

load_params(filename)

Load the parameters saved by save_params.

save_h_params(filename)

Save the hyperparameters using python pickle module.

save_params(filename)

Save the parameters using python pickle module.

save_sample(filename, sample_size)

Save the generated sample as NumPy .npz format.

set_h_params([h_m_vec, h_kappa, h_nu, h_w_mat])

Set the hyperparameters of the prior distribution.

set_params([mu_vec, lambda_mat])

Set the parameter of the sthocastic data generative model.

visualize_model([sample_size])

Visualize the stochastic data generative model and generated samples.

get_constants()#

Get constants of GenModel.

Returns:
constantsdict of {str: int}
  • "c_degree" : the value of self.c_degree

set_h_params(h_m_vec=None, h_kappa=None, h_nu=None, h_w_mat=None)#

Set the hyperparameters of the prior distribution.

Parameters:
h_m_vecnumpy.ndarray, optional

a vector of real numbers, by default None

h_kappafloat, optional

a positive real number, by default None

h_nufloat, optional

a real number > c_degree-1, by default None

h_w_matnumpy.ndarray, optional

a positive definite symetric matrix, by default None

get_h_params()#

Get the hyperparameters of the prior distribution.

Returns:
h_params{str:float, np.ndarray}
  • "h_m_vec" : The value of self.h_mu_vec

  • "h_kappa" : The value of self.h_kappa

  • "h_nu" : The value of self.h_nu

  • "h_w_mat" : The value of self.h_lambda_mat

gen_params()#

Generate the parameter from the prior distribution.

The generated vaule is set at self.mu_vec and self.lambda_mat.

set_params(mu_vec=None, lambda_mat=None)#

Set the parameter of the sthocastic data generative model.

Parameters:
mu_vecnumpy.ndarray, optional

a vector of real numbers, by default None

lambda_matnumpy.ndarray, optional

a positive definite symetric matrix, by default None

get_params()#

Get the parameter of the sthocastic data generative model.

Returns:
params{str:float, numpy.ndarray}
  • "mu_vec" : The value of self.mu_vec

  • "lambda_mat" : The value of self.lambda_mat

gen_sample(sample_size)#

Generate a sample from the stochastic data generative model.

Parameters:
sample_sizeint

A positive integer

Returns:
xnumpy ndarray

2-dimensional array whose shape is (sammple_size,c_degree) and its elements are real number.

save_sample(filename, sample_size)#

Save the generated sample as NumPy .npz format.

It is saved as a NpzFile with keyword: “x”.

Parameters:
filenamestr

The filename to which the sample is saved. .npz will be appended if it isn’t there.

sample_sizeint

A positive integer

visualize_model(sample_size=100)#

Visualize the stochastic data generative model and generated samples.

Parameters:
sample_sizeint, optional

A positive integer, by default 1

Examples

>>> from bayesml import multivariate_normal
>>> model = multivariate_normal.GenModel(c_degree=2)
>>> model.visualize_model()
mu:
[0. 0.]
lambda_mat:
[[1. 0.]
 [0. 1.]]
_images/multivariate_normal_example.png
class bayesml.multivariate_normal.LearnModel(c_degree, h0_m_vec=None, h0_kappa=1.0, h0_nu=None, h0_w_mat=None)#

Bases: Posterior, PredictiveMixin

The posterior distribution and the predictive distribution.

Parameters:
c_degreeint

a positive integer.

h0_m_vecnumpy.ndarray, optional

a vector of real numbers, by default [0.0, 0.0, … , 0.0]

h0_kappafloat, optional

a positive real number, by default 1.0

h0_nufloat, optional

a real number > c_degree-1, by default the value of c_degree

h0_w_matnumpy.ndarray, optional

a positive definite symetric matrix, by default the identity matrix

Attributes:
h0_w_mat_invnumpy.ndarray

the inverse matrix of h0_w_mat

hn_m_vecnumpy.ndarray

a vector of real numbers

hn_kappafloat

a positive real number

hn_nufloat

a real number

hn_w_matnumpy.ndarray

a positive definite symetric matrix

hn_w_mat_invnumpy.ndarray

the inverse matrix of hn_w_mat

p_m_vecnumpy.ndarray

a vector of real numbers

p_nufloat, optional

a positive real number

p_v_matnumpy.ndarray

a positive definite symetric matrix

p_v_mat_invnumpy.ndarray

the inverse matrix of p_v_mat

Methods

calc_pred_dist()

Calculate the parameters of the predictive distribution.

estimate_params([loss, dict_out])

Estimate the parameter of the stochastic data generative model under the given criterion.

get_constants()

Get constants of LearnModel.

get_h0_params()

Get the initial values of the hyperparameters of the posterior distribution.

get_hn_params()

Get the hyperparameters of the posterior distribution.

get_p_params()

Get the parameters of the predictive distribution.

load_h0_params(filename)

Load the hyperparameters to h0_params.

load_hn_params(filename)

Load the hyperparameters to hn_params.

make_prediction([loss])

Predict a new data point under the given criterion.

overwrite_h0_params()

Overwrite the initial values of the hyperparameters of the posterior distribution by the learned values.

pred_and_update(x[, loss])

Predict a new data point and update the posterior sequentially.

reset_hn_params()

Reset the hyperparameters of the posterior distribution to their initial values.

save_h0_params(filename)

Save the hyperparameters using python pickle module.

save_hn_params(filename)

Save the hyperparameters using python pickle module.

set_h0_params([h0_m_vec, h0_kappa, h0_nu, ...])

Set the hyperparameters of the prior distribution.

set_hn_params([hn_m_vec, hn_kappa, hn_nu, ...])

Set updated values of the hyperparameter of the posterior distribution.

update_posterior(x)

Update the hyperparameters of the posterior distribution using traning data.

visualize_posterior()

Visualize the posterior distribution for the parameter.

get_constants()#

Get constants of LearnModel.

Returns:
constantsdict of {str: int}
  • "c_degree" : the value of self.c_degree

set_h0_params(h0_m_vec=None, h0_kappa=None, h0_nu=None, h0_w_mat=None)#

Set the hyperparameters of the prior distribution.

Parameters:
h0_m_vecnumpy.ndarray, optional

a vector of real numbers, by default None

h0_kappafloat, optional

a positive real number, by default None

h0_nufloat, optional

a real number > c_degree-1, by default None

h0_w_matnumpy.ndarray, optional

a positive definite symetric matrix, by default None

get_h0_params()#

Get the initial values of the hyperparameters of the posterior distribution.

Returns:
h0_paramsdict of {str: float, numpy.ndarray}
  • "h0_m_vec" : The value of self.h0_m_vec

  • "h0_kappa" : The value of self.h0_kappa

  • "h0_nu" : The value of self.h0_nu

  • "h0_w_mat" : The value of self.h0_w_mat

set_hn_params(hn_m_vec=None, hn_kappa=None, hn_nu=None, hn_w_mat=None)#

Set updated values of the hyperparameter of the posterior distribution.

Parameters:
hn_m_vecnumpy.ndarray, optional

a vector of real numbers, by default None

hn_kappafloat, optional

a positive real number, by default None

hn_nufloat, optional

a real number > c_degree-1, by default None

hn_w_matnumpy.ndarray, optional

a positive definite symetric matrix, by default None

get_hn_params()#

Get the hyperparameters of the posterior distribution.

Returns:
hn_paramsdict of {str: numpy.ndarray}
  • "hn_m_vec" : The value of self.hn_m_vec

  • "hn_kappa" : The value of self.hn_kappa

  • "hn_nu" : The value of self.hn_nu

  • "hn_w_mat" : The value of self.hn_w_mat

update_posterior(x)#

Update the hyperparameters of the posterior distribution using traning data.

Parameters:
xnumpy.ndarray

All the elements must be real number.

estimate_params(loss='squared', dict_out=False)#

Estimate the parameter of the stochastic data generative model under the given criterion.

Note that the criterion is applied to estimating mu_vec and lambda_mat independently. Therefore, a tuple of the student’s t-distribution and the wishart distribution will be returned when loss=”KL”

Parameters:
lossstr, optional

Loss function underlying the Bayes risk function, by default “squared”. This function supports “squared”, “0-1”, and “KL”.

dict_outbool, optional

If True, output will be a dict, by default False.

Returns:
estimatestuple of {numpy ndarray, float, None, or rv_frozen}
  • mu_vec_hat : the estimate for mu_vec

  • lambda_mat_hat : the estimate for lambda_mat

The estimated values under the given loss function. If it is not exist, None will be returned. If the loss function is “KL”, the posterior distribution itself will be returned as rv_frozen object of scipy.stats.

visualize_posterior()#

Visualize the posterior distribution for the parameter.

Examples

>>> from bayesml import multivariate_normal
>>> gen_model = multivariate_normal.GenModel(c_degree=2)
>>> x = gen_model.gen_sample(100)
>>> learn_model = multivariate_normal.LearnModel(c_degree=2)
>>> learn_model.update_posterior(x)
>>> learn_model.visualize_posterior()
hn_m_vec:
[-0.06924909  0.08126454]
hn_kappa:
101.0
hn_nu:
102.0
hn_w_mat:
[[ 0.00983415 -0.00059828]
[-0.00059828  0.00741698]]
E[lambda_mat]=
[[ 1.0030838  -0.06102455]
[-0.06102455  0.7565315 ]]
_images/multivariate_normal_posterior.png
get_p_params()#

Get the parameters of the predictive distribution.

Returns:
p_paramsdict of {str: numpy.ndarray}
  • "p_m_vec" : The value of self.p_m_vec

  • "p_nu" : The value of self.p_nu

  • "p_v_mat" : The value of self.p_v_mat

calc_pred_dist()#

Calculate the parameters of the predictive distribution.

make_prediction(loss='squared')#

Predict a new data point under the given criterion.

Parameters:
lossstr, optional

Loss function underlying the Bayes risk function, by default “squared”. This function supports “squared”, “0-1”, and “KL”.

Returns:
Predicted_value{float, numpy.ndarray}

The predicted value under the given loss function. If the loss function is “KL”, the posterior distribution itself will be returned as rv_frozen object of scipy.stats.

pred_and_update(x, loss='squared')#

Predict a new data point and update the posterior sequentially.

Parameters:
xnumpy.ndarray

It must be a c_degree-dimensional vector

lossstr, optional

Loss function underlying the Bayes risk function, by default “squared”. This function supports “squared”, “0-1”, and “KL”.

Returns:
Predicted_value{float, numpy.ndarray}

The predicted value under the given loss function.