bayesml.gaussianmixture package#

_images/gaussianmixture_example.png

Module contents#

The Gaussian mixture model with the Gauss-Wishart prior distribution and the Dirichlet prior distribution.

The stochastic data generative model is as follows:

  • \(K \in \mathbb{N}\): number of latent classes

  • \(\boldsymbol{z} \in \{ 0, 1 \}^K\): a one-hot vector representing the latent class (latent variable)

  • \(\boldsymbol{\pi} \in [0, 1]^K\): a parameter for latent classes, (\(\sum_{k=1}^K \pi_k=1\))

  • \(D \in \mathbb{N}\): a dimension of data

  • \(\boldsymbol{x} \in \mathbb{R}^D\): a data point

  • \(\boldsymbol{\mu}_k \in \mathbb{R}^D\): a parameter

  • \(\boldsymbol{\mu} = \{ \boldsymbol{\mu}_k \}_{k=1}^K\)

  • \(\boldsymbol{\Lambda}_k \in \mathbb{R}^{D\times D}\) : a parameter (a positive definite matrix)

  • \(\boldsymbol{\Lambda} = \{ \boldsymbol{\Lambda}_k \}_{k=1}^K\)

  • \(| \boldsymbol{\Lambda}_k | \in \mathbb{R}\): the determinant of \(\boldsymbol{\Lambda}_k\)

\[\begin{split}p(\boldsymbol{z} | \boldsymbol{\pi}) &= \mathrm{Cat}(\boldsymbol{z}|\boldsymbol{\pi}) = \prod_{k=1}^K \pi_k^{z_k},\\ p(\boldsymbol{x} | \boldsymbol{\mu}, \boldsymbol{\Lambda}, \boldsymbol{z}) &= \prod_{k=1}^K \mathcal{N}(\boldsymbol{x}|\boldsymbol{\mu}_k,\boldsymbol{\Lambda}_k^{-1})^{z_k} \\ &= \prod_{k=1}^K \left( \frac{| \boldsymbol{\Lambda}_k |^{1/2}}{(2\pi)^{D/2}} \exp \left\{ -\frac{1}{2}(\boldsymbol{x}-\boldsymbol{\mu}_k)^\top \boldsymbol{\Lambda}_k (\boldsymbol{x}-\boldsymbol{\mu}_k) \right\} \right)^{z_k}.\end{split}\]

The prior distribution is as follows:

  • \(\boldsymbol{m}_0 \in \mathbb{R}^{D}\): a hyperparameter

  • \(\kappa_0 \in \mathbb{R}_{>0}\): a hyperparameter

  • \(\nu_0 \in \mathbb{R}\): a hyperparameter (\(\nu_0 > D-1\))

  • \(\boldsymbol{W}_0 \in \mathbb{R}^{D\times D}\): a hyperparameter (a positive definite matrix)

  • \(\boldsymbol{\alpha}_0 \in \mathbb{R}_{> 0}^K\): a hyperparameter

  • \(\mathrm{Tr} \{ \cdot \}\): a trace of a matrix

  • \(\Gamma (\cdot)\): the gamma function

\[\begin{split}p(\boldsymbol{\mu},\boldsymbol{\Lambda},\boldsymbol{\pi}) &= \left\{ \prod_{k=1}^K \mathcal{N}(\boldsymbol{\mu}_k|\boldsymbol{m}_0,(\kappa_0 \boldsymbol{\Lambda}_k)^{-1})\mathcal{W}(\boldsymbol{\Lambda}_k|\boldsymbol{W}_0, \nu_0) \right\} \mathrm{Dir}(\boldsymbol{\pi}|\boldsymbol{\alpha}_0) \\ &= \Biggl[\, \prod_{k=1}^K \left( \frac{\kappa_0}{2\pi} \right)^{D/2} |\boldsymbol{\Lambda}_k|^{1/2} \exp \left\{ -\frac{\kappa_0}{2}(\boldsymbol{\mu}_k -\boldsymbol{m}_0)^\top \boldsymbol{\Lambda}_k (\boldsymbol{\mu}_k - \boldsymbol{m}_0) \right\} \notag \\ &\qquad \times B(\boldsymbol{W}_0, \nu_0) | \boldsymbol{\Lambda}_k |^{(\nu_0 - D - 1) / 2} \exp \left\{ -\frac{1}{2} \mathrm{Tr} \{ \boldsymbol{W}_0^{-1} \boldsymbol{\Lambda}_k \} \right\} \Biggr] \notag \\ &\qquad \times C(\boldsymbol{\alpha}_0)\prod_{k=1}^K \pi_k^{\alpha_{0,k}-1},\\\end{split}\]

where \(B(\boldsymbol{W}_0, \nu_0)\) and \(C(\boldsymbol{\alpha}_0)\) are defined as follows:

\[\begin{split}B(\boldsymbol{W}_0, \nu_0) &= | \boldsymbol{W}_0 |^{-\nu_0 / 2} \left( 2^{\nu_0 D / 2} \pi^{D(D-1)/4} \prod_{i=1}^D \Gamma \left( \frac{\nu_0 + 1 - i}{2} \right) \right)^{-1}, \\ C(\boldsymbol{\alpha}_0) &= \frac{\Gamma(\sum_{k=1}^K \alpha_{0,k})}{\Gamma(\alpha_{0,1})\cdots\Gamma(\alpha_{0,K})}.\end{split}\]

The apporoximate posterior distribution in the \(t\)-th iteration of a variational Bayesian method is as follows:

  • \(\boldsymbol{x}^n = (\boldsymbol{x}_1, \boldsymbol{x}_2, \dots , \boldsymbol{x}_n) \in \mathbb{R}^{D \times n}\): given data

  • \(\boldsymbol{z}^n = (\boldsymbol{z}_1, \boldsymbol{z}_2, \dots , \boldsymbol{z}_n) \in \{ 0, 1 \}^{K \times n}\): latent classes of given data

  • \(\boldsymbol{r}_i^{(t)} = (r_{i,1}^{(t)}, r_{i,2}^{(t)}, \dots , r_{i,K}^{(t)}) \in [0, 1]^K\): a parameter for the \(i\)-th latent class. (\(\sum_{k=1}^K r_{i, k}^{(t)} = 1\))

  • \(\boldsymbol{m}_{n,k}^{(t)} \in \mathbb{R}^{D}\): a hyperparameter

  • \(\kappa_{n,k}^{(t)} \in \mathbb{R}_{>0}\): a hyperparameter

  • \(\nu_{n,k}^{(t)} \in \mathbb{R}\): a hyperparameter \((\nu_n > D-1)\)

  • \(\boldsymbol{W}_{n,k}^{(t)} \in \mathbb{R}^{D\times D}\): a hyperparameter (a positive definite matrix)

  • \(\boldsymbol{\alpha}_n^{(t)} \in \mathbb{R}_{> 0}^K\): a hyperparameter

\[\begin{split}q(\boldsymbol{z}^n, \boldsymbol{\mu},\boldsymbol{\Lambda},\boldsymbol{\pi}) &= \left\{ \prod_{i=1}^n \mathrm{Cat} (\boldsymbol{z}_i | \boldsymbol{r}_i^{(t)}) \right\} \left\{ \prod_{k=1}^K \mathcal{N}(\boldsymbol{\mu}_k|\boldsymbol{m}_{n,k}^{(t)},(\kappa_{n,k}^{(t)} \boldsymbol{\Lambda}_k)^{-1})\mathcal{W}(\boldsymbol{\Lambda}_k|\boldsymbol{W}_{n,k}^{(t)}, \nu_{n,k}^{(t)}) \right\} \mathrm{Dir}(\boldsymbol{\pi}|\boldsymbol{\alpha}_n^{(t)}) \\ &= \Biggl[\, \prod_{i=1}^n \prod_{k=1}^K (r_{i,k}^{(t)})^{z_{i,k}} \Biggr] \Biggl[\, \prod_{k=1}^K \left( \frac{\kappa_{n,k}^{(t)}}{2\pi} \right)^{D/2} |\boldsymbol{\Lambda}_k|^{1/2} \exp \left\{ -\frac{\kappa_{n,k}^{(t)}}{2}(\boldsymbol{\mu}_k -\boldsymbol{m}_{n,k}^{(t)})^\top \boldsymbol{\Lambda}_k (\boldsymbol{\mu}_k - \boldsymbol{m}_{n,k}^{(t)}) \right\} \\ &\qquad \times B(\boldsymbol{W}_{n,k}^{(t)}, \nu_{n,k}^{(t)}) | \boldsymbol{\Lambda}_k |^{(\nu_{n,k}^{(t)} - D - 1) / 2} \exp \left\{ -\frac{1}{2} \mathrm{Tr} \{ ( \boldsymbol{W}_{n,k}^{(t)} )^{-1} \boldsymbol{\Lambda}_k \} \right\} \Biggr] \\ &\qquad \times C(\boldsymbol{\alpha}_n^{(t)})\prod_{k=1}^K \pi_k^{\alpha_{n,k}^{(t)}-1},\\\end{split}\]

where the updating rule of the hyperparameters is as follows.

\[\begin{split}N_k^{(t)} &= \sum_{i=1}^n r_{i,k}^{(t)}, \\ \bar{\boldsymbol{x}}_k^{(t)} &= \frac{1}{N_k^{(t)}} \sum_{i=1}^n r_{i,k}^{(t)} \boldsymbol{x}_i, \\ \boldsymbol{m}_{n,k}^{(t+1)} &= \frac{\kappa_0\boldsymbol{m}_0 + N_k^{(t)} \bar{\boldsymbol{x}}_k^{(t)}}{\kappa_0 + N_k^{(t)}}, \\ \kappa_{n,k}^{(t+1)} &= \kappa_0 + N_k^{(t)}, \\ (\boldsymbol{W}_{n,k}^{(t+1)})^{-1} &= \boldsymbol{W}_0^{-1} + \sum_{i=1}^{n} r_{i,k}^{(t)} (\boldsymbol{x}_i-\bar{\boldsymbol{x}}_k^{(t)})(\boldsymbol{x}_i-\bar{\boldsymbol{x}}_k^{(t)})^\top + \frac{\kappa_0 N_k^{(t)}}{\kappa_0 + N_k^{(t)}}(\bar{\boldsymbol{x}}_k^{(t)}-\boldsymbol{m}_0)(\bar{\boldsymbol{x}}_k^{(t)}-\boldsymbol{m}_0)^\top, \\ \nu_{n,k}^{(t+1)} &= \nu_0 + N_k^{(t)},\\ \alpha_{n,k}^{(t+1)} &= \alpha_{0,k} + N_k^{(t)}, \\ \ln \rho_{i,k}^{(t+1)} &= \psi (\alpha_{n,k}^{(t+1)}) - \psi ( {\textstyle \sum_{k=1}^K \alpha_{n,k}^{(t+1)}} ) \notag \\ &\qquad + \frac{1}{2} \Biggl[\, \sum_{d=1}^D \psi \left( \frac{\nu_{n,k}^{(t+1)} + 1 - d}{2} \right) + D \ln 2 + \ln | \boldsymbol{W}_{n,k}^{(t+1)} | \notag \\ &\qquad - D \ln (2 \pi ) - \frac{D}{\kappa_{n,k}^{(t+1)}} - \nu_{n,k}^{(t+1)} (\boldsymbol{x}_i - \boldsymbol{m}_{n,k}^{(t+1)})^\top \boldsymbol{W}_{n,k}^{(t+1)} (\boldsymbol{x}_i - \boldsymbol{m}_{n,k}^{(t+1)}) \Biggr], \\ r_{i,k}^{(t+1)} &= \frac{\rho_{i,k}^{(t+1)}}{\sum_{j=1}^K \rho_{i,j}^{(t+1)}}.\end{split}\]

The approximate predictive distribution is as follows:

  • \(\boldsymbol{x}_{n+1} \in \mathbb{R}^D\): a new data point

  • \(\boldsymbol{\mu}_{\mathrm{p},k} \in \mathbb{R}^D\): the parameter of the predictive distribution

  • \(\boldsymbol{\Lambda}_{\mathrm{p},k} \in \mathbb{R}^{D \times D}\): the parameter of the predictive distribution (a positive definite matrix)

  • \(\nu_{\mathrm{p},k} \in \mathbb{R}_{>0}\): the parameter of the predictive distribution

\[\begin{split}&p(x_{n+1}|x^n) \\ &= \frac{1}{\sum_{k=1}^K \alpha_{n,k}^{(t)}} \sum_{k=1}^K \alpha_{n,k}^{(t)} \mathrm{St}(x_{n+1}|\boldsymbol{\mu}_{\mathrm{p},k},\boldsymbol{\Lambda}_{\mathrm{p},k}, \nu_{\mathrm{p},k}) \\ &= \frac{1}{\sum_{k=1}^K \alpha_{n,k}^{(t)}} \sum_{k=1}^K \alpha_{n,k}^{(t)} \Biggl[ \frac{\Gamma (\nu_{\mathrm{p},k} / 2 + D / 2)}{\Gamma (\nu_{\mathrm{p},k} / 2)} \frac{|\boldsymbol{\Lambda}_{\mathrm{p},k}|^{1/2}}{(\nu_{\mathrm{p},k} \pi)^{D/2}} \notag \\ &\qquad \qquad \qquad \qquad \qquad \times \left( 1 + \frac{1}{\nu_{\mathrm{p},k}} (\boldsymbol{x}_{n+1} - \boldsymbol{\mu}_{\mathrm{p},k})^\top \boldsymbol{\Lambda}_{\mathrm{p},k} (\boldsymbol{x}_{n+1} - \boldsymbol{\mu}_{\mathrm{p},k}) \right)^{-\nu_{\mathrm{p},k}/2 - D/2} \Biggr],\end{split}\]

where the parameters are obtained from the hyperparameters of the posterior distribution as follows:

\[\begin{split}\boldsymbol{\mu}_{\mathrm{p},k} &= \boldsymbol{m}_{n,k}^{(t)}, \\ \nu_{\mathrm{p},k} &= \nu_{n,k}^{(t)} - D + 1,\\ \boldsymbol{\Lambda}_{\mathrm{p},k} &= \frac{\kappa_{n,k}^{(t)} \nu_{\mathrm{p},k}}{\kappa_{n,k}^{(t)} + 1} \boldsymbol{W}_{n,k}^{(t)}.\end{split}\]
class bayesml.gaussianmixture.GenModel(c_num_classes, c_degree, pi_vec=None, mu_vecs=None, lambda_mats=None, h_alpha_vec=None, h_m_vecs=None, h_kappas=None, h_nus=None, h_w_mats=None, seed=None)#

Bases: Generative

The stochastic data generative model and the prior distribution

Parameters:
c_num_classesint

a positive integer

c_degreeint

a positive integer

pi_vecnumpy.ndarray, optional

a real vector in \([0, 1]^K\), by default [1/K, 1/K, … , 1/K]

mu_vecsnumpy.ndarray, optional

vectors of real numbers, by default zero vectors.

lambda_matsnumpy.ndarray, optional

positive definite symetric matrices, by default the identity matrices

h_alpha_vecnumpy.ndarray, optional

a vector of positive real numbers, by default [1/2, 1/2, … , 1/2]

h_m_vecsnumpy.ndarray, optional

vectors of real numbers, by default zero vectors

h_kappasfloat, optional

positive real numbers, by default [1.0, 1.0, … , 1.0]

h_nusfloat, optional

real numbers greater than c_degree-1, by default the value of c_degree

h_w_matsnumpy.ndarray, optional

positive definite symetric matrices, by default the identity matrices

seed{None, int}, optional

A seed to initialize numpy.random.default_rng(), by default None

Methods

gen_params()

Generate the parameter from the prior distribution.

gen_sample(sample_size)

Generate a sample from the stochastic data generative model.

get_constants()

Get constants of GenModel.

get_h_params()

Get the hyperparameters of the prior distribution.

get_params()

Get the parameter of the sthocastic data generative model.

load_h_params(filename)

Load the hyperparameters to h_params.

load_params(filename)

Load the parameters saved by save_params.

save_h_params(filename)

Save the hyperparameters using python pickle module.

save_params(filename)

Save the parameters using python pickle module.

save_sample(filename, sample_size)

Save the generated sample as NumPy .npz format.

set_h_params([h_alpha_vec, h_m_vecs, ...])

Set the hyperparameters of the prior distribution.

set_params([pi_vec, mu_vecs, lambda_mats])

Set the parameter of the sthocastic data generative model.

visualize_model([sample_size])

Visualize the stochastic data generative model and generated samples.

get_constants()#

Get constants of GenModel.

Returns:
constantsdict of {str: int, numpy.ndarray}
  • "c_num_classes" : the value of self.c_num_classes

  • "c_degree" : the value of self.c_degree

set_h_params(h_alpha_vec=None, h_m_vecs=None, h_kappas=None, h_nus=None, h_w_mats=None)#

Set the hyperparameters of the prior distribution.

Parameters:
h_alpha_vecnumpy.ndarray, optional

a vector of positive real numbers, by default None

h_m_vecsnumpy.ndarray, optional

vectors of real numbers, by default None

h_kappasfloat, optional

positive real numbers, by default None

h_nusfloat, optional

real numbers greater than c_degree-1, by default None

h_w_matsnumpy.ndarray, optional

positive definite symetric matrices, by default None

get_h_params()#

Get the hyperparameters of the prior distribution.

Returns:
h_params{str:float, np.ndarray}
  • "h_alpha_vec" : The value of self.h_alpha_vec

  • "h_m_vecs" : The value of self.h_m_vecs

  • "h_kappas" : The value of self.h_kappas

  • "h_nus" : The value of self.h_nus

  • "h_w_mats" : The value of self.h_w_mats

gen_params()#

Generate the parameter from the prior distribution.

The generated vaule is set at self.pi_vec, self.mu_vecs and self.lambda_mats.

set_params(pi_vec=None, mu_vecs=None, lambda_mats=None)#

Set the parameter of the sthocastic data generative model.

Parameters:
pi_vecnumpy.ndarray

a real vector in \([0, 1]^K\). The sum of its elements must be 1.

mu_vecsnumpy.ndarray

vectors of real numbers

lambda_matsnumpy.ndarray

positive definite symetric matrices

get_params()#

Get the parameter of the sthocastic data generative model.

Returns:
params{str:float, numpy.ndarray}
  • "pi_vec" : The value of self.pi_vec

  • "mu_vecs" : The value of self.mu_vecs

  • "lambda_mats" : The value of self.lambda_mats

gen_sample(sample_size)#

Generate a sample from the stochastic data generative model.

Parameters:
sample_sizeint

A positive integer

Returns:
xnumpy ndarray

2-dimensional array whose shape is (sammple_size,c_degree) and its elements are real numbers.

znumpy ndarray

2-dimensional array whose shape is (sample_size,c_num_classes) whose rows are one-hot vectors.

save_sample(filename, sample_size)#

Save the generated sample as NumPy .npz format.

It is saved as a NpzFile with keyword: “x”, “z”.

Parameters:
filenamestr

The filename to which the sample is saved. .npz will be appended if it isn’t there.

sample_sizeint

A positive integer

visualize_model(sample_size=100)#

Visualize the stochastic data generative model and generated samples.

Parameters:
sample_sizeint, optional

A positive integer, by default 100

Examples

>>> from bayesml import gaussianmixture
>>> import numpy as np
>>> model = gaussianmixture.GenModel(
>>>             c_num_classes=3,
>>>             c_degree=1
>>>             pi_vec=np.array([0.444,0.444,0.112]),
>>>             mu_vecs=np.array([[-2.8],[-0.8],[2]]),
>>>             lambda_mats=np.array([[[6.25]],[[6.25]],[[100]]])
>>>             )
>>> model.visualize_model()
pi_vec:
 [0.444 0.444 0.112]
mu_vecs:
 [[-2.8]
 [-0.8]
 [ 2. ]]
lambda_mats:
 [[[  6.25]]
 [[  6.25]]
 [[100.  ]]]
_images/gaussianmixture_example.png
class bayesml.gaussianmixture.LearnModel(c_num_classes, c_degree, h0_alpha_vec=None, h0_m_vecs=None, h0_kappas=None, h0_nus=None, h0_w_mats=None, seed=None)#

Bases: Posterior, PredictiveMixin

The posterior distribution and the predictive distribution.

Parameters:
c_num_classesint

a positive integer

c_degreeint

a positive integer

h0_alpha_vecnumpy.ndarray, optional

a vector of positive real numbers, by default [1/2, 1/2, … , 1/2]

h0_m_vecsnumpy.ndarray, optional

vectors of real numbers, by default zero vectors

h0_kappas{float, numpy.ndarray}, optional

positive real numbers, by default [1.0, 1.0, … , 1.0]

h0_nus{float, numpy.ndarray}, optional

real numbers greater than c_degree-1, by default the value of c_degree

h0_w_matsnumpy.ndarray, optional

positive definite symetric matrices, by default the identity matrices

seed{None, int}, optional

A seed to initialize numpy.random.default_rng(), by default None

Attributes:
h0_w_mats_invnumpy.ndarray

the inverse matrices of h0_w_mats

hn_alpha_vecnumpy.ndarray

a vector of positive real numbers

hn_m_vecsnumpy.ndarray

vectors of real numbers

hn_kappasnumpy.ndarray

positive real numbers

hn_nusnumpy.ndarray

real numbers greater than c_degree-1

hn_w_matsnumpy.ndarray

positive definite symetric matrices

hn_w_mats_invnumpy.ndarray

the inverse matrices of hn_w_mats

r_vecsnumpy.ndarray

vectors of real numbers. The sum of its elenemts is 1.

nsnumpy.ndarray

positive real numbers

s_matsnumpy.ndarray

positive difinite symmetric matrices

p_mu_vecsnumpy.ndarray

vectors of real numbers

p_nusnumpy.ndarray

positive real numbers

p_lambda_matsnumpy.ndarray

positive definite symetric matrices

Methods

calc_pred_dist()

Calculate the parameters of the predictive distribution.

estimate_latent_vars(x[, loss])

Estimate latent variables corresponding to x under the given criterion.

estimate_latent_vars_and_update(x[, loss, ...])

Estimate latent variables and update the posterior sequentially.

estimate_params([loss])

Estimate the parameter of the stochastic data generative model under the given criterion.

get_constants()

Get constants of LearnModel.

get_h0_params()

Get the initial values of the hyperparameters of the posterior distribution.

get_hn_params()

Get the hyperparameters of the posterior distribution.

get_p_params()

Get the parameters of the predictive distribution.

load_h0_params(filename)

Load the hyperparameters to h0_params.

load_hn_params(filename)

Load the hyperparameters to hn_params.

make_prediction([loss])

Predict a new data point under the given criterion.

overwrite_h0_params()

Overwrite the initial values of the hyperparameters of the posterior distribution by the learned values.

pred_and_update(x[, loss, max_itr, ...])

Predict a new data point and update the posterior sequentially.

reset_hn_params()

Reset the hyperparameters of the posterior distribution to their initial values.

save_h0_params(filename)

Save the hyperparameters using python pickle module.

save_hn_params(filename)

Save the hyperparameters using python pickle module.

set_h0_params([h0_alpha_vec, h0_m_vecs, ...])

Set the hyperparameters of the prior distribution.

set_hn_params([hn_alpha_vec, hn_m_vecs, ...])

Set updated values of the hyperparameter of the posterior distribution.

update_posterior(x[, max_itr, num_init, ...])

Update the hyperparameters of the posterior distribution using traning data.

visualize_posterior()

Visualize the posterior distribution for the parameter.

get_constants()#

Get constants of LearnModel.

Returns:
constantsdict of {str: int, numpy.ndarray}
  • "c_num_classes" : the value of self.c_num_classes

  • "c_degree" : the value of self.c_degree

set_h0_params(h0_alpha_vec=None, h0_m_vecs=None, h0_kappas=None, h0_nus=None, h0_w_mats=None)#

Set the hyperparameters of the prior distribution.

Parameters:
h0_alpha_vecnumpy.ndarray, optional

a vector of positive real numbers, by default None

h0_m_vecsnumpy.ndarray, optional

vectors of real numbers, by default None

h0_kappas{float, numpy.ndarray}, optional

positive real numbers, by default None

h0_nus{float, numpy.ndarray}, optional

real numbers greater than c_degree-1, by default None

h0_w_matsnumpy.ndarray, optional

positive definite symetric matrices, by default None

get_h0_params()#

Get the initial values of the hyperparameters of the posterior distribution.

Returns:
h0_paramsdict of {str: numpy.ndarray}
  • "h0_alpha_vec" : The value of self.h0_alpha_vec

  • "h0_m_vecs" : The value of self.h0_m_vecs

  • "h0_kappas" : The value of self.h0_kappas

  • "h0_nus" : The value of self.h0_nus

  • "h0_w_mats" : The value of self.h0_w_mats

set_hn_params(hn_alpha_vec=None, hn_m_vecs=None, hn_kappas=None, hn_nus=None, hn_w_mats=None)#

Set updated values of the hyperparameter of the posterior distribution.

Parameters:
hn_alpha_vecnumpy.ndarray, optional

a vector of positive real numbers, by default None

hn_m_vecsnumpy.ndarray, optional

vectors of real numbers, by default None

hn_kappas{float, numpy.ndarray}, optional

positive real numbers, by default None

hn_nus{float, numpy.ndarray}, optional

real numbers greater than c_degree-1, by default None

hn_w_matsnumpy.ndarray, optional

positive definite symetric matrices, by default None

get_hn_params()#

Get the hyperparameters of the posterior distribution.

Returns:
hn_paramsdict of {str: numpy.ndarray}
  • "hn_alpha_vec" : The value of self.hn_alpha_vec

  • "hn_m_vecs" : The value of self.hn_m_vecs

  • "hn_kappas" : The value of self.hn_kappas

  • "hn_nus" : The value of self.hn_nus

  • "hn_w_mats" : The value of self.hn_w_mats

update_posterior(x, max_itr=100, num_init=10, tolerance=1e-08, init_type='subsampling')#

Update the hyperparameters of the posterior distribution using traning data.

Parameters:
xnumpy.ndarray

(sample_size,c_degree)-dimensional ndarray. All the elements must be real number.

max_itrint, optional

maximum number of iterations, by default 100

num_initint, optional

number of initializations, by default 10

tolerancefloat, optional

convergence croterion of variational lower bound, by default 1.0E-8

init_typestr, optional
  • 'subsampling': for each latent class, extract a subsample whose size is int(np.sqrt(x.shape[0])). and use its mean and covariance matrix as an initial values of hn_m_vecs and hn_lambda_mats.

  • 'random_responsibility': randomly assign responsibility to r_vecs

Type of initialization, by default 'subsampling'

estimate_params(loss='squared')#

Estimate the parameter of the stochastic data generative model under the given criterion.

Note that the criterion is applied to estimating pi_vec, mu_vecs and lambda_mats independently. Therefore, a tuple of the dirichlet distribution, the student’s t-distributions and the wishart distributions will be returned when loss=”KL”

Parameters:
lossstr, optional

Loss function underlying the Bayes risk function, by default “squared”. This function supports “squared”, “0-1”, and “KL”.

Returns:
Estimatesa tuple of {numpy ndarray, float, None, or rv_frozen}
  • pi_vec_hat : the estimate for pi_vec

  • mu_vecs_hat : the estimate for mu_vecs

  • lambda_mats_hat : the estimate for lambda_mats

The estimated values under the given loss function. If it is not exist, np.nan will be returned. If the loss function is “KL”, the posterior distribution itself will be returned as rv_frozen object of scipy.stats.

visualize_posterior()#

Visualize the posterior distribution for the parameter.

Examples

>>> from bayesml import gaussianmixture
>>> gen_model = gaussianmixture.GenModel(
>>>     c_num_classes=2,
>>>     c_degree=1,
>>>     mu_vecs=np.array([[-2],[2]]),
>>>     )
>>> x,z = gen_model.gen_sample(100)
>>> learn_model = gaussianmixture.LearnModel(c_num_classes=2, c_degree=1)
>>> learn_model.update_posterior(x)
>>> learn_model.visualize_posterior()
hn_m_vecs:
[[ 2.09365933]
[-1.97862429]]
hn_kappas:
[47.68878373 54.31121627]
hn_nus:
[47.68878373 54.31121627]
hn_w_mats:
[[[0.02226992]]
[[0.01575793]]]
E[lambda_mats]=
[[[1.06202546]]
[[0.85583258]]]
_images/gaussianmixture_posterior.png
get_p_params()#

Get the parameters of the predictive distribution.

Returns:
p_paramsdict of {str: numpy.ndarray}
  • "p_mu_vecs" : The value of self.p_mu_vecs

  • "p_nus" : The value of self.p_nus

  • "p_lambda_mats" : The value of self.p_lambda_mats

calc_pred_dist()#

Calculate the parameters of the predictive distribution.

make_prediction(loss='squared')#

Predict a new data point under the given criterion.

Parameters:
lossstr, optional

Loss function underlying the Bayes risk function, by default “squared”. This function supports “squared” and “0-1”.

Returns:
predicted_value{float, numpy.ndarray}

The predicted value under the given loss function.

pred_and_update(x, loss='squared', max_itr=100, num_init=10, tolerance=1e-08, init_type='random_responsibility')#

Predict a new data point and update the posterior sequentially.

h0_params will be overwritten by current hn_params before updating hn_params by x

Parameters:
xnumpy.ndarray

It must be a c_degree-dimensional vector

lossstr, optional

Loss function underlying the Bayes risk function, by default “squared”. This function supports “squared” and “0-1”.

max_itrint, optional

maximum number of iterations, by default 100

num_initint, optional

number of initializations, by default 10

tolerancefloat, optional

convergence croterion of variational lower bound, by default 1.0E-8

init_typestr, optional
  • 'random_responsibility': randomly assign responsibility to r_vecs

  • 'subsampling': for each latent class, extract a subsample whose size is int(np.sqrt(x.shape[0])). and use its mean and covariance matrix as an initial values of hn_m_vecs and hn_lambda_mats.

Type of initialization, by default 'random_responsibility'

Returns:
predicted_value{float, numpy.ndarray}

The predicted value under the given loss function.

estimate_latent_vars(x, loss='0-1')#

Estimate latent variables corresponding to x under the given criterion.

Note that the criterion is independently applied to each data point.

Parameters:
xnumpy.ndarray

(sample_size,c_degree)-dimensional ndarray. All the elements must be real number.

lossstr, optional

Loss function underlying the Bayes risk function, by default “0-1”. This function supports “squared”, “0-1”, and “KL”.

Returns:
estimatesnumpy.ndarray

The estimated values under the given loss function. If the loss function is “KL”, the posterior distribution will be returned as a numpy.ndarray whose elements consist of occurence probabilities.

estimate_latent_vars_and_update(x, loss='0-1', max_itr=100, num_init=10, tolerance=1e-08, init_type='subsampling')#

Estimate latent variables and update the posterior sequentially.

h0_params will be overwritten by current hn_params before updating hn_params by x

Parameters:
xnumpy.ndarray

It must be a c_degree-dimensional vector

lossstr, optional

Loss function underlying the Bayes risk function, by default “0-1”. This function supports “squared” and “0-1”.

max_itrint, optional

maximum number of iterations, by default 100

num_initint, optional

number of initializations, by default 10

tolerancefloat, optional

convergence croterion of variational lower bound, by default 1.0E-8

init_typestr, optional
  • 'subsampling': for each latent class, extract a subsample whose size is int(np.sqrt(x.shape[0])). and use its mean and covariance matrix as an initial values of hn_m_vecs and hn_lambda_mats.

  • 'random_responsibility': randomly assign responsibility to r_vecs

Type of initialization, by default 'subsampling'

Returns:
predicted_valuenumpy.ndarray

The estimated values under the given loss function.