[ANN] VersatileHDPMixtureModels.jl: Distributed inference for the Versatile HDPMM, and for inference in HDPMM Like setting

Dinari · August 10, 2020, 11:21am

VersatileHDPMixtureModels.jl

Git
This package is the code for our UAI '20 paper titled “Scalable and Flexible Clustering of Grouped Data via Parallel and Distributed Sampling in Versatile Hierarchical Dirichlet Processes”.

Paper,
Supplemental Material

What can it do?

This package allows to perform inference in the vHDPMM setting, as described in the paper, or as an alternative, it can perform inference in HDPMM setting.

Quick Start

Get Julia from here, any version above 1.1.0 should work, install, and run it.
Add the package ]add VersatileHDPMixtureModels.
Add some processes and use the package:

using Disrtibuted
addprocs(2)
@everywhere using VersatileHDPMixtureModels

Now you can start using it!

For the HDP Version:

# Sample some data from a CRF PRIOR:
# We sample 3D data, 4 Groups, with $\alpha=10,\gamma=1$. and variance of 100 between the components means.
crf_prior = hdp_prior_crf_draws(100,3,10,1)
pts,labels = generate_grouped_gaussian_from_hdp_group_counts(crf_prior[2],3,100.0)


#Create the priors we opt to use:
#As we want HDP, we set the local prior dimension to 0, and the global prior dimension to 3
gprior, lprior = create_default_priors(3,0,:niw)

#Run the model:
model = hdp_fit(pts,10,1,gprior,100)

#Get results:
model_results = get_model_global_pred(model[1]) # Get global components assignments
##

Running the vHDP full setting:

#Generate some data:
#We generate gaussian data, 20K pts each group, Global Dim= 2, Local Dim = 1, 3 Global components, 5 Local in each group, 10 groups:
pts,labels = generate_grouped_gaussian_data(20000, 2, 1, 3, 5, 10, false, 25.0, false)

#Create Priors:
g_prior, l_prior = create_default_priors(2,1,:niw)


#Run the model:
vhdpmm_results = vhdp_fit(pts,2,100.0,1000.0,100.0,g_prior,l_prior,50)

#Get global and local assignments for the points:
vhdpmm_global = Dict([i=> create_global_labels(vhdpmm_results[1].groups_dict[i]) for i=1:length(data)])
vhdpmm_local = Dict([i=> vhdpmm_results[1].groups_dict[i].labels for i=1:length(data)])

Examples:

Coseg with super pixels

vHDP as HDP

Missing data experiment

Synthethic data experiemnt

License

This software is released under the MIT License (included with the software). Note, however, that if you are using this code (and/or the results of running it) to support any form of publication (e.g., a book, a journal paper, a conference paper, a patent application, etc.) then we request you will cite our paper:

@inproceedings{dinari2020vhdp,
  title={Scalable and Flexible Clustering of Grouped Data via Parallel and Distributed Sampling in Versatile Hierarchical Dirichlet Processes},
  author={{Dinari, Or and Freifeld, Oren},
  booktitle={UAI},
  year={2020}
}

Misc

For any questions: dinari at post.bgu.ac.il

Contributions, feature requests, suggestion etc… are welcomed.

Topic		Replies	Views
[ANN] DPMMSubClusters.jl - Fast, Distributed, Scaleable inference for Dirichlet Process Mixture Models Package Announcements	0	1106	August 23, 2019
[ANN] AugmentedGaussianProcesses.jl Package Announcements	3	861	July 1, 2019
KernelMatrices.jl - A software package for working with hierarchical matrices Package Announcements package	22	5910	August 9, 2024
[ANN] Announcing ItPropFit.jl Package Announcements announcement	2	431	September 12, 2022
Uncertainty quantification of inverse problem using hierarchical Bayes: high-level modelling Probabilistic programming	12	1680	September 25, 2019