EM vs Gibbs Sampler - Results

EM vs Gibbs Sampler - Results

This is an experiment to compare performance of Expectation Maximization (EM) and Gibbs Sampler (GS) in the context of Gaussian Mixture Models.

  • 500 runs each for K = 3 and K = 6 clusters

  • 1000 data points in each

  • Univariate

During Data Generation, Means were generated from a Uniform [-10, 10] distribution. Standard Deviations were generated from a Uniform [0.25, 5] distribution.

import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import matplotlib.pyplot as plt
import seaborn as sns
# Load Data
gs3 = pd.read_csv("gs-k-3.csv")
gs6 = pd.read_csv("gs-k-6.csv")

em3 = pd.read_csv("em-k-3.csv")
em6 = pd.read_csv("em-k-6.csv")

The Data

Gibbs Sampler Results have the following data

  • RS: Rand Score

  • ARS: Adjusted Rand Score

  • SS: Silhouette Score

for each of the three methods

  • Base GS

  • GS with Multiple Initializations

  • GS with Burn In

# GS with K = 3
gs3
filename gs_base_rs gs_base_ars gs_base_ss gs_multi_rs gs_multi_ars gs_multi_ss gs_burnin_rs gs_burnin_ars gs_burnin_ss
0 data-k-3-0.csv 0.485956 -0.000537 -0.255483 0.547105 0.017980 0.245627 0.536328 0.003276 -0.020430
1 data-k-3-1.csv 0.620989 0.230360 0.023708 0.683604 0.317101 0.473936 0.648256 0.271258 0.085944
2 data-k-3-2.csv 0.797538 0.534419 0.491607 0.765169 0.508474 0.488942 0.803904 0.549906 0.500306
3 data-k-3-3.csv 0.729241 0.335387 -0.061440 0.686635 0.367634 0.295017 0.676150 0.236650 0.249176
4 data-k-3-4.csv 0.609772 0.238812 0.384276 0.588667 0.191067 0.519122 0.614503 0.252042 0.619329
... ... ... ... ... ... ... ... ... ... ...
492 data-k-3-495.csv 0.663618 0.315058 0.227498 0.759620 0.516875 0.588444 0.717594 0.429114 0.455768
493 data-k-3-496.csv 0.625051 0.254764 0.189085 0.668693 0.344932 0.499449 0.638194 0.283535 0.225605
494 data-k-3-497.csv 0.848749 0.279924 0.187651 0.644931 0.155684 0.370003 0.821467 0.258756 0.108153
495 data-k-3-498.csv 0.469934 -0.009351 -0.380859 0.539754 0.059766 0.372744 0.507011 -0.001281 -0.084711
496 data-k-3-499.csv 0.765938 0.436257 -0.053071 0.583195 0.235119 0.438554 0.736348 0.335428 0.280845

497 rows × 10 columns

The dataframe for EM has the Adjusted Rand Score (ARS) results for EM in 2 modes:

  • EM with Many Random Initializations (gmm_mri_ars)

  • EM with K-Means Initialization (gmm_kmeans_ars)

And the final column is the standard K-Means Clustering results (kmeans_ars)

# EM with K = 3
em3
file gmm_mri_ars gmm_kmeans_ars kmeans_ars
0 data-k-3-0.csv -0.018932 0.024058 0.024468
1 data-k-3-1.csv 0.365575 0.119006 0.142713
2 data-k-3-2.csv 0.675526 0.306482 0.221317
3 data-k-3-3.csv -0.003321 0.023570 0.053617
4 data-k-3-4.csv 0.260538 0.159359 0.189642
... ... ... ... ...
495 data-k-3-495.csv 0.538792 0.540371 0.455710
496 data-k-3-496.csv 0.481009 0.253717 0.267321
497 data-k-3-497.csv 0.376996 0.061681 0.040540
498 data-k-3-498.csv 0.130798 0.070550 0.086350
499 data-k-3-499.csv 0.472284 0.100273 0.112142

500 rows × 4 columns

Results

The plots are interactive.

K = 3

fig = go.Figure()
fig.add_trace(go.Box(y=gs3['gs_base_ars'], name="GS Base"))
fig.add_trace(go.Box(y=gs3['gs_burnin_ars'], name="GS Burn In"))
fig.add_trace(go.Box(y=gs3['gs_multi_ars'], name="GS Multi"))
fig.add_trace(go.Box(y=em3['gmm_mri_ars'], name="EM Multi Init"))
fig.add_trace(go.Box(y=em3['gmm_kmeans_ars'], name="EM K-Means Init"))
fig.add_trace(go.Box(y=em3['kmeans_ars'], name="Standard K-Means"))
fig.update_layout(title_text="K = 3")
fig.show()

K = 6

fig = go.Figure()
fig.add_trace(go.Box(y=gs6['gs_base_ars'], name="GS Base"))
fig.add_trace(go.Box(y=gs6['gs_burnin_ars'], name="GS Burn In"))
fig.add_trace(go.Box(y=gs6['gs_multi_ars'], name="GS Multi"))
fig.add_trace(go.Box(y=em6['gmm_mri_ars'], name="EM Multi Init"))
fig.add_trace(go.Box(y=em6['gmm_kmeans_ars'], name="EM K-Means Init"))
fig.add_trace(go.Box(y=em6['kmeans_ars'], name="Standard K-Means"))
fig.update_layout(title_text="K = 6")
fig.show()

Conclusions

Gibbs Sampler with Multi Init, and both of the EM versions perform better than the standard K-Means.

Between GS and EM, GS with Multi Init seems to be performing the best, by a slight margin over EM.