519 words
3 minutes
[CS5340] Variable Elimination & Belief Propogation (Sum-Product Algorithm)

Goal: To calculate p(xFxE)p(x_F|x_E)

  • where xFx_F, xEx_E are disjoint subsets
  • xFx_F: query node
    xEx_E: evidence node
  • needs to marginalize all nuisance variables xRx_R (xV(xE,xF)x_V - (x_E, x_F))
    • p(xFxE)=p(xF,xE)p(xE)=xRp(xf,xe,xR)xFp(xF,xE)O(Kn)p(x_F | x_E) = \frac {p(x_F, x_E)} {p(x_E)} = \frac {\sum_{x_R} p(x_f, x_e, x_R)}{\sum_{x_F} p(x_F, x_E)} \Rightarrow O(K^n) (exponential, intractable!)

Variable elimination (the naive way)#

variable elimination example

  • p(x1,...,x5)=x6p(x1)p(x2x1)p(x3x1)p(x4x2)p(x5x3)p(x6x3,x5)=p(x1)p(x2x1)p(x3x1)p(x4x2)p(x5x3)x6(x6x3,x5)p(x_1, ..., x_5) = \sum_{x_6} p(x1)p(x_2|x_1)p(x_3|x_1)p(x_4|x_2)p(x_5|x_3)p(x_6|x_3, x_5) \\ = p(x1)p(x_2|x_1)p(x_3|x_1)p(x_4|x_2)p(x_5|x_3) \sum_{x_6} (x_6|x_3, x_5) \Rightarrow reduced table size to K3K^3
  • When evidence node is observed (x6=xˉ6x_6 = \bar x_6):
    • e.g p(x6x2,x5)p(x6=1x2,x5)p(x_6|x_2, x_5) \Rightarrow p(x_6 = 1 | x_2, x_5)
      table from 3D (x2,x5,x6x_2, x_5, x_6) to 2D (x2,x5x_2, x_5), x6=1x_6=1
    • p(x1xˉ6)=x2x3x4x5p(x1)p(x2x1)p(x3x1)p(x4x2)p(x5x3)p(xˉ6x3,x5)=p(x1)x2p(x2x1)x3p(x3x1)x4p(x4x2)x5p(x5x3)p(xˉ6x3,x5)=p(x1)x2p(x2x1)x3p(x3x1)x4p(x4x2)×m5(x2,x3)=p(x1)x2p(x2x1)x3p(x3x1)m5(x2,x3)x4p(x4x2)=p(x1)x2p(x2x1)m4(x2)x3p(x3x1)m5(x2,x3)=p(x1)x2p(x2x1)m4(x2)m3(x1,x2)=p(x1)m2(x1)=p(x1,xˉ6)p(x1xˉ6)=p(x1)m2(x1)x1p(x1)m2(x1)p(x_1|\bar x_6) = \sum_{x_2} \sum_{x_3} \sum_{x_4} \sum_{x_5} p(x_1)p(x_2|x_1)p(x_3|x_1)p(x_4|x_2)p(x_5|x_3)p(\bar x_6|x_3, x_5) \\ = p(x_1) \sum_{x_2} p(x_2|x_1) \sum_{x_3} p(x_3|x_1) \sum_{x_4} p(x_4|x_2) \sum_{x_5} p(x_5|x_3)p(\bar x_6|x_3, x_5) \\ = p(x_1) \sum_{x_2} p(x_2|x_1) \sum_{x_3} p(x_3|x_1) \sum_{x_4} p(x_4|x_2) \times m_5 (x_2, x_3) \\ = p(x_1) \sum_{x_2} p(x_2|x_1) \sum_{x_3} p(x_3|x_1) m_5 (x_2, x_3) \sum_{x_4} p(x_4|x_2) \\ = p(x_1) \sum_{x_2} p(x_2|x_1) m_4(x_2)\sum_{x_3} p(x_3|x_1) m_5 (x_2, x_3) \\ = p(x_1) \sum_{x_2} p(x_2|x_1) m_4(x_2) m_3(x_1, x_2) \\ = p(x_1) m_2(x_1) \\ = p(x_1, \bar x_6) \\ \Rightarrow p(x_1 | \bar x_6) = \frac {p(x_1) m_2(x_1)}{\sum_{x_1} p(x_1) m_2 (x_1)}
    • mi(Sxi)m_i (S_{x_i}): when performing xi\sum_{x_i}, xsix_{s_i} are (variables in summand xi- x_i)
NOTE

Conditioning \Rightarrow Marginalization trick for Notation

  • evidence potential: δ(xi,xˉi)={1,ifxi=xˉi0,otherwise\delta (x_i, \bar x_i) = \begin{cases} 1, if x_i = \bar x_i \\ 0, otherwise \end{cases}
  • g(xˉi)=xig(xi)δ(xi,xˉi)g(\bar x_i) = \sum_{x_i} g(x_i) \delta (x_i, \bar x_i)
  • δ(xE,xˉE)=iEδ(xi,xˉi)={1,ifxi=xˉ10,otherwise\delta (x_E, \bar x_E) = \prod_{i \in E} \delta (x_i, \bar x_i) = \begin{cases} 1, if x_i = \bar x_1 \\ 0, otherwise \end{cases}

Algorithm of variable elimination#

algo

Variable elimination algorithm for Undirected Graph#

  • Only change in initialize step
    • local condition (parent-child relation from DGM) \rightarrow potential of 𝜓xc(xc)𝜓_{x_c} (x_c)
  • e.g p(x1,xˉ6)=1Zx2x3x4x5𝜓(x1,x2)𝜓(x1,x3)𝜓(x2,x4)𝜓(x3,x5)𝜓(x2,x5,x6)𝜓(x6,xˉ6)p(x_1, \bar x_6) = \frac {1}{Z} \sum_{x_2}\sum_{x_3}\sum_{x_4}\sum_{x_5} 𝜓(x_1, x_2) 𝜓(x_1, x_3) 𝜓(x_2, x_4) 𝜓(x_3, x_5) 𝜓(x_2, x_5, x_6) 𝜓(x_6, \bar x_6)

Reconstituted graph#

  • Eliminate nodes and connect its (remaining) neighbors following the elimination order
  • Reconstituted graph: original graph + new added edges
  • complexity: the size of the largest clique in reconstructed graph reconstituted graph algo
  • Treewidth: smallest overall complexity 1-1 (NP-hard)
    • we want to minimize the cardinality of largest elimination clique

Moralization#

to trandorm DGM to UGM

  • for every node in DGM, link its parents together to trandform to UGM
  • So, to transfrom DGM to reconstituted graph DIRECTEDGRAPHELIMINATE
    • first we moraliza the DGM,
    • then we perform the UNDIRECTEDGRAPHELIMINATE algorithm
WARNING

We need to re-run for every query node to get each singleton conditional probability! Very inefficient!

  • Sol: use the sum-product algorithm to get all single-node marginals for tree-like structures

Tree-like structures#

tree-like_structure

  • (a) Undirected tree: no loop
  • (b) Directed tree: 1 single parent per node, moralizations will not add link
  • (c) Polytree: nodes with more than 1 parent, moralization will lead to loops \Rightarrow cannot perform sum-product algorithm
  • (d) Chain: just another directed tree

Parameterization#

  • Undirected tree: only single & pair in cliques
    • p(x)=1Z(iV𝜓(xi)(i,j)ϵ𝜓(xi,xj))p(x) = \frac {1}{Z} (\prod_{i \in V} 𝜓(x_i) \prod_{(i, j) \in \epsilon} 𝜓(x_i, x_j))
  • Directed tree: simply parent-child relationship
    • p(x)=p(xr)(i,j)ϵp(xjxi)p(x) = p(x_r) \prod_{(i, j) \in \epsilon} p(x_j|x_i)
      • p(xr)p(x_r) is the root node
  • THey are formally identical

Message-Passing Protocol#

  • a node can send a message to a neighbor only when it has received messages from all of its other neighbors (subtree), to ensure the messages passed to uppder node is complete.

Message#

message

Compute marinal probability#

final_message

Sum-Product Algorithm a.k.a Belief Propogation#

sum_prod_algo

  1. message flow from leaves to root (inward)
  2. message flow from root to leaves (outward) variable elimination example
[CS5340] Variable Elimination & Belief Propogation (Sum-Product Algorithm)
https://itsjeremyhsieh.github.io/posts/cs5340-4-variable-elimination-and-belief-propogation/
Author
Jeremy H
Published at
2024-09-02