Calculation of Term 1 in G

I am having trouble understanding why for term 1, the sum is taken between the two computed entropy terms:

https://github.com/zfountas/deep-active-inference-mc/blob/c40ef0dd9c16c4d98abfa1f9392d9777f1899dc9/src/tfmodel.py#L343-L344

I understand this gives the sum of two entropy terms (where `ps1` $= p_\tau = s_\tau|\pi$ and `qs1` $= q_\tau = s_\tau|o_\tau,\pi$ ):

$$- \sum_\tau ( \frac{1}{2} \log$2 e \pi \sigma^2_{p_\tau} $ + \frac{1}{2} \log $2 e \pi \sigma^2_{q_\tau} $ \) \
\xrightarrow{\text{simplify}} - \sum_\tau ( H_{p_\tau} + H_{q_\tau} ) $$

But in the paper, we see that term 1 is given by:

$$\sum_\tau E_{Q(\theta | \pi)}[E_{Q(o_\tau | \theta, \pi)}[H(s_\tau | o_\tau, \pi)] - H(s_\tau | \pi)] \
\xrightarrow{\text{simplify}} + \sum_\tau ( H_{q_\tau} - H_{p_\tau} ) $$

Why is there the discrepancy between the "+" and the "-"? Or where is my understanding breaking down? Am I simplifying the equations incorrectly? If so, can you explain how to correctly transform between the two?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calculation of Term 1 in G #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

	# E [ log Q(s\|pi) - log Q(s\|o,pi) ]
	term1 = - tf.reduce_sum(entropy_normal_from_logvar(ps1_logvar) + entropy_normal_from_logvar(qs1_logvar), axis=1)

Calculation of Term 1 in G #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions