How much is one bit of information gain?
And using Shannon information as a measure of information gain:
In this interpretation, Bayesian inference can tell us about the information gain of the knowledge update (going from prior to posterior distribution), measured in
bits (when using
) or
nats (
).
1
How much is one bit?
-
To a physicist, bits/nats are not natural units.
-
The strength of information is commonly expressed in sigma though.
-
So what is the knowledge update going from a Gaussian with to ?
The prior (q) and posterior (p) are expressed as Gaussian functions of the parameter x. Their shape is a fixed
or
.
,
The KL-distance is
, and applied here the distance between prior and posterior is thus:
1.1
Sigma versus bits
As you see in the above graphic, the more the Gaussian shrinks (uncertainty decreases), the larger the information gain becomes (in bits). This provides an intuitive view of nats/bits for physicists.
.