There are many mathematical definitions of entropy.  The mental picture I find most useful is to imagine the following:

• you are put in a room and your job is to label everything in the room with a sharpy indelible marker and masking tape.
• You are asked to label everything in the room using the binary numbering system.  This binary number will be that particular objects I.D.

As you go about this you may want to number the objects you most commonly refer to with the lower digits that have less length.  That way since you mention "FORK" much more often than "NUMBER 6 SCREW" you will end up having to say less digits.

The measure of entropy in this room is the number of binary digits required to number all the objects.  This is entropy.  The formula for this sentence that I just said is:

Entropy ~= log2N    where N is the number of different types of objects in the room

Now in a probabalistic situation with outcomes  x1 , x2 ….  xn    with P(xi) = probability of xi

entropy = H(x)= SUM [ -P(xi) * log2( P(xi) ) ]      This formula calculates the expectation of the number of required digits to enumerate the outcomes.

Now let us compare this to a realistic situation in the form of the good old fashioned coin flip with

• P(tails) = 1/2

Entropy = -(1/2) * ( -1)  -(1/2)*(-1) = 1 bit

Thus in order to enumerate all the states of a coin you require 1 bit.  So you just call heads bit = 1 and tails bit = 0 ….

… and thus you only need 1 bit.

If you have a strange coin that always comes up heads:  That is to say P(Heads)=1 then:

Entropy = -(1) * (0)= 0

Thus for your weirdo coin that only flips out heads you need no bits to enumerate its states.  There is no entropy.   You always get heads sucker!  Or heads I win tails you lose!

Categories: Information-Theory