LECTURE 1Introduction2 HandoutsLecture outlineGoals and mechanics of the class • notation • entropy: definitions and properties • mutual information: definitions and prop₭ erties Reading: Ch. 1, Scts. 2.1-2.5. GoalsOur goals in this class are to establish an understanding of the intrinsic properties of transmission of information and the relatiio between coding and the fundamental limits of information transmission in the context of communications Our class is not a comprehensive introductiio to the field of information theory and will not touch in a significant manner on such important topics as data compression and complexity, which belong in a source-coding class Notation– random variable (r.v.) : X – sample value of a random variable : x– set of possible sample values x of ther.v. X : X – Probability mass function (PMF) of a discrete r.v. X : PX(x) – Probability density function (pdf) of a continuous r.v. : pX(x) EntropyEntropy is a measure of the average un₭ certainty associated with a random variabbl The entropy of a discrete r.v. X is H(X)=• PX(x)log2(PX(x))− �x∈Xentropy is always non-negative • Joint entropy: the entropy of two dis₭ crete r.v.s X, Y with joint PMF PX,Y (x, y) is: H(X, Y )= − �x∈X ,y∈Y PX,Y (x, y)log2 �PX,Y (x, y)� Conditional entropy: expected value of • entropies calculated according to conditioona distributions H(YX)= EZ[H(YX =||Z)] for r.v. Z independent of X andidentically distributed with X. Intuitively,this is the average of the entropy of Ygiven X over all possible values of X.Conditional entropy: chain rule H(YX)= EZ[H(YX = Z)] = − �|PX(x) � PY ||X(y|x)log2[PY |X(y|x)] x∈X y∈Y = − � PX,Y (x, y) log2[PY |X(y|x)] x∈X ,y∈Y Compare with joint entropy: H(X, Y ) = − � PX,Y (x, y) log2[PX,Y (x, y)] x∈X ,y∈Y = − � PX,Y (x, y) log2[PY |X(y|x)PX(x)] x∈X ,y∈Y= − � PX,Y (x, y) log2[PY |X(y|x)]x∈X ,y∈Y− � PX,Y (x, y) log2[PX(x)]x∈X ,y∈Y= H(YX)+ H(X)|This is the Chain Rule for entropy: H(X1, . . . , Xn)= �ni=1 H(Xi|X1 . . . Xi−1). Questiion H(YX)= H(XY )?||Relative entropyRelative entropy is a measure of the distaanc between two distributions, also known as the Kullback Leibler distance between PMFs PX(x) and PY (y). Definition: D(PX||PY )= �x∈X PX(x) log �PX (x)� PY (x) in effect we are considering the log to be ar.v. of which we take the mean (note that we assume 0log(0) = 0 and p log(p )= ∞p 0Mutual informationMutual Information: let X, Y be r.v.s with joint PMF PX,Y (x, y) and marginal PMFs PX(x) and PY (y) Definition: I(X; Y ) � PX,Y (x, y) � = � PX,Y (x, y) log PX(x)PY (y)x∈X ,y∈Y= D �PX,Y (x, y)||PX(x)PY (y)�intuitively: measure of how dependent ther.v.s are Useful expression for mutual information:I(X; Y )= H(X)+ H(Y ) − H(X, Y ) = X)H(Y ) − H(Y |= Y )H(X) − H(X|= I(Y ; X) Question: what is I(X; X)?� Mutual information chain ruleConditional mutual information: I(X; YZ)=|H(X|Z) − H(X|Y, Z) I(X1, . . . , Xn; Y ) = H(X1, . . . , Xn) − H(X1, . . . , Xn|Y ) = Hn(X1, . . . , Xn) − H(X1, . . . , Xn|Y ) =H(Xi|X1 . . . Xi−1)i=1n�− n�H(Xi|X1 . . . Xi−1,Y )i=1=I(Xi; Y |X1 . . . Xi−1)i=1 Look at 3 r.v.s: I(X1,X2; Y )= I(X1; Y )+I(X2; YX1) where I(X2; YX1) is the extra ||information about Y given by X2, but not given by X1 MIT OpenCourseWarehttp://ocw.mit.edu 6.441 Information Theory Spring 2010 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Description
The main objectieve of this lecture notes are to establish an understanding of the intrinsic properties of transmission of information and the relation between coding and the fundamental limits of information transmission in the context of communications and Entropy, it is a measure of the average uncertainty associated with a random variable and diffrent types of Entropy 1.Conditional entropy and 2.Relative entropy.
1.Conditional entropy: expected value of entropies calculated according to conditional distributions H(YX)= EZ[H(YX =||Z)] for r.v. Z independent of X and identically distributed with X. Intuitively, this is the average of the entropy of Y given X over all possible values of X.
2. Relative entropy : It is a measure of the distance between two distributions, also known as the Kullback Leibler distance between PMFs PX(x) and PY (y).
Instructors: Prof. Muriel Médard , MIT Course Number:6.441 Level: Graduate , 6.441-1 Introduction to Information Theory, Entropy, Information Theory, Electrical Engineering and Computer Science, Engineering, Massachusetts Institute of Technology: MIT Open Course Ware, http://ocw.mit.edu (11-09-2011).License: Creative Commons BY-NC-SA: http://ocw.mit.edu/terms/#cc".