6.041 / 6.431 23. Classical Statistical Inference - I
Maximum Likelihood Estimation • Model, with unknown parameter(s): X ∼ pX(x; θ) • Pick θ that “makes data most likely” θˆML = arg max pX(x; θ) θ • Compare to Bayesian MAP estimation: pXΘ(x|θ)pΘ(θ) θˆMAP = max |θ pX(x) • Example: X1,...,Xn: i.i.d., exponential(θ) n max θ �θe−θxi i=1 max θ �n nlog θ−θ i�xi �=1 θˆML = n/(x1 + ...+ xn) n Θˆn = X1 + ···+ Xn LECTURE 23 • Readings: Section 9.1 (not responsible for t-based confidence intervals, in pp. 471-473) • Outline – Classical statistics – Maximum likelihood (ML) estimation – Estimating a sample mean – Confidence intervals (CIs) – CIs using an estimated variance Classical statistics N X ΘˆEstimator θ pX(x; θ)• also for vectors X and θ: pX1,...,X(xn1,...,xn; θ1,...,θm) • These are NOT conditional probabilities; θ is NOT random – mathematically: many models, one for each possible value of θ • Problem types: – Hypothesis testing: H0 : θ =1/2 versus H1 : θ =3/4 – Composite hypotheses: H0 : θ =1/2 versus H1 : θ =1/2 – Estimation: design an estimator Θ,ˆ to keep estimation error Θˆ −θ small Desirable properties of estimators (should hold FOR ALL θ!!!) • Unbiased: E[ Θˆn]= θ – exponential example, with n =1: E[1/X1]= ∞ = θ (biased) • Consistent: Θˆn → θ (in probability) – exponential example: (X1 + ···+ Xn)/n → E[X]=1/θ – can conclude that: Θˆn = n/(X1 + ···+ Xn) → 1/E[X]= θ • “Small” mean squared error (MSE) E[( Θˆ −2θ)]=var(Θˆ −2 θ)+(E[ Θˆ −θ]) =var(ˆ + 2Θ)(bias) |Confidence intervals (CIs) • An estimate Θˆ n may not be informative enough • An 1 −α confidence interval +is a (random)interval [ Θˆ − n , Θˆ n ], s.t. P(Θˆ n− ≤ θ ≤ + Θˆ n) ≥ 1 −α, ∀ θ – often α =0.05, or 0.25, or 0.01 – interpretation is subtle • CI in estimation of the mean Θˆ n =(X1 + ···+ Xn)/n – normal tables: Φ(1.96) = 1 −0.05/2 �Θˆ P n − θ σ /√ | 1.96 0.95(CLT) n ≤�≈ �1.96 σ 1.96 σP Θˆ n− √ ≤ θ ≤ Θˆ n+ √ �≈ 0.95n n More generally: let z be s.t. Φ(z)=1−α/2 P�zσ zσ Θˆ n − √ ≤ θ ≤ Θˆ n +n √ � ≈ 1 −αnThe case of unknown σ • Option 1: use upper bound on σ – if Xi Bernoulli: σ ≤ 1/2 • Option 2: use ad hoc estimate of σ – if Xi Bernoulli(θ): σˆ = �Θ(1ˆ − Θ)ˆ • Option 3: Use generic estimate of the variance – Start from 2 σ= E[(Y −θ)2] 1 n ˆ 2 2 2 σn = �(Yi i−θ)σn → =1 (but do not know θ) 1 n n �2 2 Sˆ2 =(Y ˆ i Θn) σn −1i=1 −→(unbiased: E[Sˆ2 n ]= 2σ) An example of an exact CI • X: exponential with parameter θ • Analyze [a/X, b/X]as a confidence interval for θ: a b a bP � X ≤ θ ≤X �= P �θ ≤ X ≤ θ � = �b/θ θe−θx dxa/θ x=b/θ = −e −θx � � = e− a −b e −�x=a/θ No dependence on θ,so have a confidence interval 4– Example: �1 , is a 0.76 confidence 4X X �interval (“76% CI”) Estimate a mean • X1,...,Xn: i.i.d., mean 2 θ, variance σXi = θ + Wi Wi: i.i.d., mean, 0, variance 2 σX1 + ···+ Xˆ n Θn = sample mean = Mn = n Properties: • E[Θˆ n]= θ (unbiased) • WLLN: Θˆ n → θ (consistency) • MSE: 2σ/n • Sample mean often turns out to also be the ML estimate. E.g., if Xi ∼ N(θ, 2σ), i.i.d. MIT OpenCourseWare http://ocw.mit.edu 6.041 /6.431 Probabilistic Systems Analysis and Applied Probability Fall 2010 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Description
This lecture notes introduces Classical Statistical Inference . Various topics covered under this section are:
1. Classical statistics
2. Maximum likelihood (ML) estimation
3. Estimating a sample mean
4. Confidence intervals (CIs) and
5. CIs using an estimated variance
Instructors: Prof.Dimitri Bertsekas, Prof. John Tsitsiklis, MIT Course Number: 6.041 / 6.431 Level: Undergraduate / Graduate , 6.041 / 6.431 23. Classical statistical inference - I, Probabilistic Systems Analysis and Applied Probability, Electrical Engineering and Computer Science, Engineering, Massachusetts Institute of Technology: MIT Open Course Ware, http://ocw.mit.edu (11-11-2011). License: Creative Commons BY-NC-SA: http://ocw.mit.edu/terms/#cc.
Presentation Transcript
Your Facebook Friends on WizIQ