Let $X_1, \dots, X_n \sim \textExponential(\lambda)$. The pdf is $f(x) = \lambda e^-\lambda x$ for $x>0$. Find the MLE for $\lambda$.
Step 1: Likelihood $$L(\lambda) = \prod_i=1^n \lambda e^-\lambda x_i = \lambda^n \exp\left(-\lambda \sum_i=1^n x_i\right)$$
Step 2: Log-Likelihood $$l(\lambda) = n \ln(\lambda) - \lambda \sum_i=1^n x_i$$
Step 3: Differentiate $$\fracdd\lambda l(\lambda) = \fracn\lambda - \sum_i=1^n x_i$$
Step 4: Set to 0 and Solve $$\fracn\lambda = \sum_i=1^n x_i \implies \lambda = \fracn\sum x_i$$ $$\hat\lambda_MLE = \frac1\barX$$ (This makes sense; the rate parameter $\lambda$ is the inverse of the average time).
As the lecture ends, the professor returns to the opening question: How do we learn from random data? The answer, now visible through the mathematical scaffolding, is this: We learn by constructing estimators and tests whose long-run frequency properties we can prove, whose information bounds we can derive, and whose optimality we can characterize. The randomness never disappears, but mathematical statistics gives us a language to quantify, bound, and even embrace that randomness.
The students pack their notebooks, the blackboard is erased, and the likelihood functions vanish into chalk dust. But the architecture remains—an enduring, rigorous, and beautiful framework for making sense of a world we can never fully observe.
This concludes the deep write-up. The mathematical statistics lecture, at its best, is not a collection of formulas but a narrative about certainty, uncertainty, and the extraordinary power of optimal inference.
Professors erase boards quickly. Use your phone. Take a photo of the completed proof before they erase it. Use an app like Notability or OneNote to import that photo and annotate it later. mathematical statistics lecture
For students, listening to a derivation of the Cramér–Rao bound can feel like watching a magic trick from the third row. Here is how to move to the front row.
In this lecture, we established that:
Next Lecture: We will evaluate the lower bound of variance for unbiased estimators (Cramér-Rao Lower Bound) and introduce Interval Estimation (Confidence Intervals).
If you are looking for a definitive resource that bridge the gap between lecture concepts and high-level theory, the
Institute of Mathematical Statistics (IMS) Lecture Notes – Monograph Series is the premier collection.
For a specific article that provides a comprehensive look at fundamental concepts used in mathematical statistics, I recommend:
Matching Methods for Causal Inference: A Review and a Look Forward Source: Statistical Science (via Project Euclid)
Why it’s a good choice: While "Mathematical Statistics" covers the math behind data, this article focuses on Causal Inference, one of the most practical and lecture-heavy applications of the field. It provides a structured way to think about matching methods—reducing bias and replicating randomized experiments—which are core topics in graduate-level statistics. Other Noteworthy Resources Let $X_1, \dots, X_n \sim \textExponential(\lambda)$
If you are looking for specific lecture-style materials or deeper dives into particular theories: For Core Foundations: Robust Estimation of a Location Parameter
is a classic paper that explains how to define estimators when your data doesn't perfectly follow a standard distribution. For Testing Hypotheses: The χ2chi squared Test of Goodness of Fit
is an expository discussion written specifically for students and users of statistical theory rather than just experts. It covers historical development and practical applications of the chi-square test. For Advanced Nonparametrics: The IMS Lecture Notes series contains volumes like
Recent Developments in Nonparametric Inference and Probability
, which provides a rigorous look at signal detection and modern estimation problems.
For Lecture Notes (Introductory): If you need actual structured notes for study, BYJU's Mathematical Statistics Overview
provides a clear starting point for the collection, analysis, and organization of data.
Recent Developments in Nonparametric Inference and Probability As the lecture ends, the professor returns to
The lecture moves to estimation. The Method of Moments is introduced first—intuitive, ancient, but statistically inefficient. Then, the crown jewel: Maximum Likelihood Estimation (MLE). The professor writes:
[ \hat\theta\textMLE = \arg\max\theta \in \Theta L(\theta; x) ]
The MLE is not just a recipe; it is a theorem waiting to happen. Under regularity conditions, the lecture will sketch the proof of its consistency (as sample size grows, the estimator converges to the true value) and asymptotic normality:
[ \sqrtn(\hat\theta - \theta) \xrightarrowd N(0, I(\theta)^-1) ]
Here, ( I(\theta) ) is the Fisher information—a measure of how much information the data carry about ( \theta ). The Cramér-Rao lower bound, derived earlier, now reveals its teeth: no unbiased estimator can have variance lower than ( 1/I(\theta) ). The MLE asymptotically achieves this bound. It is, in the limit, the best possible.
This is the heart of the mathematical statistics lecture. It moves in a cycle:
The distribution of a statistic (over repeated sampling) is its sampling distribution. This is the key to inference.
Example: If ( X_i \stackreli.i.d.\sim N(\mu, \sigma^2) ), then: [ \barX \sim N\left(\mu, \frac\sigma^2n\right) ]