In all likelihood I’ll post here a random selection of topics I’m interested in (machine learning, statistics and possible some simple mathematics). Partly to force myself to work out “clean” arguments. Even better if some posts are of use for other people.

Measurable Estimator Selection

In statistics or machine learning we estimate parameters/functions from data and analyze the estimate, for instance by providing confidence regions around it. To be able to analyze the estimate it is obviously useful to be able to use expressions of the form \Pr(\hat \theta \in B), where B is some region in parameter space. Generally we have some inference method that takes an experiment and assigns a parameter based on the experiment. Calling this function S we have \Pr(\hat \theta \in B) = \Pr(S^{-1}[B]) and we need to know that S is measurable, or in other words that the estimator selection is measurable.

Let (\Omega,\mathcal{A},P) be some probability space with i.i.d. random variables X_1,\ldots, X_n defined on it and some parameter space \Theta from which we want to choose the element \hat \theta that maximizes the cost function M(\theta, X_1,\ldots, X_n), where M attains values in \mathbb{R}. \hat \theta is called an M-estimator and one might ask when the selection S(\omega) = \arg \max_{\theta \in \Theta} M'(\theta, \omega) is measurable, where M'(\theta,\omega):= M(\theta,X_1(\omega), \ldots,X_n(\omega)).

The most natural setting seems here to be \Theta a compact topological space equipped with the Borel-algebra and \theta \mapsto M'(\theta,\omega) continuous with a unique maximum for every \omega \in \Omega and measurable as a map \omega \mapsto M'(\theta,\omega) for every \theta \in \Theta. The maximum is then well defined and \omega \mapsto \max_{\theta \in \Theta} M'(\theta,\omega) is measurable since the maximum can be reduced to a maximum over a countable set which then implies directly measurabilty of the map. Also the selection is measurable. Let \mathcal{O} be an open set in \Theta and \tilde \Theta a countable dense subset of \Theta. Then, using the continuity of M',

\displaystyle \{\omega: S(\omega) \in \mathcal{O}\} = \bigcup_{\theta \in \mathcal{O} } \{\omega: M'(\theta,\omega) \geq \max_{\theta' \in \Theta} M'(\theta',\omega) \}

\displaystyle = \bigcap_{n \in \mathbb{N}} \bigcup_{\theta \in \mathcal{O} \cap \tilde \Theta}  \{\omega: M'(\theta,\omega) \geq \max_{\theta' \in \Theta} M'(\theta',\omega) - 1/n \}  \in \mathcal{A}

This extends then directly to arbitrary Borel-measurable subsets of \Theta.