SciGood: Function estimation

Ok, here's the idea. Suppose I had a function $$f:X \to Y$$ and distance metrics $d_X:X \to \mathbb{R}$ and $d_Y:Y \to \mathbb{R}$. If I assumed this function was Lipschitz continuous with a Lipschitz constant $L$ I could say $$d_Y(f(x),f(z)) \leq L d_X(x,z)$$. This isn't very controversial. Suppose I didn't know much about $f$ and thus sampled it at $n$ points, $x_1$ through $x_n$. If I wanted to know about the value of $f$ at a point $z$, I could say that $d_Y(f(x_n),f(z)) \leq L d_X(x_n,z) \forall n$. Still not very controversial, but hey, we have bounds.

Let's think now, what happens if $Y$ is standard one-dimensional Euclidean space? Well, then we now have $$\|f(x_n)-f(z)\| \leq L d_X(x_n,z) \forall n$$. Well, since distance is always non-negative we can say $-L d_X(x_n,z) \leq f(z)-f(x_n) \leq L d_X(x_n,z)$. If we add that $f(x_n)$ term to all parts we then get: $$f(x_n)-L d_X(x_n,z) \leq f(z) \leq f(x_n) + L d_X(x_n,z)$$Now, since we have these inequalities for all $n$ we can say $$\max_n \left(f(x_n)-L d_X(x_n,z)\right) \leq f(z) \leq \min_n \left(f(x_n) + L d_X(x_n,z)\right)$$Since all parts of this inequality are calculable, this presents us with a clean way to bound our estimate.

Now we can start being a little fun. Supposing we know nothing about the underlying function, if we assume our samples are not 'special' relative to the underlying function (and we assume a bunch of other very general things which I don't want to do now because I need to sleep), it's valid to treat the underlying function as a set of observations of a Brownian motion with a variance $L$. From this we can at a minimum identify a MLE for $L$ and use that within the above bounds calculation.

SciGood

Function estimation

No comments:

Post a Comment

March thoughts

Search This Blog