Differential equations guarantees for function inverse
Distribution recovery (pt.1)
This series is going to be about a method to identify the family of distributions that a set of random samples have been drawn from. This one has to be multi-part because too many different things are involved here, but here's a general table of contents.
- Sort the data
- Note that the distance between sequential data-points is inversely related to the density function
- Note then that the changes in distance between successive points provides information on the derivative of the underlying density function
- Examine the data in the context of it's CDF, allowing for us to derive the probability of particular spacings given a particular PDF
- Compose a finite difference scheme and use it to construct a matrix $F$ that has provably expected values of $\left[p(x),p'(x),p''(x), \cdots \right]$
- Use the knowledge that particular probability distributions have specific families of ODEs that govern them, such that $N\left[p\right] = 0$
- Use the smallest singular vector of $F$ to identify an estimate of the specific differential equation governing the sample distribution
- Match this singular vector to one of the manifolds that contain a family of distributions
- Write angry letters to the guys at fitter
Arbitrarily reshaped QR code
I didn't post it earlier, but here's the code that creates arbitrarily shaped QR-type decompositions. The code is on a Githuib gist here but I'm putting the full code here because there's always a risk github removes the gist functionality or it in some way is damaged.
You can read the comment code to try to understand exactly what this function does, but I'd say it's easier just to run it and look at the output matrices to understand what happens. Note that if 'flipped' is set to True then while the function will still return the decomposed components in the order Q,R, it will be the case that A = RQ.
"""
Arbitrary shape QR style decompositions
Parameters
----------
A : array_like
Input matrix to be decompoosed.
corner : String
The 'dense' corner of the triangular matrix. E.g. top-right in a standard
upper triangular matrix. One of 'bl','tr','br','tl' meaning bottom left,
top right, etc.
flipped : Boolean
Determines if the decomposition should be Orthogonal Triangular, or, if
True, Triangualr Orthogonal.
Returns
-------
Q : array_like
Orthogonal Matrix.
R : array_like
Triangular Matrix.
"""
if flipped:
A = A.T
if 't' in corner:
A = np.flip(A,axis=1)
Q,R = np.linalg.qr(A)
Q = Q.T
R = R.T
if 'r' in corner:
Q = np.flip(Q,axis=0)
R = np.flip(R,axis=1)
if 't' in corner:
R = np.flip(R,axis=0)
else:
if 'l' in corner:
Q,R = np.linalg.qr(np.flip(A,axis=1))
R = np.flip(R,axis=1)
else:
Q,R = np.linalg.qr(A)
if 'b' in corner:
Q = np.flip(Q,axis=1)
R = np.flip(R,axis=0)
return Q,R
Function estimation
Ok, here's the idea. Suppose I had a function $$f:X \to Y$$ and distance metrics $d_X:X \to \mathbb{R}$ and $d_Y:Y \to \mathbb{R}$. If I assumed this function was Lipschitz continuous with a Lipschitz constant $L$ I could say $$d_Y(f(x),f(z)) \leq L d_X(x,z)$$. This isn't very controversial. Suppose I didn't know much about $f$ and thus sampled it at $n$ points, $x_1$ through $x_n$. If I wanted to know about the value of $f$ at a point $z$, I could say that $d_Y(f(x_n),f(z)) \leq L d_X(x_n,z) \forall n$. Still not very controversial, but hey, we have bounds.
Let's think now, what happens if $Y$ is standard one-dimensional Euclidean space? Well, then we now have $$\|f(x_n)-f(z)\| \leq L d_X(x_n,z) \forall n$$. Well, since distance is always non-negative we can say $-L d_X(x_n,z) \leq f(z)-f(x_n) \leq L d_X(x_n,z)$. If we add that $f(x_n)$ term to all parts we then get: $$f(x_n)-L d_X(x_n,z) \leq f(z) \leq f(x_n) + L d_X(x_n,z)$$Now, since we have these inequalities for all $n$ we can say $$\max_n \left(f(x_n)-L d_X(x_n,z)\right) \leq f(z) \leq \min_n \left(f(x_n) + L d_X(x_n,z)\right)$$Since all parts of this inequality are calculable, this presents us with a clean way to bound our estimate.
Now we can start being a little fun. Supposing we know nothing about the underlying function, if we assume our samples are not 'special' relative to the underlying function (and we assume a bunch of other very general things which I don't want to do now because I need to sleep), it's valid to treat the underlying function as a set of observations of a Brownian motion with a variance $L$. From this we can at a minimum identify a MLE for $L$ and use that within the above bounds calculation.
QL decomposition
A couple days ago I encountered a really neat solution to creating a basis using a Cholesky decomposition. Particularly, when we have a set of basis function $Y = {y_0,y_1,\cdots , y_n}$, we will want a set of coefficient vectors $C = {\vec{c}_0,\vec{c}_1,\cdots , \vec{c}_n}$ such that $\langle \sum_j^n c_{k,j} y_j, \sum_i^n c_{l,i} y_i\rangle = \delta_{k,l}$, and, ideally we also wish the constraint $c_{k,j} = 0 \forall j>k$. More succinctly, an orthonormal basis where there are only order $k$ or lower terms for the $k$-th basis element. Putting that into math terms (and using some abusive notation for the matrix of inner products) we want to find $R$ such that $R^TY^TYR = I$.
The thinking man would use a cholesky decomposition and an inverse to say that if $Y^TY = LL^T$ then $L^{-1}Y^TYL^{-T} = I$, and as $L^{-T}$ is upper triangular, we're finished.
I however am not a thinking man, just a mindless automaton, driven by a basic response to stimuli and only distantly related to chordate as a whole. As such I only have two tools at my disposal, an SVD, and a QR decomposition. How then may I solve this?
Given symmetry we know that if $Y^TY = U \Sigma V^T$ then $U = V$ and thus $Y^TY = U \Sigma U^T$. Multiplying out on both sides we get $\Sigma^{-1/2}U^TY^TYU \Sigma^{-1/2}= I$. This is nice but not upper triangular. We note here that if we found $Q$ and $R$ such that $U \Sigma^{-1/2}=RQ$, we could then multiply $Q^TR^TY^TYRQ$ by $Q$ on the left and $Q^T$ on the right to get $R^TY^TYR = QQ^TR^TY^TYRQQ^T = QQ^T = I$.
How then do we find $R$ and $Q$? Well, we see this is equivalent to finding $Q$ and $R$ such that $\Sigma^{-1/2}U^T = Q^T R^T$. This is a QR decomposition with a lower and not upper triangular matrix.
Because common sense can not and will not prevail we do the obvious thing, which is look at what happens when we 'flip' a matrix. Formally, $flip(A^{n \times m},0)_{i,j} = A_{n-i,j}$ and $flip(A^{n \times m},1)_{i,j} = A_{i,m-j}$, that is, flipping the matrix left to right and up and down.
When we apply this flip operation to a matrix product, we see $flip(AB,0) = flip(A,0) B$ and $flip(AB,1) = A flip(B,1)$.
Suppose then that we took a QR of $flip(\Sigma^{-1/2}U^T,1) = \tilde{Q} \tilde{R}$, we see then that $\tilde{Q} ^T flip(\Sigma^{-1/2}U^T,1) = \tilde{R}$, and then that $flip(\tilde{Q} ^T flip(\Sigma^{-1/2}U^T,1),0) = flip(\tilde{Q} ^T,0) flip(\Sigma^{-1/2}U^T,1) = flip(\tilde{R},0)$, and then $flip(flip(\tilde{Q} ^T,0) flip(\Sigma^{-1/2}U^T,1),1) =flip(\tilde{Q} ^T,0) \Sigma^{-1/2}U^T = flip(flip(\tilde{R},0),1)$. Thus, as $flip(flip(\tilde{R},0),1)$ is lower triangular we get a solution for $\Sigma^{-1/2}U^T = Q^T R^T$ where $R = flip(flip(\tilde{R},0),1)^T$. Here's some code you can run to verify it for yourself.
March thoughts
Lets start by taking any system of ordinary differential equations. We can of course convert this to a first order system by creating stand-...
-
Ok, here's the idea. Suppose I had a function $$f:X \to Y$$ and distance metrics $d_X:X \to \mathbb{R}$ and $d_Y:Y \to \mathbb{R}$. If ...
-
Here's some code to generate those basis functions from yesterday. As you might imagine, you need to increase the decay rate as you ask ...
-
Using the family of functions we defined a couple days ago of the form $f_n\left(x\right) = \sum^n_{k=0} c_{k,n} x^k e^{-\frac{x^2}{2}}$ we ...