Table of Contents:

Decomposing the Camera Matrix

Suppose we are given a a \(3 \times 4\) camera projection matrix \(M\), and which to recover the camera pose \({}^wT_c = ({}^wR_c, {}^wt_c)\) and the camera intrinsic matrix \(K\). This is actually possible. By definition, \(M\) with 11 degrees of freedom and all intrinsics and extrinsics combined:

\[\begin{aligned} M &= \begin{bmatrix} m_1 & m_2 & m_3 & m_4 \end{bmatrix} \\ M &= K [{}^cR_{w} \mid {}^{c}t_{w}] \\ M &= [K \cdot {}^cR_{w} \mid K \cdot {}^ct_{w}] \\ Q &= K \cdot {}^cR_{w} \\ Q^{-1} &= ({}^cR_{w})^{-1}K^{-1} \\ Q^{-1} (K) &= ({}^cR_{w})^{-1} K^{-1} (K) \\ Q^{-1} K &= ({}^cR_{w})^{T} \\ \end{aligned}\]

Extracting Position: Note that to find the camera’s potion in the world frame, the object we wish is \({}^wt_c\), which is the transformation from camera origin to world origin, meaning it is the camera center. Note that we can use the product \(K^{-1} K\) to cancel the complicating \(K\) factors:

We recall from the derivation here that for camera extrinsics \({}^cT_w = \begin{bmatrix} {}^cR_w & {}^ct_w \\ 0 & 1\end{bmatrix}\), the inverse of the camera extrinsics is the camera pose is \({}^wT_c = \begin{bmatrix} {}^wR_c & {}^wt_c \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} ({}^cR_w)^T & - ({}^cR_w)^T \cdot {}^ct_w \\ 0 & 1 \end{bmatrix}\).

\[\begin{aligned} {}^wt_c &= - ({}^cR_w)^T \cdot {}^ct_w \\ {}^wt_c &= -(Q^{-1} K) \cdot {}^ct_{w} \\ {}^wt_c &= -Q^{-1} (K \cdot {}^ct_{w}) \\ {}^wt_c &= -Q^{-1} m_4 \end{aligned}\]

Extracting Rotation: While we’ve been able to recover the camera position, we still haven’t separated apart \(K\) and \({}^cR_{w}\) yet. An operation known as RQ decomposition will decompose the first 3 columns of \(M\) into an upper triangular matrix \(R\) and an orthonormal matrix \(Q\) such that \(\begin{bmatrix} m_1 & m_2 & m_3 \end{bmatrix}=RQ\), where the upper triangular matrix will correspond to \(K\) and the orthonormal matrix to \({}^cR_w\). We can perform the RQ decomposition using scipy.linalg.rq().

\[\begin{bmatrix} m_1 & m_2 & m_3 \end{bmatrix} = K \cdot {}^cR_{w}\]

Special care must be taken to make sure \(K\)’s diagonal is positive (multiply offending columns of \(K\) by -1).

Stereo Image Rectification

  • Reproject image planes onto a common plane parallel to the line between camera centers

  • Pixel motion is horizontal after this transformation

  • Two homographies (3x3 transform), one for each input image reprojection

  • See Loop and Zhang [1] or Kris Kitani’s stereo rectification slides.

Panorama Image Rectification

Essential Matrix

\[Eij = RTi (Ti − Tj )Rj\]

\(Fij = K−Ti EijK−1j = Vi(Ti − Tj )VTj\) where \(Ti = [ti]×.\)

See [2] Kasten19iccv

References

[1] C. Loop and Z. Zhang. Computing Rectifying Homographies for Stereo Vision. CVPR, 1999. PDF.

[2] Yoni Kasten, Amnon Geifman, Meirav Galun, Ronen Basri. Algebraic Characterization of Essential Matrices and Their Averaging in Multiview Settings. ICCV 2019. PDF