<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Least squares | Nicholas Hu</title>
    <link>https://www.math.ucla.edu/~njhu/notes/nla/lsq/</link>
      <atom:link href="https://www.math.ucla.edu/~njhu/notes/nla/lsq/index.xml" rel="self" type="application/rss+xml" />
    <description>Least squares</description>
    <generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-ca</language><lastBuildDate>Mon, 16 Jun 2025 00:00:00 +0000</lastBuildDate>
    <image>
      <url>https://www.math.ucla.edu/~njhu/media/icon_hu_d46824b1c45312fd.png</url>
      <title>Least squares</title>
      <link>https://www.math.ucla.edu/~njhu/notes/nla/lsq/</link>
    </image>
    
    <item>
      <title>Projections and least squares problems</title>
      <link>https://www.math.ucla.edu/~njhu/notes/nla/lsq/leastsquares/</link>
      <pubDate>Sat, 22 Feb 2025 00:00:00 +0000</pubDate>
      <guid>https://www.math.ucla.edu/~njhu/notes/nla/lsq/leastsquares/</guid>
      <description>&lt;div class=&#34;btn-links mb-3&#34;&gt;
&lt;a class=&#34;btn btn-outline-primary btn-page-header btn-sm&#34; href=&#34;../leastsquares.pdf&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;
  PDF
&lt;/a&gt;
&lt;/div&gt;
&lt;!--
No newlines allowed between $$&#39;s below!
--&gt;
&lt;div style=&#34;display: none;&#34;&gt;
$$
\newcommand{\set}[1]{\{ #1 \}}
\newcommand{\Set}[1]{\left \{ #1 \right\}}
\renewcommand{\emptyset}{\varnothing}
\newcommand{\N}{\mathbb{N}}
\newcommand{\Z}{\mathbb{Z}}
\newcommand{\R}{\mathbb{R}}
\newcommand{\Rn}{\mathbb{R}^n}
\newcommand{\Rm}{\mathbb{R}^m}
\newcommand{\C}{\mathbb{C}}
\newcommand{\F}{\mathbb{F}}
\newcommand{\abs}[1]{\lvert #1 \rvert}
\newcommand{\Abs}[1]{\left\lvert #1 \right\rvert}
\newcommand{\inner}[2]{\langle #1, #2 \rangle}
\newcommand{\Inner}[2]{\left\langle #1, #2 \right\rangle}
\newcommand{\norm}[1]{\lVert #1 \rVert}
\newcommand{\Norm}[1]{\left\lVert #1 \right\rVert}
\newcommand{\tp}{{\top}}
\newcommand{\trans}{{\top}}
\newcommand{\span}{\operatorname{span}}
\newcommand{\im}{\operatorname{im}}
\newcommand{\ker}{\operatorname{ker}}
\newcommand{\rank}{\operatorname{rank}}
\newcommand{\proj}{\operatorname{proj}}
\newcommand{\proj}[1]{\mathop{\mathrm{proj}_{#1}}}
\newcommand{\K}{\mathcal{K}}
\newcommand{\L}{\mathcal{L}}
\renewcommand{\epsilon}{\varepsilon}
\definecolor{cblue}{RGB}{31, 119, 180}
\definecolor{corange}{RGB}{255, 127, 14}
\definecolor{cgreen}{RGB}{44, 160, 44}
\definecolor{cred}{RGB}{214, 39, 40}
\definecolor{cpurple}{RGB}{148, 103, 189}
\definecolor{cbrown}{RGB}{140, 86, 75}
\definecolor{cpink}{RGB}{227, 119, 194}
\definecolor{cgrey}{RGB}{127, 127, 127}
\definecolor{cyellow}{RGB}{188, 189, 34}
\definecolor{cteal}{RGB}{23, 190, 207}
$$
&lt;/div&gt;
&lt;h2 id=&#34;projections&#34;&gt;Projections&lt;/h2&gt;
&lt;p&gt;Let 

$H$ be a Hilbert space and 

$Y \subseteq H$. The &lt;strong&gt;(orthogonal) projection operator&lt;/strong&gt; onto 

$Y$ is defined for 

$x \in H$ by


$$
\proj{Y}(x) := \underset{y \in Y}{\operatorname{argmin}} \frac{1}{2} \norm{y - x}^2.
$$&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Hilbert projection theorem (first projection theorem)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If 

$Y$ is nonempty, closed, and convex, then 

$\proj{Y}(x)$ is a singleton (so 

$\proj{Y} : H \to Y$ is well-defined).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;Proof.&lt;/em&gt; Let 

$(y_n)_{n=1}^\infty \subseteq Y$ be such that 

$d_n := \frac{1}{2} \norm{y_n - x}^2 \to d := \inf_{y \in Y} \frac{1}{2} \norm{y-x}^2$. By the parallelogram identity,


$$
\Norm{\frac{y_m + y_n}{2} - x}^2 + \Norm{\frac{y_m - y_n}{2}}^2 = 2\Norm{\frac{y_m - x}{2}}^2 + 2\Norm{\frac{y_n - x}{2}}^2 = d_m + d_n,
$$
where 

$\norm{\frac{y_m + y_n}{2} - x}^2 \geq 2d$ by convexity. Taking 

$m, n \to \infty$ shows that 

$(y_n)$ is Cauchy and therefore convergent to some 

$y \in Y$ with 

$\frac{1}{2} \norm{y - x}^2 = d$. Moreover, if 

$y’ \in Y$ is another minimizer, replacing 

$y_m, y_n$ by 

$y, y&#39;$ above shows that 

$y = y’$. ∎&lt;/p&gt;
&lt;p&gt;Recall that the &lt;strong&gt;polar cone&lt;/strong&gt; of 

$Y$ is 

$Y^\circ := \set{x \in H : \forall y \in Y \, (\Re(\inner{x}{y}) \leq 0)}$ and that the &lt;strong&gt;orthogonal complement&lt;/strong&gt; of 

$Y$ is 

$Y^\perp := \set{x \in H : \forall y \in Y \, (\inner{x}{y} = 0)}$; clearly, if 

$Y$ is a &lt;em&gt;subspace&lt;/em&gt; of 

$H$, then 

$Y^\circ = Y^\perp$.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Characterization of projections (second projection theorem)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If 

$Y$ is nonempty, closed, and convex, then 

$y = \proj{Y}(x)$ if and only if 

$y \in Y$ and 

$x-y \in (Y-y)^\circ$.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;Proof.&lt;/em&gt; If 

$y = \proj{Y}(x)$ and 

$y’ \in Y$, then for all 

$\lambda \in [0, 1]$, we have


$$
\norm{y-x}^2 \leq \norm{(1-\lambda)y + \lambda y&#39; - x}^2 = \norm{y-x}^2 + 2\lambda \Re(\inner{y-x}{y&#39;-y}) + \lambda^2 \norm{y&#39; - y}^2,
$$
so 

$\Re(\inner{y-x}{y&#39;-y}) \geq 0$. Conversely, if 

$y, y&#39; \in Y$ and 

$x-y \in (Y-y)^\circ$, then setting 

$\lambda = 1$ in the inequality above shows that 

$y = \proj{Y}(x)$. ∎&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Firm nonexpansiveness of the projection operator&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If 

$Y$ is nonempty, closed, and convex, then
$$
\norm{\proj{Y}(x) - \proj{Y}(x&amp;rsquo;)}^2 + \norm{(I-\proj{Y})(x) + (I-\proj{Y})(x&amp;rsquo;)}^2 \leq \norm{x-x&amp;rsquo;}^2.
$$&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;Proof.&lt;/em&gt; Let 

$y = \proj{Y}(x)$ and 

$y’ = \proj{Y}(x’)$, and add the inequalities 

$\Re(\inner{x&#39;-y&#39;}{y-y&#39;}) \leq 0$ and 

$\Re(\inner{x-y}{y&#39;-y}) \leq 0$. ∎&lt;/p&gt;
&lt;p&gt;In particular, this implies that the projection operator is nonexpansive: 

$\norm{\proj{Y}(x) - \proj{Y}(x&#39;)} \leq \norm{x-x&#39;}$.&lt;/p&gt;
&lt;p&gt;If 

$Y$ is a &lt;em&gt;closed subspace&lt;/em&gt; of 

$H$, it follows from the above that 

$y = \proj{Y}(x)$ if and only if 

$y \in Y$ and 

$x-y \in Y^\perp$, and that 

$\proj{Y} : H \to Y$ is a &lt;em&gt;linear&lt;/em&gt; operator with 

$\norm{\proj{Y}} \leq 1$, 

$\im(\proj{Y}) = Y$, and 

$\ker(\proj{Y}) = Y^\perp$. In addition, 

$\proj{Y^\perp} = I-\proj{Y}$.&lt;/p&gt;
&lt;h2 id=&#34;least-squares-problems&#34;&gt;Least squares problems&lt;/h2&gt;
&lt;p&gt;Let 

$H_1$ and 

$H_2$ be Hilbert spaces and suppose that 

$A : H_1 \to H_2$ is a continuous linear operator with closed image.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; The &lt;strong&gt;(linear) least squares problem&lt;/strong&gt; is that of finding an 

$x \in H_1$ that minimizes 

$\frac{1}{2} \norm{b - Ax}^2$ for a given 

$b \in H_2$, or equivalently, that satisfies 

$Ax = \proj{\im(A)} b$. Using the fact that 

$\im(A)^\perp = \ker(A^*)$, we can also write this as the &lt;strong&gt;normal equation&lt;/strong&gt; 

$A^*Ax = A^*b$.&lt;/p&gt;
&lt;h3 id=&#34;the-pseudoinverse&#34;&gt;The pseudoinverse&lt;/h3&gt;
&lt;p&gt;To solve the least squares problem, we observe that 

$A\restriction_{\ker(A)^\perp} : \ker(A)^\perp \to \im(A)$ is bijective since 

$Ax = Ax’$ implies that 

$x-x’ \in \ker(A)$ and 

$y = Ax$ implies that 

$y = A (x - \proj{\ker(A)} x)$. Thus, the &lt;strong&gt;pseudoinverse&lt;/strong&gt; 

$A^+ : H_2 \to H_1$ of 

$A$, defined as


$$
A^+ := A\restriction_{\ker(A)^\perp}^{-1} \circ \proj{\im(A)},
$$
is a well-defined continuous linear operator, and by construction 

$x^* := A^+ b$ is a solution to the least squares problem.&lt;/p&gt;
&lt;p&gt;This solution need not be unique; however, it is the unique solution of &lt;em&gt;minimal norm&lt;/em&gt; because 

$x - x^* \in \ker(A)$ for any solution 

$x$, so 

$\norm{x}^2 = \norm{x-x^*}^2 + \norm{x^*}^2 \geq \norm{x^*}^2$ with equality if and only if 

$x = x^*$.&lt;/p&gt;
&lt;p&gt;It is straightforward to verify that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;

$A^+ = A^{-1}$ if 

$A$ is bijective&lt;/li&gt;
&lt;li&gt;

$\im(A^+) = \ker(A)^\perp$, 

$\ker(A^+) = \im(A)^\perp$&lt;/li&gt;
&lt;li&gt;

$AA^+ = \proj{\im(A)}$, 

$A^+A = \proj{\im(A^+)}$ (and in fact, these characterize the pseudoinverse)&lt;/li&gt;
&lt;li&gt;

$(A^+)^+ = A$&lt;/li&gt;
&lt;li&gt;

$(A^*)^+ = (A^+)^*$&lt;/li&gt;
&lt;li&gt;

$A^+ = (A^* A)^+ A^* = A^* (AA^*)^+$&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the finite-dimensional case, if 

$A \in \C^{m \times n}$ has full column rank, then 

$A^+ = (A^* A)^{-1} A^*$ by the identities above; similarly, if it has full row rank, then 

$A^+ = A^* (AA^*)^{-1}$. More generally, if 

$\hat{U} \hat{\Sigma} \hat{V}^*$ is a compact SVD of 

$A$ (that is, 

$\hat{\Sigma}$ is 

$r \times r$, where 

$r = \rank(A)$), then 

$A^+ = \hat{V} \hat{\Sigma}^{-1} \hat{U}^*$.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;Note that this implies that 

$A^*$ also has closed image, so 

$\im(A)^\perp = \ker(A^*)$ and 

$\ker(A)^\perp = \overline{\im(A^*)} = \im(A^*)$.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>The QR factorization</title>
      <link>https://www.math.ucla.edu/~njhu/notes/nla/lsq/qr/</link>
      <pubDate>Tue, 04 Mar 2025 00:00:00 +0000</pubDate>
      <guid>https://www.math.ucla.edu/~njhu/notes/nla/lsq/qr/</guid>
      <description>&lt;div class=&#34;btn-links mb-3&#34;&gt;
&lt;a class=&#34;btn btn-outline-primary btn-page-header btn-sm&#34; href=&#34;../qr.pdf&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;
  PDF
&lt;/a&gt;
&lt;/div&gt;
&lt;!--
No newlines allowed between $$&#39;s below!
--&gt;
&lt;div style=&#34;display: none;&#34;&gt;
$$
\newcommand{\set}[1]{\{ #1 \}}
\newcommand{\Set}[1]{\left \{ #1 \right\}}
\renewcommand{\emptyset}{\varnothing}
\newcommand{\N}{\mathbb{N}}
\newcommand{\Z}{\mathbb{Z}}
\newcommand{\R}{\mathbb{R}}
\newcommand{\Rn}{\mathbb{R}^n}
\newcommand{\Rm}{\mathbb{R}^m}
\newcommand{\C}{\mathbb{C}}
\newcommand{\F}{\mathbb{F}}
\newcommand{\abs}[1]{\lvert #1 \rvert}
\newcommand{\Abs}[1]{\left\lvert #1 \right\rvert}
\newcommand{\inner}[2]{\langle #1, #2 \rangle}
\newcommand{\Inner}[2]{\left\langle #1, #2 \right\rangle}
\newcommand{\norm}[1]{\lVert #1 \rVert}
\newcommand{\Norm}[1]{\left\lVert #1 \right\rVert}
\newcommand{\tp}{{\top}}
\newcommand{\trans}{{\top}}
\newcommand{\span}{\operatorname{span}}
\newcommand{\im}{\operatorname{im}}
\newcommand{\ker}{\operatorname{ker}}
\newcommand{\rank}{\operatorname{rank}}
\newcommand{\proj}{\operatorname{proj}}
\newcommand{\proj}[1]{\mathop{\mathrm{proj}_{#1}}}
\newcommand{\refl}{\operatorname{refl}}
\newcommand{\refl}[1]{\mathop{\mathrm{refl}_{#1}}}
\newcommand{\K}{\mathcal{K}}
\newcommand{\L}{\mathcal{L}}
\renewcommand{\epsilon}{\varepsilon}
\newcommand{\conj}{\overline}
\newcommand{\sign}{\operatorname{sign}}
\definecolor{cblue}{RGB}{31, 119, 180}
\definecolor{corange}{RGB}{255, 127, 14}
\definecolor{cgreen}{RGB}{44, 160, 44}
\definecolor{cred}{RGB}{214, 39, 40}
\definecolor{cpurple}{RGB}{148, 103, 189}
\definecolor{cbrown}{RGB}{140, 86, 75}
\definecolor{cpink}{RGB}{227, 119, 194}
\definecolor{cgrey}{RGB}{127, 127, 127}
\definecolor{cyellow}{RGB}{188, 189, 34}
\definecolor{cteal}{RGB}{23, 190, 207}
$$
&lt;/div&gt;
&lt;p&gt;Let 

$A \in \C^{m \times n}$. The &lt;strong&gt;QR factorization&lt;/strong&gt; is a factorization of 

$A$ as 

$QR$, where 

$Q \in \C^{m \times m}$ is unitary and 

$R \in \C^{m \times n}$ is (rectangular) upper triangular.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; We will show below that such a factorization always exists by describing three different methods to compute it.&lt;/p&gt;
&lt;p&gt;When 

$A$ has full column rank, we have 

$a_j = \sum_{i \leq j} r_{ij} q_i$ for each 

$j$, so 

$\span \, \set{a_j}_{j \leq k} \subseteq \span \, \set{q_j}_{j \leq k}$ for each 

$k$. As these subspaces are both 

$k$-dimensional, they must be equal, which also implies that the diagonal entries of 

$R$ are nonzero. Moreover, if 

$\hat{Q}$ denotes the left 

$m \times n$ submatrix of 

$Q$ and 

$\hat{R}$ denotes the upper 

$n \times n$ submatrix of 

$R$, we have the &lt;strong&gt;thin/reduced QR factorization&lt;/strong&gt; 

$A = \hat{Q} \hat{R}$.&lt;/p&gt;
&lt;p&gt;The thin QR factorization of a full column rank matrix is nearly unique in the sense that if 

$A = \tilde{Q} \tilde{R}$ for some 

$\tilde{Q} \in \C^{m \times n}$ with orthonormal columns and some upper triangular 

$\tilde{R} \in \C^{n \times n}$, then 

$\tilde{Q} = \hat{Q}D$ and 

$\hat{R} = D\tilde{R}$ for some diagonal matrix 

$D$ whose diagonal entries have unit modulus. This follows from the observation that 

$D := \hat{Q}^* \tilde{Q} = \hat{R} \tilde{R}^{-1} = \hat{R}^{-*} \tilde{R}^*$ must be both upper and lower triangular. Thus, if we specify a (complex) sign for each diagonal entry of 

$\hat{R}$, the factorization is unique.&lt;/p&gt;
&lt;h2 id=&#34;gramschmidt-orthogonalization&#34;&gt;Gram–Schmidt orthogonalization&lt;/h2&gt;
&lt;p&gt;Suppose that 

$(a_j)_{j \geq 1}$ is a sequence of vectors in a Hilbert space 

$V$. &lt;strong&gt;Gram–Schmidt orthogonalization&lt;/strong&gt; defines an &lt;em&gt;orthogonal&lt;/em&gt; sequence of vectors 

$(b_j)_{j \geq 1}$ in 

$V$ such that 

$\mathcal{A}_k := \span \, \set{a_j}_{j \leq k} = \mathcal{B}_k := \span \, \set{b_j}_{j \leq k}$ for each 

$k$. To wit, let 

$\proj{b} := \proj{\span{\set{b}}}$ for 

$b \in H$; that is,


$$
\proj{b} a =
\begin{cases}
\frac{\inner{a}{b}}{\inner{b}{b}} b &amp; \text{if 

$b \neq 0$}, \\
b &amp; \text{if 

$b = 0$}.
\end{cases}
$$
We then inductively define


$$
b_j := a_j - \sum_{i &lt; j} \proj{b_i} a_j.
$$
Assuming that 

$\set{b_j}_{j &lt; k}$ is orthogonal for a given 

$k$, we then have 

$\inner{b_k}{b_j} = \inner{a_k - \sum_{i &lt; k} \proj{b_i} a_k}{b_j} = \inner{a_k - \proj{b_j} a_k}{b_j} = 0$ for all 

$j &lt; k$, which shows that 

$\set{b_j}_{j \leq k}$ is orthogonal. Moreover, if 

$\mathcal{A}_{k-1} = \mathcal{B}_{k-1}$, then 

$b_k \in a_k - \mathcal{B}_{k-1} = a_k - \mathcal{A}_{k-1} \subseteq \mathcal{A}_k$ and 

$a_k \in b_k + \mathcal{B}_{k-1} \subseteq \mathcal{B}_k$, so 

$\mathcal{A}_k = \mathcal{B}_k$.&lt;/p&gt;
&lt;p&gt;To compute a QR factorization of 

$A$, we can apply Gram–Schmidt orthogonalization to the columns of 

$A =: \begin{bmatrix} a_1 &amp; \cdots &amp; a_n \end{bmatrix}$ as follows. For each 

$j \leq m$, we inductively define 

$b_j := a_j - \sum_{i &lt; j} \proj{q_i} a_j$ if 

$j \leq n$ &lt;em&gt;and&lt;/em&gt; the right-hand expression is &lt;em&gt;nonzero&lt;/em&gt;; otherwise, we select an &lt;em&gt;arbitrary nonzero&lt;/em&gt; 

$b_j \in \mathcal{B}_{j-1}^\perp$. In either case, we then define 

$q_j := \frac{b_j}{\norm{b_j}}$. We thereby obtain an orthonormal basis 

$\set{q_j}_{j \leq m}$ of 

$\C^m$ such that 

$a_j = \sum_{i \leq \min \set{j,\,m}} r_{ij} q_i$ for some 

$r_{ij} \in \C$, as required.&lt;/p&gt;
&lt;h3 id=&#34;modified-gramschmidt-orthogonalization&#34;&gt;Modified Gram–Schmidt orthogonalization&lt;/h3&gt;
&lt;p&gt;In Gram–Schmidt orthogonalization, we define 

$b_j = (I - \sum_{i &lt; j} \proj{b_i}) a_j$. Since the 

$b_i$ are orthogonal, this can equivalently be written as 

$b_j = (I - \proj{b_{j-1}}) \cdots (I - \proj{b_2}) (I - \proj{b_1}) a_j$, so computationally speaking, the projection operator 

$I - \proj{b_i}$ can be applied to all 

$a_j$ with 

$i &lt; j$ (assuming there are finitely many of them) as soon as 

$b_i$ is generated. The resulting algorithm is known as &lt;strong&gt;modified Gram–Schmidt orthogonalization&lt;/strong&gt; and exhibits greater numerical stability than “classical” Gram–Schmidt orthogonalization.&lt;/p&gt;
&lt;h2 id=&#34;householder-reflections&#34;&gt;Householder reflections&lt;/h2&gt;
&lt;p&gt;Suppose that 

$v$ is a nonzero vector in a Hilbert space 

$V$. The &lt;strong&gt;reflection operator&lt;/strong&gt; across the hyperplane 

$\set{v}^\perp$ is defined for 

$x \in H$ by


$$
\refl{v} x := (I - 2\proj{v}) \, x = x - \frac{2\inner{x}{v}}{\inner{v}{v}} v.
$$
Since 

$\proj{v}$ is idempotent and self-adjoint, 

$\refl{v}$ is involutory and self-adjoint and therefore unitary.&lt;/p&gt;
&lt;p&gt;A &lt;strong&gt;Householder reflection&lt;/strong&gt; is a reflection operator 

$H : \C^d \to \C^d$ that zeroes out all components of some vector 

$x$ except for its first component 

$x_1$; we assume that the other components are not already all zeroes. In other words, 

$Hx = \alpha e_1$ for some 

$\alpha \in \C$, where 

$e_1 := \begin{bmatrix} 1 &amp; 0 &amp; \cdots &amp; 0 \end{bmatrix}^\tp$ and 

$x \notin \span \, \set{e_1}$.&lt;/p&gt;
&lt;p&gt;As 

$H$ is unitary and self-adjoint, we must have 

$\abs{\alpha} = \norm{x}$ and 

$\inner{Hx}{x} = \alpha \conj{x_1} \in \R$, which implies that 

$\alpha = \pm \sign(x_1) \norm{x}$ (unless 

$x_1 = 0$, in which case the only constraint is 

$\abs{\alpha} = \norm{x}$). Since 

$\refl{w} x = \alpha e_1$ if and only if 

$\frac{2\inner{x}{w}}{\inner{w}{w}} w = x - \alpha e_1$, using the &lt;strong&gt;Householder vector&lt;/strong&gt; 

$v := x - \alpha e_1$ guarantees that 

$H := \refl{v}$ satisfies 

$Hx = \alpha e_1$. A conventional choice of 

$\alpha$ in this context is 

$\alpha = -\sign(x_1) \norm{x}$ so as to maximize 

$\norm{v}^2 = 2(\norm{x}^2 \mp \abs{x_1} \norm{x})$ for the sake of numerical stability.&lt;/p&gt;
&lt;p&gt;To compute a QR factorization of 

$A$, we can apply Householder reflections successively to introduce zeroes below the diagonal in each column of 

$A$. More precisely, we can find a Householder reflection 

$H \in \C^{m \times m}$ such that


$$
HA = \begin{bmatrix} \alpha &amp; b^\tp \\ &amp; A&#39; \end{bmatrix},
$$
where 

$\alpha \in \C$, 

$b \in \C^{n-1}$, and 

$A’ \in \C^{(m-1) \times (n-1)}$ (allowing 

$H = I$ if the subdiagonal entries in the first column of 

$A$ are already zero). Now supposing inductively that 

$A’$ has a QR factorization 

$Q’ R’$, we obtain the factorization


$$
A = 
\underbrace{H^* \begin{bmatrix} 1 &amp; \\ &amp; Q&#39; \end{bmatrix}}_{Q} 
\underbrace{\begin{bmatrix} \alpha &amp; b^\tp \\ &amp; R&#39; \end{bmatrix}}_{R}.
$$&lt;/p&gt;
&lt;h2 id=&#34;givens-rotations&#34;&gt;Givens rotations&lt;/h2&gt;
&lt;p&gt;Given 

$a, b \in \C$, consider the problem of finding a 

$U \in \mathrm{SU}(2)$ and an 

$r \in \C$ such that 

$U \begin{bmatrix} a \\ b \end{bmatrix} = \begin{bmatrix} r \\ 0 \end{bmatrix}$. We have


$$
U =
\begin{bmatrix} 
c &amp; s \\
-\conj{s} &amp; \conj{c}
\end{bmatrix},
\quad
\text{where 

$\abs{c}^2 + \abs{s}^2 = 1$}
$$
and 

$ac + bs = r$, 

$b\conj{c} - a\conj{s} = 0$. Since 

$U$ is unitary, we must have 

$r = \omega \sqrt{\abs{a}^2 + \abs{b}^2}$ for some 

$\omega \in \C$ with 

$\abs{\omega} = 1$, and assuming that 

$r \neq 0$ (which is to say that 

$a$ and 

$b$ are not both zero), we obtain


$$
c = \frac{\conj{a}}{\conj{r} \vphantom{\sqrt{\abs{a}^2 + \abs{b}^2}}} = \frac{\omega \conj{a}}{\sqrt{\abs{a}^2 + \abs{b}^2}}, \quad
s = \frac{\conj{b}}{\conj{r} \vphantom{\sqrt{\abs{a}^2 + \abs{b}^2}}} = \frac{\omega \conj{b}}{\sqrt{\abs{a}^2 + \abs{b}^2}}.
$$
A conventional choice in this context is 

$\omega = \sign(a)$, along with 

$U = I$ (and 

$r = 0$) in the case 

$a = b = 0$.&lt;/p&gt;
&lt;p&gt;Thus, if 

$a$ and 

$b$ are the 

$i$&lt;sup&gt;th&lt;/sup&gt; and 

$j$&lt;sup&gt;th&lt;/sup&gt; components of some 

$x \in \C^m$, where 

$i &lt; j$, the &lt;strong&gt;Givens rotation&lt;/strong&gt;


$$
G :=
\begin{bmatrix}
I_{i-1} \\
&amp; c &amp; &amp; s \\
&amp; &amp; I_{(j-1)-i} &amp; \\
&amp; -\conj{s} &amp; &amp; \conj{c} \\
&amp; &amp; &amp; &amp; I_{m-j}
\end{bmatrix}
$$
is a unitary matrix such that the 

$j$&lt;sup&gt;th&lt;/sup&gt; component of 

$Gx$ is zero. (In the real-valued setting, 

$G$ is indeed a rotation in the 

$x_i$-

$x_j$ plane.) Such rotations can evidently be applied to compute a QR factorization of 

$A$ by introducing zeroes below the diagonal of 

$A$ one at a time.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;If 

$A \in \R^{m \times n}$, a QR factorization is defined analogously; i.e., with 

$Q$ orthogonal.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
  </channel>
</rss>
