Thom's gradient conjecture, proved in this paper, asserts that convergent gradient flows of analytic functions on $\mathbb{R^n}$ cannot spiral forever. More precisely, the projection of the flow onto the unit sphere must converge.
In my paper linked below, I show that this result holds also for gradient flows of analytic functions on infinite dimensional Hilbert spaces, provided that the second derivative is a Fredholm operator. This is similar in spirit to the extension by L. Simon of the Lojasiewicz inequality to the same domain. I also show that the result holds for geometric flows with a Gauge symmetry, such as the Yang-Mills flow.
Thom's Gradient Conjecture for Parabolic Systems and the Yang-Mills Flow
My name is Lorenz Schabrun and I'm a mathematician and classical pianist living in Sydney.
Saturday, 20 February 2021
An infinite dimensional curve selection lemma
Let $X \subset V$ be a semianalytic set with $0 \in \overline X$, i.e. there exists a sequence $x_n \in X$ with $x_n \to 0$. If we allow $V$ to be a finite dimensional Hilbert space for a moment, the curve selection lemma tells us that there exists an analytic curve $\gamma(t):[0,\varepsilon) \to V$ with $\gamma(0)=0$ and $\gamma([0,\varepsilon)) \subseteq X$.
Let $V$ be a Hilbert space and let $U \subseteq V$ be an open subset. Let $\mathcal{E} \in C^2(U)$ be an analytic function and assume the $0 \in U$ is a critical point, i.e. $\mathcal{E}'(0) = 0$. We suppose that $\mathcal{E}''(0)$ is a Fredholm operator, that is, it has finite-dimensional kernel and cokernel, and closed range. We also assume for convenience that $\mathcal{E}(0) = 0$. We define the set $W^\varepsilon = \{u:\mathcal{E}(u)\neq 0, \varepsilon\|\mathcal{E}_\theta\| \leq |\mathcal{E}_r| \}$.
Let $P$ be the orthogonal projection onto $\ker \mathcal{E}''(0)$ and $P'$ the adjoint projection. We define the finite dimensional analytic manifold \[ S = \{u \in U| (I-P')\mathcal{E}'(u)=0 \}, \] and denote by $Q$ the nonlinear projection onto $S$ (see [1] for details). We have the following Taylor series. \[ \mathcal{E}(u) = \mathcal{E}(Qu) + \frac{1}{2}\langle \mathcal{E}''(Qu)(u-Qu),u-Qu \rangle + o(\|u-Qu\|^3). \] $\bf{Lemma}$ Define the set $K \subseteq U$ by $K = \mathcal{E}(u) + \mathcal{H}(u) \ \sigma \ 0$ where $\sigma \in \{<,\leq,>,\geq\}$, and $\mathcal{H}$ is an analytic function consisting only of terms of order 3 and higher. Suppose $0 \in Cl(K \cap W^\varepsilon)$. Then there exists an analytic curve $\gamma(t):[0,\varepsilon) \to K \cap W^\varepsilon$ with $\gamma(0)=0$.
$\bf{Proof}$ Since $\mathcal{H}$ consists only of higher order terms which can be incorporated into the higher order terms of $\mathcal{E}$, we assume that $\mathcal{H}=0$. We also assume for readability that $\sigma$ is $>$, since the other cases are analogous. We have \begin{align*} K & = \{u \in U | \mathcal{E}(Qu) + (\mathcal{E}(u) - \mathcal{E}(Qu)) > 0\} \\ & = \{u \in U | \mathcal{E}(Qu) + \frac{1}{2}\langle \mathcal{E}''(Qu)(u-Qu),u-Qu \rangle + o(3) > 0\}. \end{align*} From 12.15 of [1], we know that $\|(I-P')\mathcal{E}'\| \geq c||u-Qu||$. Then from the triangle inequality and the definition of $W^\varepsilon$, we know that \[ |\mathcal{E}_r| \geq c||\mathcal{E}'|| \geq c||u-Qu|| \;\; (*). \] We can assume that $\mathcal{E}(Qu) \leq 0$ in a neighbourhood of $0$, since otherwise we can apply the usual curve selection lemma to the finite dimensional manifold $S$. We can write the quadratic term as \[ \frac{1}{2}\langle \mathcal{E}''(Qu)\hat{u},\hat{u} \rangle ||u-Qu||^2, \] where $\hat{u} = (u - Qu) / \| u - Qu \|$. If there exists $u_0$ such that the quadratic term is positive, then it is trivial to find the required curve. Thus, we may assume that \[ \frac{1}{2}\langle \mathcal{E}''(Qu)\hat{u},\hat{u} \rangle \leq 0 \] in $W^\varepsilon$. By assumption there exists a sequence $u_n \in K \cap W^\varepsilon$ with $u_n \to 0$. Since $\mathcal{E}(u_n) > 0$, The only remaining case is \[ \frac{1}{2}\langle \mathcal{E}''(Qu_n)\hat{u}_n,\hat{u}_n \rangle \to 0. \] Since the derivative must grow linearly along $V_1$, this can only happen if the radial component of $\mathcal{E}''(Qu_n)(u_n-Qu_n)$ is going to zero. This however violates $(*)$, since we are inside $W^\varepsilon$.
We remark that unlike in the finite dimensional case a curve selection lemma will not hold for the set $S$ outside of $W^\varepsilon$, as the Hessian cannot control the behaviour of the higher order terms where the linear growth in the derivative has no radial component. However, a curve selection lemma may hold for other expressions such as those involving the derivative $\mathcal{E}'$.
[1] Chill, R., Fasangova, E., Gradient Systems
Since the curve selection lemma is central to many proofs concerning semianalytic sets on finite dimensional spaces, it's interesting to consider when a similar result might hold for sets defined through inequalities involving analytic functions on infinite dimensional spaces.
The curve selection lemma often functions as a kind of compactness result that allows us to restrict attention to a one-dimensional curve. Like the Lojasiewicz inequality, it won't hold in general in infinite dimensions and this failure can be linked to the non-compactness of the unit sphere. For example, suppose we have $\mathcal{E}(u) = \|u\|^3 - c(u)\|u\|^2$. For an orthonormal basis $\{e_i\}$, we can arrange that the coefficient $c(e_i) \to 0$ as $i \to \infty$, as we cycle through the infinite number of dimensions available. Thus, the set $\{ \mathcal{E}(u) > 0 \}$ contains a sequence approaching the origin but contains no analytic curve emanating from the origin.
First, note that the desired curve exists if and only if there exists at least one sequence $x_n \to 0$ with $x_n \in N \subset X$, where $N$ is a finite dimensional analytic manifold, since the ordinary curve selection lemma can then be applied.
We consider now the special case of a function with a Hessian that is elliptic.
Let $V$ be a Hilbert space and let $U \subseteq V$ be an open subset. Let $\mathcal{E} \in C^2(U)$ be an analytic function and assume the $0 \in U$ is a critical point, i.e. $\mathcal{E}'(0) = 0$. We suppose that $\mathcal{E}''(0)$ is a Fredholm operator, that is, it has finite-dimensional kernel and cokernel, and closed range. We also assume for convenience that $\mathcal{E}(0) = 0$. We define the set $W^\varepsilon = \{u:\mathcal{E}(u)\neq 0, \varepsilon\|\mathcal{E}_\theta\| \leq |\mathcal{E}_r| \}$.
Let $P$ be the orthogonal projection onto $\ker \mathcal{E}''(0)$ and $P'$ the adjoint projection. We define the finite dimensional analytic manifold \[ S = \{u \in U| (I-P')\mathcal{E}'(u)=0 \}, \] and denote by $Q$ the nonlinear projection onto $S$ (see [1] for details). We have the following Taylor series. \[ \mathcal{E}(u) = \mathcal{E}(Qu) + \frac{1}{2}\langle \mathcal{E}''(Qu)(u-Qu),u-Qu \rangle + o(\|u-Qu\|^3). \] $\bf{Lemma}$ Define the set $K \subseteq U$ by $K = \mathcal{E}(u) + \mathcal{H}(u) \ \sigma \ 0$ where $\sigma \in \{<,\leq,>,\geq\}$, and $\mathcal{H}$ is an analytic function consisting only of terms of order 3 and higher. Suppose $0 \in Cl(K \cap W^\varepsilon)$. Then there exists an analytic curve $\gamma(t):[0,\varepsilon) \to K \cap W^\varepsilon$ with $\gamma(0)=0$.
$\bf{Proof}$ Since $\mathcal{H}$ consists only of higher order terms which can be incorporated into the higher order terms of $\mathcal{E}$, we assume that $\mathcal{H}=0$. We also assume for readability that $\sigma$ is $>$, since the other cases are analogous. We have \begin{align*} K & = \{u \in U | \mathcal{E}(Qu) + (\mathcal{E}(u) - \mathcal{E}(Qu)) > 0\} \\ & = \{u \in U | \mathcal{E}(Qu) + \frac{1}{2}\langle \mathcal{E}''(Qu)(u-Qu),u-Qu \rangle + o(3) > 0\}. \end{align*} From 12.15 of [1], we know that $\|(I-P')\mathcal{E}'\| \geq c||u-Qu||$. Then from the triangle inequality and the definition of $W^\varepsilon$, we know that \[ |\mathcal{E}_r| \geq c||\mathcal{E}'|| \geq c||u-Qu|| \;\; (*). \] We can assume that $\mathcal{E}(Qu) \leq 0$ in a neighbourhood of $0$, since otherwise we can apply the usual curve selection lemma to the finite dimensional manifold $S$. We can write the quadratic term as \[ \frac{1}{2}\langle \mathcal{E}''(Qu)\hat{u},\hat{u} \rangle ||u-Qu||^2, \] where $\hat{u} = (u - Qu) / \| u - Qu \|$. If there exists $u_0$ such that the quadratic term is positive, then it is trivial to find the required curve. Thus, we may assume that \[ \frac{1}{2}\langle \mathcal{E}''(Qu)\hat{u},\hat{u} \rangle \leq 0 \] in $W^\varepsilon$. By assumption there exists a sequence $u_n \in K \cap W^\varepsilon$ with $u_n \to 0$. Since $\mathcal{E}(u_n) > 0$, The only remaining case is \[ \frac{1}{2}\langle \mathcal{E}''(Qu_n)\hat{u}_n,\hat{u}_n \rangle \to 0. \] Since the derivative must grow linearly along $V_1$, this can only happen if the radial component of $\mathcal{E}''(Qu_n)(u_n-Qu_n)$ is going to zero. This however violates $(*)$, since we are inside $W^\varepsilon$.
We remark that unlike in the finite dimensional case a curve selection lemma will not hold for the set $S$ outside of $W^\varepsilon$, as the Hessian cannot control the behaviour of the higher order terms where the linear growth in the derivative has no radial component. However, a curve selection lemma may hold for other expressions such as those involving the derivative $\mathcal{E}'$.
[1] Chill, R., Fasangova, E., Gradient Systems
Saturday, 6 February 2021
The Lojasiewicz inequality for non-analytic functions
A function $f:\mathbb{R}^n \to \mathbb{R}$ satisfies a Lojasiewicz inequality at $0$ if in a neighbourhood of $0$ we have
\[
|\nabla f| \geq c|f|^\rho,
\]
for some $c>0$ and $\rho \in [\frac{1}{2},1)$.
It is well-known that the Lojasiewicz inequality holds for analytic functions. While analyticity is sufficient for the Lojasiewicz inequality to hold, it is not necessary. Trivial examples like $f(x) = x^2 + e^{1/x}$ demonstrate this. What then is an appropriate weaker condition?
A function $f:\mathbb{R}^n \to \mathbb{R}$ is analytic at $0$ if it is locally equal to its Taylor series $T(x)$, i.e., $f(x)=T(x)$. For a non-analytic function let's write \[ f(x) = T(x) + \omega(x), \] where $\omega$ has a Taylor series which is identically zero at the origin. In other words, $\omega$ is the "non-analytic" part of the function. For the Lojasiewicz inequality to hold, $\omega$ need not be zero, and it is in fact only necessary that $\omega$ is dominated by the function's Taylor series in a certain sense.
To see this, observe that if the Lojasiewicz inequality does not hold, then for any sequences $c_n \to 0$ and $\rho_n \to 1$, we can find a sequence $x_n \to 0$ such that \[ |\nabla f(x_n)| < c_n|f(x_n)|^{\rho_n}. \] We can choose the sequence $x_n$ to converge to $0$ as fast as we like.
Let $\mathcal{C}$ be the set of smooth curves emanating from $0$, parameterised by arc length. Consider the sets \[ \mathcal{C}_{a,k}^\varepsilon = \{\gamma \in \mathcal{C}; |\nabla f(\gamma(t))| \geq at^k \; \forall \; t \in [0,\varepsilon) \}, \] \[ X_{a,k}^\varepsilon = \cup_{\gamma \in \mathcal{C}_{a,k}^\varepsilon} \gamma([0,\varepsilon)). \] Clearly the Lojasiewicz inequality holds inside any such set $X_{a,k}^\varepsilon$, even for a function which is not analytic. Thus the sequence $x_n$ is eventually outside $X_{a,k}^\varepsilon$ for any $k \in N$ arbitrarily large and any $a,\varepsilon$ arbitrarily small. Intuitively, we might guess that the sequence $x_n$ is (in some approriate sense) asymptoting to the analytic variety \[ Z(\nabla T) = \{x: \nabla T = 0\}.
\] If the sequence $x_n$ lies on an analytic curve through the origin, then on that curve we must have $\nabla T = 0$. The analytic variety $Z(\nabla T)$ admits a Whitney stratification into a finite number of analytic manifolds at $0$. We hope that we can arrange that the sequence $x_n$ is asymptoting to $Z(\nabla T)$ faster than any given polynomial in $r$. From a previous post, we know this is not true for an arbitrary sequence. Since $T$ and $\nabla T$ are analytic they satisfy Lojasiewicz inequalities. One form of which is \[ \|\nabla T(x)\| \ge C\, \mathrm{dist}(x,Z(\nabla T))^\alpha, \] \[ \|T(x)\| \ge C\, \mathrm{dist}(x,Z(T))^\alpha. \] Note that $Z(T)$ contains $Z(\nabla T)$ so we can use $Z(T)$ for both. We also use suboptimal constants in exchange for the simplicity of having the same constants in both inequalities.
$\bf{Theorem:}$ A non-analytic function $f = T + \omega:\mathbb{R}^n \to \mathbb{R}$ satisfies the Lojasiewicz inequality if the flat or non-analytic part $\omega$ satisfies
\[ \lim_{x \to 0} \frac{\,|\omega(x)| + \|\nabla \omega(x)\|\,}{\operatorname{dist}(x,Z(T))^N} \;=\; 0 \] for all positive integers $N$. The same Lojasiewicz exponent as for $T$ may be used.
It's important to realise that this condition is not satisfied by just any function with zero Taylor series at the origin.
To derive the gradient inequality for $f=T+\omega$ under this assumption, fix a small parameter $\eta>0$. By our earlier argument, we can restrict attention to a polynomial neighbourhood $\mathcal{H}$ of $z(F)$ (so that $\operatorname{dist}(x,Z(f))\le r^k$ for some $k$). Then using our growth condition on $\omega$ and $\nabla \omega$ and the Lojasiewicz inequalities for $T$ and $\nabla T$, we can achieve that for all $x\in \mathcal{H}$, \[ |\omega(x)| \le \eta\,|T(x)|, \qquad \|\nabla\omega(x)\| \le \eta\,\|\nabla T(x)\|. \] On $x\in \mathcal{H}$ the triangle inequality for $f$ implies \[ |f(x)| = |T(x)+\omega(x)| \le |T(x)| + |\omega(x)| \le (1+\eta)\,|T(x)|, \] and for the gradients one has \[ \|\nabla f(x)\| = \|\nabla T(x) + \nabla\omega(x)\| \ge \|\nabla T(x)\| - \|\nabla\omega(x)\| \ge (1-\eta)\,\|\nabla T(x)\|. \] Using the other form of the Lojasiewicz inequality for the analytic function $T$, there exist constants $c>0$ and $\rho \in \bigl[\tfrac{1}{2},1\bigr)$ such that \[ \|\nabla T(x)\| \ge c\,|T(x)|^{\rho} \] for $x$ sufficiently close to the origin. Combining this estimate with the inequalities on $\mathcal{H}$ yields \[ \|\nabla f(x)\| \ge (1-\eta)\,c\,|T(x)|^{\rho} \ge (1-\eta)\,c\,(1+\eta)^{-\rho}\,|f(x)|^{\rho}. \] Hence $f$ satisfies a Lojasiewicz inequality on $\mathcal{H}$: \[ \|\nabla f(x)\| \ge c'\,|f(x)|^{\rho},\quad x \in \mathcal{H}, \] where the modified constant is \[ c' = \frac{(1-\eta)\,c}{(1+\eta)^{\rho}}. \] Since $\eta$ may be chosen arbitrarily small, the constant $c'$ can be made as close to $c$ as desired.
A function $f:\mathbb{R}^n \to \mathbb{R}$ is analytic at $0$ if it is locally equal to its Taylor series $T(x)$, i.e., $f(x)=T(x)$. For a non-analytic function let's write \[ f(x) = T(x) + \omega(x), \] where $\omega$ has a Taylor series which is identically zero at the origin. In other words, $\omega$ is the "non-analytic" part of the function. For the Lojasiewicz inequality to hold, $\omega$ need not be zero, and it is in fact only necessary that $\omega$ is dominated by the function's Taylor series in a certain sense.
To see this, observe that if the Lojasiewicz inequality does not hold, then for any sequences $c_n \to 0$ and $\rho_n \to 1$, we can find a sequence $x_n \to 0$ such that \[ |\nabla f(x_n)| < c_n|f(x_n)|^{\rho_n}. \] We can choose the sequence $x_n$ to converge to $0$ as fast as we like.
Let $\mathcal{C}$ be the set of smooth curves emanating from $0$, parameterised by arc length. Consider the sets \[ \mathcal{C}_{a,k}^\varepsilon = \{\gamma \in \mathcal{C}; |\nabla f(\gamma(t))| \geq at^k \; \forall \; t \in [0,\varepsilon) \}, \] \[ X_{a,k}^\varepsilon = \cup_{\gamma \in \mathcal{C}_{a,k}^\varepsilon} \gamma([0,\varepsilon)). \] Clearly the Lojasiewicz inequality holds inside any such set $X_{a,k}^\varepsilon$, even for a function which is not analytic. Thus the sequence $x_n$ is eventually outside $X_{a,k}^\varepsilon$ for any $k \in N$ arbitrarily large and any $a,\varepsilon$ arbitrarily small. Intuitively, we might guess that the sequence $x_n$ is (in some approriate sense) asymptoting to the analytic variety \[ Z(\nabla T) = \{x: \nabla T = 0\}.
\] If the sequence $x_n$ lies on an analytic curve through the origin, then on that curve we must have $\nabla T = 0$. The analytic variety $Z(\nabla T)$ admits a Whitney stratification into a finite number of analytic manifolds at $0$. We hope that we can arrange that the sequence $x_n$ is asymptoting to $Z(\nabla T)$ faster than any given polynomial in $r$. From a previous post, we know this is not true for an arbitrary sequence. Since $T$ and $\nabla T$ are analytic they satisfy Lojasiewicz inequalities. One form of which is \[ \|\nabla T(x)\| \ge C\, \mathrm{dist}(x,Z(\nabla T))^\alpha, \] \[ \|T(x)\| \ge C\, \mathrm{dist}(x,Z(T))^\alpha. \] Note that $Z(T)$ contains $Z(\nabla T)$ so we can use $Z(T)$ for both. We also use suboptimal constants in exchange for the simplicity of having the same constants in both inequalities.
$\bf{Theorem:}$ A non-analytic function $f = T + \omega:\mathbb{R}^n \to \mathbb{R}$ satisfies the Lojasiewicz inequality if the flat or non-analytic part $\omega$ satisfies
\[ \lim_{x \to 0} \frac{\,|\omega(x)| + \|\nabla \omega(x)\|\,}{\operatorname{dist}(x,Z(T))^N} \;=\; 0 \] for all positive integers $N$. The same Lojasiewicz exponent as for $T$ may be used.
It's important to realise that this condition is not satisfied by just any function with zero Taylor series at the origin.
To derive the gradient inequality for $f=T+\omega$ under this assumption, fix a small parameter $\eta>0$. By our earlier argument, we can restrict attention to a polynomial neighbourhood $\mathcal{H}$ of $z(F)$ (so that $\operatorname{dist}(x,Z(f))\le r^k$ for some $k$). Then using our growth condition on $\omega$ and $\nabla \omega$ and the Lojasiewicz inequalities for $T$ and $\nabla T$, we can achieve that for all $x\in \mathcal{H}$, \[ |\omega(x)| \le \eta\,|T(x)|, \qquad \|\nabla\omega(x)\| \le \eta\,\|\nabla T(x)\|. \] On $x\in \mathcal{H}$ the triangle inequality for $f$ implies \[ |f(x)| = |T(x)+\omega(x)| \le |T(x)| + |\omega(x)| \le (1+\eta)\,|T(x)|, \] and for the gradients one has \[ \|\nabla f(x)\| = \|\nabla T(x) + \nabla\omega(x)\| \ge \|\nabla T(x)\| - \|\nabla\omega(x)\| \ge (1-\eta)\,\|\nabla T(x)\|. \] Using the other form of the Lojasiewicz inequality for the analytic function $T$, there exist constants $c>0$ and $\rho \in \bigl[\tfrac{1}{2},1\bigr)$ such that \[ \|\nabla T(x)\| \ge c\,|T(x)|^{\rho} \] for $x$ sufficiently close to the origin. Combining this estimate with the inequalities on $\mathcal{H}$ yields \[ \|\nabla f(x)\| \ge (1-\eta)\,c\,|T(x)|^{\rho} \ge (1-\eta)\,c\,(1+\eta)^{-\rho}\,|f(x)|^{\rho}. \] Hence $f$ satisfies a Lojasiewicz inequality on $\mathcal{H}$: \[ \|\nabla f(x)\| \ge c'\,|f(x)|^{\rho},\quad x \in \mathcal{H}, \] where the modified constant is \[ c' = \frac{(1-\eta)\,c}{(1+\eta)^{\rho}}. \] Since $\eta$ may be chosen arbitrarily small, the constant $c'$ can be made as close to $c$ as desired.
Subscribe to:
Comments (Atom)