Affine Restriction estimates imply Affine Isoperimetric inequalities

One thing I absolutely love about harmonic analysis is that it really has something interesting to say about nearly every other field of Analysis. Today’s example is exactly of this kind: I will show how a Fourier Restriction estimate can say something about Affine Geometry. This was first noted by Carbery and Ziesler (see below for references).

1. Affine Isoperimetric Inequality

Recall the Affine Invariant Surface Measure that we have defined in a previous post. Given a hypersurface \Sigma \subset \mathbb{R}^d sufficiently smooth to have a well-defined Gaussian curvature \kappa_{\Sigma}(\xi) (where \xi ranges over \Sigma ) and with surface measure denoted by d\sigma_{\Sigma} , we can define the Affine Invariant Surface measure as the weighted surface measure

\displaystyle d\Omega_{\Sigma}(\xi) := |\kappa_{\Sigma}(\xi)|^{1/(d+1)} \, d\sigma_{\Sigma}(\xi);

this measure has the property of being invariant under the action of SL(\mathbb{R}^d) – hence the name. Here invariant means that if \varphi is an equi-affine map (thus volume preserving) then

\displaystyle \Omega_{\varphi(\Sigma)}(\varphi(E)) = \Omega_{\Sigma}(E)

for any measurable E \subseteq \Sigma .
The Affine Invariant Surface measure can be used to formulate a very interesting result in Affine Differential Geometry – an inequality of isoperimetric type. Let K \subset \mathbb{R}^d be a convex body – say, centred in the origin and symmetric with respect to it, i.e. K = - K . We denote by \partial K the boundary of the convex body K and we can assume for the sake of the argument that \partial K is sufficiently smooth – for example, piecewise C^2-regular, so that the Gaussian curvature is defined at every point except maybe a \mathcal{H}^{d-1} -null set. Then the Affine Isoperimetric Inequality says that (with \Omega = \Omega_{\partial K} )

\displaystyle \boxed{ \Omega(\partial K)^{d+1} \lesssim |K|^{d-1}.  } \ \ \ \ \ \ \ (\dagger)


Notice that the inequality is invariant with respect to the action of SL(\mathbb{R}^d) indeed – thanks to the fact that d\Omega is. Observe also the curious fact that this inequality goes in the opposite direction with respect to the better known Isoperimetric Inequality of Geometric Measure Theory! Indeed, the latter says (let’s say in the usual \mathbb{R}^d ) that (a power of) the volume of a measurable set is controlled by (a power of) the perimeter of the set; more precisely, for any measurable E \subset \mathbb{R}^d

\displaystyle |E|^{d-1} \lesssim P(E)^d,

where P(E) denotes the perimeter1 of E – in case E = K a symmetric convex body as above we would have P(K) = \sigma(\partial K) . But in the affine context the “affine perimeter” is \Omega(\partial K) and is controlled by the volume instead of viceversa. This makes perfect sense: if K is taken to be a cube Q then \kappa_{\partial Q} = 0 and so the “affine perimeter” cannot control anything. Notice also that the power of the perimeter is d for the standard isoperimetric inequality and it is instead d+1 for the affine isoperimetric inequality. Informally speaking, this is related to the fact that the affine perimeter is measuring curvature too instead of just area.
So, the inequality should actually be called something like “Affine anti-Isoperimetric inequality” to better reflect this, but I don’t get to choose the names.

The inequality above is formulated for convex bodies since those are the most relevant objects for Affine Geometry. However, below we will see that Harmonic Analysis provides a sweeping generalisation of the inequality to arbitrary hypersurfaces that are not necessarily boundaries of convex bodies. Before showing this generalisation, we need to introduce Affine Fourier restriction estimates, which we do in the next section.

Continue reading

The Chang-Wilson-Wolff inequality using a lemma of Tao-Wright

Today I would like to introduce an important inequality from the theory of martingales that will be the subject of a few more posts. This inequality will further provide the opportunity to introduce a very interesting and powerful result of Tao and Wright – a sort of square-function characterisation for the Orlicz space L(\log L)^{1/2} .

1. The Chang-Wilson-Wolff inequality

Consider the collection \mathcal{D} of standard dyadic intervals that are contained in [0,1] . We let \mathcal{D}_j for each j \in \mathbb{N} denote the subcollection of intervals I \in \mathcal{D} such that |I|= 2^{-j} . Notice that these subcollections generate a filtration of \mathcal{D}, that is (\sigma(\mathcal{D}_j))_{j \in \mathbb{N}}, where \sigma(\mathcal{D}_j) denotes the sigma-algebra generated by the collection \mathcal{D}_j . We can associate to this filtration the conditional expectation operators

\displaystyle  \mathbf{E}_j f := \mathbf{E}[f \,|\, \sigma(\mathcal{D}_j)],

and therefore define the martingale differences

\displaystyle  \mathbf{D}_j f:= \mathbf{E}_{j+1} f - \mathbf{E}_{j}f.

With this notation, we have the formal telescopic identity

\displaystyle  f = \mathbf{E}_0 f + \sum_{j \in \mathbb{N}} \mathbf{D}_j f.

Demystification: the expectation \mathbf{E}_j f(x) is simply \frac{1}{|I|} \int_I f(y) \,dy, where I is the unique dyadic interval in \mathcal{D}_j such that x \in I .

Letting f_j := \mathbf{E}_j f for brevity, the sequence of functions (f_j)_{j \in \mathbb{N}} is called a martingale (hence the name “martingale differences” above) because it satisfies the martingale property that the conditional expectation of “future values” at the present time is the present value, that is

\displaystyle  \mathbf{E}_{j} f_{j+1} = f_j.

In the following we will only be interested in functions with zero average, that is functions such that \mathbf{E}_0 f = 0. Given such a function f : [0,1] \to \mathbb{R} then, we can define its martingale square function S_{\mathcal{D}}f to be

\displaystyle  S_{\mathcal{D}} f := \Big(\sum_{j \in \mathbb{N}} |\mathbf{D}_j f|^2 \Big)^{1/2}.

With these definitions in place we can state the Chang-Wilson-Wolff inequality as follows.

C-W-W inequality: Let {f : [0,1] \to \mathbb{R}} be such that \mathbf{E}_0 f = 0. For any {2\leq p < \infty} it holds that

\displaystyle  \boxed{\|f\|_{L^p([0,1])} \lesssim p^{1/2}\, \|S_{\mathcal{D}}f\|_{L^p([0,1])}.} \ \ \ \ \ \ (\text{CWW}_1)

An important point about the above inequality is the behaviour of the constant in the Lebesgue exponent {p} , which is sharp. This can be seen by taking a “lacunary” function {f} (essentially one where \mathbf{D}_jf = a_j \in \mathbb{C} , a constant) and randomising the signs using Khintchine’s inequality (indeed, {p^{1/2}} is precisely the asymptotic behaviour of the constant in Khintchine’s inequality; see Exercise 5 in the 2nd post on Littlewood-Paley theory).
It should be remarked that the inequality extends very naturally and with no additional effort to higher dimensions, in which [0,1] is replaced by the unit cube [0,1]^d and the dyadic intervals are replaced by the dyadic cubes. We will only be interested in the one-dimensional case here though.

Continue reading

Oscillatory integrals II: several-variables phases

This is the second part of a two-part post on the theory of oscillatory integrals. In the first part we studied the theory of oscillatory integrals whose phases are functions of a single variable. In this post we will instead study the case in which the phase is a function of several variables (and we integrate in all of them). Here the theory becomes weaker because these objects can indeed have a worse behaviour. We will proceed by analogy following the same footsteps as in the single-variable case.
Part I

3. Oscillatory integrals in several variables

In the previous section we have analysed the situation for single variable phases, that is for integrals over (intervals of) {\mathbb{R}}. In this section, we will instead study the higher dimensional situation, when the phase is a function of several variables. Things are unfortunately generally not as nice as in the single variable case, as you will see.

In order to avoid having to worry about connected open sets of {\mathbb{R}^d} (see Exercise 18 for the sort of issues that arise in trying to deal with general open sets of {\mathbb{R}^d}), in this section we will study mainly objects of the form

\displaystyle  \mathcal{I}_{\psi}(\lambda) := \int_{\mathbb{R}^d} e^{i \lambda u(x)} \psi(x) dx,

where {\psi} has compact support. We have switched to {u} for the phase to remind the reader of the fact that it is a function of several variables now.

3.1. Principle of non-stationary phase – several variables

The principle of non-stationary phase we saw in Section 2 of part I continues to hold in the several variables case.
Given a phase {u}, we say that {x_0} is a critical point of {u} if

\displaystyle  \nabla u(x_0) = (0,\ldots,0).

Proposition 8 (Principle of non-stationary phase – several variables) Let {\psi \in C^\infty_c(\mathbb{R}^d)} (that is, smooth and compactly supported) and let the phase {u\in C^\infty} be such that {u} does not have critical points in the support of {\psi}. Then for any {N >0} we have

\displaystyle  |\mathcal{I}_\psi(\lambda)|\lesssim_{N,\psi,u} |\lambda|^{-N}.

Continue reading

Marcinkiewicz-type multiplier theorem for q-variation (q > 1)

Not long ago we discussed one of the main direct applications of the Littlewood-Paley theory, namely the Marcinkiewicz multiplier theorem. Recall that the single-variable version of this theorem can be formulated as follows:

Theorem 1 [Marcinkiewicz multiplier theorem]: Let {m} be a function on \mathbb{R} such that

  1. m \in L^\infty
  2. for every Littlewood-Paley dyadic interval L := [2^k, 2^{k+1}] \cup [-2^{k+1},-2^k] with k \in \mathbb{Z}

    \displaystyle \|m\|_{V(L)} \leq C,

    where \|m\|_{V(L)} denotes the total variation of {m} over the interval L .

Then for any {1 < p < \infty} the multiplier {T_m} defined by \widehat{T_m f} = m \widehat{f} for functions f \in L^2(\mathbb{R}) extends to an L^p \to L^p bounded operator,

\displaystyle \|Tf\|_{L^p} \lesssim_p (\|m\|_{L^\infty} + C) \|f\|_{L^p}.

You should also recall that the total variation V(I) above is defined as

\displaystyle \sup_{N}\sup_{\substack{t_0, \ldots, t_N \in I : \\ t_0 < \ldots < t_N}} \sum_{j=1}^{N} |m(t_j) - m(t_{j-1})|,

and if {m} is absolutely continuous then {m'} exists as a measurable function and the total variation over interval I is given equivalently by \int_{I} |m'(\xi)|d\xi . We have seen that the “dyadic total variation condition” 2.) above is to be seen as a generalisation of the pointwise condition |m'(\xi)|\lesssim |\xi|^{-1} , which in dimension 1 happens to coincide with the classical differential Hörmander condition (in higher dimensions the pointwise Marcinkiewicz conditions are of product type, while the pointwise Hörmander(-Mihklin) conditions are of radial type; see the relevant post). Thus the Marcinkiewicz multiplier theorem in dimension 1 can deal with multipliers whose symbol is somewhat rougher than being differentiable. It is an interesting question to wonder how much rougher the symbols can get while still preserving their L^p mapping properties (or maybe giving up some range – recall though that the range of boundedness for multipliers must be symmetric around 2 because multipliers are self-adjoint).

Coifman, Rubio de Francia and Semmes came up with an answer to this question that is very interesting. They generalise the Marcinkiewicz multiplier theorem (in dimension 1) to multipliers that have bounded {q} -variation with {q} > 1. Let us define this quantity rigorously.

Definition: Let q \geq 1 and let I be an interval. Given a function f : \mathbb{R} \to \mathbb{R}, its {q} -variation over the interval {I} is

\displaystyle \|f\|_{V_q(I)} := \sup_{N} \sup_{\substack{t_0, \ldots t_N \in I : \\ t_0 < \ldots < t_N}} \Big(\sum_{j=1}^{N} |f(t_j) - f(t_{j-1})|^q\Big)^{1/q}

Notice that, with respect to the notation above, we have \|m\|_{V(I)} = \|m\|_{V_1(I)} . From the fact that \|\cdot\|_{\ell^q} \leq \|\cdot \|_{\ell^p} when p \leq q we see that we have always \|f\|_{V_q (I)} \leq \|f\|_{V_p(I)} , and therefore the higher the {q} the less stringent the condition of having bounded {q} -variation becomes (this is linked to the Hölder regularity of the function getting worse). In particular, if we wanted to weaken hypothesis 2.) in the Marcinkiewicz multiplier theorem above, we could simply replace it with the condition that for any Littlewood-Paley dyadic interval L we have instead \|m\|_{V_q(L)} \leq C . This is indeed what Coifman, Rubio de Francia and Semmes do, and they were able to show the following:

Theorem 2 [Coifman-Rubio de Francia-Semmes, ’88]: Let q\geq 1 and let {m} be a function on \mathbb{R} such that

  1. m \in L^\infty
  2. for every Littlewood-Paley dyadic interval L := [2^k, 2^{k+1}] \cup [-2^{k+1},-2^k] with k \in \mathbb{Z}

    \displaystyle \|m\|_{V_q(L)} \leq C.

Then for any {1 < p < \infty} such that {\Big|\frac{1}{2} - \frac{1}{p}\Big| < \frac{1}{q} } the multiplier {T_m} defined by \widehat{T_m f} = m \widehat{f} extends to an L^p \to L^p bounded operator,

\displaystyle \|Tf\|_{L^p} \lesssim_p (\|m\|_{L^\infty} + C) \|f\|_{L^p}.

The statement is essentially the same as before, except that now we are imposing control of the {q} -variation instead and as a consequence we have the restriction that our Lebesgue exponent {p} satisfy {\Big|\frac{1}{2} - \frac{1}{p}\Big| < \frac{1}{q} }. Taking a closer look at this condition, we see that when the variation parameter is 1 \leq q \leq 2 the condition is empty, that is there is no restriction on the range of boundedness of T_m : it is still the full range {1} < {p} < \infty , and as {q} grows larger and larger the range of boundedness restricts itself to be smaller and smaller around the exponent p=2 (for which the multiplier is always necessarily bounded, by Plancherel). This is a very interesting behaviour, which points to the fact that there is a certain dichotomy between variation in the range below 2 and the range above 2, with 2 -variation being the critical case. This is not an isolated case: for example, the Variation Norm Carleson theorem is false for {q} -variation with {q \leq 2} ; similarly, the Lépingle inequality is false for 2-variation and below (and this is related to the properties of Brownian motion).

Continue reading

Kovač’s solution of the maximal Fourier restriction problem

About 2 years ago, Müller Ricci and Wright published a paper that opened a new line of investigation in the field of Fourier restriction: that is, the study of the pointwise meaning of the Fourier restriction operators. Here is an account of a recent contribution to this problem that largely sorts it out.

1. Maximal Fourier Restriction
Recall that, given a smooth submanifold \Sigma of \mathbb{R}^d with surface measure d\sigma , the restriction operator {R} is defined (initially) for Schwartz functions as

\displaystyle f \mapsto Rf:= \widehat{f}\Big|_{\Sigma};

it is only after having proven an a-priori estimate such as \|Rf\|_{L^q(\Sigma,d\sigma)} \lesssim \|f\|_{L^p(\mathbb{R}^d)} that we can extend {R} to an operator over the whole of L^p(\mathbb{R}^d), by density of the Schwartz functions. However, it is no longer clear what the relationship is between this new operator that has been operator-theoretically extended and the original operator that had a clear pointwise definition. In particular, a non-trivial question to ask is whether for d\sigma  -a.e. point \xi \in \Sigma we have

\displaystyle \lim_{r \to 0} \frac{1}{|B(0,r)|} \int_{\eta \in B(0,r)} |\widehat{f}(\xi - \eta)| d\eta = \widehat{f}(\xi), \ \ \ \ \ (1)


where B(0,r) is the ball of radius {r} and center {0} . Observe that the Lebesgue differentiation theorem already tells us that for a.e. element of \mathbb{R}^d in the Lebesgue sense the above holds; but the submanifold \Sigma has Lebesgue measure zero, and therefore the differentiation theorem cannot give us any information. In this sense, the question above is about the structure of the set of the Lebesgue points of \widehat{f} and can be reformulated as:

Q: can the complement of the set of Lebesgue points of \widehat{f} contain a copy of the manifold \Sigma ?

Continue reading

Basic Littlewood-Paley theory II: square functions

This is the second part of the series on basic Littlewood-Paley theory, which has been extracted from some lecture notes I wrote for a masterclass. In this part we will prove the Littlewood-Paley inequalities, namely that for any {1 < p < \infty} it holds that

\displaystyle \|f\|_{L^p (\mathbb{R})} \sim_p \Big\|\Big(\sum_{j \in \mathbb{Z}} |\Delta_j f|^2 \Big)^{1/2}\Big\|_{L^p (\mathbb{R})}. \ \ \ \ \ (\dagger)


This time there are also plenty more exercises, some of which I think are fairly interesting (one of them is a theorem of Rudin in disguise).
Part I: frequency projections.

4. Smooth square function

In this subsection we will consider a variant of the square function appearing at the right-hand side of (\dagger ) where we replace the frequency projections {\Delta_j} by better behaved ones.

Let {\psi} denote a smooth function with the properties that {\psi} is compactly supported in the intervals {[-4,-1/2] \cup [1/2, 4]} and is identically equal to {1} on the intervals {[-2,-1] \cup [1,2]}. We define the smooth frequency projections {\widetilde{\Delta}_j} by stipulating

\displaystyle  \widehat{\widetilde{\Delta}_j f}(\xi) := \psi(2^{-j} \xi) \widehat{f}(\xi);

notice that the function {\psi(2^{-j} \xi)} is supported in {[-2^{j+2},-2^{j-1}] \cup [2^{j-1}, 2^{j+2}]} and identically {1} in {[-2^{j+1},-2^{j}] \cup [2^{j}, 2^{j+1}]}. The reason why such projections are better behaved resides in the fact that the functions {\psi(2^{-j}\xi)} are now smooth, unlike the characteristic functions {\mathbf{1}_{[2^j,2^{j+1}]}}. Indeed, they are actually Schwartz functions and you can see by Fourier inversion formula that {\widetilde{\Delta}_j f = f \ast (2^{j} \widehat{\psi}(2^{j}\cdot))}; the convolution kernel {2^{j} \widehat{\psi}(2^{j}\cdot)} is uniformly in {L^1} and therefore the operator is trivially {L^p \rightarrow L^p} bounded for any {1 \leq p \leq \infty} by Young’s inequality, without having to resort to the boundedness of the Hilbert transform.
We will show that the following smooth analogue of (one half of) (\dagger ) is true (you can study the other half in Exercise 6).

Proposition 3 Let {\widetilde{S}} denote the square function

\displaystyle  \widetilde{S}f := \Big(\sum_{j \in \mathbb{Z}} \big|\widetilde{\Delta}_j f \big|^2\Big)^{1/2}.

Then for any {1 < p < \infty} we have that the inequality

\displaystyle  \big\|\widetilde{S}f\big\|_{L^p(\mathbb{R})} \lesssim_p \|f\|_{L^p(\mathbb{R})} \ \ \ \ \ (1)

holds for any {f \in L^p(\mathbb{R})}.

We will give two proofs of this fact, to illustrate different techniques. We remark that the boundedness will depend on the smoothness and the support properties of {\psi} only, and as such extends to a larger class of square functions.
Continue reading

Basic Littlewood-Paley theory I: frequency projections

I have written some notes on Littlewood-Paley theory for a masterclass, which I thought I would share here as well. This is the first part, covering some motivation, the case of a single frequency projection and its vector-valued generalisation. References I have used in preparing these notes include Stein’s “Singular integrals and differentiability properties of functions“, Duoandikoetxea’s “Fourier Analysis“, Grafakos’ “Classical Fourier Analysis” and as usual some material by Tao, both from his blog and the notes for his courses. Prerequisites are some basic Fourier transform theory, Calderón-Zygmund theory of euclidean singular integrals and its vector-valued generalisation (to Hilbert spaces, we won’t need Banach spaces).

0. Introduction
Harmonic analysis makes a fundamental use of divide-et-impera approaches. A particularly fruitful one is the decomposition of a function in terms of the frequencies that compose it, which is prominently incarnated in the theory of the Fourier transform and Fourier series. In many applications however it is not necessary or even useful to resolve the function {f} at the level of single frequencies and it suffices instead to consider how wildly different frequency components behave instead. One example of this is the (formal) decomposition of functions of {\mathbb{R}} given by

\displaystyle f = \sum_{j \in \mathbb{Z}} \Delta_j f,

where {\Delta_j f} denotes the operator

\displaystyle \Delta_j f (x) := \int_{\{\xi \in \mathbb{R} : 2^j \leq |\xi| < 2^{j+1}\}} \widehat{f}(\xi) e^{2\pi i \xi \cdot x} d\xi,

commonly referred to as a (dyadic) frequency projection. Thus {\Delta_j f} represents the portion of {f} with frequencies of magnitude {\sim 2^j}. The Fourier inversion formula can be used to justify the above decomposition if, for example, {f \in L^2(\mathbb{R})}. Heuristically, since any two {\Delta_j f, \Delta_{k} f} oscillate at significantly different frequencies when {|j-k|} is large, we would expect that for most {x}‘s the different contributions to the sum cancel out more or less randomly; a probabilistic argument typical of random walks (see Exercise 1) leads to the conjecture that {|f|} should behave “most of the time” like {\Big(\sum_{j \in \mathbb{Z}} |\Delta_j f|^2 \Big)^{1/2}} (the last expression is an example of a square function). While this is not true in a pointwise sense, we will see in these notes that the two are indeed interchangeable from the point of view of {L^p}-norms: more precisely, we will show that for any {1 < p < \infty} it holds that

\displaystyle  \boxed{ \|f\|_{L^p (\mathbb{R})} \sim_p \Big\|\Big(\sum_{j \in \mathbb{Z}} |\Delta_j f|^2 \Big)^{1/2}\Big\|_{L^p (\mathbb{R})}. }\ \ \ \ \ (\dagger)

This is a result historically due to Littlewood and Paley, which explains the name given to the related theory. It is easy to see that the {p=2} case is obvious thanks to Plancherel’s theorem, to which the statement is essentially equivalent. Therefore one could interpret the above as a substitute for Plancherel’s theorem in generic {L^p} spaces when {p\neq 2}.

In developing a framework that allows to prove (\dagger ) we will encounter some variants of the square function above, including ones with smoother frequency projections that are useful in a variety of contexts. We will moreover show some applications of the above fact and its variants. One of these applications will be a proof of the boundedness of the spherical maximal function {\mathscr{M}_{\mathbb{S}^{d-1}}} (almost verbatim the one on Tao’s blog).

Notation: We will use {A \lesssim B} to denote the estimate {A \leq C B} where {C>0} is some absolute constant, and {A\sim B} to denote the fact that {A \lesssim B \lesssim A}. If the constant {C} depends on a list of parameters {L} we will write {A \lesssim_L B}.

Continue reading

Christ’s result on near-equality in Riesz-Sobolev inequality

Pdf: link.

It’s finally time to address one of Christ’s papers I talked about in the previous two blogposts. As mentioned there, I’ve chosen to read the one about the near-equality in the Riesz-Sobolev inequality because it seems the more approachable, while still containing one very interesting idea: exploiting the additive structure lurking behind the inequality via Freiman’s theorem.

1. Elaborate an attack strategy

Everything is in dimension {d=1} and some details of the proof are specific to this dimension and don’t extend to higher dimensions. I’ll stick to Christ’s notation.

Recall that the Riesz-Sobolev inequality is

\displaystyle \boxed{\left\langle \chi_{A} \ast \chi_{B}, \chi_{C}\right\rangle \leq \left\langle \chi_{A^\ast} \ast \chi_{B^\ast}, \chi_{C^\ast}\right\rangle} \ \ \ \ \ (1)

and its extremizers – which exist under the hypothesis that the sizes are all comparable – are intervals, i.e. the intervals are the only sets that realize equality in (1). See previous post for further details. The aim of paper [ChRS] is to prove that whenever {\left\langle \chi_{A} \ast \chi_{B}, \chi_{C}\right\rangle} is suitably close to {\left\langle \chi_{A^\ast} \ast \chi_{B^\ast}, \chi_{C^\ast}\right\rangle} (i.e. we nearly have equality) then the sets {A,B,C} are nearly intervals themselves.

Continue reading

Freiman’s theorem and compact subsets of the real line with additive structure

Here the pdf version: link.

In the following, I shall use {|A|} to denote both the Lebesgue measure of {A}, when a subset of {\mathbb{R}}, or the cardinality of set {A}. This shouldn’t cause any confusion, and help highlight the parallel with the continuous case.

For the sake of completeness, we remind the reader that the Minkowski sum of two sets {A,B} is defined as

\displaystyle A+B:=\{a+b \,:\, a\in A, b\in B\}.

 

I’ve been shamefully sketchy in the previous post about Christ’s work on near extremizers, and in particular I haven’t addressed properly one of the most important ideas in his work: exploiting the hidden additive structure of the inequalities. I plan to do that in this post and a following one, in which I’ll sketch his proof of the sharpened Riesz-Sobolev inequality.

In that paper, one is interested in proving that triplets of sets {A,B,C \subset \mathbb{R}^d} that nearly realize equality in Riesz-Sobolev inequality

\displaystyle \left\langle \chi_{A} \ast \chi_{B}, \chi_{C}\right\rangle \leq \left\langle \chi_{A^\ast} \ast \chi_{B^\ast}, \chi_{C^\ast}\right\rangle

must be close to the extremizers of the inequality, which are ellipsoids (check this previous post for details and notation). In case {d=1}, ellipsoids are just intervals, and one wants to prove there exist intervals {I,J,K} s.t. {A \Delta I, B\Delta J, C\Delta K} are very small.

Christ devised a tool that can be used to prove that a set on the line must nearly coincide with an interval. It’s the following

Proposition 1 (Christ, [ChRS2]) , (continuum Freiman’s theorem) Let {A\subset \mathbb{R}} be a measurable set with finite measure {>0}. If

\displaystyle |A+A|< 3|A|,

then there exists an interval {I} s.t. {A\subset I} [1] and

\displaystyle |I| \leq |A+A|-|A|.

Thus if one can exploit the near equality to spot some additive structure, one has a chance to prove the sets must nearly coincide with intervals. It turns out that there actually is additive structure concealed in the Riesz-Sobolev inequality: consider the superlevel sets

\displaystyle S_{A,B}(t):=\{x \in \mathbb{R} \,:\, \chi_A \ast \chi_B (x) > t\};

then one can prove that

\displaystyle S_{A,B} (t) - S_{A,B} (t') \subset S_{A,-A}(t+t' - |B|).

If one can control the measure of the set on the right by {|S_{A,B} (t)|} for some specific value of {t=t'}, then Proposition 1 can be applied, and {S_{A,B}(t)} will nearly coincide with an interval. Then one has to prove this fact extends to {A,B,C}, but that’s what the proof in [ChRS] is about and I will address it in the following post, as said.

Anyway, the result in Prop. 1, despite being stated in a continuum setting, is purely combinatoric. It follows – by a limiting argument – from a big result in additive combinatorics: Freiman’s theorem.

The aim of this post is to show how Prop. 1 follows from Freiman’s theorem, and to prove Freiman’s theorem with additive combinatorial techniques. It isn’t necessary at all in order to appreciate the results in [ChRS], but I though it was nice anyway. I haven’t stated the theorem yet though, so here it is:

Theorem 2 (Freiman’s {3k-3} theorem) Let {A\subset \mathbb{Z}} be finite and such that

\displaystyle |A+A| < 3|A|-3.

Then there exists an arithmetic progression {P} s.t. {A\subseteq P}, whose length is {|P|\leq |A+A|-|A|+1}.

The proof isn’t extremely hard but neither it’s trivial. It relies on a few lemmas, and it is fully contained in section 2. Section 1 contains instead the limiting procedure mentioned above that allows to deduce Proposition 1 from Freiman’s theorem.

Remark 1 Notice that Proposition 1 is essentially a result for the near-extremizers of Brunn-Minkowski’s inequality in {\mathbb{R}^1}, which states that {|A+A|\geq |A|+|A|}. Indeed the extremizers for B-M are convex sets, which in dimension 1 means the intervals. Thus Prop 1 is saying that if {|A+A|} isn’t much larger than {2|A|}, then {A} is close to being an extremizer, i.e. an interval. One can actually prove that for two sets {A,B}, if one has

\displaystyle |A+B| \leq |A|+|B|+\min(|A|,|B|)

then {\mathrm{diam}(A) \leq |A+B|-|B|}. A proof can be found in [ChRS]. It is in this sense that the result in [ChBM] for Brunn-Minkowski was used to prove the result in [ChRS] for Riesz-Sobolev, which was then used for Young’s and thus for Hausdorff-Young, as mentioned in the previous post.

Continue reading

Fine structure of some classical affine-invariant inequalities and near-extremizers (account of a talk by Michael Christ)

Pdf version here: link.

I’m currently in Bonn, as mentioned in the previous post, participating to the Trimester Program organized by the Hausdorff Institute of Mathematics – although my time is almost over here. It has been a very pleasant experience: Bonn is lovely, the studio flat they got me is incredibly nice, Germany won the World Cup (nice game btw) and the talks were interesting. 2nd week has been pretty busy since there were all the main talks and some more unexpected talks in number theory which I attended. The week before that had been more relaxed instead, but I’ve followed a couple of talks then as well. Here I want to report about Christ’s talk on his work in the last few years, because I found it very interesting and because I had the opportunity to follow a second talk, which was more specific of the Hausdorff-Young inequality and helped me clarify some details I was confused about. If you get a chance, go to his talks, they’re really good.

What follows is an account of Christ’s talks – there are probably countless out there, but here’s another one. This is by no means original work, it’s very close to the talks themselves and I’m doing it only as a way to understand better. I’ll stick to Christ’s notation too. Also, I’m afraid the bibliography won’t be very complete, but I have included his papers, you can make your way to the other ones from there.

1. Four classical inequalities and their extremizers

Prof. Christ introduced four famous apparently unrelated inequalities. These are

  • the Hausdorff-Young inequality: for all functions {f \in L^p (\mathbb{R}^d)}, with {1\leq p \leq 2},

    \displaystyle \boxed{\|\widehat{f}\|_{L^{p'}}\leq \|f\|_{L^p};} \ \ \ \ \ \ \ \ \ \ \text{(H-Y)}

  • the Young inequality for convolution: if {1+\frac{1}{q_3}=\frac{1}{q_1}+\frac{1}{q_2}} then

    \displaystyle \|f \ast g\|_{L^{q_3}} \leq \|f\|_{L^{q_1}}\|g\|_{L^{q_2}};

    for convenience, he put it in trilinear form

    \displaystyle \boxed{ |\left\langle f\ast g, h \right\rangle|\leq \|f\|_{L^{p_1}}\|g\|_{L^{p_2}}\|h\|_{L^{p_3}}; } \ \ \ \ \ \ \ \ \ \ \text{(Y)}

    notice the exponents satisfy {\frac{1}{p_1}+\frac{1}{p_2}+\frac{1}{p_3}=2} (indeed {q_1=p_1} and same for index 2, but {p_3 = q'_3});

  • the Brunn-Minkowski inequality: for any two measurable sets {A,B \subset \mathbb{R}^d} of finite measure it is

    \displaystyle \boxed{ |A+B|^{1/d} \geq |A|^{1/d} + |B|^{1/d}; } \ \ \ \ \ \ \ \ \ \ \text{(B-M)}

  • the Riesz-Sobolev inequality: this is a rearrangement inequality, of the form

    \displaystyle \boxed{ \left\langle \chi_A \ast \chi_B, \chi_C \right\rangle \leq\left\langle \chi_{A^\ast} \ast \chi_{B^\ast}, \chi_{C^\ast} \right\rangle,} \ \ \ \ \ \ \ \ \ \ \text{(R-S)}

    where {A,B,C} are measurable sets and given set {E} the notation {E^\ast} stands for the symmetrized set given by ball {B(0, c_d |E|^{1/d})}, where {c_d} is a constant s.t. {|E|=|E^\ast|}: it’s a ball with the same volume as {E}.

These inequalities share a large group of symmetries, indeed they are all invariant w.r.t. the group of affine invertible transformations (which includes dilations and translations) – an uncommon feature. Moreover, for all of them the extremizers exist and have been characterized in the past. A natural question then arises

Is it true that if {f} (or {E}, or {\chi_E} where appropriate) is close to realizing the equality, then {f} must also be close (in an appropriate sense) to an extremizer of the inequality?

Another way to put it is to think of these questions as relative to the stability of the extremizers, and that’s why they are referred to as fine structure of the inequalities. If proving the inequality is the first level of understanding it, answering the above question is the second level. As an example, answering the above question for (H-Y) led to a sharpened inequality. Christ’s work was motivated by the fact that nobody seemed to have addressed the question before in the literature, despite being a very natural one to ask.

Continue reading