Pdf version here: link.

I’m currently in Bonn, as mentioned in the previous post, participating to the Trimester Program organized by the Hausdorff Institute of Mathematics – although my time is almost over here. It has been a very pleasant experience: Bonn is lovely, the studio flat they got me is incredibly nice, Germany won the World Cup (nice game btw) and the talks were interesting. 2nd week has been pretty busy since there were all the main talks and some more unexpected talks in number theory which I attended. The week before that had been more relaxed instead, but I’ve followed a couple of talks then as well. Here I want to report about Christ’s talk on his work in the last few years, because I found it very interesting and because I had the opportunity to follow a second talk, which was more specific of the Hausdorff-Young inequality and helped me clarify some details I was confused about. If you get a chance, go to his talks, they’re really good.

What follows is an account of Christ’s talks – there are probably countless out there, but here’s another one. This is by no means original work, it’s very close to the talks themselves and I’m doing it only as a way to understand better. I’ll stick to Christ’s notation too. Also, I’m afraid the bibliography won’t be very complete, but I have included his papers, you can make your way to the other ones from there.

**1. Four classical inequalities and their extremizers **

Prof. Christ introduced four famous apparently unrelated inequalities. These are

- the
*Hausdorff-Young inequality*: for all functions , with , - the
*Young inequality*for convolution: if thenfor convenience, he put it in trilinear form

notice the exponents satisfy (indeed and same for index 2, but );

- the
*Brunn-Minkowski inequality*: for any two measurable sets of finite measure it is - the
*Riesz-Sobolev inequality*: this is a rearrangement inequality, of the formwhere are measurable sets and given set the notation stands for the symmetrized set given by ball , where is a constant s.t. : it’s a ball with the same volume as .

These inequalities share a large group of symmetries, indeed they are all invariant w.r.t. the group of affine invertible transformations (which includes dilations and translations) – an uncommon feature. Moreover, for all of them the extremizers exist and have been characterized in the past. A natural question then arises

Is it true that if (or , or where appropriate) is close to realizing the equality, then must also be close (in an appropriate sense) to an extremizer of the inequality?

Another way to put it is to think of these questions as relative to the stability of the extremizers, and that’s why they are referred to as *fine structure* of the inequalities. If proving the inequality is the first level of understanding it, answering the above question is the second level. As an example, answering the above question for (H-Y) led to a sharpened inequality. Christ’s work was motivated by the fact that nobody seemed to have addressed the question before in the literature, despite being a very natural one to ask.

Now, I’ll get to the heart of the matter soon, but first I have to introduce the extremizers and make the above more rigourous. Let’s look into each inequality separately.

** 1.1. Hausdorff-Young inequality **

First of all, notice that the exponents in the inequality make it so that the inequality is invariant w.r.t. the group of invertible linear transformations : indeed if

and therefore . Moreover, since translations in physical space are modulations in the frequency space, they don’t change the magnitude of , and thus the inequality is invariant w.r.t. translations as well. It follows that the (H-Y) inequality is invariant with respect to the full group of affine invertible transformations. Dually, it’s invariant to modulations as well, and finally to scalar multiplication.

It’s probably well known to anyone that isn’t the best constant in the H-Y inequality for [1]: it was proved by Beckner [Be] that with

What Beckner proved is that Gaussians are extremizers of H-Y (i.e. functions that realize the equality ): a gaussian function is of the form

with positive definite matrix. Later Lieb proved that actually *all* the extremizers of H-Y are gaussians (he did so by exploiting the symmetries pointed out above and the tensorial structure of the Fourier transform in multiple dimensions).

An interesting observation is that one can prove Young’s inequality (Y) for a restricted range of exponents from (H-Y) and Hölder: since , with , and assuming

one can indeed verify that .

Finally, Christ remarked (in the 2nd talk) that (H-Y) enjoys a form of strong non-locality: if you take and form the set (the union of intervals centered at , of length ), then there exists s.t. for

uniformly in . This is to be interpreted as the fact that the ratio is quite independent of the relative separation of the intervals, and thus one can’t exclude a priori near-extremizing functions that concentrate in more than one point (and thus aren’t close to gaussians, which are concentrated only in one point). This case actually presents itself explicitely in the proof of Thm. 1 below.

** 1.2. Young’s convolution inequality **

The optimal constant for Young’s inequality is when the exponents are in the range , and this again was proved by Beckner, with some more general result by Brascamp-Lieb. For the curious reader, the optimal constant is

What’s more important here is that again *all* the extremizers are (particular) triplets of gaussian functions , and this is due to Lieb, and there’s a nice proof via the heat flow monotonicity method by Carlen, Lieb and Loss (2004). The proof can be sketched as follows: let be the quantity

which is essentially . We assume the functions are positive. The idea is to use the ‘s as initial data and let them evolve indefinitely under the heat flow (i.e. treat them as temperatures). It’s known that they will then approach gaussians under appropriate rescaling, but we have to make sure they’ll approach equality in (Y) too. This will be allowed by the fact that the heat equation flow preserves the mass. Introducing the time variable , let initial data evolve under the heat flow, i.e.

so that satisfies

then consider the quantity

thus . A direct calculation shows that , thus . But by the change of variables and the relationship amongst the exponents, is also equal to the quantity

Now, as anticipated

and thus by Fatou’s lemma [2]

Finally, the gimmick is that since mass is preserved by the flow, then , and thus in the end

We remark here that the (Y) inequality is invariant w.r.t. the affine invertible transformations as well.

An interesting observation is that (Y) implies (H-Y) for even exponents, again thanks to the identity : indeed, let , then by Plancherel and (Y)

and is the dual exponent to . This is indeed how Young himself proved this particular case of (H-Y), before Hausdorff proved the result for all exponents .

** 1.3. Brunn-Minkowski inequality **

Observe there are no upperbounds on , since it can well be but (e.g. take , , so that ).

In this case the extremizers are the *convex sets*: must be convex and must be homotetic to it (i.e. there’s a translation+homotety that is a bijection between and ). This was proved by Minkowski.

In this case we have again invariance w.r.t. the affine invertible transformations, because and the Minkowski sum is linear, so ; moreover, Minkowski sum commutes with translation. Notice that convexity is preserved by affine invertible maps, thus convex sets are transformed into convex sets.

** 1.4. Riesz-Sobolev inequality **

Notice the inequality actually holds for general positive functions if you use the symmetric decreasing rearrangements (but it is equivalent to this one). In the case of characteristic functions though the RHS has a particularly simple expression: for balls centered in the origin indeed

In the case of the Riesz-Sobolev inequality for characteristic functions, the extremizers are triplets that are homotetic to the same *ellipsoid* – on the condition that

holds for all the permutations of (i.e. their “radii” are comparable). This was proved by Burchard in ’98. The reason why it has to be ellipsoids is the affine invariance that holds for this inequality as well: indeed, consider an invertible transformation , then and is a ball of volume , thus . Thus, since the inequality is certainly an equality for balls centered at the origin, it is so for balls transformed under too, which are infact ellipsoids.

Here we can spot another relationship amongst the inequalities: using (R-S) we can prove (B-M). Indeed, consider measurable sets and notice that since are balls then . Now it suffices to prove that . Since (except maybe for a set of null measure [3]), by (R-S)

but , and since we have that contains at least this support.

I will have more to say about this in the near future because I’m planning to read the paper relative to the near-extremizers of this inequality [ChRS] (because it seems more approachable than the others, starting from zero as I do).

**2. Christ’s results **

Let me sum up in a table what has been said so far

Invariance | H-Y | Y | B-M | R-S |

✓ | ✓ | ✓ | ✓ | |

translation | ✓ | ✓ | ✓ | ✓ |

modulation | ✓ | |||

extremizers | gaussians | triplets of gaussians | convex sets | ellipsoids |

Remark 1There are connections amongst all the extremizers seen so far: besides the trivial fact that an ellipsoid is a convex body, we could notice that an ellipsoid centered at the origin can be specified as

for a positive definite matrix; in particular, if are the eigenvalues of , the principal axes of have lengths . Thus we can associate gaussians and ellipsoids in a natural way.

One thing that Christ noticed is that all the above inequalities have some additive combinatorial structure. What does he mean by this? Well, first of all, consider the following objects:

Definition 1Adiscrete multi-progressionin is a set of the form

where the ‘s are linearly independent and the are arbitrary non-negative integers. One defines the rank of as . Notice it can be bigger than .

Discrete multi-progressions are thought of as a generalization of arithmetic progressions, and they are affine-invariant objects: rank is preserved by an invertible linear transformation. Combining this with the previous observations one is led to

Definition 2Acontinuum multi-progressionin is a set of the form

where is a discrete multi-progression and is a convex compact set. The rank of is defined as the rank of .

Thus affine invertible transformations preserve the continuum multi-progressions (from now on referred to simply as multi-progressions). These objects proved to be fundamental in the theory for (Y), (B-M) and (R-S), and finally for (H-Y) by partially reducing to (Y). They encode enough additive structure to be treated with combinatorial results. See below.

Another thing to notice is that the R-S inequality on (say) is counting the number of pairs s.t. . Analogously, one could see (Y) as a weighted estimate on sumsets.

In the rest of this section I will collect the results of Christ for the above inequalities, in the order given above.

Theorem 3 (Sharpened Hausdorff-Young inequality, [ChHY])

There exists a constant s.t. for every non-null , for , it holds

where is the set of all gaussian functions.

Prof. Christ also pointed out that is not sharp.

Theorem 4 (Near-extremizers for (Y), [ChY])

If are such that for some

then there exists and a triplet of gaussians s.t.

Moreover, as .

Theorem 5 (Near-extremizers for (B-M), [ChBM])

Let be Borel measurable sets. If and

then there exists a convex set and an s.t.

Moreover, as .

Theorem 6 (Sharpened Riesz-Sobolev inequality, [ChRS])

There exists a constant s.t. if are measurable sets of finite measure for which (for a certain ) it holds

and the same holds for all permutations of (i.e. their sizes are all comparable), then

**3. Comments on the proofs **

These comments will regard mainly the proof of Theorem 1, but because of the common structure of the inequalities there are several aspects of this proof that are shared by all of the proofs. In particular, I want to point out that the result for (B-M) is needed in the proof of (R-S), which is in turn needed for the proof of (Y), which is in turn needed to prove the result for (H-Y), as I will highlight below. All in all the proof for (H-Y) is a complicated one and I’m just trying to illustrate (and understand myself) its structure – I will be very sketchy. I’m not able to offer any insight on it at the moment.

Assume in the following.

A common method to prove existence of extremizers [4] is that of a *pre-compactness argument*: for a linear operator that satisfies , one seeks to prove that if a sequence of functions is s.t. and , then there must be a subsequence that converges in -norm (in other words, the sequence is pre-compact in ) and realizes equality. Part of the proof of Thm. 1 relies on a qualitative result of this kind,

Theorem 7 (Theorem )For every , there exists s.t. if

then

.

One can verify it is indeed equivalent to the above, with the exception that you conclude pre-compactness of a suitably normalized sequence, i.e. obtained by renormalizing with some element of the symmetry group of (H-Y) . This is then combined with formula (1′) to deduce Thm. 1. Formula (1′) looks like a Taylor asymptotic expansion, and indeed it is! It’s the asymptotic expansion of the functional around a gaussian. Quite surprising, if you ask. Of course it isn’t trivial to come to (1′) at all, and indeed it required to prove a general lemma for the second variation of this and similar functionals, but I don’t want to further comment on it as it’s not as interesting to me as the other ideas in the proof.

This said, the rest of the proof boils down to the proof of Thm. .

Definition 8A-quasi-extremizerfor an inequality like is a function s.t. .A

-near-extremizeris a function s.t. .

It is an observation that a -quasi-extremizer of (H-Y) then satisfies

for some particular values of , i.e. it is a quasi-extremizer for (Y) as well, thus reducing the theory for (H-Y) to that for (Y) (at least for quasi-extr.; but one can iterate to tackle the near-extr. as well). The relationship between the two inequalities is really important, as you see. *Proof:* Indeed, assume , and consider the family

which is analytic in and s.t. , and in particular for it is . We want to use the Three Lines Lemma to interpolate: we choose strip , and on the left we just saw is constant (by Plancherel); as for , and therefore . It follows

where , i.e. . Now, there exists that realizes at least half of the supremum, and , and since we have

with . One can verify we can take sufficiently close to in order for the exponents to make sense.

Now one has to resort to the theory for (Y), a theory that relies heavily on the additive structure. In particular, a quasi extremizer for (Y) must concentrate its mass on a multiprogression of controlled rank and measure. Rigourously

Lemma 9 (Quasi-extremizers for (Y))Suppose is a -quasi-extremizer of (Y) in the sense that . Then there exists a disjoint decomposition of as

with

smallin the sense that andstructured, in the sense that there exists multiprogression s.t.

- ;
- ;
- .

One can adapt this lemma to the case of quasi-extremizers of (H-Y) by the connection pointed out above. And then, one can address the case of *near-extremizers* of (H-Y) instead, by choosing a parameter and applying the lemma for quasi-extremizers iteratively times to the “remainder” of a -near extremizer , thus obtaining , with and ‘s each supported on some multiprogression controlled in terms of . is chosen small enough.

One can further refine this decomposition in such a way that all the multiprogressions are contained in a single multiprogression of controlled rank and size (comparable to the biggest multiprogression in the decomposition), and the decomposition is then simply , thus eliminating the problem of having to deal with multiple multiprogressions at once.

Now by scaling one can reduce to the case . If is then made of a single piece, one can deal with it directly because it’s a bounded interval (remember ); the opposite case is that is a proper multiprogression. This case resembles the bad example of in section 1 : it’s an effect of the non-locality that’s featured in (H-Y). Part of the proof is then devoted to prove that this case is actually incompatible with the assumption of near-extremizing.

Assume then is supported in a neighbourhood of in . Since near extremizers for the (H-Y) inequality in are indeed all concentrated in one point, the idea is to prove that the same happens for by showing it is essentially a function on . In general, one lifts supported in to by defining

It can be proved that if such a is a near-extremizer for (H-Y) in then is a near-extremizer for (H-Y) in . This in turn implies that for most ‘s the function is a near extremizer of (H-Y) in and therefore has mass concentrated in one point. This is translated back to yielding that has mass nearly all concentrated on an interval .

Plot-twist: if is a near-extremizer of (H-Y), then so is ! And then everything that’s been said of so far applies to as well, and in particular is concentrated on an interval . Indeed, if we set , then ; then

and therefore

but , therefore , which is the exact definition of -near extremizer for .

So, we’ve said so far that if is a near extremizer then it’s concentrated in a time-frequency tile . The uncertainty principle tells us that , but Christ proves that by construction one has a reverse Heisenberg inequality too, and then , which also implies one can obtain . At this point, it should be heuristically evident that our function will be close to being a gaussian.

Nevertheless, remember we were interested in the precompactness result of Thm. . The last result for the time-frequency support can be used to conclude that for a sequence s.t. as before and , there must exist a subsequence of renormalized elements (i.e. is obtained from by dilation, translation and modulation, each preserving norm) s.t. is convergent in . Sadly, we wanted convergence of in instead, but we can deduce it from the one for the Fourier transforms because is an extremizing sequence and the unit ball in is -compact. This is all.

You might’ve noticed I’ve become progressively sketchier in the comments, and the reason is the one I’ve pointed out above: it is indeed a very complicated proof, and a great achievement. Also, I’ve followed the outline of the proof in [ChHY].

I think this is enough for this time, but hope to come back on the subject in the near future.

**footnotes:**

[1] although it is 1 for other Locally Compact Abelian groups like and .

[2] notice is still a gaussian.

[3] consider the Lebesgue points of the sets…

[4] but it has applications to PDEs as well.

**References:**

[Be] W. Beckner, *Inequalities in Fourier analysis*, Annals of Math., 102, 159-182, 1975.

[ChHY] M. Christ, *A sharpened Hausdorff-Young inequality*, arXiv:1406.1210 [math.CA]

[ChY] M. Christ, *Near extremizers of Young’s inequality in *, arXiv:1112.4875 [math.CA]

[ChBM] M. Christ, *Near equality in the Brunn-Minkowski inequality*, arXiv:1207.5062 [math.CA]

[ChRS] M. Christ, *Near equality in the Riesz-Sobolev inequality*, arXiv:1309.5856 [math.CA]