Basic Littlewood-Paley theory III: applications

This is the last part of a 3 part series on the basics of Littlewood-Paley theory. Today we discuss a couple of applications, that is Marcinkiewicz multiplier theorem and the boundedness of the spherical maximal function (the latter being an application of frequency decompositions in general, and not so much of square functions – though one appears, but only for L^2 estimates where one does not need the sophistication of Littlewood-Paley theory).
Part I: frequency projections
Part II: square functions

7. Applications of Littlewood-Paley theory

In this section we will present two applications of the Littlewood-Paley theory developed so far. You can find further applications in the exercises (see particularly Exercise 22 and Exercise 23).

7.1. Marcinkiewicz multipliers

Given an {L^\infty (\mathbb{R}^d)} function {m}, one can define the operator {T_m} given by

\displaystyle  \widehat{T_m f}(\xi) := m(\xi) \widehat{f}(\xi)

for all {f \in L^2(\mathbb{R}^d)}. The operator {T_m} is called a multiplier and the function {m} is called the symbol of the multiplier1. Since {m \in L^\infty}, Plancherel’s theorem shows that {T_m} is a linear operator bounded in {L^2}; its definition can then be extended to {L^2 \cap L^p} functions (which are dense in {L^p}). A natural question to ask is: for which values of {p} in {1 \leq p \leq \infty} is the operator {T_m} an {L^p \rightarrow L^p} bounded operator? When {T_m} is bounded in a certain {L^p} space, we say that it is an {L^p}multiplier.

The operator {T_m} introduced in Section 1 of the first post in this series is an example of a multiplier, with symbol {m(\xi,\tau) = \tau / (\tau - 2\pi i |\xi|^2)}. It is the linear operator that satisfies the formal identity T \circ (\partial_t - \Delta) = \partial_t . We have seen that it cannot be a (euclidean) Calderón-Zygmund operator, and thus in particular it cannot be a Hörmander-Mikhlin multiplier. This can be seen more directly by the fact that any Hörmander-Mikhlin condition of the form {|\partial^{\alpha}m(\xi,\tau)| \lesssim_\alpha |(\xi,\tau)|^{-|\alpha|} = (|\xi|^2 + \tau^2)^{-|\alpha|/2}} is clearly incompatible with the rescaling invariance of the symbol {m}, which satisfies {m(\lambda \xi, \lambda^2 \tau) = m(\xi,\tau)} for any {\lambda \neq 0}. However, the derivatives of {m} actually satisfy some other superficially similar conditions that are of interest to us. Indeed, letting {(\xi,\tau) \in \mathbb{R}^2} for simplicity, we can see for example that {\partial_\xi \partial_\tau m(\xi, \tau) = \lambda^3 \partial_\xi \partial_\tau m(\lambda\xi, \lambda^2\tau)}. When {|\tau|\lesssim |\xi|^2} we can therefore argue that {|\partial_\xi \partial_\tau m(\xi, \tau)| = |\xi|^{-3} |\partial_\xi \partial_\tau m(1, \tau |\xi|^{-2})| \lesssim |\xi|^{-1} |\tau|^{-1} \sup_{|\eta|\lesssim 1} |\partial_\xi \partial_\tau m(1, \eta)|}, and similarly when {|\tau|\gtrsim |\xi|^2}; this shows that for any {(\xi, \tau)} with {\xi,\tau \neq 0} one has

\displaystyle  |\partial_\xi \partial_\tau m(\xi, \tau)| \lesssim |\xi|^{-1} |\tau|^{-1}.

This condition is comparable with the corresponding Hörmander-Mikhlin condition only when {|\xi| \sim |\tau|}, and is vastly different otherwise, being of product type (also notice that the inequality above is compatible with the rescaling invariance of {m}, as it should be).
Continue reading

Basic Littlewood-Paley theory II: square functions

This is the second part of the series on basic Littlewood-Paley theory, which has been extracted from some lecture notes I wrote for a masterclass. In this part we will prove the Littlewood-Paley inequalities, namely that for any {1 < p < \infty} it holds that

\displaystyle \|f\|_{L^p (\mathbb{R})} \sim_p \Big\|\Big(\sum_{j \in \mathbb{Z}} |\Delta_j f|^2 \Big)^{1/2}\Big\|_{L^p (\mathbb{R})}. \ \ \ \ \ (\dagger)

This time there are also plenty more exercises, some of which I think are fairly interesting (one of them is a theorem of Rudin in disguise).
Part I: frequency projections.

4. Smooth square function

In this subsection we will consider a variant of the square function appearing at the right-hand side of (\dagger ) where we replace the frequency projections {\Delta_j} by better behaved ones.

Let {\psi} denote a smooth function with the properties that {\psi} is compactly supported in the intervals {[-4,-1/2] \cup [1/2, 4]} and is identically equal to {1} on the intervals {[-2,-1] \cup [1,2]}. We define the smooth frequency projections {\widetilde{\Delta}_j} by stipulating

\displaystyle  \widehat{\widetilde{\Delta}_j f}(\xi) := \psi(2^{-j} \xi) \widehat{f}(\xi);

notice that the function {\psi(2^{-j} \xi)} is supported in {[-2^{j+2},-2^{j-1}] \cup [2^{j-1}, 2^{j+2}]} and identically {1} in {[-2^{j+1},-2^{j}] \cup [2^{j}, 2^{j+1}]}. The reason why such projections are better behaved resides in the fact that the functions {\psi(2^{-j}\xi)} are now smooth, unlike the characteristic functions {\mathbf{1}_{[2^j,2^{j+1}]}}. Indeed, they are actually Schwartz functions and you can see by Fourier inversion formula that {\widetilde{\Delta}_j f = f \ast (2^{j} \widehat{\psi}(2^{j}\cdot))}; the convolution kernel {2^{j} \widehat{\psi}(2^{j}\cdot)} is uniformly in {L^1} and therefore the operator is trivially {L^p \rightarrow L^p} bounded for any {1 \leq p \leq \infty} by Young’s inequality, without having to resort to the boundedness of the Hilbert transform.
We will show that the following smooth analogue of (one half of) (\dagger ) is true (you can study the other half in Exercise 6).

Proposition 3 Let {\widetilde{S}} denote the square function

\displaystyle  \widetilde{S}f := \Big(\sum_{j \in \mathbb{Z}} \big|\widetilde{\Delta}_j f \big|^2\Big)^{1/2}.

Then for any {1 < p < \infty} we have that the inequality

\displaystyle  \big\|\widetilde{S}f\big\|_{L^p(\mathbb{R})} \lesssim_p \|f\|_{L^p(\mathbb{R})} \ \ \ \ \ (1)

holds for any {f \in L^p(\mathbb{R})}.

We will give two proofs of this fact, to illustrate different techniques. We remark that the boundedness will depend on the smoothness and the support properties of {\psi} only, and as such extends to a larger class of square functions.
Continue reading

Basic Littlewood-Paley theory I: frequency projections

I have written some notes on Littlewood-Paley theory for a masterclass, which I thought I would share here as well. This is the first part, covering some motivation, the case of a single frequency projection and its vector-valued generalisation. References I have used in preparing these notes include Stein’s “Singular integrals and differentiability properties of functions“, Duoandikoetxea’s “Fourier Analysis“, Grafakos’ “Classical Fourier Analysis” and as usual some material by Tao, both from his blog and the notes for his courses. Prerequisites are some basic Fourier transform theory, Calderón-Zygmund theory of euclidean singular integrals and its vector-valued generalisation (to Hilbert spaces, we won’t need Banach spaces).

0. Introduction
Harmonic analysis makes a fundamental use of divide-et-impera approaches. A particularly fruitful one is the decomposition of a function in terms of the frequencies that compose it, which is prominently incarnated in the theory of the Fourier transform and Fourier series. In many applications however it is not necessary or even useful to resolve the function {f} at the level of single frequencies and it suffices instead to consider how wildly different frequency components behave instead. One example of this is the (formal) decomposition of functions of {\mathbb{R}} given by

\displaystyle f = \sum_{j \in \mathbb{Z}} \Delta_j f,

where {\Delta_j f} denotes the operator

\displaystyle \Delta_j f (x) := \int_{\{\xi \in \mathbb{R} : 2^j \leq |\xi| < 2^{j+1}\}} \widehat{f}(\xi) e^{2\pi i \xi \cdot x} d\xi,

commonly referred to as a (dyadic) frequency projection. Thus {\Delta_j f} represents the portion of {f} with frequencies of magnitude {\sim 2^j}. The Fourier inversion formula can be used to justify the above decomposition if, for example, {f \in L^2(\mathbb{R})}. Heuristically, since any two {\Delta_j f, \Delta_{k} f} oscillate at significantly different frequencies when {|j-k|} is large, we would expect that for most {x}‘s the different contributions to the sum cancel out more or less randomly; a probabilistic argument typical of random walks (see Exercise 1) leads to the conjecture that {|f|} should behave “most of the time” like {\Big(\sum_{j \in \mathbb{Z}} |\Delta_j f|^2 \Big)^{1/2}} (the last expression is an example of a square function). While this is not true in a pointwise sense, we will see in these notes that the two are indeed interchangeable from the point of view of {L^p}-norms: more precisely, we will show that for any {1 < p < \infty} it holds that

\displaystyle  \boxed{ \|f\|_{L^p (\mathbb{R})} \sim_p \Big\|\Big(\sum_{j \in \mathbb{Z}} |\Delta_j f|^2 \Big)^{1/2}\Big\|_{L^p (\mathbb{R})}. }\ \ \ \ \ (\dagger)

This is a result historically due to Littlewood and Paley, which explains the name given to the related theory. It is easy to see that the {p=2} case is obvious thanks to Plancherel’s theorem, to which the statement is essentially equivalent. Therefore one could interpret the above as a substitute for Plancherel’s theorem in generic {L^p} spaces when {p\neq 2}.

In developing a framework that allows to prove (\dagger ) we will encounter some variants of the square function above, including ones with smoother frequency projections that are useful in a variety of contexts. We will moreover show some applications of the above fact and its variants. One of these applications will be a proof of the boundedness of the spherical maximal function {\mathscr{M}_{\mathbb{S}^{d-1}}} (almost verbatim the one on Tao’s blog).

Notation: We will use {A \lesssim B} to denote the estimate {A \leq C B} where {C>0} is some absolute constant, and {A\sim B} to denote the fact that {A \lesssim B \lesssim A}. If the constant {C} depends on a list of parameters {L} we will write {A \lesssim_L B}.

Continue reading

Some thoughts on the smoothing effect of convolution on measures

Pdf version here: link.

A question by Ben Krause, whom I met here at the Hausdorff Institute, made me think back of one of the earliest posts of this blog. The question is essentially how to make sense of the fact that the (perhaps iterated) convolution of a (singular) measure with itself is in general smoother than the measure you started with, in a variety of settings. It’s interesting to me because in this phase of my PhD experience I’m constantly trying to build up a good intuition and learn how to use heuristics effectively.

So, let’s take a measure {\mu} on {\mathbb{R}^d} with compact support (assume inside the unit ball wlog). We ask what we can say about {\mu \ast \mu}, or higher iterates {\mu \ast \mu \ast \mu \ast \ldots} and more often1 about {\mu \ast \tilde{\mu}}. In particular, we’re interested in the case where {\mu} is singular, i.e. its support has zero Lebesgue measure.

Before starting though, I would like to give a little motivation as to why such convolutions are interesting. Consider the model case where you have an operator defined by {Tf = \sum_{j\in\mathbb{Z}}{T_j f}:= \sum_{j \in \mathbb{Z}}f\ast \mu_j} where {\mu_j} are some singular measures and {f \in L^2}. One asks whether the operator {T} is bounded on {L^2}, and the natural tool to use is Cotlar-Stein lemma, or almost-orthogonality (from which this blog takes its name). Then we need to verify that

\displaystyle \sup_{j}{\sum_{k}{\|T_j T_k^\ast\|^{1/2}}} < \infty,

and same for {T^\ast_j T_k}. But what is {T_j T^\ast_k f}? It’s simply

\displaystyle T_j T^\ast_k f = f \ast d\tilde{\mu}_k \ast d\mu_j = f \ast (d\tilde{\mu}_k \ast d\mu_j),

i.e. another convolution operator. Estimates on the convolution2 {d\tilde{\mu}_k \ast d\mu_j} are likely to help estimate the norm of {T_j T_k^\ast} then. But if this measure is not smooth enough, one can go forward, and since

\displaystyle \|T T^\ast \| \leq \|T\|^{1/2} \|T^\ast T T^\ast\|^{1/2}

one sees that estimates on {d\tilde{\mu}_k \ast d\mu_j \ast d\tilde{\mu}_k } are likely to help, and so on, until a sufficient number of iterations gives a sufficiently smooth measure. This isn’t quite the iteration of a measure with itself, but in many cases one has an operator {Tf = f \ast \mu} which then splits into the above sum by a spatial or frequency cutoff at dyadic scales. Then it becomes a matter of rescaling and the case {d\tilde{\mu}_k \ast d\mu_j} can be reduced to that of {d\tilde{\mu}_0 \ast d\mu_j} and further reduced to that of {d\tilde{\mu}_0 \ast d\mu_0} by exploiting an iterate of the above norm inequality, namely that

\displaystyle \|T_j T_0^\ast\|\leq \|T_j^\ast\|^{1-2^{\ell}}\|T_j (T_0^\ast T_0)^\ell\|^{2^{-\ell}}.

Another possibility is to write {d\nu = d\tilde{\mu}_k \ast d\mu_j} and consider working with {d\nu \ast d\nu} instead, to obtain results in term of {|j-k|}. I will say more in the end, here I just wanted to show that they arise as natural objects.

Continue reading

Ptolemaics meetings 4 & 5 & 6 ; pt I

These last ones have been quite interesting meetings, I’m happy about how the whole thing is turning out. Sadly I’m very slow at typing and working out the ideas, so I have to include three different meetings in one. Since the notes are getting incredibly long, I’ll have to split it in at least two parts.I include the pdf version of it, in case it makes it any easier to read.

ptolemaics meeting 4 & 5 & 6 pt I

Let me get finally into the time frequency of the Walsh phase plane. I won’t include many proofs as they are already well written in Hytönen’s notes (see previous post). My main interest here is the heuristic interpretation of them (disclaimer: you might think I’m bullshitting you at a certain point, but I’m probably not). Ideally, it would be very good to be able to track back the train of thoughts that went in Fefferman’s and Thiele-Lacey’s proofs.

Sorry if the pictures are shit, I haven’t learned how to draw them properly using latex yet.

1. Brush up

Recall we have Walsh series for functions {f \in L^2(0,1)} defined by

\displaystyle W_N f(x) = \sum_{n=0}^{N}{\left\langle f,w_n\right\rangle w_n(x)},

the (Walsh-)Carleson operator here is thus

\displaystyle \mathcal{C}f(x) = \sup_{N\in \mathbb{N}}{|W_N f(x)|},

and in order to prove {W_N f(x) \rightarrow f(x)} a.e. for {N\rightarrow +\infty} one can prove that

\displaystyle \|\mathcal{C}f\|_{L^{2,\infty}(0,1)} \lesssim \|f\|_{L^2(0,1)}.

There’s a general remark that should be done at this point: the last inequality is equivalent to

\displaystyle \left|\left\langle\mathcal{C}f, \chi_E\right\rangle\right| = \left|\int_{E}{\mathcal{C}f}\,dx\right| \lesssim |E|^{1/2}\|f\|_{L^2(0,1)}

to hold on every measurable {E} (of finite measure).
Continue reading

Switching from an approximation of the identity to another

Sometimes one needs to switch from an approximation of the identity to another with some other property which is helpful to the problem in exam. In Stein’s Harmonic Analysis book there’s this nice little result which allows one to do this switch.

Notice that by {f_\delta(x)} we denote the dilated function {\delta^{-n}f\left(\frac{x}{\delta}\right)}.

Proposition 1 Let {\Phi \in \mathcal{S}(\mathbb{R}^n)} be such that {\int{\Phi}\,dx=1}, and let {\Psi} be another Schwartz function. Then there exist a sequence {\{\eta_k\}_{k\in\mathbb{N}} \subset \mathcal{S}(\mathbb{R}^n)} such that we have the following nice decomposition:

  • {\Psi=\sum_{k\geq 0}{\eta_k \ast \Phi_{2^{-k}}}}
  • the {\eta_k}‘s size decreases rapidly in {k}, in particular

    \displaystyle \|\eta_k\|_{\alpha,\beta}\in O\left(2^{-k N}\right)

for every {N>0}, where {\|\cdot\|_{\alpha,\beta}} are the seminorms of {\mathcal{S}(\mathbb{R}^n)} defining its topology.

Continue reading