# Basic Littlewood-Paley theory II: square functions

This is the second part of the series on basic Littlewood-Paley theory, which has been extracted from some lecture notes I wrote for a masterclass. In this part we will prove the Littlewood-Paley inequalities, namely that for any ${1 < p < \infty}$ it holds that $\displaystyle \|f\|_{L^p (\mathbb{R})} \sim_p \Big\|\Big(\sum_{j \in \mathbb{Z}} |\Delta_j f|^2 \Big)^{1/2}\Big\|_{L^p (\mathbb{R})}. \ \ \ \ \ (\dagger)$

This time there are also plenty more exercises, some of which I think are fairly interesting (one of them is a theorem of Rudin in disguise).
Part I: frequency projections.

4. Smooth square function

In this subsection we will consider a variant of the square function appearing at the right-hand side of ( $\dagger$) where we replace the frequency projections ${\Delta_j}$ by better behaved ones.

Let ${\psi}$ denote a smooth function with the properties that ${\psi}$ is compactly supported in the intervals ${[-4,-1/2] \cup [1/2, 4]}$ and is identically equal to ${1}$ on the intervals ${[-2,-1] \cup [1,2]}$. We define the smooth frequency projections ${\widetilde{\Delta}_j}$ by stipulating $\displaystyle \widehat{\widetilde{\Delta}_j f}(\xi) := \psi(2^{-j} \xi) \widehat{f}(\xi);$

notice that the function ${\psi(2^{-j} \xi)}$ is supported in ${[-2^{j+2},-2^{j-1}] \cup [2^{j-1}, 2^{j+2}]}$ and identically ${1}$ in ${[-2^{j+1},-2^{j}] \cup [2^{j}, 2^{j+1}]}$. The reason why such projections are better behaved resides in the fact that the functions ${\psi(2^{-j}\xi)}$ are now smooth, unlike the characteristic functions ${\mathbf{1}_{[2^j,2^{j+1}]}}$. Indeed, they are actually Schwartz functions and you can see by Fourier inversion formula that ${\widetilde{\Delta}_j f = f \ast (2^{j} \widehat{\psi}(2^{j}\cdot))}$; the convolution kernel ${2^{j} \widehat{\psi}(2^{j}\cdot)}$ is uniformly in ${L^1}$ and therefore the operator is trivially ${L^p \rightarrow L^p}$ bounded for any ${1 \leq p \leq \infty}$ by Young’s inequality, without having to resort to the boundedness of the Hilbert transform.
We will show that the following smooth analogue of (one half of) ( $\dagger$) is true (you can study the other half in Exercise 6).

Proposition 3 Let ${\widetilde{S}}$ denote the square function $\displaystyle \widetilde{S}f := \Big(\sum_{j \in \mathbb{Z}} \big|\widetilde{\Delta}_j f \big|^2\Big)^{1/2}.$

Then for any ${1 < p < \infty}$ we have that the inequality $\displaystyle \big\|\widetilde{S}f\big\|_{L^p(\mathbb{R})} \lesssim_p \|f\|_{L^p(\mathbb{R})} \ \ \ \ \ (1)$

holds for any ${f \in L^p(\mathbb{R})}$.

We will give two proofs of this fact, to illustrate different techniques. We remark that the boundedness will depend on the smoothness and the support properties of ${\psi}$ only, and as such extends to a larger class of square functions.

# Basic Littlewood-Paley theory I: frequency projections

I have written some notes on Littlewood-Paley theory for a masterclass, which I thought I would share here as well. This is the first part, covering some motivation, the case of a single frequency projection and its vector-valued generalisation. References I have used in preparing these notes include Stein’s “Singular integrals and differentiability properties of functions“, Duoandikoetxea’s “Fourier Analysis“, Grafakos’ “Classical Fourier Analysis” and as usual some material by Tao, both from his blog and the notes for his courses. Prerequisites are some basic Fourier transform theory, Calderón-Zygmund theory of euclidean singular integrals and its vector-valued generalisation (to Hilbert spaces, we won’t need Banach spaces).

0. Introduction
Harmonic analysis makes a fundamental use of divide-et-impera approaches. A particularly fruitful one is the decomposition of a function in terms of the frequencies that compose it, which is prominently incarnated in the theory of the Fourier transform and Fourier series. In many applications however it is not necessary or even useful to resolve the function ${f}$ at the level of single frequencies and it suffices instead to consider how wildly different frequency components behave instead. One example of this is the (formal) decomposition of functions of ${\mathbb{R}}$ given by $\displaystyle f = \sum_{j \in \mathbb{Z}} \Delta_j f,$

where ${\Delta_j f}$ denotes the operator $\displaystyle \Delta_j f (x) := \int_{\{\xi \in \mathbb{R} : 2^j \leq |\xi| < 2^{j+1}\}} \widehat{f}(\xi) e^{2\pi i \xi \cdot x} d\xi,$

commonly referred to as a (dyadic) frequency projection. Thus ${\Delta_j f}$ represents the portion of ${f}$ with frequencies of magnitude ${\sim 2^j}$. The Fourier inversion formula can be used to justify the above decomposition if, for example, ${f \in L^2(\mathbb{R})}$. Heuristically, since any two ${\Delta_j f, \Delta_{k} f}$ oscillate at significantly different frequencies when ${|j-k|}$ is large, we would expect that for most ${x}$‘s the different contributions to the sum cancel out more or less randomly; a probabilistic argument typical of random walks (see Exercise 1) leads to the conjecture that ${|f|}$ should behave “most of the time” like ${\Big(\sum_{j \in \mathbb{Z}} |\Delta_j f|^2 \Big)^{1/2}}$ (the last expression is an example of a square function). While this is not true in a pointwise sense, we will see in these notes that the two are indeed interchangeable from the point of view of ${L^p}$-norms: more precisely, we will show that for any ${1 < p < \infty}$ it holds that $\displaystyle \boxed{ \|f\|_{L^p (\mathbb{R})} \sim_p \Big\|\Big(\sum_{j \in \mathbb{Z}} |\Delta_j f|^2 \Big)^{1/2}\Big\|_{L^p (\mathbb{R})}. }\ \ \ \ \ (\dagger)$

This is a result historically due to Littlewood and Paley, which explains the name given to the related theory. It is easy to see that the ${p=2}$ case is obvious thanks to Plancherel’s theorem, to which the statement is essentially equivalent. Therefore one could interpret the above as a substitute for Plancherel’s theorem in generic ${L^p}$ spaces when ${p\neq 2}$.

In developing a framework that allows to prove ( $\dagger$) we will encounter some variants of the square function above, including ones with smoother frequency projections that are useful in a variety of contexts. We will moreover show some applications of the above fact and its variants. One of these applications will be a proof of the boundedness of the spherical maximal function ${\mathscr{M}_{\mathbb{S}^{d-1}}}$ (almost verbatim the one on Tao’s blog).

Notation: We will use ${A \lesssim B}$ to denote the estimate ${A \leq C B}$ where ${C>0}$ is some absolute constant, and ${A\sim B}$ to denote the fact that ${A \lesssim B \lesssim A}$. If the constant ${C}$ depends on a list of parameters ${L}$ we will write ${A \lesssim_L B}$.

# A cute combinatorial result of Santaló

There is a nice result due to Santaló that says that if a (finite) collection of axis-parallel rectangles is such that any small subcollection is aligned, then the whole collection is aligned. This is kind of surprising at first, because the condition only says that there is a line, but this line might be different for any choice of subcollection. The precise statement is as follows:

Theorem. Let $\mathcal{R}$ be a collection of rectangles with sides parallel to the axes (possibly intersecting). If for every choice of 6 rectangles of $\mathcal{R}$ there exists a line intersecting all $6$ of them, then there exists a line intersecting all rectangles of $\mathcal{R}$ at once.

To be precise, I should clarify that by line intersection it is meant intersection with the interior of the rectangle – so a line touching only the boundary is not allowed. The number 6 doesn’t have any special esoteric meaning here, to the best of my understanding – it just makes the argument work.