# Ptolemaics meeting #1

Together with some other PG students in the Harmonic Analysis working group, we’ve decided (it was Kevin’s idea originally) to set up a weekly meeting to learn about topics of harmonic analysis we don’t get to see otherwise (it works quite well as an excuse to drink beer, too). The topic we settled on arose pretty much by itself: it turned out that basically everybody was interested in time-frequency analysis on his own, either through Carleson’s theorem or some other related stuff. So we decided to learn about time-frequency analysis.

Last tuesday we had our first meeting: it was mainly aimed at discussing the arrangements to be made and what to read before next meeting, but we sketched some motivational introduction (it was quite improvised, I’m afraid); see below. Also, it was Odysseas that came up with the name. I think it’s quite brilliant: Ptolemy was the first to introduce the systematic use of epicycles in astronomy, and – as the science historian Giovanni Schiapparelli noticed – epicycles were nothing but the first historical appearance of Fourier series. That’s why they offered such accurate predictions even though the theory was wrong: by adding a suitable number of terms you can describe orbits within any amount of precision. Thus, from Carleson’s result you can go all the way back to Ptolemy: therefore Ptolemaics. Odysseas further added that Ptolemy’s first name was Claudius, like the roman emperor that first began the effective conquest of Britain; but that’s another story.

I will incorporate below a post I was writing for this blog about convergence of Fourier series, so it will be quite long in the end. Sorry about that, next posts will probably be way shorter.

1. Fourier series trivia

First some trivia of Fourier series as to brush up.

One wishes to consider approximations of functions (periodic of period 1) by means of trygonometric polynomials

$\displaystyle \sum_{n=0}^{N}{\left(a_n \cos{2\pi n x} + b_n \sin{2\pi n x}\right)},$

or, with a better notation,

$\displaystyle \sum_{n=-N}^{N}{c_ n e^{2\pi i n x}}.$

If one restricts himself to functions ${f \in L^2(\mathbb{T})}$ for example (where ${\mathbb{T}}$ is the unit circle, ${\mathbb{S}^1}$ or ${\mathbb{U}^1}$ if you prefer), he’s immediately led to consider the projections onto the spaces ${\{e^{2\pi i n x}\}_{n\in\mathbb{Z}}}$ (the vectors are easily verified to be orthonormal to each other),

$\displaystyle S_N f =\sum_{n=-N}^{N}{\left\langle f, e^{2\pi i n \cdot }\right\rangle e^{2\pi i n x}},$

and it can be noted that

$\displaystyle S_N f(x_0) = \left\langle f , \sum_{n=-N}^{N}{e^{2\pi i n (\cdot -x_0)}} \right\rangle = \Phi_N \ast f (x_0),$

where ${\Phi_N}$ is called the Dirichlet kernel and has expression

$\displaystyle \Phi_N (y) = \sum_{n=-N}^{N}{e^{2\pi i n y}}.$

The study of projections is thus reduced to that of a sequence of convolution operators. Notice how ${\int_{0}^{1}{\Phi_N}\,dx = 1}$, so one can think of it as a particular approximation of unity. As it is usual, one is then concerned with convergence issues: does ${S_N f}$ converge to ${f}$ for ${N\rightarrow +\infty}$? and in what sense? Well, the most ubiquitous notions are those of pointwise convergence (or a.e. convergence) and ${L^p}$ convergence. The first one is harder to establish and is the content of the famous theorem by Carleson (see also later posts). We concentrate on this convergence for the moment.

First of all, ${\Phi_N}$ can be computed explicitely as a geometrical series:

$\displaystyle \sum_{n=-N}^{N}{e^{2\pi i n y}} = e^{-2\pi i N x} \sum_{n=0}^{2N}{e^{2\pi i n y}} = e^{-2\pi i N x}\frac{e^{2\pi i (2N+1)x}-1}{e^{2\pi i n x}-1}$

$\displaystyle = \frac{e^{2\pi i (N+1/2)x}-e^{-2\pi i (N+1/2)x}}{e^{\pi i x}-e^{-\pi i x}} = \frac{\sin\left(2\pi \left(N+\frac{1}{2}\right)x\right)}{\sin\left(\pi x\right)}.$

Now, it can be proved under some regularity hypothesis on ${f}$ that

$\displaystyle S_N f(x) \rightarrow \frac{f(x^{+})+f(x^{-})}{2}$

(so that if ${f}$ is also continuous in a point, one has convergence in that point). To understand the regularity hypothesis let’s see where they arise from. First, change by periodicity the interval of integration to the symmetric ${[-1/2,1/2]}$, for the sake of convenience. One can exploit the fact that ${\int_{-1/2}^{1/2}{\Phi_N} = 1}$ by writing

$\displaystyle S_N f (x_0) - f(x_0) = \int_{-1/2}^{1/2}{(f(x_0 - y) - f(x_0))\Phi_N(y)}\,dy,$

and since ${\Phi_N(y) = \Phi_N(-y)}$ one has the same identity with ${x_0 + y}$ in place of ${x_0 - y}$, so that

$\displaystyle 2(S_N f (x_0) - f(x_0)) = \int_{-1/2}^{1/2}{(f(x_0 + y) + f(x_0 - y) - 2f(x_0))\Phi_N(y)}\,dy.$

We know that the behaviour of ${\sin \pi x}$ (in the Dirichlet kernel) is essentially ${\pi x}$ when ${x}$ is close to ${0}$, so in order to make that more explicit write

$\displaystyle 2(S_N f (x_0) - f(x_0)) = \int_{-1/2}^{1/2}{\frac{f(x_0 + y) + f(x_0 - y) - 2f(x_0)}{\pi y}\frac{\pi y}{\sin (\pi y)}\sin\left(2\pi \left(N+\frac{1}{2}\right)y\right)}\,dy.$

We don’t really worry about the factor ${{\pi y} /{\sin (\pi y)}}$ as its absolute value is of size ${\sim 1}$ in ${[-1/2,1/2]}$. So, the r.h.s. will tend to zero (and thus we’ll have ${S_N f(x_0) \rightarrow f(x_0)}$) by the Riemann-Lebesgue lemma if

$\displaystyle \frac{f(x_0 + y) + f(x_0 - y) - 2f(x_0)}{y}$

is in ${L^1([-1/2,1/2])}$. This is what is called a Dini condition – automatically satisfied in the (strong) assumption that ${f}$ be Lipschitz continuous. Therefore, a function satisfying Dini condition in ${x_0}$ will have convergent Fourier series in there,

$\displaystyle S_N f (x_0) \rightarrow f(x_0).$

Notice that without exploiting the symmetry of ${\Phi_N}$ one could ask just for the stronger condition

$\displaystyle \int_{0}^{1}{\left|\frac{f(x_0 + y) - f(x_0)}{y}\right|}\,dy <\infty,$

which is sometimes referred to as Dini condition as well. It should also be noted that it is enough to have that the function above is in ${L^1([-\delta,\delta])}$ for some ${0<\delta \leq1}$, not necessarily ${L^1}$ on the whole interval therefore.

A different criterion (the one for which we have convergence to the average of left and right limits) is that ${f}$ be of bounded variation in a nbhd. of ${x_0}$. This is called Jordan’s criterion. Its proof relies on the fact that function of bounded variation on the real line can be written as the difference of two monotonic functions.

2. Convergence failures

The above digression settles the argument for functions with enough regularity. On the other hand, it can be proved that there are continuous functions (not just piecewise like in Jordan’s criterion) such that the Fourier series diverges at a chosen point ${y_0 \in \mathbb{T}}$. It can be proved by the contrapositive of Banach-Steinhaus theorem: if the operator norms of the partial series ${S_N}$ are unbounded, then there exists at least one function for which ${\sup_N |S_N f(y_0)| = \infty}$. This is because ${C(\mathbb{T})}$ is a Banach space with norm ${\|\cdot\|_{\infty}}$, so the hypothesis of Banach-Steinhaus are satisfied. It suffices to calculate the operator norms of ${S_N|_{y_0}\,:\, C(\mathbb{T}) \rightarrow \mathbb{C}}$. Assume ${y_0 = 0}$, then

$\displaystyle |S_N f(0)| = \left|\int_{0}^{1}{f(y)\Phi_N (y)}\,dy\right|\leq \|\Phi_N\|_{L^1(\mathbb{T})} \|f\|_{L^\infty},$

and it’s not that hard to build a continuous function with ${\|f\|_\infty = 1}$ that shows that the norm of ${S_N|_{x=0}}$ is actually ${\|\Phi_N\|_{L^1(\mathbb{T})}}$. This norm grows logaritmically in ${N}$, so we’re done. That the growth is about logaritmical in ${N}$ can be guessed by the following: by rearranging the terms one can see that

$\displaystyle \Phi_{2N}(x) = \sum_{n=-2N}^{2N}{e^{2\pi i n x}}=\sum_{n=-N}^{N}{e^{2\pi i (2n) x}} +\sum_{n=-N}^{N}{e^{2\pi i (2n+1) x}} - e^{2\pi i(2N+1)x}$

$\displaystyle = \left(1 + e^{2\pi i x}\right)\Phi_{N}(2x)- e^{2\pi i(2N+1)x},$

so that with some manipulation

$\displaystyle \|\Phi_{2N}\|_{L^1(\mathbb{T})} \leq \int_{0}^{1}{|\left(1 + e^{2\pi i y}\right)\Phi_{N}(2y)|}\,dy + 1 = 2\int_{0}^{1}{|\cos{\left(\frac{\pi y}{2}\right)}\Phi_{N}(y)|}\,dy + 1;$

this last espression can be bounded by ${\|\Phi_{N}\|_{L^1(\mathbb{T})}}$ plus a constant, as by symmetry

$\displaystyle 2\int_{0}^{1/2}{|\cos{\left(\frac{\pi y}{2}\right)}\Phi_{N}(y)|}\,dy \leq \|\Phi_{N}\|_{L^1(\mathbb{T})}$

and the function ${\cos{\left(\frac{\pi y}{2}\right)}\Phi_{N}(y)}$ is easily seen to be bounded independently of ${N}$ on ${[1/2,1]}$. Thus

$\displaystyle \|\Phi_{2^k}\|_{L^1(\mathbb{T})} \lesssim k.$

Of course one could compute it directly.

Now the continuous function of which we just proved existence has Fourier series divergent in 0. How bad can it get for continuous functions? Well, as by Carleson’s theorem, functions in ${L^2(\mathbb{T})}$ (and thus also in ${C(\mathbb{T})}$) have Fourier series convergent almost everywhere, so that the worst that can happen is that the set of divergence is of measure zero.

Is it better than that or is this the best we can say? the case turns out to be the second. Odysseas directed me to the book An introduction to Harmonic Analysis by Yitzhak Katznelson (first edition in 1968). It is proved in there that a sufficiently regular space of functions ${B \supset C(\mathbb{T})}$ either contains a function ${f}$ s.t. ${S_N f}$ diverges everywhere (see below) or the the Fourier series of all functions in ${B}$ converge (only) a.e.. Therefore it makes sense to ask for a.e. convergence only and not pointwise (which is trivial for ${L^p(\mathbb{T})}$ spaces because the functions are actually just equivalence classes modulus zero measure sets, but not just as trivial in general – for ${C(\mathbb{T})}$ for example). More rigorously: a set of divergence for ${B}$ is a set ${E \subset \mathbb{T}}$ such that there exists ${f\in B}$ the Fourier series of which diverge for all ${x\in E}$. Equivalently,

$\displaystyle \sup_{N}{|S_N f(x)|} = \infty \quad\quad\quad \forall x \in E.$

Then the theorem goes

Theorem 1 Let ${B}$ be a homogeneous Banach space [1] of functions on ${\mathbb{T}}$ such that ${B\supset C(\mathbb{T})}$. Then either ${\mathbb{T}}$ is a set of divergence for ${B}$ or the sets of divergence of ${B}$ are precisely the subsets of measure zero.

Informally, if ${E}$ is a set of divergence of positive measure, every translate is again a set of divergence and then by taking the union over all rational translates it can be proved we still obtain a set of divergence, and this set has measure ${1}$. Then one proves separately that every zero measure set is a divergence set for ${C(\mathbb{T})}$ and we’re done. I might have to say more about this next week, we’ll see.

Anyhow, as a quick survey, the result of Carleson (and Hunt’s extension to ${L^p(\mathbb{T})}$ for ${p>1}$) implies that the Fourier series of functions in ${L^p(\mathbb{T})}$ converge a.e., so that we are in the second situation outlined by the theorem. We find ourselves in the first one for ${L^1(\mathbb{T})}$, as Kolmogorov showed with his famous example of a function ${f}$ in ${L^1(\mathbb{T})}$ such that ${S_N f}$ is everywhere divergent. Details on the construction can be found in Katznelson’s book. In the meeting we asked ourselves if there’s anything deep under Kolmogorov’s example but the shared opinion is that there isn’t – it’s just technical (although that’s open to be questioned – nobody remembered the construction at the time).

As a further note, it is still an open problem to determine the “critical” ${L}$-space for a.e. convergence: as just said it’s in between ${L^1(\mathbb{T})}$ and any other ${L^p(\mathbb{T})}$ for ${p>1}$. It has been proved that a.e. convergence holds for functions in ${L \log L \log \log \log L}$, and on the other side there’s an example of a function in ${L(\log L)^{1/2}}$ with everywhere divergent Fourier series. It is conjectured that the largest space in which a.e. convergence holds is ${L \log L}$.

3. Carleson’s theorem

Personally, I got interested in Carleson’s theorem because the proof Lacey and Thiele gave is a strong example of time-frequency analysis. Carleson’s original proof, as I understand it, relied on a subtle and quite complicated decomposition of ${f\in L^2}$; historically, a second proof was given by Fefferman considering linearization of the Carleson operator instead. This proof relied heavily on time-frequency analysis, which was made explicit in Lacey and Thiele’s proof.

One defines Carleson’s maximal operator (which already appeard above) as

$\displaystyle \mathcal{C} f (x) = \sup_{N}\left|S_N f(x)\right| = \sup_{N}{|\Phi_N \ast f (x)|}.$

If one can prove the weak type estimate

$\displaystyle \left|\{x\in \mathbb{T}\,:\,\mathcal{C}f (x) > \lambda \}\right|\lesssim \frac{\|f\|_{L^2(\mathbb{T})}^2}{\lambda^2}$

then convergence a.e. follows by a standard density argument: namely, since convergence holds for functions ${\phi \in C^\infty(\mathbb{T})}$ and they are dense in ${L^2(\mathbb{T})}$, take ${\phi}$ s.t. ${\|f-\phi\|_{L^2(\mathbb{T})}\leq \varepsilon^{3/2}}$ and define function

$\displaystyle \Sigma_f(x):=\limsup_{N\rightarrow \infty}{|f(x) - S_N f(x)|}.$

Then since ${\Sigma_f(x) \leq |f - \phi(x)| + \mathcal{C}(f-\phi)}$ one has

$\displaystyle |\{\Sigma_f > 2\varepsilon\}| \leq |\{\mathcal{C}(f - \phi) > \varepsilon\}|+ |\{|f-\phi|>\varepsilon\}|\lesssim \varepsilon,$

and by arbitrariness of ${\varepsilon}$ it follows ${\Sigma_f = 0}$ a.e..

As seen above, we can “factor out” ${\frac{\pi y}{\sin(\pi y)}}$ as it is of roughly constant size in ${[-1/2,1/2]}$, and work with the equivalent kernel

$\displaystyle \tilde{\Phi}_N(y) = \frac{\sin(\pi (2N +1) y)}{\pi y}.$

Now, write ${N'}$ for ${2N+1}$ and notice that

$\displaystyle \int_{-N'}^{N'}{e^{2\pi i \xi y}}\,d\xi = \frac{e^{2\pi i N' y}-e^{-2\pi i N' y}}{2\pi i y} = \tilde{\Phi}_N(y),$

thus by Fubini

$\displaystyle \tilde{\Phi}_N \ast f(x) = \int_{0}^{1}{\int_{-N'}^{N'}{e^{2\pi i \xi y} f(x-y)}\,d\xi}\,dy = \int_{-N'}^{N'}{\widehat{f}(\xi) e^{2\pi i \xi x}}\,d\xi.$

If one were to replace ${N'}$ by ${\infty}$ this would be nothing but Fourier inversion formula (interpret ${\widehat{f}}$ as the Fourier transform on ${\mathbb{R}}$ of the function ${f\chi_{[0,1]}}$). Hence we’ll concentrate on this, and in particular we can obviously content ourselves with the one-sided Carleson operator (which we denote by the same letter from now on)

$\displaystyle \mathcal{C}f(x) : = \sup_{N\in\mathbb{R}}{\left|\int_{-\infty}^{N}{\widehat{f}(\xi) e^{2\pi i \xi x}}\,d\xi\right|}.$

We state Carleson’s theorem once again (but this is the last one, I promise)

Theorem 2 (Carleson, 1966) Carleson’s maximal operator ${\mathcal{C}}$ is ${L^2(\mathbb{R}) \rightarrow L^{2,\infty}(\mathbb{R})}$ bounded. As a corollary, the Fourier inversion formula holds pointwise a.e. for functions in ${L^2(\mathbb{R})}$ and the Fourier series of functions in ${L^2(\mathbb{T})}$ converge a.e..

The goal of the student group is to understand the proof of this theorem in depth. We shall start from the mock-case of the Walsh plane next week, where all the ideas are already present. But meanwhile, as a motivating digression, prof. Wright showed us how a proof of the maximal Hausdorff-Young inequality can be given in just a few lines thanks to a (tricky) idea of Christ and Kiselev (see [2]); this is the subject of next (and last) section.

4. Maximal Hausdorff-Young inequality

Instead of considering the inversion problem for the Fourier transform, consider the maximal version of it, namely

$\displaystyle \mathcal{F}^\ast f(\xi):= \sup_{N}{\left|\int_{-\infty}^{N}{f(x) e^{-2\pi i x \xi}}\,dx\right|}.$

If one could prove this is of weak type ${(2,2)}$, that would be equivalent to Carleson’s theorem, since ${{F}^\ast \widehat{f} = \mathcal{C}f}$ (and we have Plancherel’s inequality). Thus Carleson’s theorem can be seen as the endpoint of this maximal Fourier operator. And for the range ${1< p <2}$ ? That would be the maximal version of Hausdorff-Young inequality. It turns out it holds, i.e. ${\mathcal{F}^\ast}$ maps ${L^p}$ to ${L^{p^\prime}}$ boundedly for ${1< p<2}$. What's extremely nice about this is that no complex interpolation between the endpoints is needed (one being Carleson's theorem) – it can be proved on its own. The result is not particular of the Fourier transform and it can be proved in the following general setting: let ${T}$ be a linear operator on ${L^p(\mathbb{R})}$ with kernel ${K(x,y)}$, i.e.

$\displaystyle Tf(x) = \int_{\mathbb{R}}{K(x,y) f(y)}\,dy,$

and assume ${T}$ is smoothing, that is

$\displaystyle \|Tf\|_{L^q} \lesssim \|f\|_{L^p}$

for ${q>p}$. Then the maximal operator

$\displaystyle T^\ast f(x) = \sup_{N}{\left|\int_{-\infty}^{N}{K(x,y)f(y)}\,dy\right|}$

is bounded from ${L^p}$ to ${L^q}$ as well. The proof is based on a very clever and surprising idea. Normalize ${\|f\|_p = 1}$, and partition the real line in the following dyadic way: first define ${\omega^0 = \mathbb{R}}$, then choose ${a}$ such that ${\int_{-\infty}^{a}{|f|^p}dx = \int_{a}^{+\infty}{|f|^p}dx = 1/2}$ and define ${\omega^1_1 = ]-\infty,a]}$ and ${\omega^1_2 = [a,+\infty[}$; repeat the splitting in equal parts according to the mass ${|f|^p}$ on both the intervals ${\omega^1_1, \omega^1_2}$, which are the intervals of level 1, and so on. In the end we get a dyadic frame ${\{\omega^j_k\}_{j\in\mathbb{N}, 1\leq k \leq 2^j}}$ such that

$\displaystyle \int_{\omega^j_k}{|f|^p}dx = 2^{-j},$

$\displaystyle \bigcup_{k\leq 2^{j}}{\omega^j_k} =\mathbb{R},$

and we call the collection of intervals of level ${j}$ by ${\mathcal{I}_j}$. Notice there’s a natural relation amongst the intervals ${\omega \in \mathcal{I}_j}$, that of being a brother: namely the interval ${\omega^\ast \in \mathcal{I}_j}$ is a brother to ${\omega}$ if ${\omega \cup \omega^\ast \in \mathcal{I}_{j-1}}$. Moreover we can distinguish two brothers in left and right one. Call ${\mathcal{L}_j}$ the left brothers in ${\mathcal{I}_j}$.

Now we linearize the supremum, i.e. we take the measurable function ${N(x)}$ such that ${\int_{-\infty}^{N(x)}}$ realizes the supremum (or at least a positive fixed fraction of it) in ${x}$, and we change point of view: now ${N(x)}$ is a fixed function and we have to prove the inequality

$\displaystyle \|T_{N(\cdot)}f\|_{L^q}:=\left\|\int_{-\infty}^{N(x)}{K(x,y)f(y)}\,dy\right\|_{L^q} \lesssim \|f\|_{L^p}$

with constant independent of ${N(\cdot)}$. We split the integration on ${[-\infty, N(x)]}$ in the following way

$\displaystyle \int_{-\infty}^{N(x)} = \sum_{j \geq 1}{\sum_{\omega\in\mathcal{L}_j\,:\, \omega^\ast \ni N(x)}{\int_{\omega}}},$

(notice the second sum from the left has at most one term really) from which it follows

$\displaystyle \left|\int_{-\infty}^{N(x)}{K(x,y)f(y)}\,dy\right|\leq \sum_{j \geq 1}{\sum_{\omega\in\mathcal{L}_j\,:\, \omega^\ast \ni N(x)}{|T_N (f\chi_\omega)(x)|}}.$

Now because the inner sum has at most one term we can substitute it with the supremum in ${k}$,

$\displaystyle \left|T_N f(x)\right|\leq \sum_{j \geq 1}{\sup_{1\leq k\leq 2^j}{|T_N (f\chi_{\omega^j_k})(x)|}},$

and further replace the supremum with the ${\ell^q}$ average

$\displaystyle \left|T_N f(x)\right|\leq \sum_{j\geq 1}{\left(\sum_{k=1}^{2^j}{|T_N (f \chi_{\omega^j_k})(x)|^q}\right)^{1/q}}=:\sum_{j \geq 0}{A_j f(x)},$

where the ${A_j}$‘s don’t depend on ${N}$ anymore! Then we’re done, since

$\displaystyle \|T_N f\|_{L^q} \lesssim \sum_{j\geq 1}{\|A_j f\|_{L^q}}$

and

$\displaystyle \|A_j f\|_{L^q}^q = \sum_{k=1}^{2^j}{\int{|T_N (f \chi_{\omega^j_k})(x)|^q}\,dx}\lesssim \sum_{k=1}^{2^j}{\|f \chi_{\omega^j_k}\|_{L^p}^q}= \sum_{k=1}^{2^j}{2^{-j q/p}} = 2^{-j(q/p -1)},$

so that it is summable in ${j\geq 1}$, because ${q>p}$. This concludes the proof, as the bound is independent of ${N(\cdot)}$ as we wished. Notice the smoothing assumption on ${T}$ is fundamental here.

Now, go back to the case of the Fourier transform. It corresponds to kernel ${K(x,y)= e^{-2\pi i x y}}$, and it is is smoothing on ${L^p}$ for ${p 2}$ if ${p<2}$. Thus the above theorem applies and one has the maximal Hausdorff-Young inequality

$\displaystyle \|\mathcal{F}^\ast f\|_{L^{p^\prime}} \lesssim \|f\|_{L^p}.$

The theorem breaks down exactly at ${p=2}$ because for this exponent the Fourier transform is no more smoothing.

That’s all for today.

[1] : it means a Banach subspace of ${L^1(\mathbb{T})}$ (the largest ${L^p}$ space on ${\mathbb{T}}$) with a norm ${\|\cdot\|_B \geq\|\cdot\|_{L^1(\mathbb{T})}}$ which is translation invariant and continuous w.r.t. to translations.

[2] : “Maximal functions associated to filtrations”, M. Christ and A. Kiselev, Jour. of Func. Anal., Vol. 179, Issue 2, 1 February 2001, Pages 409–425.