These last ones have been quite interesting meetings, I’m happy about how the whole thing is turning out. Sadly I’m very slow at typing and working out the ideas, so I have to include three different meetings in one. Since the notes are getting incredibly long, I’ll have to split it in at least two parts.I include the pdf version of it, in case it makes it any easier to read.

ptolemaics meeting 4 & 5 & 6 pt I

Let me get finally into the time frequency of the Walsh phase plane. I won’t include many proofs as they are already well written in Hytönen’s notes (see previous post). My main interest here is the heuristic interpretation of them (disclaimer: you might think I’m bullshitting you at a certain point, but I’m probably not). Ideally, it would be very good to be able to track back the train of thoughts that went in Fefferman’s and Thiele-Lacey’s proofs.

Sorry if the pictures are shit, I haven’t learned how to draw them properly using latex yet.

**1. Brush up **

Recall we have Walsh series for functions defined by

the (Walsh-)Carleson operator here is thus

and in order to prove a.e. for one can prove that

There’s a general remark that should be done at this point: the last inequality is equivalent to

to hold on every measurable (of finite measure).

This is because in general an estimate of the kind

is equivalent to

To see why, suppose the first is true and take arbitrary: then by triangular inequality for the integral it is equivalent to estimate

and we easily have the second then follows by summing (optimize according to this last inequality). Now if you suppose the second one is true instead, and consider positive/negative parts of real/imaginary parts of , you just have to take of finite measure, and let it approach the whole set. It follows immediately that the first one holds.

So, we can content ourselves with proving

**2. Walsh wave packets **

In the previous post I’ve stated some properties of the Walsh functions, one of which was that if (i.e. their binary digits are complementary when overlapping) then

It’s actually true in general, by what seen in previous post, that

Now, the Walsh functions are an orthonormal basis for , thus if you consider functions as linear combinations of Walsh functions, multiplying by is equivalent to “shifting” by the indices of the Walsh functions that compose – or equivalently to shift the Walsh transform by . Bear in mind the field has order two, so shifting in this setting doesn’t necessarily resemble shifting in . Anyhow, from the group point of view, this is analogous to what happens when you modulate an function by : you’re shifting its Fourier transform by . Modulation in time equals translation in frequency and viceversa. Hence we can follow the analogy and consider the phase-plane point of view on the Walsh transform. This is just .

We need to introduce wave packets associated to rectangles according to the principle that the time interval contains information about the time localization, and about the frequency. Define thus the packets as

where is the left endpoint of .

We look a bit more in detail into the localization properties of such wavepackets. Notice that if then it is just . Its Walsh-Fourier transform is

and the value of the integral is calculated as follows: it is

where on the integration interval for ; it follows that the integral factor is constant in on intervals of size . If is then will be half of the times on , and the rest of the times, so that the interval is zero. On the contrary, if then the integral is . Summarizing

This is remarkable: the Walsh-Fourier transform is still localized! This means that the frequency projection multiplier

is convolution with

Remark 1Let’s make clear what a dyadic interval is, given the mapping between reals and Laurent series. Given a dyadic interval , let be the polynomial representation of ; then the interval on contains all the polynomials of the form , where is any Laurent series with .

** 2.1. Uncertainty Principle in the Walsh phase-plane **

We are tempted to localize both in time and frequency in the Walsh plane. If we do so, we obtain

Look at the characteristic function inside the integral. For that to be , it must be , where , and . On the other hand is of the form with , where . If we assume something magic happens:

This is because the degree of is too small to affect the sum , relatively to . This means that if then factors out of the integral, and the expression becomes

i.e. if the tile has area 1, then the associated time and frequency localizations **commute with each other!** Let me state that once again:

Remark 2This NEVER EVER happens in the real case, as has bounded support, while has unbounded support (because its Fourier transform has bounded support).

Why is this relevant to us though? well, and are projections and in particular they are self-adjoint; commutativity ensures both that the operator is self-adjoint, and that it is idempotent. Therefore is a projection operator. See below for how is this related to the wavepackets.

Anyway, what happened there? That’s the Uncertainty Principle at work. It’s lurking behind the tiles, and it suggests us to restrict our attention to tiles of area . Actually it’s best to restrict ourselves to only consider tiles of area exactly (so that they are “critical”). The reason for that is that if we assume area (or constant, in general) then there is a 1-to-1 correspondence between areas of the phase-plane (made of tiles) and the linear spans of the wavepackets associated to those tiles. This might sound confusing, so I will restate it properly in a moment; I need to fix some definitions first.

**DISCLAIMER:** from now on, by **tile** I mean a dyadic rectangle with area 1. Since it will be useful, I will also introduce **bi-tiles**, which are dyadic rectangles of area 2. In particular, if you split a bi-tile in two halves horizontally, then you get two tiles, the lower half being denoted and the upper half .

Let’s rewrite the expression for the wavepackets: since , there’s an integer such that we can write

then the wavepacket associated to is

For example,

It’s also easy to see that if , are disjoint tiles, then

The operators deserve their definition as well, and we write

for the phase projection.

The property we just mentioned above is the fact that given two collections of tiles and , the spans of the ‘s and ‘s will coincide. In detail, the following holds:

Proposition 1If is a collection of tiles and is a tile such that

then

The proof is easy and works out thanks to the combinatorics of the dyadic intervals, and the following elementary fact: the tiles in this picturehave the same span as the tiles in this picture provided they cover the same portion of the phase-plane.

*Proof:* Suppose is not contained in otherwise it’s all trivial. First of all discard all the unnecessary tiles in the collection. We know that wavepackets with disjoint tiles are orthogonal to each other, so that we can forget about the tiles in that don’t intersect – i.e. assume all of them do intersect . We can then discard some more tiles actually, and be left with a minimal collection that covers but is made only of vertical tiles (i.e. with and then because the area is constant) or horizontal ones (the opposite of vertical). This is because a vertical tile necessarily covers an entire strip (here we’re using the constant area hypothesis!), since : then if a point is not covered by vertical tiles, all of is not covered. Thus the strip must be covered by horizontal ones, and those have a time interval larger than ! therefore they cover all of and you can forget about the vertical ones (same thing works in the other way).

At this point, assuming only a minimal cover of horizontal tiles, we’re halfway done: call the widest one and , where is the sibling of ; then as well (by nestedness of the dyadic intervals), and must belong to the collection too, since if the area were covered by a tile shorter than , this tile would cover as well, which contradicts minimality. Thus we have a couple of congruent tiles , both contained in the collection. By the remark stated before this proof, we can substitute with the tiles as in figure, without altering the span; but if is wider than then only one of these tiles will intersect (tile in the figure), so we can discard the other one.

Now one applies this step inductively and gets an algorithm that stops once it gets to the full tile , which then belongs to the span of .

I think the above arguments should provide sufficient motivation for the choice of wavepackets associated to tiles of area 1. As a further remark, I would like to point out that the choice of the particular expression for the wavepackets is absolutely natural once the above phase projections have been taken into consideration. As a matter of fact, as one can imagine, all the projections can be obtained one from the other by translations, modulations and scaling. In particular one can concentrate on the simplest one, namely for , and see that

and given the first factor in the RHS we see that this is equal to

i.e. the ‘s have all rank 1. This is remarkable, because it tells us there’s a function such that for

in the above case it is

What is ? well, we can obtain it by translation modulation and scaling of the above , and what we get is

which is exactly the wavepacket !

**3. Rewriting in a useful form **

How do we exploit the phase-plane structure efficiently? well, first of all we need to write in terms of wavepackets. By definition the Walsh function of index restricted to the interval is the wavepacket of tile

and therefore

Thus it is a projection onto a subspace of wavepackets. This collection of tiles is a very lame and uninteresting one though – just some squares stacked on top of each other. What will happen if we change the set of tiles? well, as the proposition above tells us, as long as the tiles cover the same area in the phase-plane, the linear spans of the associated wavepackets are exactly the same. And since the operator is a projection and the wavepackets are orthogonal (if the corresponding tiles are disjoint), it follows that the operator will stay the same if the tilings are collections of disjoint tiles!

Therefore, we look for another cover of the rectangle . To find one, we use an idea that already appeared in the first post of this series. Let me recall it briefly: we had an integral on an interval and a dyadic system on (doesn’t matter which particular one); we wanted to decompose it into a sum of integrals where was a generic dyadic interval in the system. What we did was to write

i.e. we sum on all left halves of dyadic intervals such that *the right half* contains .

We now do the analogue in this case: we are interested in bitiles (which correspond to the ‘s) such that the *upper half* contains the frequency – or, to be precise, . The further condition is that the tiles have time support in obviously. And then we sum in the *lower* tiles . In formulas

(notice on the left is a tile, while on the right is a bitile). The partial Walsh-series is now

Pretty neat! Now the time support of the tiles is allowed to be smaller than , there’s more freedom to generalize. It takes a little to realize what kind of partition we get, so here’s a picture to help your intuition:

We are ultimately interested in estimating the size of ; recall the other important idea introduced in the first post of this series: **linearization of the supremum!** It means we rather want to prove an estimate

uniformly in the measurable function . That means we want to estimate

we introduce sets

so that we can write the last integral as

The sets contain exactly those points in such that as required. Now, the collection of bitiles with time support have nothing special; therefore we will work in general with the sums

where is *any* collection of bitiles in the Walsh phase-plane. Our estimates will need to be uniform in of course, but the freedom this gives is worth the trouble.

**4. Trees of tiles **

In his seminal proof, Fefferman introduced a particular structure on tiles: the **Tree**. The definition requires a partial ordering on the bi-tiles (or tiles), which is as follows: given tiles , it is

if and only if

(thus they intersect, and the one with smaller time support is the “smaller” one).

Then we say that a finite collection of bi-tiles is a **tree** with top bi-tile if and only if

One has the notions of **up-tree** and **down-tree** requiring respectively that and (so, the upper half of the collection is a tree, or the lower half). It is possible to decompose every tree in its up-tree and down-tree part (the share the top tile, but only that); for example, for the tree in figure one has the following decomposition:

I think of trees as clumps of narrow elongated tiles that intersect a single larger tile – the top tile, or root. Notice an up-tree is a more concentrated clump, as the upper half of any tile in the tree still has to intersect (the upper half of) the top tile, so that the center of the frequency support of any tile in the tree cannot be too far from that of the top tile . This is not necessarily true of a down-tree, as it can contain very narrow and tall tiles, with the frequency support potentially unbounded. The figure I’ve included should make it clearer.

The useful lemma that follows highlights the pros of the up-tree structure and takes to a conclusion the idea I’ve just sketched:

Lemma 2Let be an up-tree with top tile . Then for every (up-)leaf we can write in terms of as follows:

with ‘s some signs that depend on the tiles and the Haar function supported on the interval ( normalized as it’s customary).

The proof can be obtained just by unwinding the definitions, but that doesn’t say much about it. Here’s why we should expect something like the above to hold. being a leaf means just that , and in particular that they intersect (and is taller than , and away from the zero frequency as well – this is important) both in time and frequency. Thus it should be

(notation: is the center of the interval ; we also ignore the pedices to ease reading), which means that the ratios should be comparable, namely

(for a down-tree you could only say .)

You can think of as the “number of times oscillates” (a frequency indeed). By the above we expect to oscillate about times. We also know it is supported in . Think of it in a purely probabilistic perspective: oscillates about times within an interval of length , is a tile roughly in the same location on the phase plane but with smaller support, of size . We can naively expect is almost a restriction of on support , so that it will capture only a fraction of about of the oscillations of . The lemma is saying that this perspective is not completely wrong indeed.

By definition of wave packet it is

so we expect could be expressed by (ignoring normalization factors)

In general it is , and we can pretend that “morally” this holds for negative as well (it does if the resulting index is integer), so that since the lenghts are powers of 2

This is exactly what I said above: naively one expects to be just ! By the above lemma it is almost so, the main difference being that we have instead of , which has the effect of changing the sign on the right half of the support, and in place of . As a last remark on the above equality, the lemma says that the factor accounts for all oscillation in an up-tree.

So, the notion of tree has some advantages, namely that the wave packets are all essentially restrictions of the top tile’s one, and calculations seem to be more easily carried over on trees (judging by the notes). But I don’t find this answer satisfactory enough: *why should we care for trees in the first place?*

A better answer is that trees are associated to operators that are Calder\'{o}n-Zygmund like. We’ve considered orthogonal projections of the form

and it is therefore natural to consider weighted projections of the form

for some weights . These operators are bounded on for weights, as it can be easily verified that

Now, as we did for the euclidean Fourier transform, this can be rewritten as

where is the convolution w.r.t. to . Now, suppose is an up-tree, so that the above lemma applies, and all the wave packets simplify to

Forget about the signs ‘s since we can incorporate them into the weights; we now have as a kernel of the product

where the only phase factor is and the terms in the sum show very little oscillation: it is just a linear combination of Haar functions with support contained in . They correspond to the tiles of the form , which are sometimes called **lacunary tiles**. Moreover, the function has zero mean on and all the terms are supported in by definition of up-tree. So, the kernel is

This operator is thus the analogue in the Walsh setting of something like

which falls under the scope of Calderón-Zygmund theory. It is a particular case of the case treated in

Ricci, Stein, “

Harmonic analysis on nilpotent groups and singular integrals I: oscillatory integrals”, Jour. of Func. Analysis, 73, pg 179-194 (1987)

which is, kernels of the kind

with a polynomial and a homogeneous kernel of critical degree (the dimension) and zero average on the sphere. I was planning to write about that paper anyway (half of the blog entry is already there), so stay tuned if you are interested.

Resuming, the operator of weighted projection associated to an up-tree is a Calder\'{o}n-Zygmund object, and as such we expect that standard techniques will apply to bound it properly (it is indeed so). For a down-tree we don’t have such a nice formula as for the up case, and indeed one can figure out that once the tiles in the down-tree get very narrow, their frequency must be ; we correspondingly expect that there’s no cancellation to exploit in these operators, and as a matter of fact this is indeed the case, as we will see in the following part of these notes.

Enough for today.