Home Math On a conjecture of Marton

### On a conjecture of Marton

Tim Gowers, Ben Inexperienced, Freddie Manners, and I’ve simply uploaded to the arXiv our paper “On a conjecture of Marton“. This paper establishes a model of the infamous Polynomial Freiman–Ruzsa conjecture (first proposed by Katalin Marton):

Theorem 1 (Polynomial Freiman–Ruzsa conjecture) Let ${A subset {bf F}_2^n}$ be such that $A+A$. Then ${A}$ will be lined by at most ${2K^{12}}$ interprets of a subspace ${H}$ of ${{bf F}_2^n}$ of cardinality at most ${A}$.

The earlier greatest recognized end result in direction of this conjecture was by Konyagin (as communicated on this paper of Sanders), who obtained an identical end result however with ${2K^{12}}$ changed by ${exp(O_varepsilon(log^{3+varepsilon} K))}$ for any ${varepsilon>0}$ (assuming that say ${K geq 3/2}$ to keep away from some degeneracies as ${K}$ approaches ${1}$, which isn’t the troublesome case of the conjecture). The conjecture (with ${12}$ changed by an unspecified fixed ${C}$) has plenty of equal varieties; see this survey of Inexperienced, and these papers of Lovett and of Inexperienced and myself for some examples; specifically, as mentioned within the latter two references, the constants within the inverse ${U^3({bf F}_2^n)}$ theorem are actually polynomial in nature (though we didn’t attempt to optimize the fixed).

The exponent ${12}$ right here was the product of numerous optimizations to the argument (our unique exponent right here was nearer to ${1000}$), however will be improved even additional with further effort (our present argument, as an example, permits one to switch it with ${7+sqrt{17} = 11.123dots}$, however we determined to state our end result utilizing integer exponents as an alternative).

On this paper we’ll focus solely on the attribute ${2}$ case (so we might be cavalier in figuring out addition and subtraction), however in a followup paper we’ll set up comparable ends in different finite traits.

A lot of the prior progress on this form of end result has proceeded through Fourier evaluation. Maybe surprisingly, our strategy makes use of no Fourier evaluation in any respect, being carried out as an alternative totally in “bodily house”. Broadly talking, it follows a pure technique, which is to induct on the doubling fixed $A$. Certainly, suppose as an example that one may present that each set ${A}$ of doubling fixed ${K}$ was “commensurate” in some sense to a set ${A'}$ of doubling fixed at most ${K^{0.99}}$. One measure of commensurability, as an example, is likely to be the Ruzsa distance ${log frac{|A|^{1/2} |A'|^{1/2}}}$, which one would possibly hope to regulate by ${O(log K)}$. Then one may iterate this process till doubling fixed dropped beneath say ${3/2}$, at which level the conjecture is thought to carry (there’s an elementary argument that if ${A}$ has doubling fixed lower than ${3/2}$, then ${A+A}$ is in truth a subspace of ${{bf F}_2^n}$). One can then use a number of purposes of the Ruzsa triangle inequality

$displaystyle log frac{|A|^{1/2} |C|^{1/2}} leq log fracA+B{|A|^{1/2} |B|^{1/2}} + log frac{|B|^{1/2} |C|^{1/2}}$

to conclude (the truth that we scale back ${K}$ to ${K^{0.99}}$ implies that the varied Ruzsa distances that should be summed are managed by a convergent geometric collection).

There are a selection of doable methods to attempt to “enhance” a set ${A}$ of not too massive doubling by changing it with a commensurate set of higher doubling. We notice two specific potential enhancements:

Sadly, there are units ${A}$ the place neither of the above two operations (i), (ii) considerably improves the doubling fixed. For example, if ${A}$ is a random density ${1/sqrt{K}}$ subset of ${sqrt{K}}$ random interprets of a medium-sized subspace ${H}$, one can examine that the doubling fixed stays near ${K}$ if one applies both operation (i) or operation (ii). However on this case these operations don’t really worsen the doubling fixed a lot both, and by making use of some mixture of (i) and (ii) (both intersecting ${A+A}$ with a translate, or taking a sumset of ${A cap (A+h)}$ with itself) one can begin reducing the doubling fixed once more.

This begins to recommend a possible technique: present that at the least one of many operations (i) or (ii) will enhance the doubling fixed, or at the least not worsen it an excessive amount of; and within the latter case, carry out some extra sophisticated operation to find the specified doubling fixed enchancment.

An indication that this technique might need an opportunity of working is offered by the next heuristic argument. If ${A}$ has doubling fixed ${K}$, then the Cartesian product ${A times A}$ has doubling fixed ${K^2}$. Then again, through the use of the projection map ${pi: {bf F}_2^n times {bf F}_2^n rightarrow {bf F}_2^n}$ outlined by ${pi(x,y) := x+y}$, we see that ${A times A}$ initiatives to ${pi(A times A) = A+A}$, with fibres ${pi^{-1}({h})}$ being primarily a duplicate of ${A cap (A+h)}$. So, morally, ${A times A}$ additionally behaves like a “skew product” of ${A+A}$ and the fibres ${A cap (A+h)}$, which suggests (non-rigorously) that the doubling fixed ${K^2}$ of ${A times A}$ can also be one thing just like the doubling fixed of ${A + A}$, occasions the doubling fixed of a typical fibre ${A cap (A+h)}$. This could indicate that at the least one in all ${A +A}$ and ${A cap (A+h)}$ would have doubling fixed at most ${K}$, and thus that at the least one in all operations (i), (ii) wouldn’t worsen the doubling fixed.

Sadly, this argument doesn’t appear to be simply made rigorous utilizing the normal doubling fixed; even simply exhibiting the considerably weaker assertion that ${A+A}$ has doubling fixed at most ${K^2}$ is unclear (and maybe even false). Nonetheless, it seems (as mentioned in this latest paper of myself with Inexperienced and Manners) that issues are a lot better. Right here, the analogue of a subset ${A}$ in ${{bf F}_2^n}$ is a random variable ${X}$ taking values in ${{bf F}_2^n}$, and the analogue of the (logarithmic) doubling fixed ${log fracA+AA}$ is the entropic doubling fixed ${d(X;X) := {bf H}(X_1+X_2)-{bf H}(X)}$, the place ${X_1,X_2}$ are unbiased copies of ${X}$. If ${X}$ is a random variable in some additive group ${G}$ and ${pi: G rightarrow H}$ is a homomorphism, one then has what we name the fibring inequality

$displaystyle d(X;X) geq d(pi(X);pi(X)) + d(X|pi(X); X|pi(X)),$

the place the conditional doubling fixed $pi(X))$ is outlined as

$displaystyle d(X|pi(X); X|pi(X)) = {bf H}(X_1 + X_2 | pi(X_1), pi(X_2)) - {bf H}( X | pi(X) ).$

Certainly, from the chain rule for Shannon entropy one has

$displaystyle {bf H}(X) = {bf H}(pi(X)) + {bf H}(X|pi(X))$

and

$displaystyle {bf H}(X_1+X_2) = {bf H}(pi(X_1) + pi(X_2)) + {bf H}(X_1 + X_2|pi(X_1) + pi(X_2))$

whereas from the non-negativity of conditional mutual data one has

$displaystyle {bf H}(X_1 + X_2|pi(X_1) + pi(X_2)) geq {bf H}(X_1 + X_2|pi(X_1), pi(X_2))$

and it’s a simple matter to mix all these identities and inequalities to acquire the declare.

Making use of this inequality with ${X}$ changed by two unbiased copies ${(X_1,X_2)}$ of itself, and utilizing the addition map ${(x,y) mapsto x+y}$ for ${pi}$, we acquire specifically that

$displaystyle 2 d(X;X) geq d(X_1+X_2; X_1+X_2) + d(X_1,X_2|X_1+X_2; X_1,X_2|X_1+X_2)$

or (since ${X_2}$ is set by ${X_1}$ as soon as one fixes ${X_1+X_2}$)

$displaystyle 2 d(X;X) geq d(X_1+X_2; X_1+X_2) + d(X_1|X_1+X_2; X_1|X_1+X_2).$

So if ${d(X;X) = log K}$, then at the least one in all ${d(X_1+X_2; X_1+X_2)}$ or $X_1+X_2; X_1$ might be lower than or equal to ${log K}$. That is the entropy analogue of at the least one in all (i) or (ii) bettering, or at the least not degrading the doubling fixed, though there are some minor technicalities involving how one offers with the conditioning to ${X_1+X_2}$ within the second time period $X_1+X_2; X_1$ that we are going to gloss over right here (one can pigeonhole the cases of ${X_1}$ to totally different occasions ${X_1+X_2=x}$, ${X_1+X_2=x'}$, and “depolarise” the induction speculation to take care of distances ${d(X;Y)}$ between pairs of random variables ${X,Y}$ that don’t essentially have the identical distribution). Moreover, we will even calculate the defect within the above inequality: a cautious inspection of the above argument ultimately reveals that

$displaystyle 2 d(X;X) = d(X_1+X_2; X_1+X_2) + d(X_1|X_1+X_2; X_1|X_1+X_2)$

$displaystyle + {bf I}( X_1 + X_2 : X_1 + X_3 | X_1 + X_2 + X_3 + X_4)$

the place we now take 4 unbiased copies ${X_1,X_2,X_3,X_4}$. This leads (modulo some technicalities) to the next attention-grabbing conclusion: if neither (i) nor (ii) results in an enchancment within the entropic doubling fixed, then ${X_1+X_2}$ and ${X_2+X_3}$ are conditionally unbiased relative to ${X_1+X_2+X_3+X_4}$. This case (or an approximation to this example) is what we seek advice from within the paper because the “endgame”.

A model of this endgame conclusion is in truth legitimate in any attribute. However in attribute ${2}$, we will benefit from the identification

$displaystyle (X_1+X_2) + (X_2+X_3) = X_1 + X_3.$

Conditioning on ${X_1+X_2+X_3+X_4}$, and utilizing symmetry we now conclude that if we’re within the endgame precisely (in order that the mutual data is zero), then the unbiased sum of two copies of $X_1+X_2+X_3+X_4)$ has precisely the identical distribution; specifically, the entropic doubling fixed right here is zero, which is definitely a discount within the doubling fixed.

To take care of the state of affairs the place the conditional mutual data is small however not utterly zero, we’ve to make use of an entropic model of the Balog-Szemeredi-Gowers lemma, however fortuitously this was already labored out in an outdated paper of mine (though with a view to optimise the ultimate fixed, we ended up utilizing a slight variant of that lemma).

I’m planning to formalize this paper within the proof assistant language Lean4; extra particulars to observe.