Home Math Bounding sums or integrals of non-negative portions

### Bounding sums or integrals of non-negative portions

A typical process in evaluation is to acquire bounds on sums

$displaystyle sum_{n in A} f(n)$

or integrals

$displaystyle int_A f(x) dx$

the place ${A}$ is a few easy area (corresponding to an interval) in a number of dimensions, and ${f}$ is an specific (and elementary) non-negative expression involving a number of variables (corresponding to ${n}$ or ${x}$, and presumably additionally some extra parameters. Usually, one could be content material with an order of magnitude higher certain corresponding to

$displaystyle sum_{n in A} f(n) ll X$

or

$displaystyle int_A f(x) dx ll X$

the place we use ${X ll Y}$ (or ${Y gg X}$ or ${X = O(Y)}$) to indicate the certain $leq CY$ for some fixed ${C}$; typically one needs to additionally get hold of the matching decrease certain, thus acquiring

$displaystyle sum_{n in A} f(n) asymp X$

or

$displaystyle int_A f(x) dx asymp X$

the place ${X asymp Y}$ is synonymous with ${X ll Y ll X}$. Lastly, one could want to get hold of a extra exact certain, corresponding to

$displaystyle sum_{n in A} f(n) = (1+o(1)) X$

the place ${o(1)}$ is a amount that goes to zero because the parameters of the issue go to infinity (or another restrict). (For a deeper dive into asymptotic notation on the whole, see this earlier weblog put up.)

Listed here are some typical examples of such estimation issues, drawn from latest questions on MathOverflow:

In comparison with different estimation duties, corresponding to that of controlling oscillatory integrals, exponential sums, singular integrals, or expressions involving a number of unknown capabilities (which are solely identified to lie in some operate areas, corresponding to an ${L^p}$ house), high-dimensional geometry (or alternatively, massive numbers of random variables), or number-theoretic buildings (such because the primes), estimation of sums or integrals of non-negative elementary expressions is a comparatively easy process, and could be achieved by quite a lot of strategies. The artwork of acquiring such estimates is often not explicitly taught in textbooks, apart from by means of some examples and workout routines; it’s usually picked up by analysts (or these working in adjoining areas, corresponding to PDE, combinatorics, or theoretical pc science) as graduate college students, whereas they work by means of their thesis or their first few papers within the topic.

Considerably within the spirit of this earlier put up on evaluation downside fixing methods, I’m going to strive right here to gather some common rules and methods that I’ve discovered helpful for these types of issues. As with the earlier put up, I hope this shall be one thing of a residing doc, and encourage others so as to add their very own suggestions or options within the feedback.

— 1. Asymptotic arithmetic —

Asymptotic notation is designed in order that most of the standard guidelines of algebra and inequality manipulation proceed to carry, with the caveat that one must be cautious if subtraction or division is concerned. For example, if one is aware of that ${A ll X}$ and ${B ll Y}$, then one can instantly conclude that ${A + B ll X+Y}$ and ${AB ll XY}$, even when ${A,B}$ are unfavorable (be aware that the notation ${A ll X}$ or ${B ll Y}$ routinely forces ${X,Y}$ to be non-negative). Equivalently, we’ve the foundations

$displaystyle O(X) + O(Y) = O(X+Y); quad O(X) cdot O(Y) = O(XY)$

and extra usually we’ve the triangle inequality

$displaystyle sum_alpha O(X_alpha) = O( sum_alpha X_alpha ).$

(Once more, we stress that this kind of rule implicitly requires the ${X_alpha}$ to be non-negative. As a rule of thumb, in case your calculations have arrived at a scenario the place a signed or oscillating sum or integral seems inside the big-O notation, or on the right-hand facet of an estimate, with out being “protected” by absolute worth indicators, then you have got in all probability made a severe error in your calculations.)

One other rule of inequalities that’s inherited by asymptotic notation is that if one has two bounds

$displaystyle A ll X; quad A ll Y (1)$

for an identical quantity ${A}$, then one can mix them into the unified asymptotic certain

$displaystyle A ll min(X, Y). (2)$

That is an instance of a “free transfer”: a substitute of bounds that doesn’t lose any of the energy of the unique bounds, since in fact (2) implies (1). In distinction, different methods to mix the 2 bounds (1), corresponding to taking the geometric imply

$displaystyle A ll X^{1/2} Y^{1/2}, (3)$

whereas usually handy, are usually not “free”: the bounds (1) indicate the averaged certain (3), however the certain (3) doesn’t indicate (1). Alternatively, the inequality (2), whereas it doesn’t concede any logical energy, can require extra calculation to work with, actually because one finally ends up splitting up instances corresponding to ${X ll Y}$ and ${X gg Y}$ so as to simplify the minimal. So in follow, when attempting to determine an estimate, one usually begins with utilizing conservative bounds corresponding to (2) so as to maximize one’s probabilities of getting any proof (regardless of how messy) of the specified estimate, and solely after such a proof is discovered, one tries to search for extra elegant approaches utilizing much less environment friendly bounds corresponding to (3).

For example, suppose one wished to indicate that the sum

$displaystyle sum_{n=-infty}^infty frac{2^n}{(1+n^2) (1+2^{2n})}$

was convergent. Decrease bounding the denominator time period ${1+2^{2n}}$ by ${1}$ or by ${2^{2n}}$, one obtains the bounds

$displaystyle frac{2^n}{(1+n^2) (1+2^{2n})} ll frac{2^n}{1+n^2} (4)$

and likewise

$displaystyle frac{2^n}{(1+n^2) (1+2^{2n})} ll frac{2^n}{(1+n^2) 2^{2n}} = frac{2^{-n}}{1+n^2} (5)$

so by making use of (2) we get hold of the unified certain

$displaystyle frac{2^n}{(1+n^2) (1+2^{2n})} ll frac{2^n}{(1+n^2) 2^{2n}} = frac{max(2^n,2^{-n})}{1+n^2}.$

To cope with this certain, we will break up into the 2 contributions ${n geq 0}$, the place ${2^{-n}}$ dominates, and ${n < 0}$, the place ${2^n}$ dominates. Within the former case we see (from the ratio take a look at, as an illustration) that the sum

$displaystyle sum_{n=0}^infty frac{2^{-n}}{1+n^2}$

is totally convergent, and within the latter case we see that the sum

$displaystyle sum_{n=-infty}^{-1} frac{2^{n}}{1+n^2}$

can be completely convergent, so all the sum is totally convergent. However as soon as one has this argument, one can attempt to streamline it, as an illustration by taking the geometric imply of (4), (5) slightly than the minimal to acquire the weaker certain

$displaystyle frac{2^n}{(1+n^2) (1+2^{2n})} ll frac{1}{1+n^2} (6)$

and now one can conclude with out decomposition simply by observing absolutely the convergence of the doubly infinite sum ${sum_{n=-infty}^infty frac{1}{1+n^2}}$. This can be a much less “environment friendly” estimate, as a result of one has conceded loads of the decay within the summand through the use of (6) (the summand was exponentially decaying in ${n}$, however is now solely polynomially decaying), however it’s nonetheless enough for the aim of creating absolute convergence.

One of many key benefits of coping with order of magnitude estimates, versus sharp inequalities, is that the arithmetic turns into tropical. Extra explicitly, we’ve the necessary rule

$displaystyle X + Y asymp max(X,Y)$

whenver ${X,Y}$ are non-negative, since we clearly have

$displaystyle max(X,Y) leq X+Y leq 2 max(X,Y).$

In praticular, if ${Y leq X}$, then ${O(X) + O(Y) = O(X)}$. That’s to say, given two orders of magnitudes, any time period ${O(Y)}$ of equal or decrease order to a “most important time period” ${O(X)}$ could be discarded. This can be a very helpful rule to remember when attempting to estimate sums or integrals, because it permits one to discard many phrases that aren’t contributing to the ultimate reply. It additionally units up the elemental divide and conquer technique for estimation: if one needs to show a certain corresponding to ${A ll X}$, it’s going to suffice to acquire a decomposition

$displaystyle A = A_1 + dots + A_k$

or not less than an higher certain

$displaystyle A ll A_1 + dots + A_k$

of ${A}$ by some bounded variety of elements ${A_1,dots,A_k}$, and set up the bounds ${A_1 ll X, dots, A_k ll X}$ individually. Sometimes the ${A_1,dots,A_k}$ shall be (morally not less than) smaller than the unique amount ${A}$ – as an illustration, if ${A}$ is a sum of non-negative portions, every of the ${A_i}$ may be a subsum of those self same portions – which signifies that such a decomposition is a “free transfer”, within the sense that it doesn’t threat making the issue more durable. (It is because, if the unique certain ${A ll X}$ is to be true, every of the brand new targets ${A_1 ll X, dots, A_k ll X}$ should even be true, and so the decomposition can solely make the issue logically simpler, not more durable.) The one prices to such decomposition are that your proofs may be ${k}$ instances longer, as you could be repeating the identical arguments ${k}$ instances, and that the implied constants within the ${A_1 ll X, dots, A_k ll X}$ bounds could also be worse than the implied fixed within the unique ${A ll X}$ certain. Nevertheless, in lots of instances these prices are nicely value the advantages of with the ability to simplify the issue into smaller items. As talked about above, as soon as one efficiently executes a divide and conquer technique, one can return and attempt to scale back the variety of decompositions, as an illustration by unifying elements which are handled by comparable strategies, or by changing sturdy however unwieldy estimates with weaker, however extra handy estimates.

The above divide and conquer technique doesn’t straight apply when one is decomposing into an unbounded variety of items ${A_j}$, ${j=1,2,dots}$. In such instances, one wants an extra acquire within the index ${j}$ that’s summable in ${j}$ so as to conclude. For example, if one needs to determine a certain of the shape ${A ll X}$, and one has situated a decomposition or higher certain

$displaystyle A ll sum_{j=1}^infty A_j$

that appears promising for the issue, then it will suffice to acquire exponentially decaying bounds corresponding to

$displaystyle A_j ll 2^{-cj} X$

for all ${j geq 1}$ and a few fixed ${c>0}$, since this could indicate

$displaystyle A ll sum_{j=1}^infty 2^{-cj} X ll X (7)$

due to the geometric collection system. (Right here it will be important that the implied constants within the asymptotic notation are uniform on ${j}$; a ${j}$-dependent certain corresponding to ${A_j ll_j 2^{-cj} X}$ could be ineffective for this utility, as then the expansion of the implied fixed in ${j}$ might overwhelm the exponential decay within the ${2^{-cj}}$ issue). Exponential decay is the truth is overkill; polynomial decay corresponding to

$displaystyle A_j ll frac{X}{j^{1+c}}$

would already be enough, though harmonic decay such

$displaystyle A_j ll frac{X}{j} (8)$

isn’t fairly sufficient (the sum ${sum_{j=1}^infty frac{1}{j}}$ diverges logarithmically), though in lots of such conditions one might attempt to nonetheless salvage the certain by working so much more durable to squeeze some extra logarithmic components out of 1’s estimates. For example, if one can enhance eqre{ajx} to

$displaystyle A_j ll frac{X}{j log^{1+c} j}$

for all ${j geq 2}$ and a few fixed ${c>0}$, since (by the integral take a look at) the sum ${sum_{j=2}^infty frac{1}{jlog^{1+c} j}}$ converges (and one can deal with the ${j=1}$ time period individually if one already has (8)).

Typically, when attempting to show an estimate corresponding to ${A ll X}$, one has recognized a promising decomposition with an unbounded variety of phrases

$displaystyle A ll sum_{j=1}^J A_j$

(the place ${J}$ is finite however unbounded) however is uncertain of how one can proceed subsequent. Usually the following factor to do is to check the intense phrases ${A_1}$ and ${A_J}$ of this decomposition, and first attempt to set up (the presumably less complicated) duties of exhibiting that ${A_1 ll X}$ and ${A_J ll X}$. Usually as soon as one does so, it turns into clear how one can mix the remedies of the 2 excessive instances to additionally deal with the intermediate instances, acquiring a certain ${A_j ll X}$ for every particular person time period, resulting in the inferior certain ${A ll JX}$; this may then be used as a place to begin to hunt for extra positive aspects, such because the exponential or polynomial positive aspects talked about beforehand, that could possibly be used to take away this lack of ${J}$. (There are extra superior methods, corresponding to these based mostly on controlling moments such because the sq. operate ${()sum_{j=1}^J |A_j|^2)^{1/2}}$, or attempting to know the exact circumstances wherein a “massive values” state of affairs $A_j$ happens, and the way these situations work together with one another for various ${j}$, however these are past the scope of this put up, as they’re hardly ever wanted when coping with sums or integrals of elementary capabilities.)

— 1.1. Psychological distinctions between actual and asymptotic arithmetic —

The adoption of the “divide and conquer” technique requires a sure psychological shift from the “simplify, simplify” technique that one is taught in highschool algebra. Within the latter technique, one tries to gather phrases in an expression make them as brief as doable, as an illustration by working with a standard denominator, with the concept that unified and elegant-looking expressions are “less complicated” than sprawling expressions with many phrases. In distinction, the divide and conquer technique is deliberately extraordinarily keen to tremendously enhance the whole size of the expressions to be estimated, as long as every particular person element of the expressions seems simpler to estimate than the unique one. Each methods are nonetheless attempting to scale back the unique downside to an easier downside (or assortment of less complicated sub-problems), however the metric by which one judges whether or not the issue has change into less complicated is slightly completely different.

A associated psychological shift that one must undertake in evaluation is to maneuver away from the precise identities which are so prized in algebra (and in undergraduate calculus), because the precision they provide is commonly pointless and distracting for the duty at hand, and infrequently fail to generalize to extra sophisticated contexts wherein actual identities are now not obtainable. As a easy instance, think about the duty of estimating the expression

$displaystyle int_0^a frac{dx}{1+x^2}$

the place ${a > 0}$ is a parameter. With a trigonometric substitution, one can consider this expression precisely as ${mathrm{arctan}(a)}$, nevertheless the presence of the arctangent could be inconvenient if one has to do additional estimation duties (as an illustration, if ${a}$ relies upon in a sophisticated style on different parameters, which one then additionally needs to sum or combine over). As a substitute, by observing the trivial bounds

$displaystyle int_0^a frac{dx}{1+x^2} leq int_0^a dx = a$

and

$displaystyle int_0^a frac{dx}{1+x^2} leq int_0^infty frac{dx}{1+x^2} = frac{pi}{2}$

one can mix them utilizing (2) to acquire the higher certain

$displaystyle int_0^a frac{dx}{1+x^2} leq min( a, frac{pi}{2} ) asymp min(a,1)$

and comparable arguments additionally give the matching decrease certain, thus

$displaystyle int_0^a frac{dx}{1+x^2} asymp min(a,1). (9)$

This certain, whereas cruder than the precise reply of ${mathrm{arctan}(a)}$, is commonly adequate for a lot of functions (par ticularly in conditions the place one is keen to concede constants within the bounds), and could be extra tractible to work with than the precise reply. Moreover, these arguments could be tailored with out problem to deal with the same expression

$displaystyle int_0^a frac{dx}{1+x^4}$

for which there isn’t any closed type actual expression when it comes to elementary capabilities such because the arctangent.

As a common rule, as a substitute of relying completely on actual formulae, one ought to search approximations which are legitimate as much as the diploma of precision that one seeks within the last estimate. For example, suppose one one needs to determine the certain

$displaystyle sec(x) - cos(x) = x^2 + O(x^3)$

for all small enough ${x}$. If one was clinging to the precise id mindset, one might attempt to search for some trigonometric id to simplify the left-hand facet precisely, however the faster (and extra strong) method to proceed is simply to make use of Taylor enlargement as much as the required accuracy ${O(x^3)}$ to acquire

$displaystyle cos(x) = 1 - frac{x^2}{2} + O(x^3)$

which one can invert utilizing the geometric collection system ${(1-y)^{-1} = 1 + y + y^2 + dots}$ to acquire

$displaystyle sec(x) = 1 + frac{x^2}{2} + O(x^3)$

from which the declare follows. (One might even have computed the Taylor enlargement of ${sec(x)}$ straight, however as it is a collection that’s often not memorized, this may take somewhat bit extra time than simply computing it on to the required accuracy.) Be aware that the notion of “specified accuracy” could should be interpreted in a relative sense if one is planning to multiply or divide a number of estimates collectively. For example, if one needs to establsh the certain

$displaystyle sin(x) cos(x) = x + O(x^3)$

for small ${x}$, one wants an approximation

$displaystyle sin(x) = x + O(x^3)$

to the sine operate that’s correct to order ${O(x^3)}$, however one solely wants an approximation

$displaystyle cos(x) = 1 + O(x^2)$

to the cosine operate that’s correct to order ${O(x^2)}$, as a result of the cosine is to be multiplied by ${sin(x)= O(x)}$. Right here the bottom line is to acquire estimates which have a relative error of ${O(x^2)}$, in comparison with the primary time period (which is ${1}$ for cosine, and ${x}$ for sine).

Alternatively, some actual formulae are nonetheless very helpful, significantly if the tip results of that system is clear and tractable to work with (versus involving considerably unique capabilities such because the arctangent). The geometric collection system, as an illustration, is an especially useful actual system, a lot in order that it’s usually fascinating to manage summands by a geometrical collection purely to make use of this system (we already noticed an instance of this in (7)). Precise integral identities, corresponding to

$displaystyle frac{1}{a} = int_0^infty e^{-at} dt$

or extra usually

$displaystyle frac{Gamma(s)}{a^s} = int_0^infty e^{-at} t^{s-1} dt$

for ${a,s>0}$ (the place ${Gamma}$ is the Gamma operate) are additionally fairly generally used, and basic actual integration guidelines such because the change of variables system, the Fubini-Tonelli theorem or integration by components are all esssential instruments for an analyst attempting to show estimates. Due to this, it’s usually fascinating to estimate a sum by an integral. The integral take a look at is a basic instance of this precept in motion: a extra quantitative variations of this take a look at is the certain

$displaystyle int_{a}^{b+1} f(t) dt leq sum_{n=a}^b f(n) leq sum_{a-1}^b f(t) dt (10)$

every time ${a leq b}$ are integers and ${f: [a-1,b+1] rightarrow {bf R}}$ is monotone reducing, or the intently associated certain

$displaystyle sum_{a leq n leq b} f(n) = int_a^b f(t) dt + O( |f(a)| + |f(b)| ) (11)$

every time ${a geq b}$ are reals and ${f: [a,b] rightarrow {bf R}}$ is monotone (both rising or reducing); see Lemma 2 of this earlier put up. Such bounds enable one to modify backwards and forwards fairly simply between sums and integrals so long as the summand or integrand behaves in a principally monotone style (as an illustration, whether it is monotone rising on one portion of the area and monotone reducing on the opposite). For extra precision, one might flip to extra superior relationships between sums and integrals, such because the Euler-Maclaurin system or the Poisson summation system, however these are past the scope of this put up.

Train 1 Suppose ${f: {bf R} rightarrow {bf R}^+}$ obeys the quasi-monotonicity property ${f(x) ll f(y)}$ every time ${y-1 leq x leq y}$. Present that ${int_a^{b-1} f(t) dt ll sum_{n=a}^b f(n) ll int_a^{b+1} f(t) dt}$ for any integers ${a < b}$.

Train 2 Use (11) to acquire the “low cost Stirling approximation

$displaystyle n! = exp( n log n - n + O(log n) )$

for any pure quantity ${n geq 2}$. (Trace: take logarithms to transform the product ${n! = 1 times 2 times dots times n}$ right into a sum.)

With follow, it is possible for you to to establish any time period in a computation which is already “negligible” or “acceptable” within the sense that its contribution is at all times going to result in an error that’s smaller than the specified accuracy of the ultimate estimate. One can then work “modulo” these negligible phrases and discard them as quickly as they seem. This will help take away loads of muddle in a single’s arguments. For example, if one needs to determine an asymptotic of the shape

$displaystyle A = X + O(Y)$

for some most important time period ${X}$ and decrease order error ${O(Y)}$, any element of ${A}$ that one can already establish to be of measurement ${O(Y)}$ is negligible and could be faraway from ${A}$ “totally free”. Conversely, it may be helpful to add negligible phrases to an expression, if it makes the expression simpler to work with. For example, suppose one needs to estimate the expression

$displaystyle sum_{n=1}^N frac{1}{n^2}. (12)$

This can be a partial sum for the zeta operate

$displaystyle sum_{n=1}^infty frac{1}{n^2} = zeta(2) = frac{pi^2}{6}$

so it might make sense so as to add and subtract the tail ${sum_{n=N+1}^infty frac{1}{n^2}}$ to the expression (12) to rewrite it as

$displaystyle frac{pi^2}{6} - sum_{n=N+1}^infty frac{1}{n^2}.$

To cope with the tail, we change from a sum to the integral utilizing (10) to certain

$displaystyle sum_{n=N+1}^infty frac{1}{n^2} ll int_N^infty frac{1}{t^2} dt = frac{1}{N}$

giving us the moderately correct certain

$displaystyle sum_{n=1}^N frac{1}{n^2} = frac{pi^2}{6} - O(frac{1}{N}).$

One can sharpen this approximation considerably utilizing (11) or the Euler–Maclaurin system; we depart this to the reader.

One other psychological shift when switching from algebraic simplification issues to estimation issues is that one must be ready to let go of constraints in an expression that complicate the evaluation. Suppose as an illustration we now want to estimate the variant

$displaystyle sum_{1 leq n leq N, hbox{ square-free}} frac{1}{n^2}$

of (12), the place we are actually limiting ${n}$ to be square-free. An id from analytic quantity idea (the Euler product id) lets us calculate the precise sum

$displaystyle sum_{n geq 1, hbox{ square-free}} frac{1}{n^2} = frac{zeta(2)}{zeta(4)} = frac{15}{pi^2}$

in order earlier than we will write the specified expression as

$displaystyle frac{15}{pi^2} - sum_{n > N, hbox{ square-free}} frac{1}{n^2}.$

Beforehand, we utilized the integral take a look at (10), however this time we can’t achieve this, as a result of the restriction to square-free integers destroys the monotonicity. However we will merely take away this restriction:

$displaystyle sum_{n > N, hbox{ square-free}} frac{1}{n^2} leq sum_{n > N} frac{1}{n^2}.$

Heuristically not less than, this transfer solely “prices us a relentless”, since a optimistic fraction (${1/zeta(2)= 6/pi^2}$, the truth is) of all integers are square-free. Now that this constraint has been eliminated, we will use the integral take a look at as earlier than and acquire the moderately correct asymptotic

$displaystyle sum_{1 leq n leq N, hbox{ square-free}} frac{1}{n^2} = frac{15}{pi^2} + O(frac{1}{N}).$

— 2. Extra on decomposition —

The way in which wherein one decomposes a sum or integral corresponding to ${sum_{n in A} f(n)}$ or ${int_A f(x) dx}$ is commonly guided by the “geometry” of ${f}$, and specifically the place ${f}$ is massive or small (or whether or not numerous element phrases in ${f}$ are massive or small relative to one another). For example, if ${f(x)}$ comes near a most in some unspecified time in the future ${x=x_0}$, then it might make sense to decompose based mostly on the gap  to ${x_0}$, or maybe to deal with the instances ${x leq x_0}$ and ${x>x_0}$ individually. (Be aware that ${x_0}$ doesn’t actually should be the utmost to ensure that this to be an affordable decomposition; whether it is in “inside affordable distance” of the utmost, this might nonetheless be a superb transfer. As such, it’s usually not worthwhile to attempt to compute the utmost of ${f}$ precisely, particularly if this actual system finally ends up being too sophisticated to be helpful.)

If an expression includes a distance $X-Y$ between two portions ${X,Y}$, it’s typically helpful to separate into the case $X$ the place ${X}$ is far smaller than ${Y}$ (in order that $X-Y$), the case $leq$ the place ${Y}$ is far smaller than ${X}$ (in order that $X$), or the case when neither of the 2 earlier instances apply (in order that ). The components of ${2}$ right here are usually not of vital significance; the purpose is that in every of those three instances, one has some hope of simplifying the expression into one thing extra tractable. For example, suppose one needs to estimate the expression

$displaystyle int_{-infty}^infty frac{dx}{(1+(x-a)^2) (1+(x-b)^2)} (13)$

when it comes to the 2 actual parameters ${a, b}$, which we’ll take to be distinct for sake of this dialogue. This specific integral is easy sufficient that it may be evaluated precisely (as an illustration utilizing contour integration methods), however within the spirit of Precept 1, allow us to keep away from doing so and as a substitute attempt to decompose this expression into less complicated items. A graph of the integrand reveals that it peaks when ${x}$ is close to ${a}$ or close to ${b}$. Impressed by this, one can decompose the area of integration into three items:

• (i) The area the place ${|x-a| leq frac{2}}$.
• (ii) The area the place ${|x-b| leq frac{2}}$.
• (iii) The area the place ${|x-a|, |x-b| > frac{2}}$.

(This isn’t the one method to lower up the integral, however it’s going to suffice. Usually there isn’t any “canonical” or “elegant” method to carry out the decomposition; one ought to simply attempt to discover a decomposition that’s handy for the issue at hand.)

The explanation why we need to carry out such a decomposition is that in every of the three instances, one can simplify how the integrand depends upon ${x}$. For example, in area (i), we see from the triangle inequality that $x-b$ is now akin to , in order that this contribution to (13) is akin to

$displaystyle asymp int_/2 frac{dx}{(1+(x-a)^2) (1+(a-b)^2)}.$

Utilizing a variant of (9), this expression is akin to

$displaystyle asymp min( 1, |a-b|/2) frac{1}{1+(a-b)^2} asymp frac){1+(a-b)^2}. (14)$

The contribution of area (ii) could be dealt with equally, and can be akin to (14). Lastly, in area (iii), we see from the triangle inequality that $,$ are actually comparable to one another, and so the contribution of this area is akin to

$displaystyle asymp int_a-b frac{dx}{(1+(x-a)^2)^2}.$

Now that we’ve centered the integral round ${x=a}$, we’ll discard the $>$ constraint, higher bounding this integral by

$displaystyle asymp int_/2 frac{dx}{(1+(x-a)^2)^2}.$

On the one hand this integral is bounded by

$displaystyle int_{-infty}^infty frac{dx}{(1+(x-a)^2)^2} = int_{-infty}^infty frac{dx}{(1+x^2)^2} asymp 1$

and then again we will certain

$displaystyle int_/2 frac{dx}{(1+(x-a)^2)^2} leq int_/2 frac{dx}{(x-a)^4} asymp |a-b|^{-3}$

and so we will certain the contribution of (iii) by ${O( min( 1, |a-b|^{-3} ))}$. Placing all this collectively, and dividing into the instances $a-b$ and $a-b$, one can quickly get hold of a complete certain of ${O(min( 1, |a-b|^{-3}))}$ for all the integral. One can even adapt this argument to indicate that this certain is sharp as much as constants, thus

$displaystyle int_{-infty}^infty frac{dx}{(1+(x-a)^2) (1+(x-b)^2)} asymp min( 1, |a-b|^{-3}) asymp frac{1}a-b.$

A robust and customary sort of decomposition is dyadic decomposition. If the summand or integrand includes some amount ${Q}$ in a key method, it’s usually helpful to interrupt up into dyadic areas corresponding to ${2^{j-1} leq Q < 2^{j}}$, in order that ${Q sim 2^j}$, after which sum over ${j}$. (One can tweak the dyadic vary ${2^{j-1} leq Q < 2^{j}}$ right here with minor variants corresponding to ${2^{j} < Q leq 2^{j+1}}$, or change the bottom ${2}$ by another base, however these modifications principally have a minor aesthetic affect on the arguments at greatest.) For example, one might break up a sum

$displaystyle sum_{n=1}^{infty} f(n) (15)$

$displaystyle sum_{j=1}^infty sum_{2^{j-1} leq n < 2^{j}} f(n)$

after which search to estimate every dyadic block ${sum_{2^{j-1} leq n < 2^{j}} f(n)}$ individually (hoping to get some exponential or polynomial decay in ${j}$). The classical strategy of Cauchy condensation is a primary instance of this technique. However one can even dyadically decompose different portions than ${n}$. For example one can carry out a “vertical” dyadic decomposition (in distinction to the “horizontal” one simply carried out) by rewriting (15) as

$displaystyle sum_{k in {bf Z}} sum_{n geq 1: 2^{k-1} leq f(n) < 2^k} f(n);$

because the summand ${f(n)}$ is ${asymp 2^k}$, we could simplify this to

$displaystyle asymp sum_{k in {bf Z}} 2^k # { n geq 1: 2^{k-1} leq f(n) < 2^k}.$

This now converts the issue of estimating the sum (15) to the extra combinatorial downside of estimating the scale of the dyadic degree units ${{ n geq 1: 2^{k-1} leq f(n) < 2^k}}$ for numerous ${k}$. In an analogous spirit, we’ve

$displaystyle int_A f(x) dx asymp sum_{k in {bf Z}} 2^k | { x in A: 2^{k-1} leq f(x) < 2^k }|$

the place $E$ denotes the Lebesgue measure of a set ${E}$, and now we’re confronted with a geometrical downside of estimating the measure of some specific set. This permits one to make use of geometric instinct to unravel the issue, as a substitute of multivariable calculus:

Train 3 Let ${S}$ be a easy compact submanifold of ${{bf R}^d}$. Set up the certain

$displaystyle int_{B(0,C)} frac{dx}{varepsilon^2 + mathrm{dist}(x,S)^2} ll varepsilon$

for all ${0 < varepsilon < C}$, the place the implied constants are allowed to rely on ${C, d, S}$. (This may be achieved both by a vertical dyadic decomposition, or a dyadic decomposition of the amount ${mathrm{dist}(x,S)}$.)

Train 4 Remedy downside (ii) from the introduction to this put up by dyadically decomposing within the ${d}$ variable.

Comment 5 By such instruments as (10), (11), or Train 1, one might convert the dyadic sums one obtains from dyadic decomposition into integral variants. Nevertheless, if one wished, one might “lower out the middle-man” and work with steady dyadic decompositions slightly than discrete ones. Certainly, from the integral id

$displaystyle int_0^infty 1_{lambda < Q leq 2lambda} frac{dlambda}{lambda} = log 2$

for any ${Q>0}$, along with the Fubini–Tonelli theorem, we get hold of the continual dyadic decomposition

$displaystyle sum_{n in A} f(n) = int_0^infty sum_{n in A: lambda leq Q(n) < 2lambda} f(n) frac{dlambda}{lambda}$

for any amount ${Q(n)}$ that’s optimistic every time ${f(n)}$ is optimistic. Equally if we work with integrals ${int_A f(x) dx}$ slightly than sums. This model of dyadic decomposition is often somewhat extra handy to work with, significantly if one then needs to carry out numerous modifications of variables within the ${lambda}$ parameter which might be difficult to execute if this have been a discrete variable.

— 3. Exponential weights —

Many sums contain expressions which are “exponentially massive” or “exponentially small” in some parameter. A primary rule of thumb is that any amount that’s “exponentially small” will seemingly give a negligible contribution when put next towards portions that aren’t exponentially small. For example, if an expression includes a time period of the shape ${e^{-Q}}$ for some non-negative amount ${Q}$, which could be bounded on not less than one portion of the area of summation or integration, then one expects the area the place ${Q}$ is bounded to offer the dominant contribution. For example, if one needs to estimate the integral

$displaystyle int_0^infty e^{-varepsilon x} frac{dx}{1+x}$

for some ${0 < varepsilon < 1/2}$, this heuristic means that the dominant contribution ought to come from the area ${x = O(1/varepsilon)}$, wherein one can certain ${e^{-varepsilon x}}$ just by ${1}$ and acquire an higher certain of

$displaystyle ll int_{x = O(1/varepsilon)} frac{dx}{1+x} ll log frac{1}{varepsilon}.$

To make such a heuristic exact, one can carry out a dyadic decomposition within the exponential weight ${e^{-varepsilon x}}$, or equivalently carry out an additive decomposition within the exponent ${varepsilon x}$, as an illustration writing

$displaystyle int_0^infty e^{-varepsilon x} frac{dx}{1+x} = sum_{j=1}^infty int_{j-1 leq varepsilon x < j} e^{-varepsilon x} frac{dx}{1+x}.$

Train 6 Use this decomposition to carefully set up the certain

$displaystyle int_0^infty e^{-varepsilon x} frac{dx}{1+x} ll log frac{1}{varepsilon}$

for any ${0 < varepsilon < 1/2}$.

Train 7 Remedy downside (i) from the introduction to this put up.

Extra usually, if one is working with a sum or integral corresponding to

$displaystyle sum_{n in A} e^{phi(n)} psi(n)$

or

$displaystyle int_A e^{phi(x)} psi(x) dx$

with some exponential weight ${e^phi}$ and a decrease order amplitude ${psi}$, then one usually expects the dominant contribution to return from the area the place ${phi}$ comes near attaining its maximal worth. If this most is attained on the boundary, then one usually has geometric collection conduct away from the boundary, and one can usually get a superb estimate by acquiring geometric collection sort conduct. For example, suppose one needs to estimate the error operate

$displaystyle mathrm{erf}(z) = frac{2}{sqrt{pi}} int_0^z e^{-t^2} dt$

for ${z geq 1}$. In view of the entire integral

$displaystyle int_0^infty e^{-t^2} dt = frac{sqrt{pi}}{2}$

we will rewrite this as

$displaystyle mathrm{erf}(z) = 1 - frac{2}{sqrt{pi}} int_z^infty e^{-t^2} dt.$

The exponential weight ${e^{-t^2}}$ attains its most on the left endpoint ${t=z}$ and decays rapidly away from that endpoint. One might estimate this by dyadic decomposition of ${e^{-t^2}}$ as mentioned beforehand, however a slicker method to proceed right here is to make use of the convexity of ${t^2}$ to acquire a geometrical collection higher certain

$displaystyle e^{-t^2} leq e^{-z^2 - 2 z (t-z)}$

for ${t geq z}$, which on integration provides

$displaystyle int_z^infty e^{-t^2} dt leq int_z^infty e^{-z^2 - 2 z (t-z)} dt = frac{e^{-z^2}}{2z}$

giving the asymptotic

$displaystyle mathrm{erf}(z) = 1 - O( frac{e^{-z^2}}{z})$

for ${z geq 1}$.

Train 8 Within the converse course, set up the higher certain

$displaystyle mathrm{erf}(z) leq 1 - c frac{e^{-z^2}}{z}$

for some absolute fixed ${c>0}$ and all ${z geq 1}$.

Train 9 If ${theta n leq m leq n}$ for some ${1/2 < theta < 1}$, present that

$displaystyle sum_{k=m}^n binom{n}{k} ll frac{1}{2theta-1} binom{n}{m}.$

(Trace: estimate the ratio between consecutive binomial coefficients ${binom{n}{k}}$ after which management the sum by a geometrical collection).

When the utmost of the exponent ${phi}$ happens within the inside of the area of summation or integration, then one can get good outcomes by some model of <a href=”https://en.wikipedia.org/wiki/Laplace

$displaystyle int_a^b e^{phi(x)} psi(x) dx$

the place ${phi}$ attains a non-degenerate world most at some inside level ${x = x_0}$. The rule of thumb right here is that

$displaystyle int_a^b e^{phi(x)} psi(x) dx approx sqrt{frac{2pi}} e^{phi(x_0)} psi(x_0).$

The heuristic justification is as follows. The primary contribution must be when ${x}$ is near ${x_0}$. Right here we will carry out a Taylor enlargement

$displaystyle phi(x) approx phi(x_0) - frac{1}{2} |phi''(x_0)| (x-x_0)^2$

since at a non-degenerate most we’ve ${phi'(x_)=0}$ and ${phi''(x_0) > 0}$. Additionally, if ${psi}$ is steady, then ${psi(x) approx psi(x_0)}$ when ${x}$ is near ${x_0}$. Thus we must always have the ability to estimate the above integral by the gaussian integral

$displaystyle int_{bf R} e^{phi(x_0) - frac{1}{2} |phi''(x_0)| (x-x_0)^2} psi(x_0) dx$

which could be computed to equal ${sqrt{frac{2pi}} e^{phi(x_0)} psi(x_0)}$ as desired.

Allow us to illustrate how this argument could be made rigorous by contemplating the duty of estimating the factorial ${n!}$ of a big quantity. In distinction to what we did in Train ref”>, we’ll proceed utilizing a model of Laplace’s methodology, counting on the integral illustration

$displaystyle n! = Gamma(n+1) = int_0^infty x^n e^{-x} dx.$

As ${n}$ is massive, we’ll think about ${x^n}$ to be a part of the exponential weight slightly than the amplitude, scripting this expression as

$displaystyle int_0^infty e^{-phi(x)} dx$

the place

$displaystyle phi(x) = x - n log x.$

The operate ${phi}$ attains a world most at ${x_0 = n}$, with ${phi(n) = 0}$ and ${phi''(n) = 1/n}$. We’ll subsequently decompose this integral into three items

$displaystyle int_0^{n-R} e^{-phi(x)} dx + int_{n-R}^{n+R} e^{-phi(x)} dx + int_{n+R}^infty e^{-phi(x)} dx (16)$

the place ${0 < R < n}$ is a radius parameter which we’ll select later, as it’s not instantly apparent for now how one can choose it.

The primary time period is predicted to be the center time period, so we will use crude strategies to certain the opposite two phrases. For the primary half the place ${0 < x leq n-R}$, ${phi}$ is rising so we will crudely certain ${e^{-phi(x)} leq e^{-phi(n-R)}}$ and thus

$displaystyle int_0^{n-R} e^{-phi(x)} dx leq (n-R) e^{-phi(n-R)} leq n e^{-phi(n-R)}.$

(We anticipate ${R}$ to be a lot smaller than ${n}$, so there may be not a lot level to saving the tiny ${-R}$ time period within the ${n-R}$ issue.) For the third half the place ${x geq n+R}$, ${phi}$ is reducing, however bounding ${e^{-phi(x)}}$ by ${e^{-phi(n+R)}}$ wouldn’t work due to the unbounded nature of ${x}$; some extra decay is required. Fortuitously, we’ve a strict enhance

$displaystyle phi'(x) = 1 - frac{n}{x} geq 1 - frac{n}{n+R} = frac{R}{n+R}$

for ${x geq n+R}$, so by the intermediate worth theorem we’ve

$displaystyle phi(x) geq phi(n+R) + frac{R}{n+R} (x-n-R)$

and after a brief calculation this offers

$displaystyle int_{n+R}^infty e^{-phi(x)} dx leq frac{n+R}{R} e^{-phi(n+R)} ll frac{n}{R} e^{-phi(n+R)}.$

Now we flip to the necessary center time period. If we assume ${R leq n/2}$, then we can have ${phi'''(x) = O( 1/n^2 )}$ within the area ${n-R leq x leq n+R}$, so by Taylor’s theorem with the rest

$displaystyle phi(x) = phi(n) + phi'(n) (x-n) + frac{1}{2} phi''(n) (x-n)^2 + O( fracx-n{n^2} )$

$displaystyle = phi(n) + frac{(x-n)^2}{2n} + O( frac{R^3}{n^2} ).$

If we assume that ${R = O(n^{2/3})}$, then the error time period is bounded and we will exponentiate to acquire

$displaystyle e^{-phi(x)} = (1 + O(frac{R^3}{n^2})) e^{-phi(n) - frac{(x-n)^2}{2n}} (17)$

for ${n-R leq x leq n+R}$ and therefore

$displaystyle int_{n-R}^{n+R} e^{-phi(x)} dx = (1 + O(frac{R^3}{n^2})) e^{-phi(n)} int_{n-R}^{n+R} e^{-(x-n)^2/2n} dx.$

If we additionally assume that ${R gg sqrt{n}}$, we will use the error operate sort estimates from earlier than to estimate

$displaystyle int_{n-R}^{n+R} e^{-(x-n)^2/2n} dx = sqrt{2pi n} + O( frac{n}{R} e^{-R^2/2n} ).$

Placing all this collectively, and utilizing eqref for particulars.

Train 10 Remedy downside (iii) from the introduction. (Trace: extract out the time period ${frac{k^{2n-4k}}{(n-k)^{2n-4k}}}$ to jot down because the exponential issue ${e^{phi(k)}}$, inserting all the opposite phrases (that are of polynomial measurement) within the amplitude operate ${psi(k)}$. The operate ${phi}$ will then attain a most at ${k=n/2}$; carry out a Taylor enlargement and mimic the arguments above.)