Given a threshold , a
-smooth number (or
-friable number) is a natural number
whose prime factors are all at most
. We use
to denote the number of
-smooth numbers up to
. In studying the asymptotic behavior of
, it is customary to write
as
(or
as
) for some
. For small values of
, the behavior is straightforward: for instance if
, then all numbers up to
are automatically
-smooth, so
in this case. If , the only numbers up to
that are not
-smooth are the multiples of primes
between
and
, so
where we have employed Mertens’ second theorem. For , there is an additional correction coming from multiples of two primes between
and
; a straightforward inclusion-exclusion argument (which we omit here) eventually gives
in this case.
More generally, for any fixed , de Bruijn showed that
where is the Dickman function. This function is a piecewise smooth, decreasing function of
, defined by the delay differential equation
with initial condition for
.
The asymptotic behavior of as
is rather complicated. Very roughly speaking, it has inverse factorial behavior; there is a general upper bound
, and a crude asymptotic
With a more careful analysis one can refine this to
and with a very careful application of the Laplace inversion formula one can in fact show that
where is the Euler-Mascheroni constant and
is defined implicitly by the equation
One cannot write in closed form using elementary functions, but one can express it in terms of the Lambert
function as
. This is not a particularly enlightening expression, though. A more productive approach is to work with approximations. It is not hard to get the initial approximation
for large
, which can then be re-inserted back into (3) to obtain the more accurate approximation
and inserted once again to obtain the refinement
We can now see that (2) is consistent with previous asymptotics such as (1), after comparing the integral to
For more details of these results, one can see for instance this survey by Granville.
This asymptotic (2) is quite complicated, and so one does not expect there to be any simple argument that could recover it without extensive computation. However, it turns out that one can use a “maximum entropy” analysis to get a reasonably good heuristic approximation to (2), that at least reveals the role of the mysterious function . The purpose of this blog post is to give this heuristic.
Viewing , the task is to try to count the number of
-smooth numbers of magnitude
. We will propose a probabilistic model to generate
-smooth numbers as follows: for each prime
, select the prime
with an independent probability
for some coefficient
, and then multiply all the selected primes together. This will clearly generate a random
-smooth number
, and by the law of large numbers, the (log-)magnitude of this number should be approximately
(where we will be vague about what “” means here), so to obtain a number of magnitude about
, we should impose the constraint
The indicator of the event that
divides this number is a Bernoulli random variable with mean
, so the Shannon entropy of this random variable is
If is not too large, then Taylor expansion gives the approximation
Because of independence, the total entropy of this random variable is
inserting the previous approximation as well as (5), we obtain the heuristic approximation
The asymptotic equipartition property of entropy, relating entropy to microstates, then suggests that the set of numbers that are typically generated by this random process should have cardinality approximately
Using the principle of maximum entropy, one is now led to the approximation
where the weights are chosen to maximize the right-hand side subject to the constraint (5).
One could solve this constrained optimization problem directly using Lagrange multipliers, but we simplify things a bit by passing to a continuous limit. We take a continuous ansatz , where
is a smooth function. Using Mertens’ theorem, the constraint (5) then heuristically becomes
and the expression (6) simplifies to
So the entropy maximization problem has now been reduced to the problem of minimizing the functional subject to the constraint (7). The astute reader may notice that the integral in (8) might diverge at
, but we shall ignore this technicality for the sake of the heuristic arguments.
This is a standard calculus of variations problem. The Euler-Lagrange equation for this problem can be easily worked out to be
for some Lagrange multiplier ; in other words, the optimal
should have an exponential form
. The constraint (7) then becomes
and so the Lagrange multiplier is precisely the mysterious quantity
appearing in (2)! The formula (8) can now be evaluated as
where is the divergent constant
This recovers a large fraction of (2)! It is not completely accurate for multiple reasons. One is that the hypothesis of joint independence on the events is unrealistic when trying to confine
to a single scale
; this comes down ultimately to the subtle differences between the Poisson and Poisson-Dirichlet processes, as discussed in this previous blog post, and is also responsible for the otherwise mysterious
factor in Mertens’ third theorem; it also morally explains the presence of the same
factor in (2). A related issue is that the law of large numbers (4) is not exact, but admits gaussian fluctuations as per the central limit theorem; morally, this is the main cause of the
prefactor in (2).
Nevertheless, this demonstrates that the maximum entropy method can achieve a reasonably good heuristic understanding of smooth numbers. In fact we also gain some insight into the “anatomy of integers” of such numbers: the above analysis suggests that a typical -smooth number
will be divisible by a given prime
with probability about
. Thus, for
, the probability of being divisible by
is elevated by a factor of about
over the baseline probability
of an arbitrary (non-smooth) number being divisible by
; so (by Mertens’ theorem) a typical
-smooth number is actually largely comprised of something like
prime factors all of size about
, with the smaller primes contributing a lower order factor. This is in marked contrast with the anatomy of a typical (non-smooth) number
, which typically has
prime factors in each hyperdyadic scale
in
, as per Mertens’ theorem.
2 Comments
https://shorturl.fm/KhOxH
https://shorturl.fm/PBOeI