http://people.csail.mit.edu/jaffer/ZxZtoN

f(Z×Z) = N?

At the bottom of http://www-math.mit.edu/~poonen, Professor Poonen poses an
Open Question: Let Z be the set of integers, and let N be the set of nonnegative integers. Is there a polynomial f(x,y) such that f(Z×Z) = N?

Related Mathematics

Polygonal Numbers
One way to view a polynomial in two variables is as an extension of the Fermat polygonal number theorem (which covers sum-of-four-squares and sum-of-three-triangular-numbers) to two-sided polygons.

M. B. Nathanson's A Short Proof of Cauchy's Polygonal Number Theorem defines polygonal numbers as
 um(k) = m (k2 - k) + k 2
for positive m and nonnegative k. The theorem is

Every nonnegative integer is the sum of m+2 polygonal numbers of order m+2.
When m=0, u0(k)=k. And with nonnegative k we trivially have f(NxN)=N when f(x,y)=x+y. Even simpler is f(N)=N when f(x)=x.

But the open question is for inputs both positive and negative. With positive m, um(k) is always nonnegative; so f(Z×Z×Z)=N and higher degrees follow. But with u0 being the identity function, f produces negative integers.

Because the proof is all about quadratics (as opposed to the two-sided polygonal numbers being linear), and because the number of polygonal-numbers required jumps from 3 to 1 when m goes from 1 to 0, the proof of Cauchy's Polygonal Number Theorem is unlikely to shed any light on the proposition in question.

J. H. Conway, in Universal Quadratic Forms and the Fifteen Theorem writes:
The 15-theorem closes the universality problem for integer-matrix forms by providing an extremely simple criterion. We no longer need a list of universal quaternaries, because a form is universal provided only that it represent the numbers up to 15. Moreover, this criterion works for larger numbers of variables, where the number of universal forms is no longer finite. (It is known that no form in three or fewer variables can be universal.)
The parenthetical remark at the end seems to imply that the degrees of x and y in f(x,y) must be greater than 2. But Professor Poonen points out that quadratic form means "homogeneous polynomial of degree 2"; and his question is not restricted to homogeneous polynomials.

Fermat's Last Theorem
Fermat's Last Theorem states:
If an integer n is greater than 2, then the equation an+bn=cn has no solutions in nonzero integers a, b, and c.
If f is the simple sum of nth (4 or larger) powers of its two arguments, then Fermat's Last Theorem directs that f is not surjective because there are no input pairs generating cn. It will also rule out the sum of two powers (> 2) of polynomials with no "net" constant term. For example: (x2+y2+1)3 + (2 x y-1)3.

Polynomial Degrees

If f exists, then there must be an argument pair x0,y0 such that f(x0,y0)=0. Consider the polynomial g(x,y)=f(x+x0,y+y0). Because g(0,0)=0, the constant term of g(x,y) must be 0. Without loss of generality for the rest of this treatment, we will assume that f(0,0)=0.

The requirement that only nonnegative numbers are produced by f is stringent. Suppose the degree of x in f is odd. If we let y take a value which does not extinguish the leading x term of f, then sufficiently large magnitude of x will dominate any other terms. For sufficiently large magnitudes, that leading term will change sign depending on the sign of x. Thus the degree of x in f must be even. Similarly, the degree of y in f must also be even.

Consider f(x,0). All terms in this polynomial involve x only. By the previous reasoning, the degree of x in f(x,0) must be even. Similarly, the degree of y in f(0,y) must also be even.

What if the degree of x or y in f(x,y) is zero? Then we simply have a polynomial in one variable (the variable whose degree is not zero).

Lemma 1: There is no polynomial f(x) which is surjective from the integers (Z) to the nonnegative integers (N).

The degree of x in f must be even. For degree 0, f would be a constant, having only a single value which is not surjective onto N. For even degree k>1 and sufficiently large value of x, f(x) is greater than every value f(0), ..., f(x-1) and f is monotonically increasing and the difference between f(x+1) and f(x) is O(xk-1). Because f is monotonically increasing, it never returns to the values between f(x) and f(x+1). The values of f(-x) similarly become monotonically increasing and spread further and further apart. Thus f is not surjective onto N.

Thus the degrees of x and y in f(x,y) must be even and not less than 2.

Corollary 1: If a polynomial f(x,y) can be rewritten as the composition of an even degree polynomial p with an integer-valued polynomial g(x,y), then f is not surjective from the integers to the nonnegative integers.

The set of all possible values returned by g is a subset of the integers. By Lemma 1, p cannot be surjective onto N; thus f(x,y)=p(g(x,y)) cannot be surjective onto N.

What about the total-degree of f? If f exists with an odd total-degree of D, there may be more than one term with total-degree of D (eg. 3x2y+y2x). Consider h(x)=f(x,cx), where c is a nonzero constant chosen (eg. 1 in the example) so that the degree of x in h(x) is D, the total-degree of f. Then h(x) is a polynomial of odd degree and takes on negative values for some x. Thus, the total-degree of f must be even.

This establishes that the total-degree of f must be even and no smaller than 2.

x4+y4

f(x,y)=x4+y4 grows very quickly with x and y. Its shape is that of a rectangular vase. This makes it easy to find the bands of equal height.

Let b be a positive integer. There are b2 x,y pairs in the region 0≤x<b, 0≤y<b. Because f(x,y) is symmetrical, f produces no more than 1/2(b2+b) values when applied to these x,y pairs.

The polynomial f applied to these x,y pairs along with pairs -x,y, x,-y, and -x,-y produces the only values of f(x,y) less than b4. But for b≥2, 1/2(b2+b) is less than b4. Thus, f(x,y)=x4+y4 does not produce integers densely enough to cover the natural numbers.

x2+y2

This sort of counting also works for f(x,y)=x2+y2. The polynomial f applied to the 1/2(b2+b) unique x,y combinations produces the only values of f(x,y) falling in the range 0 to b2. For b≥2, 1/2(b2+b) is less than b2. Thus, f(x,y)=x2+y2 does not produce integers densely enough to cover the natural numbers.
In both these cases the 1/2(b2+b) unique x,y combinations included pairs whose projection through f was larger than b4 and b2, respectively. But the source count was less than output range even with these extra pairs.

Asymptotics

There is an important difference between the x4+y4 and x2+y2 cases. O(x4+y4)>O(xy); but the same cannot be said for O(x2+y2). Quadratic polynomials must be addressed separately from polynomials of higher degree. The proof for higher degrees centers on counting arguments dubbed asymptotically-sparse.
Definition: Given a positive number H and a polynomial f(x,y), let C(H,f) be the count of integer x,y pairs satisfying 0<f(x,y)<H.

A polynomial f(x,y) is asymptotically-sparse if all values of f(x,y) are nonnegative and O(C(H,f))<O(H).

From the definition it follows that asymptotically-sparse polynomial functions do not produce integers densely enough to cover the natural numbers. The converse is not necessarily true; even though both do not produce integers densely enough to cover the natural numbers, x4+y4 is asymptotically-sparse, while x2+y2 is not.

Polynomial functions can also fail to be asymptotically-sparse because they return negative values, or because they are not bivariate. For instance, f(x,y)=x4 is not asymptotically-sparse because there are an unbounded number of y values for which f(x,y)<H.

Note that the definition of asymptotically-sparse polynomial functions does not restrict them to be integer-valued.

Lemma 2: The product of an asymptotically-sparse polynomial function with any positive number is asymptotically-sparse.

Let H be a large positive number and f(x,y) be an asymptotically-sparse polynomial function. Let g(x,y)=rf(x,y) where r is a positive number. Any x,y pair satisfying 0<g(x,y)<H, also satisfies 0<f(x,y)<H/r. Because O(H)=O(H/r), C(H,rf)<O(H); and rf(x,y) is asymptotically-sparse.

u(x)+u(y)

The reasoning about x4+y4 can be generalized to the case of identical polynomial functions of x and y. Let f(x,y)=u(x)+u(y) where u is an nonnegative integer polynomial function of even degree k≥4.

Let b be a positive integer large enough so that u(xb) and u(x≤-b) are monotonically increasing and the difference between u(x+1) and u(x) and the difference between u(-x) and u(-x-1) are O(xk-1) for |x|≥b.

There are O(b2) x,y pairs in the region 0≤x<b, 0≤y<b and its negative mirrors. The polynomial f applied to these x,y pairs produces the only values of f(x,y) which fall in the range 0 to O(xk). But for large x and k≥4, O(x2) is smaller than O(xk). Thus, f(x,y)=u(x)+u(y) of degree 4 or more is asymptotically-sparse.

u(x)+v(y)

This can be further generalized to the case of independent polynomial functions of x and y. Let f(x,y)=u(x)+v(y) where u and v are nonnegative integer polynomial functions of even degrees jk, respectively, and j≥4 and k≥2. The approach is to pick input ranges [0,b] and [0,c] which have the same output range when projected through u and v, respectively.

Let b and c be positive integers large enough so that:

• u(xb) and v(yc) are monotonically increasing;
• c is the largest value of y for which v(y) ≤ u(b);
• the difference between u(x+1) and u(x) is O(xj-1) for xb; and
• the difference between v(y+1) and v(y) is O(yk-1) for yc.

There are O(x1+j/k) x,y pairs in the regions 0≤x<b, 0≤y<c and their negative images. The polynomial f applied to these x,y pairs produces the only values of f(x,y) which fall in the range 0 to O(xj). With jk, j≥4 and k≥2, O(x1+j/k) is smaller than O(xj). In particular:

• When j=4 and k=2, O(x3) is smaller than O(x4).
• For large j and k=2, O(xj/2) is less than O(xj).
• When j=k≥4, O(x2) is smaller than O(xj).
Thus, f(x,y)=u(x)+v(y) with total-degree of 4 or more is asymptotically-sparse.
Notice that last this case tightened the degree bounds. With the minimum total-degree of 4, one polynomial, v(y), can have degree 2. This leaves only the case where both u(x) and v(y) are quadratic. The counting arguments used previously won't work here because half of the (b2+b) x,y pairs can't be dismissed. Instead, count the integers generated over the range 0 to 11.

Both u(x) and v(y) must be positive semi-definite. f(x,y) must be 1 for some x,y; so either there is x1 such that f(x1,0)=u(x1)=1 or there is y1 such that f(0,y1)=v(y1)=1. Without loss of generality, assume that f(x1,0)=u(x1)=1.

Because u(0)=0, we have u(x)=(ax2+bx)/2 for some integers a and b. ax2+bx≥0 for all x. Thus 0<|b|≤a. So that (ax2+bx)/2 returns only integers, b must be even if and only if a is even. So that (ax2+bx)/2 returns 1, b=±(a-2). Thus x1 is 1 or -1. Without loss of generality, let b=a-2 and x1=-1.

Because v(0)=0, we have v(y)=(cx2+dx)/2 for some integers c and d. cx2+dx≥0 for all y. Thus 0<|d|≤c. So that (cx2+dx)/2 returns only integers, d must be even if and only if c is even.

If a>24, then (ax2+bx)/2 has only two values less than 12: u(0)=0 and u(-1)=1.

If c>24, then (cy2+dy)/2 has at most two values less than 12: v(0)=0 and v(1) or v(-1).

If both a>24 and c>24,
then f(x,y) generates at most four values between 0 and 11 (0, 1, v(±1), and v(±1)+1). Thus f is not surjective onto N.

If a>24 and c≤24 or a≤24 and c>24,
then the maximum number of values generated by f(x,y) between 0 and 11 is twice the number of values generated by v(y) or u(x), respectively. v(y) is the more general case. Of the 168 possible v(y) polynomials with c≤24, four generate the most integers, five [5]:
```( 1 y^2  +1 y)/2: [5] #(0 1 ! 3 ! ! 6 ! ! ! 10 !)
( 2 y^2  +0 y)/2: [4] #(0 1 ! ! 4 ! ! ! ! 9 ! !)
( 2 y^2  +2 y)/2: [3] #(0 ! 2 ! ! ! 6 ! ! ! ! !)
( 3 y^2  +1 y)/2: [5] #(0 1 2 ! ! 5 ! 7 ! ! ! !)
( 3 y^2  +3 y)/2: [3] #(0 ! ! 3 ! ! ! ! ! 9 ! !)
( 4 y^2  +0 y)/2: [3] #(0 ! 2 ! ! ! ! ! 8 ! ! !)
( 4 y^2  +2 y)/2: [5] #(0 1 ! 3 ! ! 6 ! ! ! 10 !)
( 4 y^2  +4 y)/2: [2] #(0 ! ! ! 4 ! ! ! ! ! ! !)
( 5 y^2  +1 y)/2: [5] #(0 ! 2 3 ! ! ! ! ! 9 ! 11)
( 5 y^2  +3 y)/2: [4] #(0 1 ! ! 4 ! ! 7 ! ! ! !)
( 5 y^2  +5 y)/2: [2] #(0 ! ! ! ! 5 ! ! ! ! ! !)
...
```
Four of these polynomials generate five integers in the range 0 to 11. But twice 5 is less than 12, so f is not surjective onto N.

The last case is where a≤24 and c≤24.
Computer enumeration of these cases finds no surjection onto N. The longest run from 0 is 0 through 20, generated by: (3x2+x+7y2+y)/2.

Prof. Poonen has suggested a simpler proof using quadratic reciprocity which rules out all quadratic polynomials.

Therefore there is no f(x,y) comprised of the sum of univariate polynomials u(x) and v(y) over the integers which is surjective onto the nonnegative integers.

u(x)+v(y)+w(x,y)

Let f(x,y)=u(x)+v(y)+w(x,y) where u, v, and w are polynomial functions of even total-degrees jk>l, respectively, and jk≥4, and l≥2; and u and v return only nonnegative integers, w returns only integers, and u(x)+v(y)+w(x,y) is nonnegative.

Because jk>l, u and v will dominate w for sufficiently large |x| and |y| along any trajectory ax+by=0. Hence the previous asymptotic argument holds that f is asymptotically-sparse.

x2y2

The simplist bivariate polynomial with even degrees of x and y and total-degree of 4 is f(x,y)=x2y2. It consists of two perpendicular troughs intersecting at the origin. Bands of equal height map from hyperbolas resulting from equating the height to x2y2.

The contour of height H is

• H=x2y2
• y2=H/x2
• y(H,x)=H1/2/x
The number of nonzero integer x,y coordinates inside of this contour is asymptotically 4 times its area above y=1, A(H), which is 4 times the integral of y(H,x)-1dx from x=1 to H1/2, the largest x for which |y(H,x)|≥1.

A(H) = 4H1/2(ln(H1/2)-ln(1)) = 2H1/2ln(H)

A(H) does not grow as fast as H; so f(x,y)=x2y2 is asymptotically-sparse.

This proof extends to all polynomials in which x2y2 is the only term having the highest total-degree.

xjyk

Let f(x,y)=xjyk where j and k are even and jk≥2. The contour of height H is
• H=xjyk
• yk=Hx-j
• y(H,x)=H1/kx-j/k
The number of nonzero integer x,y coordinates inside this contour is asymptotically its area, A(H), which is 4 times the integral of y(H,x)-1dx from x=1 to H1/j.

If j=k
A(H) = 4H1/j(ln(H1/j)-ln(1)) = 4H1/jln(H)/j
If j>k
Because j>k, 1-j/k is negative and the indefinite integral of the hyperbolic part is hyperbolic. Therefore O(A(H))<O(H).
In both cases A(H) does not grow as fast as H; so f(x,y)=xjyk is asymptotically-sparse.

(x-y)2(x+y)2

Rotating the troughs by 45° should not effect their asymptotic behavior. The contour of height H is
• H=(x-y)2(x+y)2
• H=(y2-x2)2
• H1/2=y2+x2
• y2=H1/2+x2
• y(H,x)=(H1/2+x2)1/2
Points along the x=y and x=-y diagonals have heights of 0. The number of non-diagonal integer x,y coordinates inside of this contour is asymptotically proportional to the area between y(1,x) and y(H,x), a complicated expression which grows more slowly than H.

Sliding Blocks

Visualize f(x,y) as an infinite array of square tiles, each labeled with the value of f(x,y) at that x,y location. [Shown is the center of (7x2+x+3y2+y)/2.] If we slide a row of tiles an integral number of positions left or right, it doesn't change the range of f. In fact, we can slide all rows by different amounts simultaneously by substituting x+q(y) for x. After doing so, we can slide all columns by different amounts; then rows again...
Theorem 1: Let f(x,y) be a polynomial function of two integer variables and q(x) be an integer-valued polynomial function of one integer variable. The set of values returned by f(x,y) for all integer pairs x,y is the same as the set of values returned by g(x,y)=f(x,y+q(x)) for all integer pairs x,y. The same is true for g(x,y)=f(x+q(y),y).

For any integer-valued polynomial function q(x), because x and y are independent, Q(x,y)=[x,y+q(x)] is a bijection from Z×Z to Z×Z; and f(x,y)=g(x,y-q(x)).

If f(x,y)=x2y2 and q(y)=y, then f(x+q(y),y)=x2y2+2xy3+y4, which is the same as x2y2 with its x rows shifted progressively by y offsets of -1. As with this example, linear q(y) polynomials take homogeneous polynomials h(x,y) to homogeneous polynomials h(x+q(y),y) of the same total degrees.

When q is not linear, the degree and homogeneity of polynomials are not preserved. For example, f(x,y)=x4+y4 and g(x,y)=f(x,y+x2)=x4+x8+4x6y+6x4y2+4x2y3+y4. Notice that, although f(x,y)=x4+y4 has been analyzed, g(x,y) is not covered by any of the cases analyzed above.

Can we distinguish nonlinearly-shifted polynomials from those which aren't? When f is a nonnegative polynomial of degree 2 or more in both x and y, and q(x) is a polynomial of degree 2 or more, the highest total-degree term of g(x,y)=f(x,y+q(x)) must be a power of x and not y; and the highest total-degree term involving y is linear in y and has a degree in x greater than 3 and a coefficient divisible by the exponent of the highest degree pure y term (by the Binomial Theorem).

This suggests a procedure for reducing nonlinearly-shifted polynomials:

Given g(x,y) with degree in x twice or more its degree in y, and whose highest total-degree term involving y is cnxmy where m≥4 and c is a positive integer and n is the degree of y in g, replace g(x,y) by g(x,y-cxn). This reduction step removes the leading term of q(x). Repeating this process will remove lower degree (but higher degree than 1) terms of q(x). As the result of a reduction, a pure x term may no longer be the highest degree term. The same process may then be carried out with the roles of x and y reversed, unwrapping layer after layer of nonlinear-shift until a minimal degree polynomial is achieved.
The coefficient of the xmy term needing to be a multiple of n leaves the great majority of high degree polynomials for which the nonlinear-shifting reduction procedure does not apply. But for the purposes of gaging asymptotic-sparseness, the coefficient can be non-integer. The reduction procedure is then:
Given g(x,y) with degree in x twice or more its degree in y, and whose highest total-degree term involving y is cxmy where m≥4 and c is a real, replace g(x,y) by g(x,y-(c/n)xn) where n is the degree of y in g. This reduction step removes the leading term of q(x). Repeating this process will remove lower degree (but higher degree than 1) terms of q(x). As the result of a reduction, a pure x term may no longer be the highest degree term. The same process may then be carried out with the roles of x and y reversed, unwrapping layer after layer of nonlinear-shift until a minimal degree polynomial is achieved.

Structure Theorems

Theorem 2: The sum of asymptotically-sparse polynomial functions is asymptotically-sparse.

Let H be a large positive number and f(x,y) and g(x,y) be asymptotically-sparse polynomial functions. Because f and g return only nonnegative numbers, any x,y pair satisfying 0<f(x,y)+g(x,y)<H, also satisfies either 0<f(x,y)<H or 0<g(x,y)<H. Thus C(H,f+g)≤C(H,f)+C(H,g)<O(H); and f(x,y)+g(x,y) is asymptotically-sparse.

Theorem 3: The product of an asymptotically-sparse polynomial function and any nonnegative-valued polynomial function is asymptotically-sparse.

Let H be a large positive number and f(x,y) be an asymptotically-sparse polynomial function and g(x,y) be a polynomial function whose values are all nonnegative. Let L be the smallest nonzero value returned by g(x,y) for any integer values of x and y. Then g(x,y)/L will return 0 or values greater or equal to 1.

Any integer x,y pair satisfying 0<f(x,y)g(x,y)/L<H, also satisfies 0<f(x,y)<H. Thus C(H,fg/L)≤C(H,f)<O(H); hence f(x,y)g(x,y)/L is asymptotically-sparse; and by Lemma 2, Lf(x,y)g(x,y)/L=f(x,y)g(x,y) is asymptotically-sparse.