## Info

This tropical line is the union of three half-rays:

T(If) = {(p, 0) : p € R>o} U {(0,p): p € R>o} U {(-p, -p) : p € R>o}.

Let g : R1 ^ R2 be the tropicalization of the linear map f : R1 ^ R2, and let g' be the tropicalization of f'. These piecewise-linear maps are given by g(w) = (trop(/1)(w), trop(/2)(w)) = (w ® 0, w ® 0) = (min(w, 0), min(w, 0))

and g'(w) = (trop(/1)(w), trop(/2)(w)) = (w ® 0, w) = (min(w, 0), w). These map R1 onto one or two of the three halfrays of the tropical line T(If): image(g) = {(-p, -p) : p € R>o}, image(g') = {(0,p): p € R>o} U {(-p, -p) : p € R>o} = T+ (If).

The tropicalization g' of the surjectively positive map f' maps onto the positive tropical variety. This is an example for the result in Theorem 3.42 below. □

Returning to our general discussion, consider an arbitrary polynomial map f : Cd ^ Cm. The tropicalization of f is the piecewise-linear map g : Rd ^ Rm , p ^ (g1(p),g2(p),... ,gm(p)), (3.37)

where g» = trop(/j) is the tropicalization of the ith coordinate polynomial / of f. To describe the geometry of the tropical polynomial map g, we consider the Newton polytopes NP(/1),NP(/2),...,NP(/m) of the coordinates of f. Recall from Section 2.3 that the Newton polytope NP(/j) of the polynomial / is the convex hull of the vectors (u1, u2,..., ud) such that ^U1 ^22 ••• ^ appears with non-zero coefficient in /». The cones in the normal fan of the Newton polytope NP(/j) are the domains of linearity of the piecewise-linear map g» : Rd ^ Rm. The Newton polytope of the map f is defined as the Newton polytope of the product of the coordinate polynomials:

NP(f) := NP(/1 • /2 /m) = NP(/1) 0 NP(/2) ©•••© NP(/m). (3.38)

Here the operation © is the Minkowski sum of polytopes (Theorem 2.26). The following theorem describes the geometry of tropicalizing polynomial maps.

Theorem 3.42 The tropical polynomial map g : Rd ^ Rm is linear on each cone in the normal fan of the Newton polytope NP(f). Its image lies inside the tropical variety T(If). If f is positive then image(g) lies in the positive tropical variety T +(If). If f is surjectively positive then image(g) = T+(If).

Proof See [Pachter and Sturmfels, 2004b] and [Speyer and Williams, 2004].

This theorem is fundamental for the study of inference functions of statistical models in Chapter 9. Imagine that w € Rd is a vector of weights for a dynamic programming problem that is derived from a statistical model. The relationship between the weights and the model parameters is w» ~ log(0j).

Evaluating the tropical map g at w means solving the dynamic programs for all possible observations. As we vary the weights w, the vector of outcomes is piecewise constant. Whenever w crosses a boundary, the system undergoes a "phase transition", meaning that the outcome changes for some observation. Theorem 3.42 offers a geometric characterization of these phase transitions, taking into account all possible weights and all possible observations.

If the model has only one parameter, then the Newton polytope NP(f) is a line segment. There are only two "phases", corresponding to the two vertices of that segment. In Example 3.41 the "phase transition" occurs at w = 0.

One important application of this circle of ideas is sequence alignment (Section 2.2). The statistical model for alignment is the pair HMM f in (2.15). The tropicalization g of the polynomial map f is the tropicalized pair HMM, whose coordinates g are featured in (2.16). Parametric alignment is discussed in Chapters 5, 7, 8 and 9. We conclude this section with two other examples.

Example 3.43 DiaNA's model in Example 3.40 has d =16 parameters

0 = (^l,Ai,Ai,\$, A2,&2,^t2, Ya , Yc , Yg , Yi, Ya , Yc , Yg , Yt ), and is specified by the homogeneous polynomial map f : Ci6 ^ c4x4 with Pij = + Yi Yj2, where i, j € {A, C, G, T}.

We know from the linear algebra literature [Cohen and Rothblum, 1993] that every positive 4 x 4-matrix of rank < 2 is the sum of two positive 4 x 4-matrices of rank < 1. This means that DiaNA's model f is a surjectively positive map. The tropicalization of f is the piecewise-linear map g : Ri6 ^ R4x4 given by

= ^ © \$ ® Yi 0 Y2 = min(# + j Yi + Yj2) for i, j € {A, C, G, T}.

Theorem 3.42 says that the image of g equals the positive tropical variety T + (Ig). The space T +(1g) consists of all 4 x 4-matrices of Barvinok rank < 2 and was studied in [Develin et al., 2003] and [Develin and Sturmfels, 2004]. The Newton polytope NP(f) of the map f is a zonotope, i.e., it is a Minkowski sum of line segments. The map g is piecewise linear with respect to the hyperplane arrangement dual to that zonotope. For a detailed combinatorial study of the map g and the associated hyperplane arrangement see [Ardila, 2005]. □

Example 3.44 We consider the hidden Markov model of length n = 3 with binary states (l = l' = 2) but, in contrast to Example 3.19, we suppose that all eight parameters 000, 00i, 0i0, 0ii, 000,00i, 0i0,0ii are independent unknowns. Thus our model is the homogeneous map f : C8 ^ C8 with coordinates

/CT1CT2CT3 = 000°0000(71 C2 00<r3 + °0000i°0CTl C2 °i<73 + 00i0i000CT1 °i(72 o0.3 + 00i0ii00CTi 0i^2 0iff3 + Oi0°00°i0-1 00CT2 00ct3 + 0i000i0iCTi 00CT2 01CT3

The implicitization techniques of Section 3.2 reveal that If is generated by

P011 P2oo - PoolPno + PoooPollPi0l - PoooPi0lPllo + PoooPollPno

2 2 2 2 2 -PoolP5ioPiii + PcolPlooP111 + poioPlooP111 - PoolPlooPlll - PoooPSllPllo

-PoolPollPlooPlol - PoloPollPlooPlol + PoolPoloPollPllo - PoloPollPlooPllo

+PoolPoloPlolPllo + PoolPlooPlolPllo + PoooPoloPollPlll - PoooPollPlooPlll

-PoooPoolPlolPlll + PoooPlooPlolPlll + PoooPoolPlloPlll - PoooPoloPlloPlll.

Thus T(If) is the tropical hypersurface defined by this degree-four polynomial. The tropicalized HMM is the map g : R8 — R8, (w, w') — q with coordinates q^i^s = min{ whlh2 +wh2h3 +whiai +wh2^2 +whsa3 : (h1,h2,h3) e {0,1}3 }.

This minimum is attained by the most likely explanation (h1, h2,h3) of the observation (ct1,ct2,ct3). Hence, evaluating the tropical polynomial qaia2a3 solves the inference task in MAP estimation. See Section 4.4 for the general definition of MAP inference.

Let us see an example. Consider the parameters w = 68 l5 and w' = o8 88 .

The observation ^1^3 = 000 001 010 011 100 101 110 111 has the explanation hl/^ = 000 001 000 011 000 111 110 111

We call {0,1}3 — {0,1}3, — h1h2h3 the inference function for the parameters (w, w'). There are 88 = 16, 777,216 functions from {0,1}3 to itself, but only 398 of them are inference functions. In Chapter 9 it is shown that the number of inference functions is polynomial in the sequence length n, while the number of functions from {0,1}n to itself is doubly-exponential in n.

The inference functions are indexed by the vertices of the Newton polytope NP(f). In our example, polymake reveals that NP(f) is 5-dimensional and has 398 vertices, 1136 edges, 1150 two-faces, 478 ridges and 68 facets. Thus there are 398 inference functions, and we understand their phase transitions.

However, there are still many questions we do not yet know how to answer. What is the most practical method for listing all maximal cones in the image of g ? How does the number of these cones compare to the number of vertices of NP(f)? Is the hidden Markov model f surjectively positive? Which points of the positive tropical variety T+(/f) he in image(g)? □

### 3.5 The tree of life and other tropical varieties

At the 1998 International Congress of Mathematicians in Berlin, Andreas Dress presented an invited lecture titled "The tree of life and other affine buildings" [Dress and Terhalle, 1998]. Our section title is meant as a reference to that paper, which highlighted the importance and utility of understanding the geometry and structure of phylogenetic trees and networks. In Section 4.5 we return to this topic by explaining how the space of trees and its generalizations are relevant for modeling and reconstructing the tree of life.

Here we begin with a reminder of the definition of metrics and tree metrics, which were introduced in Section 2.4. A metric on [n] = {1,2,...,n} is a dissimilarity map for which the triangle inequality holds. Most metrics D are not tree metrics. The set of tree metrics is the space of trees T^. This is a (2n — 3)-dimensional polyhedral fan inside -dimensional cone of all metrics. Membership in Tn is characterized by the four-point condition (Theorem 2.36). Our goal here is to derive an interpretation of Tn in tropical geometry.

Let Q = (qj) be a symmetric matrix with zeros on the diagonal whose (n) distinct off-diagonal entries are unknowns. For each quadruple {i,j, k,1} C {1,2,..., n} we consider the quadratic tropical polynomial gjfcKQ) = qj 0 0 9ik 0 qjl 0 qa 0 qjk. (3.39)

This tropical polynomial is the tropicalization of the Pliicker relation (3.14) gjfci(Q) = trop(pifcpj7 — pjp« — PilPjfc)

which defines a tropical hypersurface T(g^z) in the space rW. In fact, the Pliicker relations (3.14) form a tropical basis for the Pliicker ideal /2,n [Speyer and Sturmfels, 2004]. This implies that the tropical Grassmannian T(I2,n) equals the intersection of these (n) tropical hypersurfaces, i.e.,