
We consider the numerical taxonomy problem of fitting an SxS distance matrix D with a tree metric T. Here T is a weighted tree spanning S where the path lengths in T induce a metric on S. If there is a tree metric matching D exactly, then it is easily found. If there is no exact match, then for some k, we want to minimize the L_{k} norm of the errors, that is, pick T so as to minimize DT_{k} = (Σ_{i,jϵS} D(i,j)T(i,j)^{k}) ^{1/k}.
An evolutionary tree induces a hierarchical classification of species and this is not just tied to biology. Medicine, ecology and linguistics are just some of the fields where this concept appears, and it is even an integral part of machine learning and data science. Fundamentally, if we can approximate distances with a tree, then they are much easier to reason about: many questions that are NPhard for general metrics can be answered in linear time on tree metrics. In fact, humans have appreciated hierarchical classifications at least since Plato and Aristotle (350 BC).
The numerical taxonomy problem is important in practice and many heuristics have been proposed. In this talk we will review the basic algorithmic theory, results and techniques, for the problem, including the most recent result from FOCS'21 [Vincent CohenAddad et al., 2021]. They paint a varied landscape with big differences between different moments, and with some very nice open problems remaining.
 At STOC'93, Farach, Kannan, and Warnow [Martin Farach et al., 1995] proved that under L_{∞}, we can find the optimal ultrametric. Almost all other variants of the problem are APXhard
 At SODA'96, Agarwala, Bafna, Farach, Paterson, and Thorup [Richa Agarwala et al., 1999] showed that for any norm L_{k}, k≥1, if the best ultrametric can be αapproximated, then the best tree metric can be 3αapproximated. In particular, this implied a 3approximation for tree metrics under L_{∞.}
 At FOCS'05, Ailon and Charikar [Nir Ailon and Moses Charikar, 2011] showed that for any L_{k}, k≥1, we can get an approximation factor of O(((log n)(log log n))^{1/k}) for both tree and ultrametrics. Their paper was focused on the L_{1} norm, and they wrote "Determining whether an O(1) approximation can be obtained is a fascinating question".
 At FOCS'21, CohenAddad, Das, Kipouridis, Parotsidis, and Thorup [Vincent CohenAddad et al., 2021] showed that indeed a constant factor is possible for L_{1} for both tree and ultrametrics. This uses the special structure of L_{1} in relation to hierarchies.
 The status of L_{k }is wide open for 1<k<∞. All we know is that the approximation factor is between Ω(1) and O((log n)(log log n)). 