A Heuristic Approach to Calculating Geometric Medians

August 22, 2017

I was recently given an interesting problem:

Given a set of points in on a common plane , find a point on that minimizes the net cost of traveling from to each other point.

Ignoring the problem of calculating costThe differences between traveling by car, bike, and foot include important time and cost differences. Here, we’ll employ Euclidean distance as an uncomplicated, normalizing cost function. , this problem may seem fairly straightforward. Let’s look at some approaches.

0: as the Median

As it happens, the median of a set of points is guaranteed to be the cost-minimum center of those points only in space. We can show that the median does not hold in via a counter-example: create a triangle with vertices , the median of which is at :

A triangle with a median at . The net cost of traveling from the vertices to the median is .

The net cost of traveling from the vertices to the center point can be improved by if we make that center point .

A triangle with a median at . The net cost of traveling from the vertices to the center is .

For most, isn’t all that much, and the median is a pretty good approximation for the latter point - the geometric median More info. , or center as I will refer to it from here. But what if the margin of improvement grows higher, say to or or ? How do we derive that obscure value?

1: Function Minimization

What if we attempt to minimize the net cost function? Say that for some point , the net cost of traveling from a set of points with coordinate form to is

Note that this function does _not* have the same minimum as - that minimum is the median.

Hey, pretty good! Let’s do , , throw in a second derivative test, and call it a day.

Oh man. Maybe not. That looks like a pain in the ass to solve for , let alone simultaneously with I have no idea how to do this efficiently, but if you do, please let me know. Looks like it could be worth a lot of money :wink:. . Don’t forget this process would have to be automated, too - wow that looks fun!

2: Heuristic Approximation

You know the old saying:

I may be missing a couple of words, but you get the gist.

If you can’t derive ‘em, approximate ‘em!

…and so heuristics come to the rescue! How? Well, a good place to start is WolframAlpha a contoured version of the function whose extrema are in question.

Contour plots are an excellent way of determining the terrain of a function and what algorithms may be effective traversing it.

As expected, the contour slopes in around the previously-shown minimum . Interestingly, there seems to only be one minimum for this function - and in general, there is only one centerSee this paper, which shows that the geometric center is unique and convergent for any set of non-co-linear points. for any set of points! This means we can employ some descent-type optimization methodNo Zuck, this isn’t machine learning :triumph:. .

Okay, but where do we descend from?

A convenient place would be from the median. It’s cheap to compute and fairly close to the center.

How do we descend?

This is a bit trickier. Of course, we can only travel in , so let’s just say we can go up, down, left, or rightFormally, +y, -y, -x, or +x. A more thorough traversal could also employ diagonal directions. . To find the best direction of descent, we travel to a candidate point in each direction and determine the one that most improves the cost.

What if no direction lowers the cost?

Then we’ve found our center! Except we could probably do better, by decreasing our search radius and repeating the descent mechanism again. And again, and again, until the approximation of the geometric center is within some acceptable margin of error .

Alright, let’s code it!These examples are in Rust, but it should be easy to implement the logic in any language of your choice. The full code can be found here, with a C version here. First, the boilerplate - a euclidean cost function.

And now, we can use the previously-defined specification to write a complete algorithm.

Let’s plug in our three points from before, and…

[0.21126302083333331, 0.21126302083333331]


n: Some Details

This function runs in timeA method for exact computation of the geometric center in near-linear is proposed here. , and its precision can be improved by reducing (epsilon) to an arbitrary minimum.

<< All Posts :snake: