Generating a simple geospatial index for MySQL

Searching for latitude and longitude values presents special problems for relational databases. By default databases like MySQL don’t handle queries like this very well:

SELECT * FROM table WHERE latitude BETWEEN w AND x AND longitude BETWEEN y AND z;

This is because it is not possible to index the latitude and longitude fields efficiently, so the query will almost always end up touching significantly more records than it needs to. It gets even worse if you try to implement a great circle lookup in SQL.

There are many ways to solve this problem – for example spatial extensions, or using a different database or search provider that supports spatial lookups.  However sometimes the requirement is so simple that you don’t want to overcomplicate your stack.

One answer is to provide your own spatial index to MySQL. A spatial index turns the world into a grid, and assigns each segment in the grid a unique identifier. There are many types of spatial grids – such as UTM or Marsden Squares. Often they were designed for a different era – many are alphanumeric, and many are complicated to calculate.

When approaching the problem at Chicmi, we decided that it was time for a simpler, quicker and numerical spatial index, that provided a numerical integer for every region of the world, and that could be used effectively in a relational database like MySQL. We came up with the PHP GridLatLong module.

With GridLatLong simply add one extra field wherever you have latitude and longitude fields – we call it gridref – which can be an INT. Make sure there’s an index on this field. Then when inserting or updating the latitude and longitude calculate the gridref value as follows:

$gridlatlong = new JamieMBrown\GridLatLong\GridLatLong();
$gridref = $gridlatlong->getGridReferences($latitude, $longitude)[0];

Insert that value in the gridref field alongside the latlong values.

Then when you want to search around a latlong, get all gridrefs within your desired search radius (in KM):

$gridlatlong = new JamieMBrown\GridLatLong\GridLatLong();
$gridrefs = $gridlatlong->getGridReferences($latitude, $longitude, $radius);

This returns an array of all of the matching gridrefs so that you can execute a query like this:

SELECT * FROM table WHERE gridref IN (gridrefs) AND latitude BETWEEN w AND x AND longitude BETWEEN y AND z;

This helps MySQL significantly narrow down the results to a much smaller area using its indexes, before doing the actual query.

There are a few extra options within GridLatLong – for example to make the grid more or less granular, or to change the units used for the search radius. Check out more over at the library page in GitHub. It’s freely available under an MIT license.


jamiembrown

http://www.jamiembrown.com/

Co-founder of Chicmi.com. Consultant for some of London's most exciting tech startups.