Skip navigation

  Help us to make more
useful things.
Donate to mySociety

Gaze web service

Thursday, September 15th, 2005 by Chris Lightfoot

A very quick post to announce the launch of a public interface to our Gaze web gazetteer service. The motivation behind Gaze is collecting location information from users without using maps (a clunky approach with poor accessibility and licensing problems) or postcodes (which do not have universal coverage and have privacy issues as well as licensing problems). Instead the idea is to use place names to identify locations, even in the presence of ambiguity, alternate names, etc. We do this by providing a search service over a large gazetteer (2.2 million places and 3 million names), and supplying additional contextual information to disambiguate common place names. The API is very simple, with one major function and two other supporting ones.

Anyway, without further ado, here is the API. Internally we use one based on RABX, but we’ve done a special “RESTful” API for everyone else. All requests should be HTTP GETs; all parameters must be in UTF-8; and all responses are in UTF-8 plain text or comma-separated values. All calls should be passed to the URL,

http://gaze.mysociety.org/gaze-rest

selecting a particular function by specifying the HTTP parameter f, for instance

http://gaze.mysociety.org/gaze-rest?f=get_find_places_countries

Available functions are:

get_country_from_ip

Parameters:
ip
IPv4 address of a host, in dotted-quad format

Guess the country of location of a host from its IP address. The result of this call will be an ISO country code, followed by a line feed; or, if it was not possible to determine a country, a line feed on its own.

get_find_places_countries

No parameters.

Return the list of countries for which the find_places call has a gazetteer available. The list is returned as a list of ISO country codes followed by line feeds.

find_places

Parameters:
country
ISO country code of country in which to search for places
state
state in which to search for places; presently this is only meaningful for country=US (United States), in which case it should be a conventional two-letter state code (AZ, CA, NY etc.); optional
query
query term input by the user; must be at least two characters long
maxresults
largest number of results to return, from 1 to 100 inclusive; optional; default 10
minscore
minimum match score of returned results, from 1 to 100 inclusive; optional; default 0

Returns in CSV format (as defined by this internet draft) with a one-line header a list of the following fields:

name
name of the place described by this row
in-qualifier
blank, or the name of an administrative region in which this place lies (for instance, a county)
near-qualifier
blank, or a list of nearby places, separated by commas
latitude
WGS-84 latitude of place in decimal degrees, north-positive
longitude
WGS-84 longitude of place in decimal degrees, east-positive
state
blank, or containing state code for US
score
match score for this place, from 0 to 100 inclusive

Enjoy! Questions and comments to chris@mysociety.org, please.

Update: we’ve now added the facilities for discovering population densities and “customary proximity” (as discussed in this post) to Gaze. The additional APIs are documented here.

17 Responses to “Gaze web service”

  1. Kevin Marks Says:

    This is interesting, but CSV is not a nice format to parse easily – would you consider using a structured HTML response, like http://microformats.org/wiki/xoxo ?

    Also, a missing, but useful bit of information with this result is a radius of interest for the named place, as if you are going to present results to users on a map-based UI at any point, knowing how far to zoom in is important.

  2. Francis Irving Says:

    Out of interest Kevin, what are you using to parse with? I’d have thought CSV would be pretty easy – there are standard Perl and Python modules for it, for example. Parsing a tag soup strikes me as much harder.

  3. Chris Lightfoot Says:

    What he said. We chose CSV because it’s standard and easy-to-parse; structured HTML seems to have neither advantage.

  4. Andrew Turner Says:

    I wanted to make the Gaze lookup service an Asynchronous call (AJAX
    without the X), but the problem lies in not typically being allowed to
    make async calls to a non-local host. Therefore, I wrapped up the Gaze
    service in a little PHP code (gaze-rest.php) that has the same API as
    the actual gaze service, but acts like a ‘local service’.

    http://highearthorbit.com/projects/geocode/geocode.html

    The page then makes a Javascript call, which gets the values from the
    form and makes the async call. The returned value is put in the
    textarea.

    Right now I hardcoded the US and GB, but plan on extending it to
    actually dynamically fill the options via a find_places_get_countries
    call to Gaze.

    The source is available as a link at the bottom of the page.

  5. Sam Says:

    CSV is easy to parse until you get commas within the fields.

    There’s nothing in what’s been published that says that this wont happen.

  6. Chris Lightfoot Says:

    CSV is at least a regular language (no recursion), and so it can be parsed with a regular expression. By comparison, HTML needs a special (and very complicated) parser to handle. Commas within the fields aren’t exactly troublesome; the relevant RE is just something like,

    /^((^|,)(”([^"]|”")+”|[^,]+))*$/

  7. Carsten Says:

    I was enjoying this, but it seems to be offline now. Will it be back up?

    And, assuming that you’re using data from the GEONet Names Server, may I ask where you found public data for the US?

  8. RJ Says:

    The US government publish quite a lot of GIS data.
    Try the geonames stuff (http://geonames.usgs.gov/)

    Here’s a direct link to the data by state:
    http://geonames.usgs.gov/stategaz/index.html

    RJ

  9. Chris Lightfoot Says:

    Carsten — sorry about the delay answering this. As RJ says, the data for the US are from USGS; the dump for the whole country is at,
    http://geonames.usgs.gov/stategaz/POP_PLACES_DECI.zip
    and the program to parse it is here:
    usgs-geonames-parse
    (you can get that from our public CVS too).

  10. Blaine Price Says:

    Sorry if I’m being daft here, but what is the format of the query paramenter? If I want to know the place names near a give lat long how do I do it? For example:
    http://gaze.mysociety.org/gaze-rest?f=find_places;country=GB;lat=51.53;lon=-0.1020

    wants the query paramenter, but I can’t find it documented…

    Blaine

  11. Francis Irving Says:

    There isn’t an API which returns the places near a given longitude and latitude. I don’t think our database is indexed to make that easy to do – it is the other way round, to find a latitude and longitude given a place.

  12. Chris Lightfoot Says:

    As Francis says — I don’t think we have the appropriate geographic indexes. We could probably add them (I’d have to check whether this has any unpleasant resource requirements, but it oughtn’t to) and an API to do a find-places-near-location query; best of all would be if you could offer a patch — the relevant code is here: Gaze.pm, the web interface here: gaze-rest.cgi, and the database schema here: schema.sql. Access to our CVS repository is described here; you want module mysociety/services/Gaze, though it has some dependencies on mysociety/perllib too. There’s appropriate SQL you can copy in pb/db/schema.sql, I think.

    I’m afraid that making a local installation is a little bit involved, but drop a mail to chris@mysociety.org or leave questions as comments here if you get stuck. (The latter might be better, since then the results are available to others too.) You could also join the (fairly low-traffic) mysociety-devchat mailing list if you like.

  13. Paul Clarke Says:

    Chris,

    This is exactly what a great web service – useful and easy to use.

    Nice work!

  14. Daniel Says:

    The link to the CVS internet draft doesn’t seem to work any more. Maybe:

    http://www.rfc-editor.org/rfc/rfc4180.txt

  15. smithvp Says:

    How to get geocode using regular expression using php

    send me

  16. Andy Mabbett Says:

    You might be interested in the “geo” microforamt:

    http://microformats.org/wiki/geo

  17. Павло Says:

    Без преувеличения можно точно сказать, что пост тему раскрыл на все 100 процентов. :)

Leave a Reply


News & information:
Projects:
Contact & information:
Technical:

mySociety is a project of UK Citizens Online Democracy (UKCOD). UKCOD is a registered charity in England and Wales, no. 1076346.