A very quick post to announce the launch of a public interface to our Gaze web gazetteer service. The motivation behind Gaze is collecting location information from users without using maps (a clunky approach with poor accessibility and licensing problems) or postcodes (which do not have universal coverage and have privacy issues as well as licensing problems). Instead the idea is to use place names to identify locations, even in the presence of ambiguity, alternate names, etc. We do this by providing a search service over a large gazetteer (2.2 million places and 3 million names), and supplying additional contextual information to disambiguate common place names. The API is very simple, with one major function and two other supporting ones.
Anyway, without further ado, here is the API. Internally we use one based on RABX, but we’ve done a special “RESTful” API for everyone else. All requests should be HTTP GETs; all parameters must be in UTF-8; and all responses are in UTF-8 plain text or comma-separated values. All calls should be passed to the URL,
http://gaze.mysociety.org/gaze-rest
selecting a particular function by specifying the HTTP parameter f, for instance
http://gaze.mysociety.org/gaze-rest?f=get_find_places_countries
Available functions are:
- get_country_from_ip
- Parameters:
- ip
- IPv4 address of a host, in dotted-quad format
Guess the country of location of a host from its IP address. The result of this call will be an ISO country code, followed by a line feed; or, if it was not possible to determine a country, a line feed on its own.
- get_find_places_countries
- No parameters.Return the list of countries for which the find_places call has a gazetteer available. The list is returned as a list of ISO country codes followed by line feeds.
- find_places
- Parameters:
- country
- ISO country code of country in which to search for places
- state
- state in which to search for places; presently this is only meaningful for country=US (United States), in which case it should be a conventional two-letter state code (AZ, CA, NY etc.); optional
- query
- query term input by the user; must be at least two characters long
- maxresults
- largest number of results to return, from 1 to 100 inclusive; optional; default 10
- minscore
- minimum match score of returned results, from 1 to 100 inclusive; optional; default 0
Returns in CSV format (as defined by this internet draft) with a one-line header a list of the following fields:
- name
- name of the place described by this row
- in-qualifier
- blank, or the name of an administrative region in which this place lies (for instance, a county)
- near-qualifier
- blank, or a list of nearby places, separated by commas
- latitude
- WGS-84 latitude of place in decimal degrees, north-positive
- longitude
- WGS-84 longitude of place in decimal degrees, east-positive
- state
- blank, or containing state code for US
- score
- match score for this place, from 0 to 100 inclusive
Enjoy! Questions and comments to hello@mysociety.org, please.
Update: we’ve now added the facilities for discovering population densities and “customary proximity” (as discussed in this post) to Gaze. The additional APIs are documented here.
This is interesting, but CSV is not a nice format to parse easily – would you consider using a structured HTML response, like http://microformats.org/wiki/xoxo ?
Also, a missing, but useful bit of information with this result is a radius of interest for the named place, as if you are going to present results to users on a map-based UI at any point, knowing how far to zoom in is important.
Out of interest Kevin, what are you using to parse with? I’d have thought CSV would be pretty easy – there are standard Perl and Python modules for it, for example. Parsing a tag soup strikes me as much harder.
What he said. We chose CSV because it’s standard and easy-to-parse; structured HTML seems to have neither advantage.
I wanted to make the Gaze lookup service an Asynchronous call (AJAX
without the X), but the problem lies in not typically being allowed to
make async calls to a non-local host. Therefore, I wrapped up the Gaze
service in a little PHP code (gaze-rest.php) that has the same API as
the actual gaze service, but acts like a ‘local service’.
http://highearthorbit.com/projects/geocode/geocode.html
The page then makes a Javascript call, which gets the values from the
form and makes the async call. The returned value is put in the
textarea.
Right now I hardcoded the US and GB, but plan on extending it to
actually dynamically fill the options via a find_places_get_countries
call to Gaze.
The source is available as a link at the bottom of the page.
CSV is easy to parse until you get commas within the fields.
There’s nothing in what’s been published that says that this wont happen.
CSV is at least a regular language (no recursion), and so it can be parsed with a regular expression. By comparison, HTML needs a special (and very complicated) parser to handle. Commas within the fields aren’t exactly troublesome; the relevant RE is just something like,
/^((^|,)(“([^”]|””)+”|[^,]+))*$/
I was enjoying this, but it seems to be offline now. Will it be back up?
And, assuming that you’re using data from the GEONet Names Server, may I ask where you found public data for the US?
The US government publish quite a lot of GIS data.
Try the geonames stuff (http://geonames.usgs.gov/)
RJ
Carsten — sorry about the delay answering this. As RJ says, the data for the US are from USGS; the dump for the whole country is at,
http://geonames.usgs.gov/stategaz/POP_PLACES_DECI.zip
and the program to parse it is here:
usgs-geonames-parse
(you can get that from our public CVS too).
Sorry if I’m being daft here, but what is the format of the query paramenter? If I want to know the place names near a give lat long how do I do it? For example:
http://gaze.mysociety.org/gaze-rest?f=find_places;country=GB;lat=51.53;lon=-0.1020
wants the query paramenter, but I can’t find it documented…
Blaine
There isn’t an API which returns the places near a given longitude and latitude. I don’t think our database is indexed to make that easy to do – it is the other way round, to find a latitude and longitude given a place.
As Francis says — I don’t think we have the appropriate geographic indexes. We could probably add them (I’d have to check whether this has any unpleasant resource requirements, but it oughtn’t to) and an API to do a find-places-near-location query; best of all would be if you could offer a patch — the relevant code is here: Gaze.pm, the web interface here: gaze-rest.cgi, and the database schema here: schema.sql. Access to our CVS repository is described here; you want module mysociety/services/Gaze, though it has some dependencies on mysociety/perllib too. There’s appropriate SQL you can copy in pb/db/schema.sql, I think.
I’m afraid that making a local installation is a little bit involved, but drop a mail to chris@mysociety.org or leave questions as comments here if you get stuck. (The latter might be better, since then the results are available to others too.) You could also join the (fairly low-traffic) mysociety-devchat mailing list if you like.
Chris,
This is exactly what a great web service – useful and easy to use.
Nice work!
The link to the CVS internet draft doesn’t seem to work any more. Maybe:
http://www.rfc-editor.org/rfc/rfc4180.txt
How to get geocode using regular expression using php
send me
You might be interested in the “geo” microforamt:
http://microformats.org/wiki/geo
??? ????????????? ????? ????? ???????, ??? ???? ???? ??????? ?? ??? 100 ?????????. 🙂