Thursday, March 17, 2016

Acquiring Geonames Data


Primary Data

These files can be downloaded here:
~/data/Data/geonames $ cd ../geonames-2/
~/data/Data/geonames-2 $ ls -lah
total 3568232
drwxr-xr-x  14 craigtrim  staff   476B Mar 17 10:38 .
drwxr-xr-x  17 craigtrim  staff   578B Mar 17 10:38 ..
-rw-r--r--@  1 craigtrim  staff   306M Mar 17 10:31 allCountries.zip
-rw-r--r--@  1 craigtrim  staff   111M Mar 17 10:14 alternateNames.zip
-rw-r--r--@  1 craigtrim  staff   7.6M Mar 17 10:11 cities1000.zip
-rw-r--r--@  1 craigtrim  staff   2.1M Mar 17 10:11 cities15000.zip
-rw-r--r--@  1 craigtrim  staff   3.5M Mar 17 10:11 cities5000.zip
-rw-r--r--@  1 craigtrim  staff   1.3M Mar 17 10:11 hierarchy.zip
-rw-r--r--@  1 craigtrim  staff   184K Mar 17 10:11 no-country.zip
-rw-r--r--@  1 craigtrim  staff   825K Mar 17 10:11 shapes_simplified_low.json.zip
-rw-r--r--@  1 craigtrim  staff   824K Mar 17 10:11 shapes_simplified_low.zip
-rw-r--r--@  1 craigtrim  staff   169K Mar 17 10:11 userTags.zip

"allCountries.zip" contains all countries combined in one file, see 'geoname' table for columns



Geonames Table


geonameid integer id of record in geonames database
name name of geographical point (utf8) varchar(200)
asciiname name of geographical point in plain ascii characters, varchar(200)
alternatenames alternatenames, comma separated, ascii names automatically transliterated, convenience attribute from alternatename table, varchar(10000)
latitude latitude in decimal degrees (wgs84)
longitude longitude in decimal degrees (wgs84)
feature class see http
feature code see http
country code ISO-3166 2-letter country code, 2 characters
cc2 alternate country codes, comma separated, ISO-3166 2-letter country code, 200 characters
admin1 code fipscode (subject to change to iso code), see exceptions below, see file admin1Codes.txt for display names of this code; varchar(20)
admin2 code code for the second administrative division, a county in the US, see file admin2Codes.txt; varchar(80)
admin3 code code for third level administrative division, varchar(20)
admin4 code code for fourth level administrative division, varchar(20)
population bigint (8 byte int)
elevation in meters, integer
dem digital elevation model, srtm3 or gtopo30, average elevation of 3''x3'' (ca 90mx90m) or 30''x30'' (ca 900mx900m) area in meters, integer. srtm processed by cgiar/ciat.
timezone the timezone id (see file timeZone.txt) varchar(40)
modification date date of last modification in yyyy-MM-dd format



 

Acquiring Files in Parts

As of 17-March-2016, all the individual geonames ZIP files could be downloaded with this script.  If you are on OS X, and don't have wget installed, install homebrew first, then use brew install wget

#!/bin/bash

download () {
 if [ -f "$1".zip ]
 then
  echo "$1.zip exists"
 else
  wget http://download.geonames.org/export/dump/"$1".zip
 fi
}

download "AD"
download "AE"
download "AF"
download "AG"
download "AI"
download "AL"
download "AM"
download "AN"
download "AO"
download "AQ"
download "AR"
download "AS"
download "AT"
download "AU"
download "AW"
download "AX"
download "AZ"
download "BA"
download "BB"
download "BD"
download "BE"
download "BF"
download "BG"
download "BH"
download "BI"
download "BJ"
download "BL"
download "BM"
download "BN"
download "BO"
download "BQ"
download "BR"
download "BS"
download "BT"
download "BV"
download "BW"
download "BY"
download "BZ"
download "CA"
download "CC"
download "CD"
download "CF"
download "CG"
download "CH"
download "CI"
download "CK"
download "CL"
download "CM"
download "CN"
download "CO"
download "CR"
download "CS"
download "CU"
download "CV"
download "CW"
download "CX"
download "CY"
download "CZ"
download "DE"
download "DJ"
download "DK"
download "DM"
download "DO"
download "DZ"
download "EC"
download "EE"
download "EG"
download "EH"
download "ER"
download "ES"
download "ET"
download "FI"
download "FJ"
download "FK"
download "FM"
download "FO"
download "FR"
download "GA"
download "GB"
download "GD"
download "GE"
download "GF"
download "GG"
download "GH"
download "GI"
download "GL"
download "GM"
download "GN"
download "GP"
download "GQ"
download "GR"
download "GS"
download "GT"
download "GU"
download "GW"
download "GY"
download "HK"
download "HM"
download "HN"
download "HR"
download "HT"
download "HU"
download "ID"
download "IE"
download "IL"
download "IM"
download "IN"
download "IO"
download "IQ"
download "IR"
download "IS"
download "IT"
download "JE"
download "JM"
download "JO"
download "JP"
download "KE"
download "KG"
download "KH"
download "KI"
download "KM"
download "KN"
download "KP"
download "KR"
download "KW"
download "KY"
download "KZ"
download "LA"
download "LB"
download "LC"
download "LI"
download "LK"
download "LR"
download "LS"
download "LT"
download "LU"
download "LV"
download "LY"
download "MA"
download "MC"
download "MD"
download "ME"
download "MF"
download "MG"
download "MH"
download "MK"
download "ML"
download "MM"
download "MN"
download "MO"
download "MP"
download "MQ"
download "MR"
download "MS"
download "MT"
download "MU"
download "MV"
download "MW"
download "MX"
download "MY"
download "MZ"
download "NA"
download "NC"
download "NE"
download "NF"
download "NG"
download "NI"
download "NL"
download "NO"
download "NP"
download "NR"
download "NU"
download "NZ"
download "OM"
download "PA"
download "PE"
download "PF"
download "PG"
download "PH"
download "PK"
download "PL"
download "PM"
download "PN"
download "PR"
download "PS"
download "PT"
download "PW"
download "PY"
download "QA"
download "RE"
download "RO"
download "RS"
download "RU"
download "RW"
download "SA"
download "SB"
download "SC"
download "SD"
download "SE"
download "SG"
download "SH"
download "SI"
download "SJ"
download "SK"
download "SL"
download "SM"
download "SN"
download "SO"
download "SR"
download "SS"
download "ST"
download "SV"
download "SX"
download "SY"
download "SZ"
download "TC"
download "TD"
download "TF"
download "TG"
download "TH"
download "TJ"
download "TK"
download "TL"
download "TM"
download "TN"
download "TO"
download "TR"
download "TT"
download "TV"
download "TW"
download "TZ"
download "UA"
download "UG"
download "UM"
download "US"
download "UY"
download "UZ"
download "VA"
download "VC"
download "VE"
download "VG"
download "VI"
download "VN"
download "VU"
download "WF"
download "WS"
download "XK"
download "YE"
download "YT"
download "YU"
download "ZA"
download "ZM"
download "ZW"

It can be convenient to work with a parts file for the purpose of testing a parser or data reader, rather than reading the entire allCountries.txt [1.3 GB] file each time.

References

  1. [Official] Geonames Data Dump

No comments:

Post a Comment