Download all the cities, towns, villages
Our dataset is a comprehensive collection of geographic information encompassing cities, towns, villages, and hamlets from around the world. This rich repository of data has been meticulously sourced from OpenStreetMap, a collaborative project renowned for its accuracy and global coverage.
It includes names, translations, address details, population figures, precise geographical coordinates, and bounding box coordinates. With the inclusion of unique identifiers like osm_id and osm_type, you can easily link back to the original OSM objects using the Geoapify Place Details API.
About This Dataset
Our dataset comprehensively composes geographical data sourced directly from the OpenStreetMap (OSM) database. We have meticulously exported all data rows that carry specific place tag values, such as place=city, place=town, place=village, and place=hamlet. This inclusive approach ensures that our dataset encompasses many human settlements, from major cities to small hamlets, providing a holistic view of global locations.
Within this dataset, you'll find a wealth of information, meticulously structured to meet various research, development, and analytical needs. Each data entry includes essential details such as names (including translations), address information (such as county, state, etc.), population figures, precise geographical coordinates, and bounding box coordinates.
Here's a table describing the entire dataset:
Field | Description |
---|---|
name | The name of the place, e.g., "North Vancouver," "West Kelowna," "Edmonton." |
other_names | A dictionary of alternate names in various languages and official names where applicable. |
display_name | The full display name, including location details such as regional districts, states, and countries. |
address | Information about the location, including city name, county, state, country, country code, etc. |
population | The population of the place. |
osm_type | The data type in the OpenStreetMap database (e.g., "relation", "node" or "way"). |
osm_id | The unique identifier in the OpenStreetMap database for the place. |
location | The geographical coordinates of the place (latitude and longitude). |
bbox | The bounding box coordinates that define the geographic extent of the place. |
Here is a data sample:
{"name":"North Vancouver","other_names":{"name:be":"Норт-Ванкувер","name:ru":"Норт-Ванкувер","name:ckb":"باکووری ڤانکوڤەر","name:zh-Hans":"北温哥华","name:zh-Hant":"北溫哥華","official_name":"City of North Vancouver"},"display_name":"North Vancouver, Metro Vancouver Regional District, British Columbia, Canada","address":{"city":"North Vancouver","county":"Metro Vancouver Regional District","state":"British Columbia","ISO3166-2-lvl4":"CA-BC","country":"Canada","country_code":"ca"},"population":48200,"osm_type":"relation","osm_id":1524228,"type":"administrative","location":[-123.0737925,49.3206294],"bbox":[-123.1026766,49.295636,-123.038138,49.3357132]}
{"name":"West Kelowna","display_name":"West Kelowna, Regional District of Central Okanagan, British Columbia, Canada","address":{"city":"West Kelowna","county":"Regional District of Central Okanagan","state":"British Columbia","ISO3166-2-lvl4":"CA-BC","country":"Canada","country_code":"ca"},"population":36078,"osm_type":"relation","osm_id":2221799,"type":"administrative","location":[-119.582876,49.8585784],"bbox":[-119.7266019,49.7881267,-119.5112063,49.9452394]}
{"name":"Edmonton","other_names":{"name:de":"Edmonton","name:en":"Edmonton","name:es":"Edmonton","name:fr":"Edmonton","name:it":"Edmonton","name:ar":"إدمونتون","name:cr":"ᐊᒥᐢᑿᒌᐚᐢᑲᐦᐃᑲᐣ","name:eu":"Edmonton","name:ja":"エドモントン市","name:ko":"에드먼턴","name:ru":"Эдмонтон","name:uk":"Едмонтон","name:bla":"Omahkoyisi","name:chp":"Kuę́ Nedhé","name:srs":"Nasagachoo","name:sto":"Tiʔoda","name:yue":"愛民頓","name:cr-Latn":"Amiskwacîwâskahikan","name:ja-Latn":"Edomonton-shi","name:ko-Latn":"Edeumeonteon","name:zh-Hant":"埃德蒙頓市","name:zh-Latn-pinyin":"Āidéméngdùn-shì"},"display_name":"Edmonton, Division No. 11, Alberta, Canada","address":{"city":"Edmonton","county":"Division No. 11","state":"Alberta","ISO3166-2-lvl4":"CA-AB","country":"Canada","country_code":"ca"},"population":1,"osm_type":"relation","osm_id":2564500,"type":"city","location":[-113.491241,53.5462055],"bbox":[-113.7138411,53.3372558,-113.2714783,53.7162646]}
We use ndjson (newline delimited JSON), which allows you to read the files line by line. This data format is particularly useful for handling large datasets efficiently and seamlessly.
Instructions
To download dataset of cities, towns, villages, and hamlets worldwide in JSON format, click the link below:
- After clicking the download link, you will be directed to a directory with zip archives. Each archive is dedicated to a specific country, containing up to four files: place-city.ndjson, place-town.ndjson, place-village.ndjson and place-hamlet.ndjson. This structure allows you to easily decide at which geographical level you want to access the data. Whether you're interested in comprehensive city data or need detailed information on smaller hamlets, you can choose the appropriate file for your project.
- After downloading, unzip the archive and select the specific JSON dataset file that suits your needs. You can then use ndjson to read the data line by line in your applications, databases, or analytical tools for a wide range of purposes, from urban planning and research to geographic analysis and software development.
Copyright and License
The data contained in this dataset is compiled from the OpenStreetMap (OSM) database and is subject to the copyright terms and conditions of OSM contributors.
OpenStreetMap® is open data, licensed under the Open Data Commons Open Database License (ODbL) by the OpenStreetMap Foundation (OSMF).
You are free to copy, distribute, transmit, and adapt the data, as long as you credit OpenStreetMap and its contributors. If you alter or build upon our data, you may distribute the result only under the same license.
Related Tutorials
Cities, Towns, Villages and Hamlets dataset: How to Use for Data Analysis
JavaScript Heatmap Example: A Step-by-Step Tutorial to Build Heatmaps with MapLibre GL
Frequently Asked Questions
How to get the original OSM object and its geometry?
To retrieve the original OSM object and its geometry, you can make use of the Geoapify Place Details API. This API allows you to access a wealth of information about a specific place, including its geometrical data. You'll need the unique identifiers from our dataset, such as 'osm_id' and 'osm_type,' to make requests to the Geoapify Place Details API. This API enables you to fetch detailed information about a place, including its boundaries, polygons, and additional attributes, directly from the OpenStreetMap database.
Here is a URL examle:
https://api.geoapify.com/v2/place-details?osm_id=62422&osm_type=r&apiKey=YOUR_API_KEY
I noticed duplicate rows in the dataset. What could be the reason for this?
The presence of duplicate rows in the dataset can be attributed to our comprehensive data import process. We have imported all rows from the OpenStreetMap database with specific tags, namely place=city, place=town, place=village, and place=hamlet. Sometimes, duplicates may arise when multiple objects reference the same geographical location. For example, one entry may represent the city boundary, while another describes the city center. Here is an example:
{"name":"Berlin",… ,"population":3769962,"osm_type":"relation","osm_id":62422,"type":"administrative","location":[13.3888599,52.5170365],"bbox":[13.088345,52.3382448,13.7611609,52.6755087]}
{"name":"Berlin",… ,"population":3769962,"osm_type":"node","osm_id":240109189,"type":"city","location":[13.3888599,52.5170365],"bbox":[13.2288599,52.3570365,13.5488599,52.6770365]}
The first row corresponds to OSM relation 62422, and another row corresponds to OSM node 240109189.
This duplication results from how geographic data is organized in OpenStreetMap, with distinct objects representing various aspects of a place. We've included both entries to provide a more comprehensive view of each location within the dataset.
I noticed some cities are in "no_country.zip." Why is this the case?
"No country" cities may indicate that the data in OpenStreetMap lacks complete country information. This could be due to the nature of crowd-sourced data. You can cross-reference with additional sources or use other attributes provided to assign the city to the correct country.
How can I get all the cities in a specific U.S. state?
Use address fields to find all places within one particular administrative location. Here is an Python code sample of searching all cities in the California state:
cities_in_california = [city for city in data if city['address']['state'] == 'California']
Can you provide an example of how to read an ndjson file in Python?
Here's the Python example using pandas and the alternative method with json
for reading ndjson files:
Using pandas:
import pandas as pd
# Read ndjson file with newlines
data = pd.read_json('place-city.ndjson', lines=True)
Using json
(alternative method):
import json
data = []
# Read ndjson file without newlines
with open("place-city.ndjson") as f:
for line in f.readlines():
data.append(json.loads(line))
Both methods achieve the same result, allowing you to read ndjson files into a Python data structure for further processing. You can choose the one that best fits your preferences and file format.
Can you provide an example of how to read an ndjson file in JavaScript?
You can use the readline module to read ndjson files line by line in Node.js. Here's an example in JavaScript:
const readline = require('readline');
const fs = require('fs');
const data = [];
const readStream = readline.createInterface({
input: fs.createReadStream('place-village.ndjson')
});
readStream.on('line', (line) => {
data.push(JSON.parse(line));
});
readStream.on('close', () => {
console.log(data); // Process the data as needed
});
What is the source of the population data in this dataset?
The population data in this dataset is sourced directly from OpenStreetMap, where contributors may provide population information for various locations. Keep in mind that this data is crowd-sourced and may not always be up-to-date or comprehensive.
How frequently is this dataset updated?
This dataset's cities, towns, villages, and hamlets data are not typically subject to frequent updates in OpenStreetMap (OSM). These types of geographical data tend to be relatively stable over time. As a result, there is generally no immediate need for regular updates.
However, we are committed to maintaining the accuracy and reliability of this dataset. If you encounter problems or identify inaccuracies in the data, please don't hesitate to contact us. We will promptly address any issues and provide updates as needed to ensure the dataset remains a valuable and dependable resource for your use.