Using the cartopy shapereader¶
Cartopy provides an object oriented shapefile reader based on top of the pyshp module to provide easy, programmatic, access to standard vector datasets.
Cartopy’s wrapping of pyshp has the benefit of being pure python, and is therefore easy to install and extremely portable. However, for heavy duty shapefile I/O Fiona and GeoPandas are highly recommended.
Detailed API for the shapereader functionality can be found in the documentation
Helper functions for shapefile acquisition¶
Cartopy provides an interface for access to frequently used data such as the
GSHHS dataset and from
the NaturalEarthData website.
These interfaces allow the user to define the data programmatically, and if the data does not exist
on disk, it will be retrieved from the appropriate source (normally by
downloading the data from the internet). Currently the interfaces available are
natural_earth()
and gshhs()
.
Using the shapereader¶
We can acquire the countries dataset from Natural Earth found at
https://www.naturalearthdata.com/downloads/110m-cultural-vectors/110m-admin-0-countries/
by using the natural_earth()
function:
import cartopy.io.shapereader as shpreader
shpfilename = shpreader.natural_earth(resolution='110m',
category='cultural',
name='admin_0_countries')
From here, we can make use of the Reader
to get the first country
in the shapefile:
reader = shpreader.Reader(shpfilename)
countries = reader.records()
country = next(countries)
We can get the country’s attributes dictionary with the
Record.attributes
attribute:
>>> print type(country.attributes)
<type 'dict'>
>>> print sorted(country.attributes.keys())
['abbrev', ..., 'name_long', ... 'pop_est', ...]
We could now find the 5 least populated countries with:
reader = shpreader.Reader(shpfilename)
# define a function which returns the population given the country
population = lambda country: country.attributes['pop_est']
# sort the countries by population and get the first 5
countries_by_pop = sorted(reader.records(), key=population)[:5]
Which we can print with
>>> ', '.join([country.attributes['name_long']
... for country in countries_by_pop])
'Western Sahara, French Southern and Antarctic Lands, Falkland Islands, Antarctica, Greenland'
Exercises:
SHP.1: Repeat the last example to show the 4 most populated African countries in to the shapefile.
Hint: Look at the possible attributes to find out which continent a country belongs.
Answer:
Democratic Republic of the Congo, Egypt, Ethiopia, NigeriaSHP.2: Using the countries shapefile, find the most populated country grouped by the first letter of the “name_long”.
Hint:
itertools.groupby()
can help with the grouping.Answer:
A Argentina B Brazil C China D Democratic Republic of the Congo E Ethiopia F France G Germany H Hungary I India J Japan K Kenya L Lao PDR M Mexico N Nigeria O Oman P Pakistan Q Qatar R Russian Federation S South Africa T Turkey U United States V Vietnam W Western Sahara Y Yemen Z Zimbabwe