OSGeoLive-Notebooks/geopandas-tutorial/02-spatial-relationships-op...

494 lines
12 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Spatial relationships and operations"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"\n",
"import pandas as pd\n",
"import geopandas\n",
"\n",
"pd.options.display.max_rows = 10"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# o13 filepaths\n",
"countries = geopandas.read_file(\"/home/user/data/natural_earth2/ne_10m_admin_0_countries.shp\")\n",
"cities = geopandas.read_file(\"/home/user/data/natural_earth2/ne_10m_populated_places.shp\")\n",
"rivers = geopandas.read_file(\"/home/user/data/natural_earth2/ne_10m_rivers_lake_centerlines.shp\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Spatial relationships\n",
"\n",
"An important aspect of geospatial data is that we can look at *spatial relationships*: how two spatial objects relate to each other (whether they overlap, intersect, contain, .. one another).\n",
"\n",
"The topological, set-theoretic relationships in GIS are typically based on the DE-9IM model. See https://en.wikipedia.org/wiki/Spatial_relation for more information.\n",
"\n",
"![](img/TopologicSpatialRelarions2.png)\n",
"(Image by [Krauss, CC BY-SA 3.0](https://en.wikipedia.org/wiki/Spatial_relation#/media/File:TopologicSpatialRelarions2.png))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Relationships between individual objects"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's first create some small toy spatial objects:\n",
"\n",
"A polygon <small>(note: we use `.squeeze()` here to to extract the scalar geometry object from the GeoSeries of length 1)</small>:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"belgium = countries.loc[countries['name'] == 'Belgium', 'geometry'].squeeze()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"cities.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Two points:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"paris = cities.loc[cities['NAME'] == 'Paris', 'geometry'].squeeze()\n",
"brussels = cities.loc[cities['NAME'] == 'Brussels', 'geometry'].squeeze()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And a linestring:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from shapely.geometry import LineString\n",
"line = LineString([paris, brussels])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's visualize those 4 geometry objects together (I only put them in a GeoSeries to easily display them together with the geopandas `.plot()` method):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"geopandas.GeoSeries([belgium, paris, brussels, line]).plot(cmap='tab10')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can recognize the abstract shape of Belgium.\n",
"\n",
"Brussels, the capital of Belgium, is thus located within Belgium. This is a spatial relationship, and we can test this using the individual shapely geometry objects as follow:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"brussels.within(belgium)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And using the reverse, Belgium contains Brussels:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"belgium.contains(brussels)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On the other hand, Paris is not located in Belgium:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"belgium.contains(paris)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"paris.within(belgium)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The straight line we draw from Paris to Brussels is not fully located within Belgium, but it does intersect with it:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"belgium.contains(line)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"line.intersects(belgium)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Spatial relationships with GeoDataFrames\n",
"\n",
"The same methods that are available on individual `shapely` geometries as we have seen above, are also available as methods on `GeoSeries` / `GeoDataFrame` objects.\n",
"\n",
"For example, if we call the `contains` method on the world dataset with the `paris` point, it will do this spatial check for each country in the `world` dataframe:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"countries.contains(paris)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Because the above gives us a boolean result, we can use that to filter the dataframe:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"countries[countries.contains(paris)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And indeed, France is the only country in the world in which Paris is located."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Another example, extracting the linestring of the Amazon river in South America, we can query through which countries the river flows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"amazon = rivers[rivers['name'] == 'Amazonas'].geometry.squeeze()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#countries[countries.intersects(amazon)] # or .intersects\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<div class=\"alert alert-info\" style=\"font-size:120%\">\n",
"<b>REFERENCE</b>: <br><br>\n",
"\n",
"Overview of the different functions to check spatial relationships (*spatial predicate functions*):\n",
"\n",
"<ul>\n",
" <li>`equals`</li>\n",
" <li>`contains`</li>\n",
" <li>`crosses`</li>\n",
" <li>`disjoint`</li>\n",
" <li>`intersects`</li>\n",
" <li>`overlaps`</li>\n",
" <li>`touches`</li>\n",
" <li>`within`</li>\n",
" <li>`covers`</li>\n",
"</ul>\n",
"\n",
"<p>\n",
"See https://shapely.readthedocs.io/en/stable/manual.html#predicates-and-relationships for an overview of those methods.\n",
"<p></p>\n",
"See https://en.wikipedia.org/wiki/DE-9IM for all details on the semantics of those operations.\n",
"</p>\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Spatial operations\n",
"\n",
"Next to the spatial predicates that return boolean values, Shapely and GeoPandas aslo provide analysis methods that return new geometric objects.\n",
"\n",
"<table>\n",
"<tr>\n",
" <td><img src=\"img/spatial-operations-base.png\"/> </td>\n",
" <td><img src=\"img/spatial-operations-intersection.png\"/> </td>\n",
"</tr>\n",
"<tr>\n",
" <td><img src=\"img/spatial-operations-union.png\"/> </td>\n",
" <td><img src=\"img/spatial-operations-difference.png\"/> </td>\n",
"</tr>\n",
"<tr>\n",
" <td><img src=\"img/spatial-operations-buffer-line.png\"/> </td>\n",
" <td><img src=\"img/spatial-operations-buffer-polygon.png\"/> </td>\n",
"</tr>\n",
"\n",
"</table>\n",
"\n",
"See https://shapely.readthedocs.io/en/stable/manual.html#spatial-analysis-methods for more details."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For example, using the toy data from above, let's construct a buffer around Brussels (which returns a Polygon):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"geopandas.GeoSeries([belgium, brussels.buffer(1)]).plot(alpha=0.5, cmap='tab10')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"and now take the intersection, union or difference of those two polygons:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"brussels.buffer(1).intersection(belgium)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"brussels.buffer(1).union(belgium)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"brussels.buffer(1).difference(belgium)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Another useful method is the `unary_union` attribute, which converts the set of geometry objects in a GeoDataFrame into a single geometry object by taking the union of all those geometries.\n",
"\n",
"For example, we can construct a single object for the Africa continent:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"africa_countries = countries[countries['continent'] == 'Africa']"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"africa = africa_countries.unary_union"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"africa"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"print(str(africa)[:1000])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<div class=\"alert alert-info\" style=\"font-size:120%\">\n",
"<b>REMEMBER</b>: <br><br>\n",
"\n",
"GeoPandas (and Shapely for the individual objects) provides a whole lot of basic methods to analyse the geospatial data (distance, length, centroid, boundary, convex_hull, simplify, transform, ....), much more than the few that we can touch in this tutorial.\n",
"\n",
"\n",
"<ul>\n",
" <li>An overview of all methods provided by GeoPandas can be found here: http://geopandas.readthedocs.io/en/latest/reference.html</li>\n",
"</ul>\n",
"\n",
"</div>\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}