Full Package Reference
This page contains the full reference for the maup package, and
is included primarily for the sake of developers trying to hunt down errors
or understand the codebase.
Adjacencies
- exception maup.adjacencies.IslandWarning[source]
Bases:
UserWarning
- exception maup.adjacencies.OverlapWarning[source]
Bases:
UserWarning
- maup.adjacencies.adjacencies(geometries, adjacency_type='rook', output_type='geoseries', *, warn_for_overlaps=True, warn_for_islands=True)[source]
Returns adjacencies between geometries. The default return type is a GeoSeries with a MultiIndex, whose (i, j)th entry is the pairwise intersection between geometry i and geometry j. We ensure that i < j always holds, so that any adjacency is represented just once in the series. If output_type == “geodataframe”, the return type is a range-indexed GeoDataFrame with a “neighbors” column containing the pair (i,j) for the geometry consisting of the intersection between geometry i and geometry j.
- maup.adjacencies.iter_adjacencies(geometries)[source]
Assign
- exception maup.assign.AssigmentWarning[source]
Bases:
UserWarningWarning raised when some source geometries are not assigned to any target.
- maup.assign.assign(sources, targets)[source]
Assign source geometries to targets. A source is assigned to the target that covers it, or, if no target covers the entire source, the target that covers the most of its area.
- maup.assign.assign_by_area(sources, targets)[source]
- maup.assign.assign_by_covering(sources, targets)[source]
- maup.assign.assign_to_max(weights)[source]
- maup.assign.drop_source_label(index)[source]
CRS
- maup.crs.require_same_crs(f)[source]
Indexed Geometries
- class maup.indexed_geometries.IndexedGeometries(geometries)[source]
Bases:
object- assign(targets)[source]
- covered_by(container)[source]
- enumerate_intersections(targets)[source]
- intersections(geometry)[source]
- query(geometry)[source]
- maup.indexed_geometries.get_geometries(geometries)[source]
Indices
- maup.indices.get_geometries_with_range_index(geometries)[source]
Intersections
- maup.intersections.intersections(sources, targets, output_type='geoseries', area_cutoff=None)[source]
Computes all of the nonempty intersections between two sets of geometries. By default, the returned ~geopandas.GeoSeries will have a MultiIndex, where the geometry at index (i, j) is the intersection of
sources[i]andtargets[j](if it is not empty). If output_type == “geodataframe”, the return type is a range-indexed GeoDataFrame with “source” and “target” columns containing the indices i,j, respectively, for the intersection ofsources[i]andtargets[j]:param sources: geometries :type sources:GeoSeriesorGeoDataFrame:param targets: geometries :type targets:GeoSeriesorGeoDataFrame:rtype:GeoSeries:param area_cutoff: (optional) if provided, only return intersections witharea greater than
area_cutoff
- maup.intersections.prorate(relationship, data, weights, aggregate_by='sum')[source]
Prorate data from one set of geometries to another, using their ~maup.intersections or an assignment.
- Parameters:
relationship – the
intersections()of the geometries you are getting data from (sources) and the geometries you are moving the data to; or, a series assigning sources to targetsdata (
pandas.Seriesorpandas.DataFrame) – the data you want to move (must be indexed the same as the source geometries)weights (
pandas.Series) – the weights to use when prorating fromsourcestointersaggregate_by (function) – (optional) the function to use for aggregating from
interstotargets. The default is"sum".
Normalize
- maup.normalize.normalize(weights, level=0)[source]
Takes a series of MultiIndexed weights and normalizes them with respect to one level (level 0 by default).
Progress Bar
- class maup.progress_bar.ProgressBar[source]
Bases:
object
Repair
- exception maup.repair.AreaCroppingWarning[source]
Bases:
UserWarning
- maup.repair.absorb_by_shared_perimeter(sources, targets, relative_threshold=None, force_polygons=False)[source]
- maup.repair.apply_func_to_polygon_parts(shape, func)[source]
- maup.repair.autorepair(geometries, relative_threshold=0.1, force_polygons=False)[source]
Uses simplistic algorithms to repair most gaps and overlaps. Additional optional argument provided that can drop any fragmented line segments that may occur while using shapely’s make valid function. Should work by default.
The default relative_threshold is 0.1. This default is chosen to include tiny overlaps that can be safely auto-fixed while preserving major overlaps that might indicate deeper issues and should be handled on a case-by-case basis. Set relative_threshold=None to attempt to resolve all overlaps. See resolve_overlaps() and close_gaps() for more.
For a more careful repair that takes adjacencies and higher-order overlaps between geometries into account, consider using smart_repair instead.
- maup.repair.close_gaps(geometries, relative_threshold=0.1, force_polygons=False)[source]
Closes gaps between geometries by assigning the hole to the polygon that shares the greatest perimeter with the hole.
If the area of the gap is greater than relative_threshold times the area of the polygon, then the gap is left alone. The default value of relative_threshold is 0.1. This is intended to preserve intentional gaps while closing the tiny gaps that can occur as artifacts of geospatial operations. Set relative_threshold=None to attempt close all gaps. Due to floating point precision issues, all gaps may not be closed.
Optional “force_polygons” argument included to apply an automatic filter non-polygonal fragments during operation.
- maup.repair.count_holes(shp)[source]
Counts gaps between geometries.
- maup.repair.count_overlaps(shp)[source]
Counts overlaps between geometries. Code is taken directly from the resolve_overlaps function in maup.
- maup.repair.crop_to(source, target)[source]
Crops the source geometries to the target geometries.
- maup.repair.dedup_vertices(polygon)[source]
- maup.repair.doctor(source, target=None, silent=False, accept_holes=False)[source]
Detects quality issues in a given set of source and target geometries. Quality issues include overlaps, gaps, invalid geometries, non-perfect tiling, and not entirely overlapping source and targets. If maup.doctor() returns True, votes should not be lost when prorating or assigning (beyond a few due to rounding, etc.). Passing a target to doctor is optional.
If silent is True, then print outputs are suppressed. (Default is silent = False.)
If accept_holes is True, then holes alone do not cause doctor to return a value of False. (Default is accept_holes = False.)
- maup.repair.expand_to(source, target, force_polygons=False)[source]
Expands the source geometries to the target geometries.
- maup.repair.holes(geometry)[source]
Returns any holes in a Polygon or MultiPolygon.
- maup.repair.holes_of_union(geometries)[source]
Returns any holes in the union of the given geometries.
- maup.repair.make_valid_polygons(geometries, force_polygons=True)[source]
Extended make valid function with optional filter for non polygonal data from the output.
- maup.repair.quick_repair(geometries, relative_threshold=0.1, force_polygons=False)[source]
New name for autorepair function from Maup 1.x. Uses simplistic algorithms to repair most gaps and overlaps.
The default relative_threshold is 0.1. This default is chosen to include tiny overlaps that can be safely auto-fixed while preserving major overlaps that might indicate deeper issues and should be handled on a case-by-case basis. Set relative_threshold=None to attempt to resolve all overlaps. See resolve_overlaps() and close_gaps() for more.
For a more careful repair that takes adjacencies and higher-order overlaps between geometries into account, consider using smart_repair instead.
- maup.repair.remove_repeated_vertices(geometries)[source]
Removes repeated vertices. Vertices are considered to be repeated if they appear consecutively, excluding the start and end points.
- maup.repair.resolve_overlaps(geometries, relative_threshold=0.1, force_polygons=False)[source]
For any pair of overlapping geometries, assigns the overlapping area to the geometry that shares the greatest perimeter with the overlap. Returns the GeoSeries of geometries, which will have no overlaps.
If the ratio of the overlap’s area to either of the overlapping geometries’ areas is greater than relative_threshold, then the overlap is ignored. The default relative_threshold is 0.1. This default is chosen to include tiny overlaps that can be safely auto-fixed while preserving major overlaps that might indicate deeper issues and should be handled on a case-by-case basis. Set relative_threshold=None to attempt to resolve all overlaps. Due to floating point precision issues, all overlaps may not be resolved.
Optional “force_polygons” argument included to apply an automatic filter non-polygonal fragments during operation.
- maup.repair.snap_multilinestring_to_grid(multilinestring, n=-7)[source]
- maup.repair.snap_polygon_to_grid(polygon, n=-7)[source]
- maup.repair.snap_to_grid(geometries, n=-7)[source]
Snap the geometries to a grid by rounding to the nearest 10^n. Helps to resolve floating point precision issues in shapefiles.
- maup.repair.split_by_level(series, multiindex)[source]
- maup.repair.trim_valid(value)[source]
Ensures that the results from the “make_valid_polygons” method won’t be geometry collections with mixed types by filtering out non-polygons.
Smart Repair
- maup.smart_repair.building_blocks(geometries_df, snap_magnitude=None, nest_within_regions=None)[source]
Partitions the extent of the input via all boundaries of all geometries (and regions, if nest_within_regions is a GeoDataFrame/GeoSeries of region boundaries); associates to each polygon in the partition the set of polygons in the original shapefile whose intersection created it, and organizes this data according to order of the overlaps. (Order zero = hole)
- maup.smart_repair.construct_hole_boundaries(geometries_df, holes_df)[source]
Construct a GeoDataFrame with all positive-length intersections between hole and geometry boundaries, including intersections between hole boundaries and exterior boundaries, if applicable.
- maup.smart_repair.contain_each_other(poly1, poly2)[source]
- maup.smart_repair.convexify_hole_boundaries(geometries_df, holes_df)[source]
Partially fill gaps as follows: (1) Assign any gap that only adjoins 1 geometry to that geometry. (2) For each gap that adjoins at least 2 geometries, “convexify” the geometries
surrounding the gap by replacing the gap’s boundary with each geometry by the shortest path within the gap between its endpoints and “filling in” the geometry up to the new boundary. (Exterior boundaries, if any, are left alone.)
If there are only 2 non-exterior (and no exterior) geometries intersecting the gap, this will fill the gap completely; otherwise it will usually leave one or more smaller gaps remaining. The convexity of the geometry boundaries will simplify the process of filling the remaining gap(s).
- maup.smart_repair.drop_bad_holes(reconstructed_df, holes_df, fill_gaps_threshold)[source]
Identify holes that won’t be filled and drop them from holes_df
- maup.smart_repair.incenter(triangle)[source]
Find the incenter (intersection point of the angle bisectors) of a triangle.
- maup.smart_repair.num_components(geom)[source]
Counts the number of connected components of a shapely object.
- maup.smart_repair.reconstruct_from_overlap_tower(geometries_df, overlap_tower, nested=False)[source]
Rebuild the polygons in geometries_df with overlaps removed.
- maup.smart_repair.segments(curve)[source]
Extracts a list of the individual line segments from a LineString
- maup.smart_repair.shortest_path_in_polygon(polygon, start, end, full_triangulation=None)[source]
Finds the shortest path between any two vertices in a not-necessarily-convex simple polygon. The polygon must be valid and simply connected, and “start” and “end” must be vertices of the polygon.
Optional input full_triangulation allows triangulation to be computed in advance to avoid repetition when multiple paths need to be computed within the same polygon.
- maup.smart_repair.small_rook_to_queen(geometries_df, min_rook_length)[source]
Convert all rook adjacencies between geometries with total adjacency length less than min_rook_length to queen adjacencies.
- maup.smart_repair.smart_close_gaps(geometries_df, holes_df)[source]
Fill simply connected gaps; general procedure is roughly as follows: (1) Fill in gaps that only intersect one non-exterior geometry in the
obvious way.
For remaining gaps, partially fill by “convexifying” boundaries with each non-exterior geometry. This will have the effect of completely filling gaps that only intersect 2 geometries and no exterior boundaries.
For any gap that intersects 4 or more geometries nontrivially (including exterior boundaries), find the non-adjacent pair with the shortest distance between them and try to connect the pair by adding a “triangle” to each of the non-exterior geometries in the pair. (Keep trying until this succeeds for some pair.) This reduces the gap to 1 or 2 smaller gaps, each intersecting strictly fewer geometries than the original. Put the smaller gaps back in the queue for the next round.
For any gap that intersects exactly 3 geometries (including exterior boundaries) nontrivially, fill by a process that gives a portion of the gap to each of the non-exterior geometries that it intersects.
- maup.smart_repair.smart_repair(geometries_df, snapped=True, snap_precision=9, fill_gaps=True, fill_gaps_threshold=0.1, disconnection_threshold=0.0001, nest_within_regions=None, min_rook_length=None)[source]
Repairs topology issues (overlaps, gaps, invalid polygons) in a geopandas GeoDataFrame or GeoSeries, with an emphasis on preserving intended adjacency relations between geometries as closely as possible.
Specifically, the algorithm (1) Applies shapely.make_valid to all polygon geometries. (2) If snapped = True (default), snaps all polygon vertices to a grid of size no
more than 10^(-snap_precision) times the max of width/height of the entire extent of the input. HIGHLY RECOMMENDED to avoid topological exceptions due to rounding errors. Default value for snap_precision is 9; if topological exceptions still occur, try reducing snap_precision (which must be integer- valued) to 8 or 7.
Resolves all overlaps.
If fill_gaps = True (default), closes all simply connected gaps with area less than fill_gaps_threshold times the largest area of all geometries adjoining the gap. Default threshold is 10%; if fill_gaps_threshold = None then all simply connected gaps will be filled.
If nest_within_regions is a secondary GeoDataFrame/GeoSeries of region boundaries (e.g., counties in a state) then all of the above will be performed so that repaired geometries nest cleanly into the region boundaries; each repaired geometrywill be contained in the region with which the original geometry has the largest area of intersection. Default value is None.
If min_rook_length is given a numerical value, replaces all rook adjacencies with length below this value with queen adjacencies. Note that this is an absolute value and not a relative value, so make sure that the value provided is in the correct units with respect to the input’s CRS. Default value is None.
Sometimes the repair process creates tiny fragments that are disconnected from the district that they are assigned to. A final cleanup step assigns any such fragments to a neighboring geometry if their area is less than disconnection_threshold times the area of the largest connected component of their assigned geometry. Default threshold is 0.01%, and this seems to work well in practice.
- maup.smart_repair.triangulate_polygon(polygon)[source]
Triangulate a not-necessarily-convex simple polygon, based on the ear clipping method.