Bienvenue sur PostGIS.fr

Bienvenue sur PostGIS.fr , le site de la communauté des utilisateurs francophones de PostGIS.

PostGIS ajoute le support d'objets géographique à la base de données PostgreSQL. En effet, PostGIS "spatialise" le serverur PostgreSQL, ce qui permet de l'utiliser comme une base de données SIG.

Maintenu à jour, en fonction de nos disponibilités et des diverses sorties des outils que nous testons, nous vous proposons l'ensemble de nos travaux publiés en langue française.

source: trunk/workshop-foss4g/validity.rst @ 1

Revision 1, 6.7 KB checked in by djay, 13 years ago (diff)

Initial import of the svn tree

RevLine 
[1]1.. _validity:
2
3Section 20: Validity
4====================
5
6In 90% of the cases the answer to the question, "why is my query giving me a 'TopologyException' error" is "one or more of the inputs are invalid".  Which begs the question: what does it mean to be invalid, and why should we care?
7
8What is Validity
9----------------
10
11Validity is most important for polygons, which define bounded areas and require a good deal of structure. Lines are very simple and cannot be invalid, nor can points.
12
13Some of the rules of polygon validity feel obvious, and others feel arbitrary (and in fact, are arbitrary).
14
15 * Polygon rings must close.
16 * Rings that define holes should be inside rings that define exterior boundaries.
17 * Rings may not self-intersect (they may neither touch nor cross one another).
18 * Rings may not touch other rings, except at a point.
19
20The last two rules are in the arbitrary category. There are other ways to define polygons that are equally self-consistent but the rules above are the ones used by the :term:`OGC` :term:`SFSQL` standard that PostGIS conforms to.
21
22The reason the rules are important is because algorithms for geometry calculations depend on consistent structure in the inputs. It is possible to build algorithms that have no structural assumptions, but those routines tend to be very slow, because the first step in any structure-free routine is to *analyze the inputs and build structure into them*.
23
24Here's an example of why structure matters. This polygon is invalid:
25
26::
27
28  POLYGON((0 0, 0 1, 2 1, 2 2, 1 2, 1 0, 0 0));
29 
30You can see the invalidity a little more clearly in this diagram:
31
32.. image:: ./validity/figure_eight.png
33
34The outer ring is actually a figure-eight, with a self-intersection in the middle. Note that the graphic routines successfully render the polygon fill, so that visually it is appears to be an "area": two one-unit squares, so a total area of two units of area.
35
36Let's see what the database thinks the area of our polygon is:
37
38.. code-block:: sql
39
40  SELECT ST_Area(ST_GeometryFromText('POLYGON((0 0, 0 1, 1 1, 2 1, 2 2, 1 2, 1 1, 1 0, 0 0))'));
41 
42::
43
44    st_area
45   ---------
46          0
47
48What's going on here? The algorithm that calculates area assumes that rings to not self-intersect. A well-behaved ring will always have the area that is bounded (the interior) on one side of the bounding line (it doesn't matter which side, just that it is on *one* side). However, in our (poorly behaved) figure-eight, the bounded area is to the right of the line for one lobe and to the left for the other. This causes the areas calculated for each lobe to cancel out (one comes out as 1, the other as -1) hence the "zero area" result.
49
50
51Detecting Validity
52------------------
53
54In the previous example we had one polygon that we **knew** was invalid. How do we detect invalidity in a table with millions of geometries? With the :command:`ST_IsValid(geometry)` function. Used against our figure-eight, we get a quick answer:
55
56.. code-block:: sql
57
58  SELECT ST_IsValid(ST_GeometryFromText('POLYGON((0 0, 0 1, 1 1, 2 1, 2 2, 1 2, 1 1, 1 0, 0 0))'));
59
60:: 
61
62  f
63
64Now we know that the feature is invalid, but we don't know why. We can use the :command:`ST_IsValidReason(geometry)` function to find out the source of the invalidity:
65
66.. code-block:: sql
67
68  SELECT ST_IsValidReason(ST_GeometryFromText('POLYGON((0 0, 0 1, 1 1, 2 1, 2 2, 1 2, 1 1, 1 0, 0 0))'));
69
70::
71
72  Self-intersection[1 1]
73
74Note that in addition to the reason (self-intersection) the location of the invalidity (coordinate (1 1)) is also returned.
75
76We can use the :command:`ST_IsValid(geometry)` function to test our tables too:
77
78.. code-block:: sql
79
80  -- Find all the invalid polygons and what their problem is
81  SELECT name, boroname, ST_IsValidReason(the_geom)
82  FROM nyc_neighborhoods
83  WHERE NOT ST_IsValid(the_geom);
84
85::
86
87           name           |   boroname    |                     st_isvalidreason                     
88 -------------------------+---------------+-----------------------------------------------------------
89  Howard Beach            | Queens        | Self-intersection[597264.083368305 4499924.54228856]
90  Corona                  | Queens        | Self-intersection[595483.058764138 4513817.95350787]
91  Steinway                | Queens        | Self-intersection[593545.572199759 4514735.20870587]
92  Red Hook                | Brooklyn      | Self-intersection[584306.820375986 4502360.51774956]
93
94
95
96Repairing Invalidity
97--------------------
98
99First the bad news: there is no guaranteed way to fix invalid geometries. The worst case scenario is identifying them with the :command:`ST_IsValid(geometry)` function, moving them to a side table, exporting that table, and repairing them externally.
100
101Here's an example of SQL to move invalid geometries out of the main table into a side table suitable for dumping to an external cleaning process.
102
103.. code-block:: sql
104
105  -- Side table of invalids
106  CREATE TABLE nyc_neighborhoods_invalid AS
107  SELECT * FROM nyc_neighborhoods
108  WHERE NOT ST_IsValid(the_geom);
109 
110  -- Remove them from the main table
111  DELETE FROM nyc_neighborhoods
112  WHERE NOT ST_IsValid(the_geom);
113 
114A good tool for visually repairing invalid geometry is OpenJump (http://openjump.org) which includes a validation routine under **Tools->QA->Validate Selected Layers**.
115
116Now the good news: a large proportion of invalidities **can be fixed inside the database** using the :command:`ST_Buffer` function.
117
118The buffer trick takes advantage of the way buffers are built: a buffered geometry is a brand new geometry, constructed by offsetting lines from the original geometry. If you offset the original lines by **nothing** (zero) then the new geometry will be structurally identical to the original one, but because it is built using the :term:`OGC` topology rules, it will be valid.
119
120For example, here's a classic invalidity -- the "banana polygon" -- a single ring that encloses an area but bends around to touch itself, leaving a "hole" which is not actually a hole.
121
122:: 
123
124  POLYGON((0 0, 2 0, 1 1, 2 2, 3 1, 2 0, 4 0, 4 4, 0 4, 0 0))
125 
126.. image:: ./validity/banana.png
127
128Running the zero-offset buffer on the polygon returns a valid :term:`OGC` polygon, consisting of an outer and inner ring that touch at one point.
129
130.. code-block:: sql
131
132  SELECT ST_AsText(
133           ST_Buffer(
134             ST_GeometryFromText('POLYGON((0 0, 2 0, 1 1, 2 2, 3 1, 2 0, 4 0, 4 4, 0 4, 0 0))'),
135             0.0
136           )
137         );
138
139::
140
141  POLYGON((0 0,0 4,4 4,4 0,2 0,0 0),(2 0,3 1,2 2,1 1,2 0))
142
143.. note::
144
145  The "banana polygon" (or "inverted shell") is a case where the :term:`OGC` topology model for valid geometry and the model used internally by ESRI differ. The ESRI model considers rings that touch to be invalid, and prefers the banana form for this kind of shape. The OGC model is the reverse.
146 
Note: See TracBrowser for help on using the repository browser.