joins.rst [1:62]

File:

: 1 edited

trunk/workshop-foss4g/joins.rst (modified) (10 diffs)

Legend:

: Unmodified
: Added
: Removed

trunk/workshop-foss4g/joins.rst

-                      r1
+                      r62
 .. _joins:
 Section 12: Spatial Joins
 =========================
 Spatial joins are the bread-and-butter of spatial databases.  They allow you to combine information from different tables by using spatial relationships as the join key.  Much of what we think of as "standard GIS analysis" can be expressed as spatial joins.
 In the previous section, we explored spatial relationships using a two-step process: first we extracted a subway station point for 'Broad St'; then, we used that point to ask further questions such as "what neighborhood is the 'Broad St' station in?"
 Using a spatial join, we can answer the question in one step, retrieving information about the subway station and the neighborhood that contains it:
 .. code-block:: sql
   SELECT
     subways.name AS subway_name,
     neighborhoods.name AS neighborhood_name,
+Partie 12 : Les jointures spatiales
+===================================
+Les jointures spatiales sont la cerise sur le gÃ¢teau des base de donnÃ©es spatiales. Elles vous pemettent de combiner les informations de plusieurs tables en utilisant une relation spatiale comme clause de jointure. La plupart des "analyses SIG standards" peuvent Ãªtre exprimÃ©es Ã  l'aide de jointures spatiales.
+Dans la partie prÃ©cÃ©dente, nous avons utilisÃ© les relations spatiales en utilisant deux Ã©tapes dans nos requÃªtes : nous avons dans un premier temps extrait la station de mÃ©tro "Broad St" puis nous avons utilisÃ© ce rÃ©sultat dans nos autres requÃªtes pour rÃ©pondre aux questions comme "dans quel quartier se situe la station 'Broad St' ?"
+En utilisant les jointures spatiales, nous pouvons rÃ©pondre aux questions en une seule Ã©tape, rÃ©cupÃ©rant les informations relatives Ã  la station de mÃ©tro et le quartier la contenant :
+.. code-block:: sql
+  SELECT
+    subways.name AS subway_name,
+    neighborhoods.name AS neighborhood_name,
     neighborhoods.boroname AS borough
   FROM nyc_neighborhoods AS neighborhoods
 …
   WHERE subways.name = 'Broad St';
 ::
    subway_name | neighborhood_name  |  borough
+::
+   subway_name | neighborhood_name  |  borough
   -------------+--------------------+-----------
    Broad St    | Financial District | Manhattan
 We could have joined every subway station to its containing neighborhood, but in this case we wanted information about just one.  Any function that provides a true/false relationship between two tables can be used to drive a spatial join, but the most commonly used ones are: :command:`ST_Intersects`, :command:`ST_Contains`, and :command:`ST_DWithin`.
 Join and Summarize
 ------------------
 The combination of a ``JOIN`` with a ``GROUP BY`` provides the kind of analysis that is usually done in a GIS system.
 For example: **"What is the population and racial make-up of the neighborhoods of Manhattan?"** Here we have a question that combines information from about population from the census with the boundaries of neighborhoods, with a restriction to just one borough of Manhattan.
 .. code-block:: sql
   SELECT
     neighborhoods.name AS neighborhood_name,
+Nous avons pu regrouper chaque station de mÃ©tro avec le quartier auquel elle appartient, mais dans ce cas nous n'en voulions qu'une. Chaque fonction qui envoit un rÃ©sultat du type vrai/faux peut Ãªtre utilisÃ©e pour joindre spatialement deux tables, mais la plupart du temps on utilise : :command:`ST_Intersects`, :command:`ST_Contains`, et :command:`ST_DWithin`.
+Jointure et regroupement
+------------------------
+La combinaison de ``JOIN`` avec ``GROUP BY`` fournit le type d'analyse qui est couramment utilisÃ© dans les systÃšmes SIG.
+Par exemple : **Quelle est la population et la rÃ©partition raciale du quartier de Manhattan ?** Ici nous avons une question qui combine les informations relatives Ã  la population recensÃ©e et les contours des quartiers, or nous ne voulons qu'un seul quartier, celui de Manhattan.
+.. code-block:: sql
+  SELECT
+    neighborhoods.name AS neighborhood_name,
     Sum(census.popn_total) AS population,
     Round(100.0 * Sum(census.popn_white) / Sum(census.popn_total),1) AS white_pct,
 …
 ::
    neighborhood_name  | population | white_pct | black_pct
+   neighborhood_name  | population | white_pct | black_pct
  ---------------------+------------+-----------+-----------
   Carnegie Hill       |      19909 |      91.6 |       1.5
 …
 What's going on here? Notionally (the actual evaluation order is optimized under the covers by the database) this is what happens:
 #. The ``JOIN`` clause creates a virtual table that includes columns from both the neighborhoods and census tables.
 #. The ``WHERE`` clause filters our virtual table to just rows in Manhattan.
 #. The remaining rows are grouped by the neighborhood name and fed through the aggregation function to :command:`Sum()` the population values.
 #. After a little arithmetic and formatting (e.g., ``GROUP BY``, ``ORDER BY``) on the final numbers, our query spits out the percentages.
 .. note::
    The ``JOIN`` clause combines two ``FROM`` items.  By default, we are using an ``INNER JOIN``, but there are four other types of joins. For further information see the `join_type <http://www.postgresql.org/docs/8.1/interactive/sql-select.html>`_ definition in the PostgreSQL documentation.
 We can also use distance tests as a join key, to create summarized "all items within a radius" queries. Let's explore the racial geography of New York using distance queries.
 First, let's get the baseline racial make-up of the city.
 .. code-block:: sql
   SELECT
 .0 * Sum(popn_white) / Sum(popn_total) AS white_pct,
 .0 * Sum(popn_black) / Sum(popn_total) AS black_pct,
+Que ce passe-t-il ici ?  Voici ce qui se passe (l'ordre d'Ã©valuation est optimisÃ© par la base de donnÃ©es) :
+#. La clause ``JOIN`` crÃ©e une table virtuelle qui contient les colonnes Ã  la fois des quartiers et des recensements (tables neighborhoods et census).
+#. La clause ``WHERE`` filtre la table virtuelle pour ne conserver que la ligne correspondant Ã  Manhattan.
+#. Les lignes restantes sont regroupÃ©es par le nom du quartier et sont utilisÃ©es par la fonction d'agrÃ©gation : :command:`Sum()` pour rÃ©aliser la somme des valeurs de la population.
+#. AprÃšs un peu d'arithmÃ©tique et de formatage (ex: ``GROUP BY``, ``ORDER BY``)) sur le nombres finaux, notre requÃªte calcule les pourcentages.
+.. note::
+   La clause ``JOIN`` combine deux parties ``FROM``.  Par dÃ©faut, nous utilisons un jointure du type :``INNER JOIN``, mais il existe quatres autres types de jointures. Pour de plus amples informations Ã  ce sujet, consultez la partie `type_jointure <http://docs.postgresql.fr/9.1/sql-select.html>`_ de la page de la documentation officielle de PostgreSQL.
+Nous pouvons aussi utiliser le test de la distance dans notre clef de jointure, pour crÃ©er une regroupement de "tous les Ã©lÃ©ments dans un certain rayon". Essayons d'analyser la gÃ©ographie raciale de New York en utilisant les requÃªtes de distance.
+PremiÃšrement, essayons d'obtenir la rÃ©partition raciale de la ville.
+.. code-block:: sql
+  SELECT
+.0 * Sum(popn_white) / Sum(popn_total) AS white_pct,
+.0 * Sum(popn_black) / Sum(popn_total) AS black_pct,
     Sum(popn_total) AS popn_total
   FROM nyc_census_blocks;
 ::
         white_pct      |      black_pct      | popn_total
+::
+        white_pct      |      black_pct      | popn_total
   ---------------------+---------------------+------------
 .6586020115685295 | 26.5945063345703034 |    8008278
+So, of the 8M people in New York, about 44% are "white" and 26% are "black".
 Duke Ellington once sang that "You / must take the A-train / To / go to Sugar Hill way up in Harlem." As we saw earlier, Harlem has far and away the highest African-American population in Manhattan (80.5%). Is the same true of Duke's A-train?
 First, note that the contents of the ``nyc_subway_stations`` table ``routes`` field is what we are interested in to find the A-train. The values in there are a little complex.
+Donc, 8M de personnes dans New York, environ 44% sont "blancs" et 26% sont "noirs".
+Duke Ellington chantait que "You / must take the A-train / To / go to Sugar Hill way up in Harlem." Comme nous l'avons vu prÃ©cÃ©demment, Harlem est de trÃšs loin le quartier ou se trouve la plus grande concentration d'africains-amÃ©ricains de Manhattan (80.5%). Est-il toujours vrai qu'il faut prendre le train A dont Duke parlait dans sa chanson ?
+PremiÃšrement, le contenu du champ ``routes`` de la table ``nyc_subway_stations`` va nous servir Ã  rÃ©cupÃ©rer le train A. Les valeurs de ce champs sont un peu complexes.
 .. code-block:: sql
   SELECT DISTINCT routes FROM nyc_subway_stations;
 ::
+::
  A,C,G
 …
 .. note::
    The ``DISTINCT`` keyword eliminates duplicate rows from the result.  Without the ``DISTINCT`` keyword, the query above identifies 491 results instead of 73.
 So to find the A-train, we will want any row in ``routes`` that has an 'A' in it. We can do this a number of ways, but today we will use the fact that :command:`strpos(routes,'A')` will return a non-zero number if 'A' is in the routes field.
 .. code-block:: sql
    SELECT DISTINCT routes
    FROM nyc_subway_stations AS subways
+   Le mot clef ``DISTINCT`` permet d'Ã©liminer les rÃ©pÃ©titions de lignes de notre rÃ©sultat. Dans ce mot clef, notre requÃªte renverrait 491 rÃ©sultats au lieu de 73.
+Donc pour trouver le train A, nous allons demander toutes les lignes ayant pour ``routes`` la valeur 'A'. Nous pouvons faire cela de diffÃ©rentes maniÃšres, mais nous utiliserons aujourd'hui le fait que la fonction :command:`strpos(routes,'A')` retourne un entier diffÃ©rent de 0 si la lettre 'A' se trouve dans la valeur du champ route.
+.. code-block:: sql
+   SELECT DISTINCT routes
+   FROM nyc_subway_stations AS subways
    WHERE strpos(subways.routes,'A') > 0;
 ::
 …
   A,B,C,D
   A,C,E
 Let's summarize the racial make-up of within 200 meters of the A-train line.
 .. code-block:: sql
   SELECT
 .0 * Sum(popn_white) / Sum(popn_total) AS white_pct,
 .0 * Sum(popn_black) / Sum(popn_total) AS black_pct,
+Essayons de regrouper la rÃ©partition raciale dans un rayon de 200 mÃštres de la ligne du train A.
+.. code-block:: sql
+  SELECT
+.0 * Sum(popn_white) / Sum(popn_total) AS white_pct,
+.0 * Sum(popn_black) / Sum(popn_total) AS black_pct,
     Sum(popn_total) AS popn_total
   FROM nyc_census_blocks AS census
 …
 ::
         white_pct      |      black_pct      | popn_total
+        white_pct      |      black_pct      | popn_total
   ---------------------+---------------------+------------
 .0805466940877366 | 23.0936148851067964 |     185259
+So the racial make-up along the A-train isn't radically different from the make-up of New York City as a whole.
+Advanced Join
 -------------
 In the last section we saw that the A-train didn't serve a population that differed much from the racial make-up of the rest of the city. Are there any trains that have a non-average racial make-up?
 To answer that question, we'll add another join to our query, so that we can simultaneously calculate the make-up of many subway lines at once. To do that, we'll need to create a new table that enumerates all the lines we want to summarize.
+La rÃ©partition raciale le long de la ligne du train A n'est pas radicalement diffÃ©rente de la rÃ©partition gÃ©nÃ©rale de la ville de New York.
+Jointures avancÃ©es
+------------------
+Dans la derniÃšre partie nous avons vu que le train A n'est pas utilisÃ© par des populations si Ã©loignÃ©es de la rÃ©partition totale du reste de la ville. Y-a-t-il des trains qui passent par des parties de la ville qui ne sont pas dans la moyenne de la rÃ©partition raciale ?
+Pour rÃ©pondre Ã  cette question, nous ajouterons une nouvelle jointure Ã  notre requÃªte, de telle maniÃšre que nous puissions calculer simultanÃ©ment la rÃ©partition raciale de plusieurs lignes de mÃ©tro Ã  la fois. Pour faire ceci, nous crÃ©erons une table qui permettra d'Ã©numÃ©rer toutes les lignes que nous voulons regrouper.
 .. code-block:: sql
     CREATE TABLE subway_lines ( route char(1) );
     INSERT INTO subway_lines (route) VALUES
+    INSERT INTO subway_lines (route) VALUES
       ('A'),('B'),('C'),('D'),('E'),('F'),('G'),
       ('J'),('L'),('M'),('N'),('Q'),('R'),('S'),
 …
       ('7');
 Now we can join the table of subway lines onto our original query.
 .. code-block:: sql
     SELECT
+Maintenant nous pouvons joindre les tables des lignes de mÃ©tro Ã  notre requÃªte prÃ©cÃ©dente.
+.. code-block:: sql
+    SELECT
       lines.route,
       Round(100.0 * Sum(popn_white) / Sum(popn_total), 1) AS white_pct,
       Round(100.0 * Sum(popn_black) / Sum(popn_total), 1) AS black_pct,
+      Round(100.0 * Sum(popn_white) / Sum(popn_total), 1) AS white_pct,
+      Round(100.0 * Sum(popn_black) / Sum(popn_total), 1) AS black_pct,
       Sum(popn_total) AS popn_total
     FROM nyc_census_blocks AS census
 …
 ::
      route | white_pct | black_pct | popn_total
+     route | white_pct | black_pct | popn_total
     -------+-----------+-----------+------------
      S     |      30.1 |      59.5 |      32730
 …
 As before, the joins create a virtual table of all the possible combinations available within the constraints of the ``JOIN ON`` restrictions, and those rows are then fed into a ``GROUP`` summary. The spatial magic is in the ``ST_DWithin`` function, that ensures only census blocks close to the appropriate subway stations are included in the calculation.
+Function List
 -------------
 `ST_Contains(geometry A, geometry B) <http://postgis.org/docs/ST_Contains.html>`_: Returns true if and only if no points of B lie in the exterior of A, and at least one point of the interior of B lies in the interior of A.
 `ST_DWithin(geometry A, geometry B, radius) <http://postgis.org/docs/ST_DWithin.html>`_: Returns true if the geometries are within the specified distance of one another.
 `ST_Intersects(geometry A, geometry B) <http://postgis.org/docs/ST_Intersects.html>`_: Returns TRUE if the Geometries/Geography "spatially intersect" - (share any portion of space) and FALSE if they don't (they are Disjoint).
 `round(v numeric, s integer) <http://www.postgresql.org/docs/7.4/interactive/functions-math.html>`_: PostgreSQL math function that rounds to s decimal places
 `strpos(string, substring) <http://www.postgresql.org/docs/current/static/functions-string.html>`_: PostgreSQL string function that returns an integer location of a specified substring.
 `sum(expression) <http://www.postgresql.org/docs/8.2/static/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE>`_: PostgreSQL aggregate function that returns the sum of records in a set of records.
 .. rubric:: Footnotes
+Comme prÃ©cÃ©demment, les jointures crÃ©ent une table virtuelle de toutes les combinaisons possibles et disponibles Ã  l'aide des contraintes de type ``JOIN ON`. Ces lignes sont ensuite utilisÃ©es dans le regroupement ``GROUP``. La magie spatiale tient dans l'utilisation de la fonction ``ST_DWithin`` qui s'assure que les blocs sont suffisamment proches des lignes de mÃ©tros incluses dans le calcul.
+Liste de fonctions
+------------------
+`ST_Contains(geometry A, geometry B) <http://postgis.org/docs/ST_Contains.html>`_: retourne TRUE si et seulement si aucun point de B est Ã  l'extÃ©rieur de A, et si au moins un point Ã  l'intÃ©rieur de B  est Ã  l'intÃ©rieur de A.
+`ST_DWithin(geometry A, geometry B, radius) <http://postgis.org/docs/ST_DWithin.html>`_: retourne TRUE si les gÃ©omÃ©tries sont distantes du rayon donnÃ©.
+`ST_Intersects(geometry A, geometry B) <http://postgis.org/docs/ST_Intersects.html>`_: retourne TRUE si les gÃ©omÃ©tries/gÃ©ographies "s'intersectent spatialement" (partage une portion de l'espace) et FALSE sinon (elles sont disjointes).
+`round(v numeric, s integer) <http://www.postgresql.org/docs/7.4/interactive/functions-math.html>`_: fonction de PostgreSQL qui arrondit Ã  s dÃ©cimales.
+`strpos(chaÃ®ne, sous-chaÃ®ne) <http://www.postgresql.org/docs/current/static/functions-string.html>`_: fonction de chaÃ®ne de caractÃšres de PostgreSQL qui retourne la position de la sous-chaine.
+`sum(expression) <http://www.postgresql.org/docs/8.2/static/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE>`_: fonction d'agrÃ©gation de PostgreSQL qui retourne la somme d'un ensemble de valeurs.
+.. rubric:: Notes de bas de page
 .. [#PostGIS_Doco] http://postgis.org/documentation/manual-1.5/

Note: See TracChangeset for help on using the changeset viewer.

PostGIS.fr

Bienvenue sur PostGIS.fr

Changes in trunk/workshop-foss4g/joins.rst [1:62]

Legend:

trunk/workshop-foss4g/joins.rst

Download in other formats: