Chapter 3. Frequently Asked Questions

3.1. What kind of geometric objects can I store?
3.2. How do I insert a GIS object into the database?
3.3. How do I construct a spatial query?
3.4. How do I speed up spatial queries on large tables?
3.5. Why aren't PostgreSQL R-Tree indexes supported?
3.6. Why should I use the AddGeometryColumn() function and all the other OpenGIS stuff?
3.7. What is the best way to find all objects within a radius of another object?
3.8. How do I perform a coordinate reprojection as part of a query?

3.1.

What kind of geometric objects can I store?

You can store point, line, polygon, multipoint, multiline, multipolygon, and geometrycollections. These are specified in the Open GIS Well Known Text Format (with XYZ,XYM,XYZM extentions).

3.2.

How do I insert a GIS object into the database?

First, you need to create a table with a column of type "geometry" to hold your GIS data. Connect to your database with psql and try the following SQL:

CREATE TABLE gtest ( ID int4, NAME varchar(20) );
SELECT AddGeometryColumn('', 'gtest','geom',-1,'LINESTRING',2);

If the geometry column addition fails, you probably have not loaded the PostGIS functions and objects into this database. See the installation instructions.

Then, you can insert a geometry into the table using a SQL insert statement. The GIS object itself is formatted using the OpenGIS Consortium "well-known text" format:

INSERT INTO gtest (ID, NAME, GEOM) 
VALUES (
  1, 
  'First Geometry', 
  GeomFromText('LINESTRING(2 3,4 5,6 5,7 8)', -1)
);

For more information about other GIS objects, see the object reference.

To view your GIS data in the table:

SELECT id, name, AsText(geom) AS geom FROM gtest;

The return value should look something like this:

 id | name           | geom
----+----------------+-----------------------------
  1 | First Geometry | LINESTRING(2 3,4 5,6 5,7 8) 
(1 row)

3.3.

How do I construct a spatial query?

The same way you construct any other database query, as an SQL combination of return values, functions, and boolean tests.

For spatial queries, there are two issues that are important to keep in mind while constructing your query: is there a spatial index you can make use of; and, are you doing expensive calculations on a large number of geometries.

In general, you will want to use the "intersects operator" (&&) which tests whether the bounding boxes of features intersect. The reason the && operator is useful is because if a spatial index is available to speed up the test, the && operator will make use of this. This can make queries much much faster.

You will also make use of spatial functions, such as Distance(), ST_Intersects(), ST_Contains() and ST_Within(), among others, to narrow down the results of your search. Most spatial queries include both an indexed test and a spatial function test. The index test serves to limit the number of return tuples to only tuples that might meet the condition of interest. The spatial functions are then use to test the condition exactly.

SELECT id, the_geom 
FROM thetable 
WHERE 
  the_geom && 'POLYGON((0 0, 0 10, 10 10, 10 0, 0 0))' 
AND
  _ST_Contains(the_geom,'POLYGON((0 0, 0 10, 10 10, 10 0, 0 0))');

3.4.

How do I speed up spatial queries on large tables?

Fast queries on large tables is the raison d'etre of spatial databases (along with transaction support) so having a good index is important.

To build a spatial index on a table with a geometry column, use the "CREATE INDEX" function as follows:

CREATE INDEX [indexname] ON [tablename] USING GIST ( [geometrycolumn] );

The "USING GIST" option tells the server to use a GiST (Generalized Search Tree) index.

Note

GiST indexes are assumed to be lossy. Lossy indexes uses a proxy object (in the spatial case, a bounding box) for building the index.

You should also ensure that the PostgreSQL query planner has enough information about your index to make rational decisions about when to use it. To do this, you have to "gather statistics" on your geometry tables.

For PostgreSQL 8.0.x and greater, just run the VACUUM ANALYZE command.

For PostgreSQL 7.4.x and below, run the SELECT UPDATE_GEOMETRY_STATS() command.

3.5.

Why aren't PostgreSQL R-Tree indexes supported?

Early versions of PostGIS used the PostgreSQL R-Tree indexes. However, PostgreSQL R-Trees have been completely discarded since version 0.6, and spatial indexing is provided with an R-Tree-over-GiST scheme.

Our tests have shown search speed for native R-Tree and GiST to be comparable. Native PostgreSQL R-Trees have two limitations which make them undesirable for use with GIS features (note that these limitations are due to the current PostgreSQL native R-Tree implementation, not the R-Tree concept in general):

  • R-Tree indexes in PostgreSQL cannot handle features which are larger than 8K in size. GiST indexes can, using the "lossy" trick of substituting the bounding box for the feature itself.

  • R-Tree indexes in PostgreSQL are not "null safe", so building an index on a geometry column which contains null geometries will fail.

3.6.

Why should I use the AddGeometryColumn() function and all the other OpenGIS stuff?

If you do not want to use the OpenGIS support functions, you do not have to. Simply create tables as in older versions, defining your geometry columns in the CREATE statement. All your geometries will have SRIDs of -1, and the OpenGIS meta-data tables will not be filled in properly. However, this will cause most applications based on PostGIS to fail, and it is generally suggested that you do use AddGeometryColumn() to create geometry tables.

Mapserver is one application which makes use of the geometry_columns meta-data. Specifically, Mapserver can use the SRID of the geometry column to do on-the-fly reprojection of features into the correct map projection.

3.7.

What is the best way to find all objects within a radius of another object?

To use the database most efficiently, it is best to do radius queries which combine the radius test with a bounding box test: the bounding box test uses the spatial index, giving fast access to a subset of data which the radius test is then applied to.

The ST_DWithin(geometry, geometry, distance) function is a handy way of performing an indexed distance search. It works by creating a search rectangle large enough to enclose the distance radius, then performing an exact distance search on the indexed subset of results.

For example, to find all objects with 100 meters of POINT(1000 1000) the following query would work well:

SELECT * FROM geotable 
  WHERE ST_DWithin(geocolumn, 'POINT(1000 1000)', 100.0);

3.8.

How do I perform a coordinate reprojection as part of a query?

To perform a reprojection, both the source and destination coordinate systems must be defined in the SPATIAL_REF_SYS table, and the geometries being reprojected must already have an SRID set on them. Once that is done, a reprojection is as simple as referring to the desired destination SRID.

SELECT ST_Transform(the_geom,4269) FROM geotable;