SSP 1995 project summary
Parallel merge-sort for GIS vector data
[EPCC home] [SSP home] [2001 projects] [2000 projects] [1999 projects] [1998 projects] [1997 projects] [1996 projects] [1995 projects] [1994 projects] [1993 projects]

Geographical Information Systems (GIS) permit the storage, display and manipulation of spatially referenced data. Continued rapid growth in the availability of digital cartographic data and satellite images is creating a demand for intensive computing to integrate and process large datasets. Such datasets can be collected and integrated from a wide variety of sources, including geological, political and census databases. This information can then be used to solve problems in areas such as environmental assessment, transportation, marketing, distribution, telecommunications, planning and resource management.

GIS processing is both computationally and I/O intensive. This makes GIS an ideal application for parallel processing, where performance and I/O can be increased in a modular and scalable way. Consequently the Edinburgh Parallel Computing Centre, in partnership with the Department of Geography, has developed GIS systems that exploit parallel processing technology.

A scalable, extensible, parallel library of core GIS operations has been designed and partly implemented to interface to some of the major commercial GIS packages. These core operations are polygon overlay, and conversion between raster and vector data formats.

These core operations have been designed in a modular fashion enabling the substantial reuse of code segments. Full parallel designs have been produced for all components of all operations.

The proposed Summer Student projects involve redesign and implementation of the vector data input module to use MPI instead of CHIMP. The projects also involve implementation of the outstanding components in this module.

The vector data input module performs the pre-processing and distribution of vector data necessary for both the polygon overlay and vector-to-raster conversion operations. Indeed similar processing and distribution of vector data is required by most vector GIS operations. This module is therefore an important building block for parallel GIS operations other than just polygon overlay and vector-to-raster conversion operations. It also necessary for operations such as vector topology creation, buffer generation and generalisation.

The vector data input module comprises three sub-phases :

All three sub-phases involve parallel processing of GIS data and significant use of PUL-PF and PUL-active buffers.

These sub-phases have been designed and partially implemented using CHIMP. Detailed design documents and pseudo-code also exist. All code is written in C. Two summer student projects would assist greatly in completing the implementation of the vector data input module. Once complete the vector data input module could be integrated with the Topology, Stitching and Output module currently being implemented by the Department Of Geography to produce a vector topology creation operation. Also by integrating the prototype polygon overlay module as well, a complete polygon overlay operation could be produced. This would therefore be amongst the first, if not the first, modular parallel GIS operations in the world and would represent a significant step in creating a library of parallel GIS operations. As such the results of the projects would be publishable in academic journals.

The project involves the following stages:

Robert Jackson worked on this project.

Compressed PostScript of the project's final report is available here (309367 bytes) .

Webpage maintained by mario@epcc.ed.ac.uk