SSP Project Summary:
[EPCC home] [SSP home] [2001 projects] [2000 projects] [1999 projects] [1998 projects] [1997 projects] [1996 projects] [1995 projects] [1994 projects] [1993 projects]

Parallel Vector-topological data input for Geographical Informations Systems

Geographical Information Systems (GIS) permit the storage, display and manipulation of spatially referenced data. Continued rapid growth in the availability of digital cartographic data and satellite images is creating a demand for intensive computing to integrate and process large datasets. Such datasets can be collected and integrated from a wide variety of sources, including geological, political and census databases. This information can then be used to solve problems in areas such as environmental assessment, transportation, marketing, distribution, telecommunications, planning and resource management.

GIS processing is both computationally and I/O intensive. This makes GIS an ideal application for parallel processing, where performance and I/O can be increased in a modular and scalable way. Consequently the Edinburgh Parallel Computing Centre, in partnership with the Department of Geography, has developed GIS systems that exploit parallel processing technology.

A scalable, extensible, parallel library of core GIS operations has been designed and partly implemented to interface to some of the major commercial GIS packages. These core operations are polygon overlay, and conversion between raster and vector data formats.

These core operations have been designed in a modular fashion enabling the substantial reuse of code segments. Full parallel designs have been produced for all components of all operations.

The vector data input module performs the pre-processing and distribution of vector data necessary for both the polygon overlay and vector-to-raster conversion operations. Indeed similar processing and distribution of vector data is required by most vector GIS operations. This module is therefore an important building block for parallel GIS operations other than just polygon overlay and vector-to-raster conversion operations. It also necessary for operations such as vector topology creation, buffer generation and generalisation.

The vector data input module comprises three sub-phases :

All three sub-phases involve parallel processing of GIS data. The Sort Phase has been designed and implemented using MPI and PUL-GF. The Join Phase has been implemented using CHIMP. Both these modules have been tested. Detailed design documents and pseudo-code also exist. All code is written in C. A summer student project would assist greatly in completing the implementation of the vector data input module.

The designs for this work are about to be published in a Taylor and Francis book, so this would represent a useful test of this work and as such is publishable. Last year's SSP is referenced and the results quoted in this text.

Last year, an SSP student redesigned the Sort phase of the Vector Data input module to use MPI instead of CHIMP. This was implemented and tested successfully.

The proposal for this year's project is to redesign and implement the Join and GAD phases for MPI instead of CHIMP. An implementation of the Join phase with CHIMP currently exists and has been tested successfully. This first task is therefore a useful introduction to the GIS the project and MPI since essentially that is required is the reimplementation of the message passing code in Join. The second task requires much more effort since only a design (with pseudo-code) of the GAD phase using CHIMP exists. However this task uses much of the functionality previously implemented in the Sort and Join phases and therefore should be straightforward.

The project therefore consists of the following units.

Should the student complete the above, there is further work in converting the Sort phase from PUL-GF and the implementation of a parallel line intersection operation which utilises the complete Vector Input module.

Expertise Required

Strong computing skills, preferably with C.

Resources Required

MPI, PUL-active buffers are used in the Join phase; the project no longers requires PUL-PF; any necessary datasets can be obtained from Geography dept.

Resources Supplied

Design documents plus existing code; datasets from Dept of Geography

Michal Rewienski worked on this project.

Compressed PostScript of the project's final report is available here (44 kbytes) .

Webpage maintained by mario@epcc.ed.ac.uk