Summer Scholarship Programme

	SSP 1998 Project Summary:

[EPCC home] [SSP home] [2001 projects] [2000 projects] [1999 projects] [1998 projects] [1997 projects] [1996 projects] [1995 projects] [1994 projects] [1993 projects]

WebDB: Bringing Structure to the WWW

Student

John Lin, University of California, San Diego

Supervisors

Martin Westhead, EPCC

Rob Baxter, EPCC

The World Wide Web owes its enormous success to its simplicity and attractive user interface. From the user's perspective it is becoming increasingly sophisticated. However, from the point of view of Web authors, writing and maintaining a Web site is an arduous undertaking. To use an analogy the current Web is at the level of assembly code - all the links are hard coded and absolute, and this is at the heart of why the Web is so difficult to maintain.

Several tools exist to help write and maintain pages, but we suggest that the problem, fundamentally, is that the wrong model is being used. What is needed is the Web equivalent of compilers or interpreters to allow the construction of Web pages to take place at a higher level. To this end, we propose that the Web should be seen not as a collection of files but as a structured storage and retrieval mechanism - a database.

This proposal is part of a new project being started at EPCC that ultimately aims to develop a standard for the representation of Web pages on a database.

There are already many initiatives underway to interface Web servers with databases. This is not the same thing. We are not trying just to define an interface, we are trying to redefine the Web. In a sense the intention is to bring the WWW closer to its better designed, but less popular, cousin Hyper-G. Hyper-G is a much more sophisticated distributed hypertext system. It allows the same set of documents to have different routes through them depending on the context they are presented in, without having to modify the document source. In a chain of documents with forward and backward links, insertions and deletions can be made in a single operation. The database underlying the system understands these structural aspects of the pages, so once set up the author can let these happen automatically.

In order to understand better what would be required of such a database standard this project proposes to build a database from part of the EPCC Web site. It will contain staff bio pages, project pages and EPCC library book pages linking all three together with cross references. Staff will be linked to projects they are managing or working on and EPCC library books pages they have in their office. Likewise those pages will in turn be linked back to members of staff. The system will be designed to construct pages on demand from a Web server (interpret the source) and to produce a static Web of the pages (compile the source). From the author's point of view it should have an easy-to-use Web based interface, either through HTML and CGI or through Java.

The final report for this project is available here.

Webpage maintained by mario@epcc.ed.ac.uk