SSP 1998 Project Summary:
[EPCC home] [SSP home] [2001 projects] [2000 projects] [1999 projects] [1998 projects] [1997 projects] [1996 projects] [1995 projects] [1994 projects] [1993 projects]

Benchmarking and related work on the Hitachi SR2201

Student

Krzysztof Zelazowski, Warsaw University of Technology

Supervisors

Connor Mulholland, EPCC


As part of the project titled High Performance Java for the Hitachi SR2201, Hitachi have donated a distributed memory parallel system with 8 processors, each of which feature 256Mb of memory. Up to this point, Connor Mulholland and Lorna Smith have carried out a set of benchmarking exercises on existing Cray T3D codes written in FORTRAN and utilising the MPI message passing library. The results obtained were compared with those of the Cray machines and a set of useful and interesting conclusions made.

The aim of this project is to act as an extension to the initial benchmarking carried out and to investigate further how useful and efficient the SR2201 actually is, in comparison to the Cray machines. The proposal will be aimed at dividing the project into 3 parts with an additional 4th, if time permits.

The first of the stages involves extending the initial benchmarking exercise to make use of 8 and possibly 16 processors (from Maidenhead) which were not available before. The timings would be carried out on the Hitachi SR2201 as well as both Cray machines. Also and only if available, tuned BLAS routines for the SR2201 could be utilised and compared with earlier results which made use of imported BLAS routines from Netlib.

The second of the stages would involve yet another benchmarking exercise. This time though, the aim would be to measure sustainable memory bandwidth for the Hitachi SR2201. It would involve work with a simple synthetic benchmark program, called STREAMS. The object of this particular stage would be to carry out tests on the machine, in order to measure its sustainable memory bandwidth and get the results published on the STREAMS WWW pages.

The purpose of the third stage would be to investigate the Parallel FORTRAN Translator pf90. Everyone is well aware of the portability of High Performance FORTRAN and the Hitachi SR2201 is no exception to this. The machine does support this language, but so far, very little work has been carried out. This would provide a useful opportunity to learn how to program, compile and execute using HPF on the machine, as well as determine how useful and efficient the translator is.

A further fourth exercise could be tackled if the previous 3 stages have been completed within the 8 weeks time frame. This would involve further investigation into the use of faster communications on the SR2201 machine. Since SHMEM versions of ANGUS, CETEP and DL\_POLY (ie. codes used in the initial benchmarking exercise) already exist, these routines could be replaced with the concept of remote DMA (Direct Memory Access). This is sender operated and directly transfers memory, hence providing an opportunity to compare the use of it with SHMEM calls.


The final report for this project is available here.
Webpage maintained by mario@epcc.ed.ac.uk