Approaches To Indexing In The UK Higher Education Community

Brian Kelly, UKOLN, University of Bath, BATH, BA2 7AY
{b.kelly@ukoln.ac.uk}

Abstract

The paper reports on the approaches to indexing taken in the UK Higher Education community. A review of the indexing software used in over 150 University web sites is given. Other approaches taken in the community are described including initiatives in providing regional and national indexing, indexing non-web resources and cross-searching.

Approaches Within Institutions

A survey of software used to index UK University web sites was carried out during the summer 1999 and updated in January 2000. Reports on the surveys have been published [1] [2]. The surveys indicated the preference for freely available software, such as ht://Dig. A small number of institutions used third party services.

Other Initiatives

Volunteer Initiatives

ACDC [3] aims to provide an index of the UK Higher Education community's web resources based on a distributed approach using Harvest [4]. Although the ACDC service is still available it does not provide a complete index of the community and it is becoming less useful as new or updated file formats (e.g. HTML pages containing scripts) become more widely deployed. In addition as ACDC was reliant on volunteer, unfunded effort, the service deteriorated as resources became unavailable.

Another approach at providing a regional index has been taken by the Scottish Search Maestro initiative [5]. This provides an index of higher education resources in Scotland together with a number of relevant national resources (e.g. national research council web sites).

Nationally-Funded Initiatives

eLib

Since 1996 the eLib Programme [6] has funded over 70 projects aimed at developing various aspects of a digital library for the UK Higher Education community. eLib's budget of over 20 million over its three phases has enabled significant developments to be made, which have made an impact throughout internationally. eLib Phase 3 consists of three main areas: "Hybrid Libraries", "Clumps" and "Digital Preservation". The Hybrid Library projects aim to develop services which provide access to resources which may be located on a range of digital services (Web, CD-ROM, etc.) as well as in print. Agora [7], for example, provides access to a variety of library catalogues, resource gateways, etc. The Clumps projects aim to provide large scale resource discovery services. Clumps projects include regional and subject-based services. For example CAIRNS [8] provides access to Scottish library catalogues.

The DNER

JISC (Joint Information Systems Committee) has announced plans to develop a "Distributed National Electronic Resource" (the DNER) [9]. The DNER aims to provide seamless access to electronic resources provided by JISC service providers.

The Resource Discovery Network (the RDN) [10] is a relatively new JISC service. The RDN is coordinating the activities of subject gateways which provide access to quality resources in areas such as Social Sciences, Engineering and Mathematics (see [11]). The RDN is piloting a cross-searching service which enables users to search across the subject gateways (now known as "hubs") [12].

Discussion

From the summary of developments within the UK Higher Education community we have seen how a number of approaches to the provision of an index of an organisational web service have been taken, such as use of indexing software locally and use of third-party indexing services. We have also considered a number of alternative approaches, from unfunded, often volunteer, approaches to centrally-funded initiatives. What lessons can be learnt and how may we expect things to develop in the future?

A requirement for seamless access to resources which are provided on a variety of platforms (including a variety of hardware platforms, operating systems, databases, authentication schemes, file formats, etc.) is support for standards. Work towards the development of the DNER has identified a number of areas in which existing standards are not available, immature or cumbersome. As a result standardisation work has been initiated in a number of areas including metadata for defining collections [13] and attribute sets for Z39.50 [14].

The research and academic communities have been instrumental in the development of the Internet and the Web. However, as the Web becomes more complex, individual volunteer effort becomes increasingly unlikely to result in significant changes. Developments within the research and academic communities are likely to be based on funded initiatives which are resourced to provide software developments and involvement in standardisation work. In addition relationships with the (international) user community will have to be fostered in order to ensure uptake of new systems.

The developments funded by eLib and JISC do have much momentum behind them. Many of the projects are addressing not only the software development aspects, but also use of standards (such as Dublin Core, RDF and Z39.50) and the development of such standards. In addition links with the user communities are being forged to ensure the update of the services once the project funding finishes.

JISC's recent Circular for "Developing the DNER for Learning and Teaching" [15] provides a mechanism for integrated institutional developments within a nationally-defined strategy. The Call provides 10 million to improve the applicability of C&IT for learning and teaching through enhancements to the DNER infrastructure (based on standards such as Z39.50 and XML) and the provision of content to be used in a teaching and learning context. The outcomes of this call will be of interest to the resource discovery research community.

References

  1. WebWatch: UK University Search Engines, Ariadne issue 21, Sept 1999,
    <http://www.ariadne.ac.uk/issue21/webwatch/>
  2. Survey Of UK HE Institutional Search Engines - January 2000
    <http://www.ukoln.ac.uk/web-focus/surveys/uk-he-search-engines/survey-jan2000.html>
  3. ACDC
    <http://acdc.hensa.ac.uk/>
  4. Harvest
    <http://www.tardis.ed.ac.uk/harvest/>
  5. Scottish Search Maestro, University of Dundee
    <http://somis2.ais.dundee.ac.uk/scotland/www-gtw>
  6. eLib
    <http://www.ukoln.ac.uk/services/elib/>
  7. Agora
    <http://www.ukoln.ac.uk/services/elib/projects/agora/>
  8. The CAIRNS Project - Cooperative Academic Information Retrieval Network For Scotland
    <http://www.ukoln.ac.uk/services/elib/projects/cairns/>
  9. DNER, JISC
    <http://www.jisc.ac.uk/pub99/dner_desc.html>
  10. RDN
    <http://www.rdn.ac.uk/>
  11. Subject-Based Information Gateways In The UK
    <http://www.ukoln.ac.uk/web-focus/papers/www8/>
  12. ResourceFinder, RDN
    <http://www.rdn.ac.uk/xsearch.html>
  13. Collection Description, UKOLN
    <http://www.ukoln.ac.uk/metadata/cld/>
  14. Z39.50 Interoperability Profile
    <http://www.ukoln.ac.uk/interop-focus/activities/z3950/int_profile/bath/draft/>
  15. JISC Circular 5/99: Developing the DNER for Learning and Teaching, JISC
    <http://www.jisc.ac.uk/pub99/c05_99.html>