Metadata and Electronic Document Management

Introduction

The Digital Library allows access to electronic documents, while respecting the intellectual property rights of the author. Before the web, the distinction between internal organisation documents and external publishing was clear. With the advent of the web, these distinctions are disappearing and there is a tendency to use the same technology for creating and indexing internal documents and for external document publishing. However, the legal distinctions remain and business practice has not caught up with technological developments. Therefore "publishing" for the electronic library remains a separate and distinct activity.

A good overview of e-publishing issues was provided in the Guidelines for Commonwealth information published in electronic formats:

The introduction of digital technology allows information be stored in open formats from which a range of end products can be generated. ... For example, a file meeting the latest specifications for Internet text can be viewed on a screen as text, displayed as Braille or run through a speech synthesiser and read aloud. These developments have profound importance in enabling the creation of documents that are accessible to a wide range of people. With a small amount of care at the outset, one document prepared in a standard format can meet a variety of needs, with the end-user taking responsibility for how the document is accessed.

From: "Guidelines for Commonwealth information published in electronic formats", AusInfo, Commonwealth of Australia 1999, Revised Edition, January 2000, URL: http://www.agimo.gov.au/information/publishing/formats

The guidelines recommend the use of the AGLS metadata, as previously discussed. Responsibility for the guidelines was transferred to Department of Finance, then the National Office for the Information Economy and then Australian Government Information Management Office. Unfortunately they have not been properly maintained following the transitions and are now only available in PDF format.

Publishing Mistakes Are Dangerous

Publishing, even academic publishing, is a significant economic activity and can also have significant effects on the lives of the public. As an example in looking for articles on "electronic publishing" I came across:

Sirs: Recently we found out that our abstract "Severe Tardive Dystonia: Treatment with Continuous Intrathecal Baclofen Administration" (J Neurol 243 Suppl 2: S75) contains a severe and potentially dangerous mistake.

The dose of intrathecal baclofen in the patient presented was 100 mg/day rather than 100 g/day. The abstract submitted as well as the computer disk (Microsoft Word for Windows Version 2.0b) additionally handed in for electronic publication contained the correct figure spelled with the Greek character "m".

Investigations into this subject revealed that occasionally special characters may be misinterpreted by different versions of the same wordprocessing programme ...

From: "Risks of electronic publishing", D. Dressler, page 61, Letters to the Editors, Journal of Neurology, Steinkopff Verlag , Volume 244, Number 1/November 28, 1996, URL: http://www.springerlink.com/openurl.asp?genre=article&eissn=1432-1459&volume=244&issue=1&spage=61

Accessibility

The Commonwealth guidelines also recommend use of the Human Rights and Equal Opportunity Commission advisory notes on World Wide Web access, issued under the Disability Discrimination Act 1992 for the purpose of avoiding discrimination:

Availability of information and services in electronic form via the web has the potential to provide equal access for people with a disability; and to provide access more broadly, more cheaply and more quickly than is otherwise possible using other formats. Examples of access are:

From: "World Wide Web Access: Disability Discrimination Act Advisory Notes", Version 3.2, August 2002, Human Rights and Equal Opportunity Commission, URL: http://www.hreoc.gov.au/disability_rights/standards/www_3/www_3.html

In August 2000 the Sydney Organizing Committee for the Olympic Games was found to have engaged in unlawful conduct by providing a web site which was to a significant extent inaccessible to the blind. This is discussed in detail in the ANU course Internet, Intranet, and Document Systems (COMP3400).

Library Metadata

Libraries, such as the ANU Library, now provide web based search facilities which look similar to web search engines. They look like web search engines partly because web search engines evolved from concepts of libraries and partly because on-line library users are now used to web search interfaces.

It should be appreciated that libraries have been in the information business for longer than IT professionals. As an example the Library of Alexandria was founded around 290 BC, destroyed by fire around 48 BC and opened again for business in 2002AD, with an on-line catalogue:

The new Bibliotheca Alexandrina will be officially opened by Egyptian President Hosni Mubarak at a ceremony attended by other heads of state and top officials.

Based on the old Library of Alexandra, the most famous library of Ancient Times, this modern public study centre will be open to students, researchers and the general public. ...

From: Inauguration of the Alexandria Library", UNESCO, 2002

On-line Public Access Catalog (OPAC)

Libraries are progressively changing from paper based to electronic systems, first for metadata and then for the information resources themselves.

Author Aristotle, 384-322 B.C.
Title Athenaion Politeia / Aristoteles; Edidit Mortimer Chambers.
Publisher Stuttgart : B.G. Teubner, 1994.
Call Number 089.81
Description xx, 84p., [4]p. of Plates : Plates ; 20cm.
Series Stmt Bibliotheca Scriptorum Graecorum et Romanorum Teubneriana ; No. 1113

From: "On-line Public Access Catalog (OPAC)", Bibliotheca Alexandrina, URL: http://www.bibalex.org/English/

Pneumatic tube to online

It was only in 2001 that the National Library of Australia changed from using pieces of paper propelled by air pressure to send around book requests in the building:

It is the end of an era in the life of the National Library of Australia. The paper call slips, long used by readers at the National Library have been replaced with electronic call slips. The pneumatic tube system, a source of fascination to users and visitors over the last 33 years has now assumed museum status. ...

The procedure for requesting material using e-CallSlips is quite easy. After selecting an available item from the catalogue, clients swipe their cards through swipe readers located at each computer terminal. ... Requests are sorted electronically and transmitted electronically to a printer at the appropriate stack location.

From: "Call Slips ONLINE - Save Time!", Gateways, National Library of Australia, no. 51| June 2001, URL: http://www.nla.gov.au/ntwkpubs/gw/51/p01a01.html

Library metadata

Libraries use specialised terms for metadata items:

Search the ANU Library Catalogue (ANU material only)

Search By: Search Call Numbers: Title/Series

From: Australian National University (ANU) Scholarly Information Services / Library Catalogue - ANU Material Only, ANU, 2003, URL: http://library.anu.edu.au/search~S1/

Some conventions suit librarians not customers

Some conventions designed to suit librarians, rather than their customers, persist in library systems, such as the need to type in author names backwards:

Search here for names of authors, artists, composers and corporate authors such as government bodies, organisations and conferences. For persons with family names, type the family name first.

From: "ANU Library Catalogue - Author Search", ANU, 2003, URL: http://library.anu.edu.au/search/a

Catalogues adapted to paper and e-documents

As with corporate records management systems, library catalogues have been adapted to record both paper and electronic documents. The ANU library catalogue includes links to on-line versions of documents, where available:

Author Bourk, Michael J

Title Universal service? : telecommunications policy in Australia and people with disabilities / Michael J Bourk ; edited by Tom Worthington

Published Belconnen, A.C.T. : TomW Communications, 2000

Click on the following to:

View electronic text

LOC'N

CALL #

STATUS

CHIFLEY

HV1559.A8B682 2000

AVAILABLE ...

From: "ANU Full Database", ANU, 2003

MAchine-Readable Cataloging (MARC) Format

The same catalogue information can also be displayed in the MARC format, developed in the 1970s for "MAchine-Readable Cataloging"‚ by libraries. This format uses numeric codes to identify each metadata item:

050 HV1559.A8B682 2000

100 1 Bourk, Michael J

245 10 Universal service? :|btelecommunications policy in

Australia and people with disabilities /|cMichael J Bourk

; edited by Tom Worthington

246 3 Telecommunications policy in Australia and people with

disabilities

260 Belconnen, A.C.T. :|bTomW Communications,|c2000

300 xiv, 273 p. ;|c21 cm

From: From: "ANU Full Database", ANU, 2003

MARC adapted to XML

As with other metadata formats, MARC is being adapted to XML formats:

<?xml version="1.0" encoding="UTF-8" ?>

<collection xmlns="http://www.loc.gov/MARC21/slim">

<record>

...

<datafield tag="245" ind1="1" ind2="0">

<subfield code="a">Arithmetic /</subfield>

<subfield code="c">Carl Sandburg ; illustrated as an anamorphic adventure by Ted Rand.</subfield>

</datafield>

...

</record>

</collection>

From: URL: http://www.loc.gov/standards/marcxml//Sandburg/sandburg.xml

MARC to Dublin Core

However, it is more likely this would be converted to Dublin Core format for use in non-library systems:

<?xml version="1.0" ?>

<dc xmlns="http://purl.org/dc/elements/1.1/">

<title>Arithmetic /</title>

<creator>Sandburg, Carl, 1878-1967.</creator>

<creator>Rand, Ted, ill.</creator>

<type />

<publisher>San Diego :Harcourt Brace Jovanovich,</publisher>

<date>c1993.</date>

<language>eng</language>

...

</dc>

From: URL: http://www.loc.gov/standards/marcxml//Sandburg/sandburgdc.xml
see: MARC 21 XML Schema, The Library of Congress, 2003, URL: http://www.loc.gov/standards/marcxml//