To all interested in seamless access to ETDs connected with NDLTD:
Fortunately, there are a growing number of institutions connected
with NDLTD that are helping make scholarly research available,
thanks to the activities of students and universities around the globe.
One important partner is VTLS, Inc., a member of NDLTD, with a
seat on the NDLTD Steering Committee.
VTLS provides a variety of services in the library automation and
digital library field. Their high-end service is "Virtua" which has
unique capabilities to handle multilingual collections of metadata
using UNICODE, MARC, XML, and other technologies. VTLS has
generously agreed to support NDLTD through a union database
of ETDs for all members of NDLTD willing to share their metadata.
VTLS will provide a computer, loaded with Virtua software, and
will use that to provide access to the union database on an ongoing
basis. As long as NDLTD members provide metadata free of charge
for this effort, VTLS will provide access free of charge as well.
Please see the announcement of this from last summer:
``VTLS Partners with Virginia Tech on NDLTD.''
Press release, July 2000, online in PDF
(http://www.ndltd.org/news/prVTLS.pdf) and Word
(http://www.ndltd.org/news/prVTLS.doc).
We have been continuing this effort and have a plan in 3
steps to have a service fully functioning in summer 2001.
The first effort, starting now, is with a small set of
beta sites, to extend the successful testing that has been
underway with Virginia Tech data. The second effort
is a larger testing effort, for which we invite volunteers
so we have a complete set of cases and languages. The
third phase will be the production effort in the summer.
In all cases the emphasis will be on metadata records
for which there is a link to corresponding ETD, typically
in PDF, SGML, XML, and in some cases with multimedia
elements. See at the bottom of this msg for tentative details.
We are hoping that sites interested in this will run the
software being developed through the Open Archives
Initiative, OAI, www.openarchives.org. This will allow
us to harvest data, from all sites involved, in a fully
automatic way. We have prototyped a harvester for this
purpose. The OAI standards will be announced in
Washington, D.C. on Jan. 23 and in Berlin on Feb. 26.
I am not sure if anyone in Asia will be having a similar
unveiling and briefing, but we hope that will happen
eventually.
On Jan. 9-10 there was an open meeting at OCLC in Dublin
Ohio to finalize the standard encoding for metadata about
ETDs according to Dublin Core in XML. VTLS will be working
with metadata in that form or in MARC. They will support
UNICODE for all languages.
While all this is going on, we will continue to maintain our
set of pointers to various ETD collections as well as running
a simple federated search system from http://www.theses.org/
We hope that every NDLTD site with a searchable collection is
listed there, or will let me know details so they may be
added.
We look forward to comments from all interested!
Regards, Ed
- - - --details follow- -- --
1. Steps in the process
a) The first, beta, phase is underway now and involves a small
number of NDLTD members that represent a wide variety of
languages, such as Chinese, Greek, German, and Spanish.
Testing will proceed through roughly the end of February.
b) The second, test, phase will proceed through the end of
May 2001. It will involve roughly 15 members of NDLTD that
represent a variety of types of data and languages. There will
be several from the Chinese-Japanese-Korean language group,
several with right to left character sets, and several from non-
English European countries. There will be a range of types of
MARC records (MARC-21, UNIMARC, SWEMARC, UKMARC, etc.).
c) The third, open phase, will include all interested members of
NDLTD, that wish to participate in final open testing. This will
allow a much larger testing and tailoring effort before the service
goes fully public on July 1, 2001.
2. Formats
VTLS will accept 2 formats for metadata. One is MARC
communication format. The other is XML, according to the
specifications that will result from the Jan. 9-10 mtg at OCLC.
Members of NDLTD willing to supply metadata for the union
database testing efforts must indicate
a) if data is in XML, what DTD (e.g., the abovementioned
standard) is followed
b) if data is in MARC, which version of MARC is involved
c) what character set is used
d) number of records provided
e) which of the records have corresponding ETDs, and
how those ETDs are pointed to (e.g., by URL)
3. Identifiers/Updates
All data provided will have some clear scheme by which each
record has a unique control number assigned. These must
reflect clearly the institution providing the data, as well as a
unique ID for the record. Such a scheme is built-in if the
OAI is involved. Note that an OCLC number can be used if
this is desired. This type of mechanism will allow records to be
updated if such is necessary, with a control number ensuring
that the old and new versions are connected together
logically.
4. Contacts
Those interested in participating in the testing efforts should
contact Ed Fox, [log in to unmask] The contact at VTLS for the
effort is Deveron Milne, [log in to unmask] The contact at
Virginia Tech helping with OAI issues is Hussein Suleman,
[log in to unmask] The contact at Virginia Tech helping with
MARC-related questions and user-interface questions is
Robert France, [log in to unmask]
Professor Edward A. Fox, Ph.D.
Director, NDLTD
c/o Department of Computer Science
660 McBryde Hall, M/C 0106
Virginia Tech
Blacksburg, VA 24061 USA
ph +1-540-231-5113, FAX +1-540-231-6075
cellular phone/pager +1-540-230-6266
email [log in to unmask], WWW http://fox.cs.vt.edu
|