This entry outlines the purpose and benefits of the University
Repository project.
The key systems
TULIP
Historically, one of TULIP’s functions has been to act as
the University’s publications database, holding bibliographic details of all
publications / outputs of research by university staff.
Bibliographic data held in TULIP are used for purposes such
as REF submissions and providing details of publications for personal web pages
and the PDR system.
The TULIP publications database was designed as an
internal-facing system, rather than as a system to provide bibliographic
details to users external to the university, and does not enforce any specific
format for storing bibliographic details.
The Repository
Institutional repositories are online databases of
publications, and can include metadata-only records (i.e. just the information
about research publications) and metadata-and-full-text records.
The University Repository runs on the EPrints software,
developed at the University of Southampton. It is the most widely-used such
software in the UK.
The Repository is specifically designed to host and make publicly
available bibliographic data in a properly structured fashion. It can also help
make the outputs of university research more widely available by hosting the
full text of outputs.
Interplay between the two systems
The University requires a publications database, a single
source of data on its research outputs (currently provided by TULIP). It also
needs to be able to comply with the open access requirements of the next REF
(for which it will need an institutional repository).
It is inefficient to require staff to enter and maintain
publications details in two separate systems. As such, the repository project
aims to integrate the two systems.
The purpose of the project
The project will see the research publications data
currently held in TULIP transferred into the Repository, so that the Repository
will have all the publications details of the University’s outputs. By the
project’s conclusion, there will therefore only be one system for both holding
details of publication outputs and, where desired/necessary, the full text of
outputs.
Bibliographic data in the Repository will still be available
to be called by TULIP for re-use in other systems, such as the PDR system,
personal web pages, and so on.
This has the following advantages:
- staff need only enter publication details into one system, not two;
- the system they’ll be using is designed to meet the new requirements of an externally facing repository;
- the new system will continue to feed those other systems (PDR pages, personal web pages, etc.) already in use;
- the new system will allow for full text of outputs to be stored on a central University system and, where necessary, made available on an open access basis.
How things will be after switchover
Once publications details from TULIP have been imported into
the Repository, all those details will be held in what is called the Review
Buffer within the Repository. This means that the data are available for
internal viewing by logged in staff, but will not be available on the Repository’s
public interface. The data will still be available to other internal systems
for PDR, web pages and other purposes outlined above. Staff can then choose if
they wish to make their records, or certain ones of their records, publicly
available through the public repository interface.
A word on data quality
As already noted, TULIP was not designed to impose full
bibliographic standards on data collection.
One example of this is the Author field, which was collected
as a single free-format text string, rather than a list of individual names (as
in the new Repository). Although most records have been split into individual
names for the data transfer process, in some cases names such as “de Bolins” or
“van Houten” may not be migrated correctly; et
al may be given as an actual author name; and so on.
Whilst starting with fully cleaned data would be ideal, this
would require significant manual intervention which would therefore add a
considerable delay to the project, so we have chosen to prioritise the release
of the new system.
Into the future
After switchover, the Repository will become the University
publications database, and bibliographic data on staff outputs will need to be
entered into it. This can be done manually or by simple import from other
sources of bibliographic data, such as lists of DOIs, PubMed IDs, and other
bibliographic data sources. As outlined earlier, other systems that use this
data (e.g. profile pages) will continue to work as before.
No comments:
Post a Comment