Thursday, 19 June 2008
Marching to the Byte of the Drum
Hi - I'm Scott, an assistant on the Institutional Repository Project.
One of my main tasks on the project is to carry out usability studies that will inform and guide some of the technical developments of the project. As a team we are mindful that the repository services that are established must fit into the average academic's natural working practices as seamlessly as possible. This involves us going out talking to academics, finding out as best we can what their 'natural working practices' are; and so I've spent a lot of time lately asking academics questions about the self-archiving of their scholarly work and how much they know about the issues surrounding Institutional Repositories.
Scurrying around the campus with my paper questionnaires I've started to collect a small but growing body of data that I will hopefully be able to collate into nice, neat pie-charts and graphs. We have a core of 'early adopters' who are helping us with these studies but, as invaluable as they are, they cannot provide us with usability data for every possible research output that the University creates.
It was for this reason that I contacted our Music Librarian here at John Rylands University Library for a list of friendly music types that I could go and have a chat with; questionnaire at the ready. None of our 'early adopters' produce the kinds of research outputs that the Music School create and so I was pleased when three music lecturers agreed to meet with me.
After I had met with the three lecturers I sat down with my questionnaire results and tried to pull together some recurring themes. Three respondents is clearly not enough to make sound scientific conclusions but it was enough to get a flavour of musician's attitudes to open access and self archiving.
Early Conclusions
- Music has fundamentally different concerns to published literature where open access is concerned.
- It is difficult to describe music.
- There is an enthusiasm for a central repository which can create lists of research outputs.
- Technology is fast transforming modern music research outputs.
The first theme is about royalties. An academic who makes his published articles open access is not jeopardising his revenue stream. A musician who makes his music open access most certainly is. It is for this reason that the vast majority of music research outputs will most likely be either metadata only or have the attached sound file embargoed indefinitely. The key concern then, as far as the repository is concerned, is to provide the researcher with the ability to describe the music as best they can (and perhaps to link to where the music is available to purchase).
This question of description is in itself a complicated business. I took a list of metadata fields with me to the lecturers and asked them which should be absolutely mandatory, which were not necessary, and which required amending. If it wasn't already, it became very clear that it can become a wickedly complex task to satisfactorily provide the necessary fields to describe all of the varying types of potential musical outputs. Also, there are myriad ways a person can be related to a piece of music. Should the date relate always to the premiere or the publication or both? Is a new work of Beethoven's Fifth really a new work or a new manifestation of an old work?...There are many tricky issues to tackle when creating the submission form for a musical research output and the solutions to these issues are as yet unclear.
What did become clear was that at present there is a strong desire for the kinds of functionality that the repository hopes to provide; key of those is the ability to create dynamic lists of research outputs to display on personal websites for example. All three musicians agreed that this would also be useful from an administrative point of view as a way to list the entire research output of the school.

Dr. Ricardo Climent was one of the lecturers kind enough to meet with me to discuss Institutional Repositories - he also gave me a guided tour of the new NOVARS research centre building on Bridgeford Street (pictured, right) which is a state of the art facility for electroacoustic composition, performance and sound-art research programmes. He explained to me that they had been recently investigating the possibility of the Research Centre setting up its own subject repository.
The centre creates large sound files up to a gigabyte in size which currently have no single location to be stored. These files are large multi-channel files over a gigabyte in size and so would fall outside the remit of the Institutional Repository, but the metadata could certainly be stored in the central repository with a link to the source file. As Dr. Clement and I talked it became clear that technology was driving change at a fast rate and that organising and sharing these new kinds of music files was posing challenges not encountered before.
It will be an interesting process exploring the best ways to deal with the unique problems that musical research outputs pose in terms of repository submission, display, and re-use. Not least because it will delay us having to deal with anthropological audio-visual material!
Monday, 16 June 2008
Choosing software for the University of Manchester's institutional repository - Part 2
In Part 1 of this series, I finished with a question, "who are we and what are we trying to achieve?".
This is my answer.
Who are we [The University of Manchester]?

The University of Manchester is a large and complex organisation.
The University employees around 11,000 staff. Of these, 3,500 are academic staff and 2,000 contract research staff. The number of registered postgraduate students is around 8,400 (3,600 research, 4,800 taught), with 4,000 students graduating annually (900 research, 3,100 taught)
Our research encompasses a wide range of disciplines, including biomedical and life sciences, engineering, physical and theoretical sciences, the arts, social sciences and business studies. See our Research activities for a comprehensive list.
The University is structured into 4 major academic faculties. Faculties are organised into 22 academic schools. These academic departments are complemented by 11 research institutes, which cross academic boundaries and incorporate strengths into core research priorities.
The University further supports a range of cultural assets, including Jodrell Bank Science Centre, Manchester Museum and Whitworth Art Gallery.
Research, teaching and learning activities are supported by a range of administrative services. Of particular relevance, are John Rylands University Library and the Information Technology Services.
What are we [the Institutional Repository Project] trying to achieve?
Our Institutional Repository (IR) Project is well defined. It has agreed aims, scope and outline plan (see Project scope and deliverables and Project outline plan).
The Project aims to sustain and enhance research reputations by establishing institutional repository services. We consider a repository as a place where individuals can store, manage, preserve and disseminate their scholarly work.
Currently, we can only estimate the types and numbers of scholarly work University authors create annually. The Project aims to support research works in four main categories. Our current estimates for these are.
- academic publications: ~7,000 - 11,000
- theses and dissertations: 900 PhD theses, 3,100 Masters dissertations
- grey literature: probably <5,000
- audio/visual materials: probably <2,000
Hence our best guess is, University authors create around 15,000 to 20,000 pieces of scholarly (research) work annually.
Project deliverables are,
- D1. Stakeholder engagement and awareness
- D2. An Institutional Repository Services Support Network
- D3. A set of repository technologies
- D4. A governance and sustainability plan
- D5. A functional institutional repository
The Project is well supported (three staff for 2 years) and, critically, benefits from senior management buy-in. The Project has an engaged community of stakeholders represented by an Academic Sponsor, a Steering Group, an ETD Working Group, a Technical Advisory Group and 9 early adopters (see Project structure).
Conclusion ... we have the technology

In summary, the University is large, complex and well-resourced. Our Institutional Repository Project is well-supported and clearly defined.
What does this all mean in terms of choosing repository software?
We need sustainable software. With senior management buy-in, the IR Project is a window of opportunity. It represents a long-term committment both in terms of supporting the University's research community and preserving digital scholarly works.
We need scalable software. Software will need to support a significant number and wide diversity of digital scholarly work. We have a large user base, with a wide range of expectations and knowledge.
Resource availability and technical expertise are NOT obstacles, within reason. The University has extensive in-house support services that are expert in a range of subjects and technologies. We have a well-resourced IR Project with a defined scope. Saying that, our choice needs to be sensitive to resource usage and existing technical preferences.
We need software that is sensitive to future requirements. Our focus is supporting scholarly communication and specifically research outputs. We have excluded experimental data and teaching and learning materials, both of which are forms of scholarly work. It is sensible that, if possible, we choose software that could accommodate these in some future form.
In the words of Oscar Goldman from the 1970's TV series, The Six Million Dollar Man, "... we have the technology".
Thursday, 12 June 2008
Choosing software for the University of Manchester's institutional repository - Part 1
As part of the University of Manchester's Institutional Repository Project, I and other members of the Project Implementation Team are faced with the daunting task of recommending and implementing a suitable software solution. This article and subsequent articles outline my thoughts and our travels towards this.
The only certainty is uncertainty!

Some might say choosing the right software for an institutional repository is like backing the right racing horse. As with horse racing we have to choose from a set of available candidates. Each candidate has good features to varying degrees. To choose a winner you assess each feature for each horse. This may involve examining a horses past performance based on recorded race results. You add up the good and bad points and hey presto, the winner is chosen!
A standard way to choose software is to undertake a requirements analysis. As with horse racing, this involves writing down a (normally very long) list of requirements. Ideally these should be based on what users want. Then you weight these in some form to indicate how important each is and score each software against each requirement. We add up the scores (weighted if necessary) and the software candidate that scores the highest is our choice.
For institutional repository software, I see a number of problems with this approach to choosing the software (none of which are new to software engineers).
- its apples and oranges - repository software comes in a number of very different shapes, sizes and colours, as a consequence, comparing candidates can be difficult because, effectively, they do the same thing but in different ways
- it can be difficult to know what is exactly around the corner - repository software is evolving rapidly with developer communities continuously launching new and better features
- its all still very new - institutional repositories are still a relatively new subject to your average academic user making it difficult to identify requirements
- you need to see it working - determining if a particular product does what you want is difficult from documentation alone, because documentation is often a work in progress; the only real way to accurately know what a piece software does, is install, configure and test it
- its all very time consuming - it can take a lot of time and effort to do requrements analysis, so much so that by the time you have finished, your uses knowledge and expectations have changed and/or a new release of the software has launched such that you need to restart the process
- we need it now! - its common that delivering a product like an institutional repository has to be done with limited resource and within a certain timeframe; the more time you take selecting your software the less time you have to implement and test it
Others have concluded the same to varying degrees. Useful articles in this respect include,
- in April 2004, Jody DeRidder published the article "Choosing Software for an Institutional Repository" in which she argued for consideration of scope and future interoperability
- in August 2004, The Open Society Institute published the "System Feature and Functionality Table" which attempts to explain the relevance of system technical features in the context of a repository's broader planning, design, and policy framework
- in November 2004, Chris Taylor compared a number of OAI-PMH 2.0 compliant software solutions in his paper, "Criteria for choosing repository software"
- in December 2005, Andy Powell published "Notes about technical cirteria for evaluating institutional repository (IR) software" in which he outlined a number of issues to be addressed when making a choice
- in August 2006, as part of the Open Access Repositories in New Zealand Project, Richard Wyles published a Technical Evaluation of Research Repositories - probably the most comprehensive comparison of the three main open source repository solutions to date
In summary, I believe its fair to say, the only certainty about selecting repository software is that there is considerable uncertainty!
What are we to do?

First lets throw away the racing horse analogy. Choosing repository software is not about winning and loosing. The end result is not the best, fastest or brightest product, it is the most satisfactory solution to our situation.
To make an informed choice we need to manage uncertainty, uncertainty in our knowledge of the software, both now and in the future. This is like a balancing act. We can only improve our knowledge incrementally and only when we feel confident enough can we make an informed choice.
Of course we can't spend forever choosing the software. So we need to make our choice with the minimum time and effort, leaving more time to implement our prefered solution.
In my next article I'll focus on "our situation" and answer the questions, who are we and what are we trying to achieve?
Go to Part 2.
Wednesday, 4 June 2008
Welcome to this blog

This is me and my first blog - ironic really, I've only been building websites since 1994. I guess "if you can't beat them join them"!
What's the purpose of this blog?
Currently, I'm managing a 2 year Project within the University of Manchester. The Project aims to sustain and enhance the reputations of University researchers by implementing institutional repository services.
The Project has been running for just over six months now. We've made quite a lot of progress but there's still a lot to be decided and done. What has become clear to me is we need somewhere to share opinions whether they be rambles or rants.
We have a Project website where we post all our formal stuff,
but we need something less formal aswell, hence this blog.
So what might you see on this blog in the coming months?
First I promise (to myself) to post at least one article each month. If I can encourage my colleagues, Scott/Nilani to share their thoughts by blogging here, then you may see other articles as well.
My articles will focus on a range of topics in 'my work world' (no pet or holiday piccies here). These will include things that inspire me, things that bug me and just whats new in the areas of information architecture, institutional repositories, scholarly communication, usability and web technologies.
If anybody reads this stuff, great. If not, well I won't be offended or disappointed, after all its just a blog.
Cheers, Phil
PS. If you want to contact me try p.butler@manchester.ac.uk
Subscribe to Posts [Atom]