Dr. Christy Henshaw is the Digital Production Manager at The Wellcome Collection, London. Dr Henshaw spoke to us about her role in digitising the Collection’s materials and how they can be accessed and enjoyed
The Wellcome Collection holds a wonderful trove of images for artists and creators to use as an inspirational resource. The Wellcome team are in the process of digitising a significant portion of their holdings and making the content freely available under CC-BY license standards.
We spoke to Dr. Christy Henshaw about her work and how the resource can be used by our community
Hi Dr. Henshaw,
Thank you very much for taking the time to answer our questions, we’re looking forward to sharing your interview and the fantastic resources available at The Wellcome Collection!
Could you tell us about the Wellcome Collection and the background to the digitisation project?
Wellcome Collection is a museum and library focused on the history of medicine and health.
Wellcome Collection is a department within the larger organisation (Wellcome) that is one of the world’s largest charities, spending about £1bn per year on medical-related research and initiatives, both in science and the humanities.
We have two ways to access images online: one is is currently available on our Wellcome Collection image search. This is based on a collection of scanned prints and negatives, born digital images, and biomedical and clinical images given to us by individuals and organisations.
We currently make high-resolution images available under a CC-BY licence. However we are planning to stop using a CC licence for anything that is known to be out of copyright (using Public Domain Mark instead), and where items are old but of dubious rights status (e.g. we are not sure if the item is in or out of copyright, or who the rights holder is, it might be anonymous or orphaned), then we will not assign a licence, but will instead make the end user aware that the item may be in copyright, and to use the image in accordance with copyright law (e.g. under an exception). We haven’t yet been able to make these changes, as there are over 100k images to try to deal with, so for now the CC-BY licence stands.
In many cases we have specific permission to use a creative commons licence from a rights holder, and this could be CC-BY, or one of the more restrictive CC licences (such as CC-BY-NC or CC-BY-NC-ND), so it’s always a good idea to look at the licence information before use.
In 2007 Wellcome came up with a strategy for larger-scale digitisation of its physical collections. That’s when I came in. The picture library had a lot of visually-interesting images, but now the plan was to do what we call cover-to-cover digitisation, a facsimile of our library collections to provide a real alternative to visiting in person. Entire books, manuscripts, archive collections would be digitised. We also digitised a large proportion of our video archive. These are available via the Wellcome Library catalogue on the Library website.
We’re curious to find out how you started your career, and what path led you to working at the Wellcome Collection?
I originally studied to be an artist, then a historian and archaeologist, but life had other plans!
My first exposure to digitisation was in a temporary job in 1999 as an undergraduate student at St Andrews University in Scotland. One of the very first publicly funded digitisation projects in the UK, the images were made available online by SCRAN. I spent months aligning photos of Scottish churches on a document scanner, and believe it or not, they are all still online. A few years later, when I finished my MA in archaeology at UCL, I got a position at Lambeth Palace Library digitising 14,000 architectural plans for the Church Plans Online project. It was in this role that I really started to learn about cultural heritage digitisation technologies and standards. I also came to love working with historic archive materials.
When that work was finished, I started my PhD at UCL part-time and continued working at Lambeth Palace Library in a freelance capacity. I set up their digital reprographics service, fulfilling orders for new photography and working on a few projects for places like the School of Oriental and African Studies, St. Paul’s Cathedral and the Society of Antiquaries.
I enjoyed working in the archives sector immensely. Looking back, perhaps my enthusiasm was partly due to the contrast between learning on the job as a natural progression of skills, and the pressure of formal university studies. Everything I learned about digitisation had logical, practical applications to my work, with tangible benefits. I was working to create a greater impact for others, not to satisfy examiners. I found the world of archivists and librarians to be a more collaborative environment than academia, and I wanted more of that.
In 2007 I applied for a Digitisation Project Manager job at Wellcome. I hadn’t yet finished my PhD, so taking on a full-time job while writing-up was a bit daunting, but I couldn’t pass up the opportunity to further my career in digitisation. My job title has changed a few times since then, landing on Digital Production Manager. The programme has grown considerably from those early years. I now have a team of nine fantastic people. Together we manage the end-to-end workflows for digitisation from retrieval and preparation of physical items, to ingest of the digital assets into our digital library system. The role keeps bringing me new challenges and learning opportunities 11 years later, and I am incredibly fortunate to be able to work here.
The project to digitise the Library of the Wellcome Collection was originally started in the 1990’s which seems like an enormous undertaking considering the technology that was available at the time, and of course, the breadth of the material. What were the original motivations for embarking upon the digitisation project, and have they evolved over time?
For decades, the Wellcome had a “Medical Photographic Library”, a resource of historical and clinical pictures for researchers, media, and the public. With the advent of digital reproduction methods in the late 1990s, this needed to evolve to provide greater ease of access to our huge picture collection. Between 1999 and 2002 tens of thousands of our photographs and transparencies were scanned in a massive digitisation project. We started collecting digital biomedical images. The photography team started using digital cameras for new photography.
By the time I arrived on the scene in 2007 all those images were stored on servers and a public-facing online search tool was in place. Although in the early days the quality was far lower than today, this was a significant improvement in access for staff and researchers using our images.
In 2007 Wellcome decided to digitise the library collections to provide an alternative to visiting us on-site, and to facilitate new uses of our collections. To do this we needed to create complete facsimiles of items on a large scale. When I started, mass digitisation meant 100,000 images or so per year. In 2010 Wellcome’s Board of Governors approved a multi-million-pound budget to scale up digitisation significantly. We are now digitising four to five million images per year, with 37 million images now stored in our repository. The digital library holds tens of thousands of books and other printed materials, around 1,000 manuscripts, thousands of visual materials including paintings and drawings and ephemera, dozens of archive collections and 1,000 titles from our time-based media archive.
We still have a lot to do, and I expect many more years of digitisation before we can truly say we’ve completed our goal of “free and unrestricted access” to our collections - a concept described by Tom Scott, head of Wellcome’s Digital Engagement department, in this Medium post.
The range of topics available from September 2018 vary from Medieval and Early Modern Manuscripts to Nineteenth Century Books and Medical Officer of Health reports. How do you decide what material in which order will be digitised?
This has varied over the years in practice, but selection is based first and foremost on benefits to our users, then alignment with organisational priorities, our collection strengths and feasibility. The bulk of our digitisation is format-based. There is no need to select specific books if we can work our way through all of them in a few years. We are digitising all our medieval manuscripts and recipe manuscripts, but we can be reactive and prioritise specific items in the workflow if required. My team devises strategies to order the material in the most efficient way to get the items through as quickly as possible. This saves money and increases how much we can digitise with the budget that we have.
Archives and visual materials (such as drawings and paintings) require more careful selection by our collection specialists who know the materials, topics, and audiences well. It will take us a very long time indeed to digitise all our archives and visual materials. They are time-consuming and expensive to move around, assess, conserve and photograph. We cannot afford to digitise these indiscriminately.
We are currently reviewing our selection process to make it easier for staff to understand the criteria and reasoning, and to allow them to more easily propose items or collections for prioritisation.
The Wellcome Collection has made available their API to allow external developers to create apps that search the museum and library collections using the same API that staff use internally. In addition, anyone can access The Wellcome Collection's digital images using the standard International Image Interoperability Framework (IIIF) APIs with support provided for the Image API.
How do you envisage this resource to be leveraged by developers? Can you share any examples of how this has been used?
These APIs are still very new, but we have had some use. I have described some of these in a recent Medium post. Sourcera is a plugin for Google docs providing search, embed and auto-captioning functionality for images available through IIIF APIs. Content aggregators Europeana and the Digital Transgender Archive use our APIs to easily pull metadata and thumbnails into their sites from our catalogue and image repository. These two aggregators are at opposite ends of the scale, with Europeana serving as a pan-European collection representing all types of materials, and the Digital Transgender Archive curating a very specific set of content on a theme.
We are also able to provide access to our data in bulk. Researchers at Birkbeck downloaded a huge number of full-text documents to support development of their research tool Samtla.
Please can you elaborate on your open access policy and what is expected of anyone using the content extracted from your collections?
Wellcome is a strong supporter of open access. Our organisation, of which Wellcome Collection is a part, is a major funder of medical research and related initiatives in the UK and abroad. Wellcome’s charitable mission is to improve health by helping great ideas to thrive. We believe unrestricted access to research is key. It is a condition of funding that research outputs are made freely available, and not exclusively locked up behind publisher paywalls.
This is reflected in our policy at Wellcome Collection to provide free and unrestricted access to our own digital content. Unrestricted does not mean we will allow everyone to do anything they want with everything we have; it means we will remove what barriers we can. Licensing arrangements, impediments in search functionality, inefficiencies in digital asset transfer: we’re tackling such barriers on multiple fronts.
I’m responsible for ensuring we assign correct access and licensing information to our digital assets, and that we make as much content available openly as possible.
We are making images of out-of-copyright works available under a Public Domain Mark. That means that there are no known rights restrictions. When we digitise a book or a painting that is out-of-copyright, we don’t assert any rights over the images that we create, and we don’t charge for access. Anyone can use these images for any purpose. In some cases we assign terms and conditions in the form of standardised Creative Commons licences. This applies when we own the rights to the original item (such as a Wellcome publication), or when an author or publisher has assigned a licence to their work. We will impose access restrictions on unpublished material containing sensitive or private data.
One of my tasks for 2019 is to audit our digital assets’ licensing status and correct some legacy practices that no longer comply with our current policy.
What are the goals for the project within the next year?
I’m excited about next year because we are going to see some major changes in our discovery environment. This will make it much easier for people to find the digitised content that we’ve spent years creating. For example, we’ll be making better connections between related items and creating a more intuitive search and browse interface.
We are also moving all our systems and process to the Cloud. High throughput digitisation requires an efficient and robust environment for digital asset management. Moving to the Cloud will improve performance and scalability, making life a lot easier for my team. It will also open up new opportunities for analysis of data. An example of this would be programmatically identifying and classifying illustrations contained in all those books we’ve digitised. This will allow users to search imagery we didn’t even know we had.
We will also continue to add more digital assets to our online resource from across the library’s collections. In addition to our existing workstreams, in 2019 we will start digitising our collection of 1,200 oil paintings. Some of these have never been photographed before, but for most we will be replacing old low-quality images with up-to-date high-resolution files.
Finally, what is your favourite subject or piece in the Wellcome Collection Library?
We have many amazing things here at Wellcome, but that’s an easy question for me to answer because I have a long-standing fascination with our recipe manuscripts. This collection, a unique window into domestic life and culture in the UK and Europe, is a real treasure. Some of these manuscripts have been handed down through families over generations with each generation adding their own cookery recipes or medicinal cures, sometimes crossing out previous ones that didn’t quite satisfy. It is so easy to get lost in these recipes, both weird and wonderful.
In my very first year at Wellcome I project managed digitisation of our 17th century “receipt books” as they are commonly known. We are now in the middle of a push to get the entire collection, ranging from the 16th – 19th century, online. Several hundred are now available, and we have about 100 left to go. I have a dream to create a huge digital text-based resource of household recipes from a variety of repositories around the world – if only we could digitise and transcribe them all!
Here’s a prime example of a 17th century recipe book. Do you dare try any of these recipes? I haven’t - yet!
Thank you to Dr. Christy Henshaw and the Wellcome team for answering our questions. We’re very grateful for the time you have given to us, and the valuable resource you are creating.