Keep Asking Questions: Catalogue searching – the three prongs of optimisation
In a previous blog post Abby Wharne, the Asa Briggs intern at The Keep, explained how searching our archive catalogue can be different to searching on Google or other popular search engines. Abby mentioned some of the ways that we have tried to improve our catalogue to make it more accessible and I want to elaborate more on this in this post.
The process of optimising our catalogues is three-fold:
- Optimise descriptions in catalogue records
- Improve the mechanics of the search engine
- Create curated content on the website.
The first is really an extension of one of the archivists’ core roles – to describe archival documents in an easy to understand fashion so that people can find them. Most of our records have been catalogued with the aim of having them published in a printed hand-list, but this is problematic now that the records are online.
As an example, let’s take a look at Baptism registers for the Parish of Glynde: If you wanted to find these using a hand list you would pick up the Parish of Glynde hand list and find the series called “Incumbent Registers (1558-1994)”, within this series there would be a sub-series called “Baptism registers (1813-1899)”, within this you would see the record “Baptism register (Jan 1813-Jun 1899)” and note the reference number to order it. When this hand list is transferred to an online environment it is represented by separate records in a hierarchy:
Parish of Glynde (record representing the hand list)
Incumbent Registers (1558-1994) (record representing the series)
Baptism registers (1813-1899) (record representing the sub-series)
Baptism register (Jan 1813-Jun 1899) (record representing the physical register)
See our FAQ :What is the hierarchy? for a more detailed explanation of the hierarchy.
If a researcher finds this record by browsing the Parish of Glynde hierarchy they know the register belongs to this parish, but if they do a search for “Glynde register”, they won’t find it because there is no record in the hierarchy with the words “Glynde” and “register” in it. The researcher must go to the top level of the hierarchy and then browse to the register. Browsing the hierarchy depends on the researcher being familiar with the hierarchy in the first place. Not so difficult for the seasoned researcher, but certainly very different from the immediacy of a Google search that an archive novice might be used to.
To improve this, we need to change the lowest level catalogue record to read “Parish of Glynde Baptism register (Jan 1813-Jun 1899)” to make the record show up in a search for “Glynde registers”.
The second job is to optimise the mechanics of the search engine and as an example we have improved its ability to recognise dates and date ranges.
Before the optimisation a search for records with a date range of “1850-1870” would only bring up records that specifically had “1850-1870” in them. After the optimisation, the search engine delivers records with any date in that range e.g. 1859 or 1869 and also from 1874 as there is a 5 year buffer on either side of the date range. A search for “Early 17th cent” would only find records with the specific text “Early 17th cent”, after optimisation this search also find records with specific dates such as 21st March 1603 or 21/03/1603.
The third job of optimisation is related to curated content.
Curated content is new content that we have added to our web pages that sums up a particular subject, person or place and provides links to related catalogue records. As Abby mentioned in her post we are in the process of creating Places pages for every parish in East Sussex. So rather than doing a search for a place you can look the place up on the places page that can lead you into a particular area of research relating to that place. We also have other pages such as Leonard Woolf Papers and German Jewish Collections that explain more about the person or subject area and can lead you into related catalogue records. We aim to create much more curated content like this in the future to offer researchers a more targeted way to conduct their research.
These three methods of catalogue optimisation will over time make our catalogues easier to use and bring more useful search results to the top of the results page. Having just over a million records to deal with means that this is a large job and as we learn more about how researchers search our catalogues we can make changes to the search engine to improve their experience. Google has been going since 1998 and are a multi-billion dollar company; our resources don’t stretch that far so it might take us a little while to catch up!