Navigation and service

German National Library closed during Easter

18 to 21 April 2025: The German National Library will be closed at both locations. The exhibitions of the German Museum of Books and Writing will open from 10:00 to 18:00.

Leipzig: Wednesday, 30.04.2025

The reading rooms in the main building of the German National Library in Leipzig will close at 14:00 due to an event. The museum reading room, the music reading room and the service area are open until 18.00. The exhibitions of the German Museum of Books and Writing will open from 10:00 to 18:00.

Subject cataloguing of scientific publications by machine learning

Open book with neural network in the background, in the foreground a coloured area with the title of the project.

Project Description

The aim of the project is to develop machine learning models based on the extensive metadata holdings of the German National Library. These models will serve to mathematically catalogue the content of scientific publications in order to understand them at an abstract level and establish relationships between them. This will facilitate content searches and other functions. Free, pre-trained language models are being used as the foundation; however, the project is different from the large language models used by tech concerns. The project is using a training dataset specially created from DNB holdings to work on a model which is to be as streamlined as possible but optimised for the application. The researchers are also developing a web application which will allow users to visualise the proximity and distance between publications in terms of content and search for entries with similar content.

The project was proposed for and is being worked on by Linus Herterich and Max Schaible.

Duration

October 2024 – March 2025

Contact

DH-Stipendien@dnb.de

Last changes: 04.11.2024
Contact: DH-Stipendien@dnb.de

to the top