Artificial intelligence helps to access manuscript heritage

Summary: The topic of the study is the scientific and methodological context of the European project of basic research READ and application of the results of this research in Slovakia and the Czech Republic. The study is part of the ongoing applications of the READ project. It shows the progress of research, applications and experiments undertaken by the digital humanities international community involved in the READ-COOP association since 2019. Part of these activities is also a Slovak project of applied research with the acronym of SKRIPTOR, planned for 2020-2024. Based on information survey and selection of the latest information sources, there has been some progress in research and applications in the field of OCR. The core of the study is focused on the user-centred rather than IT-based approach to the use of the Transkribus platform for automatic text recognition of historical documents. It describes the experience and knowledge gained in adopting the Transkribus platform that uses artificial intelligence of the OCR machine and the HTR+ method. The study explains and illustrates the main steps of the experiments, the process of training of the machine, the creation of new models of transcription, and the results of automatic transcription of printed Fraktura texts and manuscripts by Andrej Kmeť. The study also presents the first new efficient transcription model for printed historical type of Slovak Fraktur (Gothic) script in the Transkribus platform. First, it explains a unique experiment with the transcription of printed Slovak and Czech Fraktur texts. This is followed by a description of the advanced experimental transcription of Andrej Kmeť’s handwritten letters. It presents the possibilities of making transcribed collections and documents available on local networks and on the Internet.

Keywords: digital humanities, OCR, READ‑COOP, artificial intelligence, Transkribus platform, HTR+, SKRIPTOR project, Andrej Kmeť, schwabacher, fraktur, antiqua, read & search

Dec 28, 2022