Thorvaldsen G. - Automating Historical Source Transcription with Record Linkage Techniques. Work in progress on the 1950 census for Norway


Abstract: The article addresses the issue of transcribing handwritten materials of the 1950 Norwegian Population Census. These are 801 000 scanned double sided questionnaires. Optical character recognition programs have been improving for over four decades.  Now researchers aim to extend similar techniques to handle handwritten historical source material. The article analyzes studies carried by the Center of Historical Documents at the University of Tromsø which address handwritten text recognition as well as considers the use of various text recognition techniques as far as nominative sources are concerned. Since it is difficult to distinguish and separate individual handwritten characters, the words are mathematically clustered according to image similarity or searched for within sources that have been transcribed earlier. After the recognition quality control, the software uses the line numbers to place the information taken from the transcribed cells. After that the latter become a part of the census database. Moreover, special software has been developed to process handwritten numerical codes, data on occupations and education, etc. The methods offered in the article provide for handwritten texts transcribing quality improvement and can be used to recognize nominative source notes in Russia, for instance, parish registers and vital records. The main goals are still the search for methods and algorithms which optimally link different variables as well as the rationalization of interactive proofread methods.  
Lyakhovitskii E.A., Tsypkin D.O. - Infrared Text Visualization to Study Old Russian Scripts


Abstract: The article studies the script as a material object that is the system of traces left by a writing medium on a writing material (paper or vellum). Traces of the writing medium are a combination of a relief and a dye (for instance, ink). The text understood as a combination of such traces is characterized by different dye thickness and its chemical composition on different text structure levels. Such differences are determined by varying aspects of the writing ability and can be used to characterize it. The article aims at presenting the advantages of a new electro-optical spectrozonal examination of historical inks to study handwritten scripts. It discusses the technology of digital visualization of documents in the near-infra-red region followed by computer processing of the image. The result of the work is the main research paths to study information potential of the text as a physical object (system of traces) by means of spectrozonal visualization. These paths are the study of writing medium traces to reconstruct the system of movements and the writing technique, the finding of zones written in different time and the search for corrections.
Thorvaldsen G. - Record Linkage in the Historical Population Register of Norway


Abstract: The historical population register of Norway contains data on the country's population from 1800 to 1964. Information on the country's population from 1964 to the present is collected in the Central Population Register. The historical register consists of these metric books and civil records, filling in the gaps between population censuses conducted every ten years. In 1801 and, beginning in 1865, these censuses were nominative, that is, contained the names of people. This article is devoted to the problems of linking census records and metric books (record linkage) from 1800 to 1920. Special attention is paid to the identification of individuals and the difficulties of linking records. The main problem is to identify a person by the records belonging to different years, in terms of a significant number of namesakes and variations in the fixation of their names, as well as age. The creation of stable identifiers for individuals and the procedure for linking records from various sources required the development of new software combining automatic and manual methods. Analysis of local databases allows us to hope for successful linking from 2/3 to 90% of records for various periods and regions of the country. The historical register of Norway is unique in its coverage of the territory and the variety of historical sources related to it.
Bryukhanova E.A., Eremin A.A. - 1897 Census Primary Data Representativeness: Cartographic Approach


Abstract: The authors assess how 1897 Census papers stored in Russian and foreign archives are represented and preserved. The study of primary data document collections leads to a conclusion that the term “census papers” is heterogeneous and includes several different forms used depending on a type of household and region as well as first, second and third copies of census forms. A peculiar feature of the article is the presentation of conclusions in the form of cartograms based on modern and historical maps. The study has used source studies analysis and spatial analysis as well as a complex approach treating census papers as a unified historical source irrespective of their storage place. The research novelty is identification and introduction of a complex of nominative 1897 Census data. In addition, the authors propose an original approach that takes into account both the number of areas populated and the number of census papers preserved in them which allowed them to assess the degree of preservation of census materials in Russian Empire uezds. The article concludes that census papers with different preservation state have been identified for 47 % of guberniyas and 25.5% of uezds.  Census paper collections cover regions of European Russia and Siberia, partly those of the Caucasus and Central Asia. The volume of census paper data preserved and their "territorial spread" allows one to consider them a complex source on the history of the Russian Empire population at the turn of the 19th century..
