Digihist24 Basel

On a solid ground. Building software for a 120-year-old research project applying modern engineering practices

Bastian Politycki/Christian Sonder

There is no doubt that «de[r] zunehmende[] Einsatz von digitalen Tools und computergestützte Methoden in geistes- und kulturwissenschaftlichen Fächern»[1] opens up an almost infinite number of new possibilities. At the same time, it is becoming increasingly clear that this creates new problems for the humanities. Many software solutions are often 'quick hacks' – changes to them are time-consuming, lead to errors, and the sustainability of the solution itself is overall questionable. Digital editing projects – which are mostly based on TEI-XML – face this challenge from the beginning: The 'TEI standard' is rather a loose collection of recommendations, which necessitates the development of a customized schema (a TEI subset) for project-specific data, so that the edition or encoding guidelines can be enforced and their compliance checked. These machine-readable rules must be supplemented by human-readable guidelines which document the fundamental philological decisions and can be used as a manual for the editors.

 

The development of such a schema – and the associated workflows – becomes particularly relevant in the context of long-term projects, such as the Collection of Swiss Legal Sources (SLS). Changes to the schema require a continuous conversion of existing datasets. It must be ensured that:

1. no unintended side effects occur.

2. it is clear which file must be validated with which version of the schema and how it should be processed.

3. human-readable and machine-readable documentation are always in sync.

 

The contribution addresses how practices of modern software development, such as versioning or test-driven development (TDD), can be profitably used for humanities projects. It presents the entire workflow beginning with the creation of a modularized schema for a complex text corpus, which includes texts in German, French, Latin, Italian and Romansh from the 6th to the 18th century, up to the automated verification and publication of the documentation/schema.

 


[1] Manuel Burghardt, Claudia Müller-Birn: «Software Engineering in den Digital Humanities». In: 50 Jahre Gesellschaft für Informatik–Informatik für Gesellschaft, Workshopbeiträge der 49. Jahrestagung der Gesellschaft für Informatik. Kassel, p. 75.