Abstracts

Participants and titles

Bentein, Klaas: Studying ancient interaction ritual: a comparison of two Greek corpora
Brown, Joshua: Formulae as a source of linguistic innovation in Renaissance Italian: the salutatio and conclusio as loci of language contact
Chioni, Irene: Request Formula Variations in Greek Petitions from Roman Egypt: Typological Classification and Sociolinguistic Interpretation
Cichosz, Anna & Pęzik, Piotr: Towards a combinatorial dictionary of historical English
Cook, Samuel Peter: Cross-linguistic formulae and the study of contact-induced language change: the case of Greek and Coptic legal formulae from 6th–8th century Egypt
di Bartolo, Giuseppina & Marchesi, Beatrice: Negative concord in Postclassical Greek: Impact and functions of formulaic expressions in documentary papyri
Di Pasquale, Daniele: Shifts and Standardization of Language Patterns in Korean Old Vernacular Epistles (언간, 諺簡, Ŏn'gan)
Drigo, Jasmim: Calques or Formulaic Expressions? An analysis of calques in Early Irish religious language
Elalfy, Doaa: The Canonization and Transmission of Collyria (اشياف) in Greek and Arabic Medical Traditions: Lexical Shifts and Formulaic Patterns
Elder, Claire: The Fermesse, Fidelity and Faith: A Pragmatic Analysis of the Formulaic Symbols in an Early Modern Scottish Community of Practice
Fantoli, Margherita & Korkiakangas, Timo: Exploring formulaic language in dependency treebanks using network analysis
Fascione, Sara: Formularity and idiosyncrasy in Fronto’s letter headings (panel B: Formulae in Latin Epistolography)
Fezer, Katharina: Tracing and comparing formulae in printed and handwritten texts: Methods, issues, challenges
Frog: Translating Formula: Formula as a Universal Concept or a Concept on the Move?
Giannikou, Kyriaki: Assessing and Reassessing Formulaicity: are editorial practices a blessing or a curse?
Ginevra, Riccardo, Biagetti, Erica, Brigada Villa, Luca & Zanchi, Chiara: Comparing Indo-European Poetic Languages: How to Combine Construction Grammar and Digital Resources for the Analysis of Formulaic Phraseology in Vedic Sanskrit and Homeric Greek (30 min)
Groot, Hester: Identity construction and genre shift through formulaic language in Scottish pauper letters, 1750-1900
Große, Sybille: Formulaicity in French letters: function and acquisition in theory and empiricism
Honkanen, Saara: Formulaicity in Medieval Latin Historical Prose: the Case of Freculf of Lisieux
Jauhiainen, Tommi
Kaislaniemi, Samuli: Address formulas and material practices in seventeenth-century English letters (30 min)
Kayachev, Boris: ‘Roses are red and violets are blue’: poetic language between formulaicity and intertextuality (the case of purpureus)
Kootstra-Ford, Fokelien: Formulaic variation: Leveraging formulaic language to understand linguistic variation in Dadanitic inscriptions (6th–1st c. BCE) (panel A: Shared formulae, continuity, and change in the epigraphy of Northern Arabia)
Kopaczyk, Joanna: Studying formulaic language in historical linguistics
Korkiakangas, Timo: Mitigating formulaicity bias in historical corpus linguistics
Koroli, Aikaterini: Stereotypicality and variation in Greek private papyrus letters: a focus on stereotypical directive speech-acts
Longrée, Dominique & Vanni, Laurent: New Ways to identify Formulaic Expressions in Latin Epistolography: Between Statistics and AI (panel B: Formulae in Latin Epistolography)
Maczuga, Julia: The religious formulae attested in the Arabic graffiti from North-West Arabia during the Late pre-Islamic and Early Islamic periods: A study in continuity and change (panel A: Shared formulae, continuity, and change in the epigraphy of Northern Arabia)
Majdak, Magdalena: Evolution of the Formulaic Expressions Referring to God in Polish Language History: Analysis of the Correspondence of the Czapski Family
Marszałek, Jagoda & Wieczorek, Aleksandra: Polish and Latin date formulas used in Polish texts from 17th to 18th centuries
Martín González, Elena & Konstantopoulou, Stavroula: Formulaic Language in the Oracular Inscriptions of Dodona: Integrating Traditional Epigraphic Analysis and Deep Neural Networks
Meeder, Sven & Schmidt, Gleb: Formulae of Authority: Formulaic Aspects of Referencing the Bible in Early Medieval Canon Law
Mika, Tomasz: Division of Old Polish Apocrypha: Title Formulas in the Middle Ages (30 min)
Murgia, Giulia & Puddu, Nicoletta: Notarial Formularies in Early Modern Sardinia
Mäkinen, Martti: Exploring formulae through stylometric analysis of Middle English documents
Norris, Jérôme: The highly formulaic nature of epigraphic habits in North-West Arabia before Islam (30 min; panel A: Shared formulae, continuity, and change in the epigraphy of Northern Arabia)
PANEL A: Norris, Jérôme, Maczuga, Julia & Kootstra-Ford, Fokelien: Shared formulae, continuity, and change in the epigraphy of Northern Arabia
PANEL B: Longrée, Dominique, Vanni, Laurent, Fascione, Sara, Rosa, Arianna & Thon, Valérie: Formulae in Latin Epistolography
Rodek, Ewa: The Role of Keywords in Building Sender-Receiver Relationships: A Case Study of Polish-Language Texts from 1600-1750
Roldão, Filipa & Serafim, Joana: Formulaic Language in Portuguese Municipal Charters of the Middle Ages: A Historical and Linguistic Analysis
Rosa, Arianna & Thon, Valérie: Letters Across Time: a Diachronic Study of the Epistolary Formulas in Cicero, Jerome and Peter Damian (panel B: Formulae in Latin Epistolography)
Salemenou, Maroula: Diplomatic correspondence in the corpus Demosthenicum: an evaluation of authenticity
Scapini, Elia & Iezzi, Federico: Θεὸν εκ θεοῦ: a case study for semantic retrieval in Ancient Greek
Schironi, Francesca: Formulae and Formulaic Language in Hellenistic Greek Astronomy
Soffiantini, Laura: Pliny’s formulaic language in geographical books
Stenroos, Merja: Formulaicity and the individual voice in late medieval English legal statements (30 min.)
Vatri, Alessandro: Aristotle’s ‘diagrammar’: Formulaicity and multimodality in the Organon
Vezzosi, Letizia & Rosselli Del Turco, Roberto: Poetic formulas in the Germanic literatures of the Middle Ages: semantic annotation and analysis
Vierros, Marja
Wong, Catherine, Fitzmaurice, Susan & Lam, Benson: Tracing Formulaic Patterns and Language Change in Early Modern English: A Quantitative and Computational Approach
Wong, Jorge: The Formulaic Template and Linguistic Innovation in Homer
Yiftach, Uri: Teaching AI Greek Syntax: The Taxonomy of the Legal Document as an Experimental Platform
Zilio, Leonardo & Arblaster, Paul: Exploring formulaic language in 17th-century Dutch-language newspaper articles

Abstracts

Bentein, Klaas

Studying ancient interaction ritual: a comparison of two Greek corpora

Formulaic language has received a fair amount of attention in fields such as papyrology and epigraphy, disciplines which work with corpora that typically consist of shorter, relatively repetitive texts; scholarship has catalogued the formulaic phrases that can be found, focusing in particular on specific genres, such as letters and petitions in papyrology (e.g. Mascellari 2012; Nachtergaele 2023 for Ancient Greek texts). To a limited extent, these inventories have been incorporated within a broader ‘ecology of writing’ framework (for which, see e.g. Basso 1989). One approach has been to study the texts as instances of ‘technical’ text types (Fachtexte), which can be characterized in terms of an isomorphy of features, whether structural, formal, contextual, visual, or language-related (see Kruschwitz and Halla-aho 2007 for Pompeian wall inscriptions; and for technical literature, see Fögen 2010). A related perspective, rooted in the Anglo-American tradition, examines ‘formulaic’ genres (Kuiper 2000; 2009), which are characterized by fixed discourse components as well as formulaic phrases cueing each of these components (for an outline of the ‘discourse grammar’ of ancient Greek letters, see Bentein 2023, 433–46).

In this contribution, I intend to develop a third perspective that has received limited attention so far, known as ‘interaction ritual,’ a concept first introduced by Erving Goffman to describe the structured, ritualistic behaviors people use in everyday interactions to uphold social order and manage social identities (Goffman 1967), and more recently expanded by Dániel Kádár (e.g., Kádár 2013; Kádár 2024). Kádár’s typology of ritual interaction offers a set of analytical tools that I believe can be applied to compare different corpora (and the genres within these corpora), providing a deeper understanding of formulaic phraseology. I will apply four features identified by Kádár to examine two corpora that I have been studying for some time: Greek documentary texts preserved on papyrus (Palme 2009) and Byzantine book epigrams (Bernard and Demoen 2019). Despite their chronological and thematic differences, these corpora share a strong formulaic element that can be more deeply explored using Kádár’s interaction ritual framework. The four features of focus are as follows:

Social Extension of the Ritual: Kádár’s framework allows us to distinguish between ‘in-group’ rituals—practices confined to a specific social group—and broader social rituals, which are widely recognized and accessible across society.
Pragmatic Complexity: Kádár distinguishes between ‘interactional complexity’—the range of communicative acts involved in a ritual—and ‘relational complexity,’ which concerns the depth and intricacy of relationships between individuals engaged in ritualistic practices (compare Bentein 2023b).
Ritual Frame Indicating Expressions: Kádár defines formulaic phrases as ‘ritual frame indicating expressions,’ which signal a ritual context in communication and guide participants to follow its norms and expectations. Often linked to specific speech acts such as greetings, apologies, or farewells, these expressions have a recognizable structure and may include non-verbal or visual cues to help manage the flow of ritualized interactions (compare Bentein and Capano 2025).
Mimesis and Self-Display: Kádár sees rituals as naturally imitative but open to enhanced behaviors, allowing participants to either intensify mimicry or add elaborate displays to reflect social values or showcase relational skill (compare Bentein 2023a).

My discussion will highlight similarities and differences between the two corpora and review how me and my research team have previously studied aspects of these four features (see Ricceri et al. 2023; Bentein 2024 for the relevant digital environments); along the way I will introduce a newly launched project, ANNOPHIS (www.annophis.ugent.be), which aims to develop a machine-learning-based annotation platform for a broad range of historical formulaic text corpora, including papyri, book epigrams, and inscriptions.

Basso, K. H. 1989. “The Ethnography of Writing.” In Explorations in the Ethnography of Speaking, edited by J. Sherzer and R. Bauman, 2nd ed., 425–32. Cambridge: Cambridge University Press.

Bentein, Klaas. 2023a. “A Typology of Variations in the Ancient Greek Epistolary Frame (IIII AD).” In Historical Linguistics and Classical Philology, edited by G. Giannakis, E.

Crespo, J. de La Villa, and P. Filos, 429–72. Berlin: De Gruyter.

———. 2023b“Why Say Goodbye Twice? Repetition and Involvement in the Greek Epistolary Frame (I-IV AD).” In La Correspondance Privée Dans La Méditerranée Antique, edited by M. Dana, 173–206. Bordeaux: éditions Ausonius.

———. 2024. “Socio-Semiotic, Multimodal Annotation of Documentary Sources : Digital Infrastructure in the Everyday Writing Project.” In Digital Papyrology III, edited by N. Reggiani. Berlin: De Gruyter.

Bentein, K., and M. Capano. 2025. “Spacing out Speech Acts. Textual Units and Their Visual Organization in Greek Letters on Papyrus.” In Everyday Communication in Antiquity.

Frames and Framings, edited by K. Bentein. Venice: Edizioni Ca’ Foscari.

Bernard, F., and K. Demoen. 2019. “Byzantine Book Epigrams.” In A Companion to Byzantine Poetry, edited by W. Hörandner, A. Rhoby, and N. Zagklas, 404–29. Leiden: Brill.

Fögen, T. 2010. “Technical Literature.” In A Companion to Greek Literature, edited by E.

Bakker, 266–79. Chichester, UK: John Wiley & Sons, Ltd.

Goffman, E. 1967. Interaction Ritual; Essays in Face-to-Face Behavior. Chicago: Aldine PubCo.

Kádár, D.Z. 2013. Relational Rituals and Communication: Ritual Interaction in Groups. London: Palgrave Macmillan.

———. 2024. Ritual and Language. Cambridge, UK; New York: Cambridge University Press.

Kruschwitz, P., and H. Halla-aho. 2007. “The Pompeian Wall Inscriptions and the Latin Language: A Critical Reappraisal.” Arctos: Acta Philologica Fennica 41:31–49.

Kuiper, K. 2000. “On the Linguistic Properties of Formulaic Speech.” Oral Tradition 15 (2): 279–305.

———. 2009. Formulaic Genres. Basingstoke [England] ; New York: Palgrave Macmillan.

Mascellari, R. 2012. “Le Petizioni Nell’Egitto Romano. Evoluzione Di Formulario, Procedure e Organizzazione Della Giustizia. Documentazione Su Papiro Dal 30 a.C. al 300 d.C.” Firenze: Università degli Studi di Firenze.

Nachtergaele, D. 2023. The Formulaic Language of the Greek Private Papyrus Letters. Leuven: Trismegistos Online Publications.

Palme, B. 2009. “The Range of Documentary Texts: Types and Categories.” In The Oxford Handbook of Papyrology, edited by R.S. Bagnall, 358–94. New York: Oxford University Press.

Ricceri, R., K. Bentein, F. Bernard, A. Bronselaer, E. De Paermentier, P. De Potter, G. De Tré, et al. 2023. “The Database of Byzantine Book Epigrams Project: Principles, Challenges, Opportunities.” Journal of Data Mining and Digital Humanities.

Brown, Joshua

Formulae as a source of linguistic innovation in Renaissance Italian: the salutatio and conclusio as loci of language contact

Formulaicity and formulaic strings provide forms of language that aid letter writers through prefabricated units, either through copying or retrieved as whole from memory (Rutten & van der Wal 2012; Serra 2023). In historical corpora, documentary formulae have been characterized as ‘text reuse templates’, allowing the investigation of historical drift of documentary production and cultural change (Korkiakangas 2023). Formulae may also contribute to aspects of in-group membership and identity formation (Laitinen & Norlund 2012). Less focus has been places on formulae as loci of linguistic innovation (cf. Bybee & Torres Cacoullos 2009), despite the importance of imitation in both language acquisition and in letter-writing literacy.

This paper identifies a series of formulae in merchant letters sent from Milan to various locations around the Mediterranean between 1396-1402, and currently housed at the Datini Archive, Archivio di Stato di Prato in Tuscany. Defining a corpus of 82 letters, written in vernacular, I show how the written correspondence of two merchants, Francesco Tanso and Giovanni da Pessano, represent a ‘hybrid’ linguistic variety, with forms of Tuscan, Milanese, and Latin clearly identifiable (Brown 2024). Tracing the infiltration of 1pl. verb forms in the letters of these merchants reveals that formulae represent a clear focus of linguistic innovation, particularly in the salutatio and conclusio of letters. Part of this case-study is shown as a proof-of-concept for a digital project scaling up to all 810 letters sent from Milan in the Datini Archive.

As with many merchants of late medieval Europe, both Francesco Tanso and Giovanni da Pessano sent large numbers of letters, sometimes in rapid succession. Merchants required access to quick information. In creating such an enormous written correspondence, both writers made use of formulae in their letters, especially in the salutatio and conclusio. Repetition of formulae was one way in which particular forms of language spread quickly and made writing easy. Were specific linguistic items transferred from one variety to another through formulae? If so, what were they? The paper concludes by returning to the question of methodology, and how preprocessing of linguistic data can best be achieved to ascertain the presence of formulae in a ‘big data’ corpus, making some comparison to research in similar domains (Granger 2018; Koolen & Hoekstra 2022).

Brown, Joshua. 2024. Dialect levelling and merchant writing in Renaissance Italy. Special issue of “Journal of Historical Sociolinguistics” ed. by Anita Auer & Joshua Brown. 10:(2) 197-223.

Bybee, Joan L & Rena Torres Cacoullos. 2009. The role of prefabs in grammaticization: How the particular and the general interact in language change. In: Corrigan, Roberta, Edith A Moravcsik, Hamid Ouali & Kathleen Wheatley (eds.) Formulaic Language, volume 1: Distribution and historical change. Amsterdam: John Benjamins, pp.187-218.

Granger, Sylviane. 2018. Formulaic sequences in learner corpora: Collections and lexical bundles. In: Siyanova-Chanturia, A & A Pellicer-Sanchez (eds.) Understanding Formulaic Language: A second language acquisition perspective. London: Routledge, pp.228-247.

Koolen, Marijn & Rik Hoekstra. 2022. Detecting formulaic language use in historical administrative corpora. Proceedings of the Computational Humanities Research Conference 2022, Antwerp, Belgium, December 12-14. 127-151.

Korkiakangas, Timo. 2023. Documentary formulae as text reuse templates: Constat and Manifestus clauses in early medieval Latin charters. Digital Medievalist. 16:1-44.

Rutten, Gijsbert & Marijke J van der Wal. 2012. Functions of epistolary formulae in Dutch letters from the seventeenth and eighteenth centuries. Journal of Historical Pragmatics. 13:(2) 173-201.

Serra, Eleonora. 2023. Learning to write letters in sixteenth-century Florence: Epistolary formulae in the correspondence of Lucrezia Albizzi Ricasoli. Linguistica. 63:(1-2) 273-300.

Chioni, Irene

Request Formula Variations in Greek Petitions from Roman Egypt: Typological Classification and Sociolinguistic Interpretation

A significant number of highly formulaic Greek texts can be identified in the papyrological corpus. Among these, petitions stand out for using similar or identical expressions over an extended period. This linguistic consistency underscores the nature of petitions as a formulaic genre (Kuiper 2009).

While the formulaic language of petitions has been thoroughly examined, particularly for the Ptolemaic and Roman periods (Di Bitonto 1967, 1968, 1976; Mascellari 2021), the variations within these formulas remain largely unexplored, despite their linguistic potential (Kuiper 2000). For example, the study of formulaic variation in letters has resulted in a detailed typology (Bentein 2023), revealing new research opportunities. Similarly, developing a typology of variation in petitions could provide insights into the social contexts and shifts in the power dynamics between petitioners and officials, offering a richer understanding of how formulaic language reflects interactions.

One particularly significant formula within a petition is the request section, as it represents the core communicative purpose of the document—in pragmatics, the “head act” of a speech act (House-Kádár 2021: 105-133). Given its centrality, an examination of this section offers a lens through which the functionality of the text can be better understood.

Previous research has identified several fundamental constituents of the request formula in petitions (e.g., Mullins 1962; White 1972; Mascellari 2021). Building on this foundation, I propose a more refined structure of petitions’ request formula, based on the direct analysis of a substantial corpus of Greek petitions from Egypt, dating from the 1st to the 3rd centuries AD, all containing relatively intact request sections.

The essential constituents of the request formula, resulting from my observation, typically include a performative request verb (most often ἀξιόω) and an infinitive specifying the requested action. Inferential expressions frequently introduce the request formula, while courtesy phrases or honorific titles support the request verb. Occasionally, a final plea is added, anticipating the expected benefit or relief resulting from the fulfillment of the request. Other minor elements within the formula, while not extensively discussed in previous scholarship, play a significant role in contributing to its variation.

By analyzing the frequency and the recurrent structure of these unexplored constituents in the formula, I aim to explore the relationship between normative formulaic structures and their variations. To investigate this, I propose a typology of variations in the request formulae, drawing on the typology for formulaic variation in letter openings and closings proposed in Bentein (2023). I suggest that some patterns of variation can be identified: reformulation, addition, repetition, combination, and displacement, while the presence of others remains uncertain. The focus will then shift to the reformulation pattern, specifically to the use of lexical variants. I will demonstrate that changes in the performative request verb offer a sociolinguistic lens through which these variations can be interpreted (Dickey 2009, 2016). Specifically, I will show how such changes reflect the petitioner’s construction of identity and strategic use of language in relation to the social status of the recipient, as explored in Bentein (2016) about inferential expressions.

Bentein, K. (2016). “Διό, διὰ τοῦτο, ὅθεν, τοίνυν, οὖν, or rather asyndeton? Inferential expressions and their social value in Greek official petitions (I–IV AD)”. Acta Classica, 59, 23–51.

Bentein, K. (2023). “A Typology of Variations in the Ancient Greek Epistolary Frame (I–III AD)”. In G. K. Giannakis, P. Filos, E. Crespo, & J. De La Villa (Eds.), Classical Philology and Linguistics (429-472). Berlin-Boston: De Gruyter.

Di Bitonto, A. 1967. “Le petizioni al re. Studio sul formulario”. Aegyptus, 47, 5-57.

Di Bitonto, A. 1968. “Le petizioni ai funzionari nel periodo tolemaico. Studio sul formulario”. Aegyptus 48, 53-107.

Di Bitonto, A. 1976. “Frammenti di petizioni del periodo tolemaico. Studio sul formulario”. Aegyptus 56, 109-143.

Dickey, E. (2009). Latin Influence and Greek Request Formulae. In T. V. Evans & D. D. Obbink (Eds.), The Language of the Papyri (208–220). Oxford: Oxford University Press.

Dickey, E. (2016). “Emotional language and formulae of persuasion in Greek papyrus letters”. In E. Sanders & M. Johncock (Eds.), Emotion and Persuasion in Classical Antiquity (237-262). Stuttgart: Franz Steiner Verlag.

House, J. & Kádár, D. Z. (2021). Cross-Cultural Pragmatics. Cambridge: Cambridge University Press.

Kuiper, K. (2000). “On the Linguistic Properties of Formulaic Speech”. Oral Tradition, 15(2), 279-305.

Kuiper, K. (2009). Formulaic genres. Basingstoke: Palgrave Macmillan.

Mascellari, R. (2021). La lingua delle petizioni nell’Egitto romano: Evoluzione di lessico, formule e procedure dal 30 a.C. al 300 d.C. Florence: Firenze University Press.

Mullins, T. Y. (1962). “Petition as a Literary Form”. Novum Testamentum, 5(1), 46-54.

Cichosz, Anna & Pęzik, Piotr

Towards a combinatorial dictionary of historical English

Automatic Combinatorial Dictionaries (ACDs) are databases of recurrent word combinations whose status as phraseological units is estimated from distributional criteria such as frequency, degree of binding or evenness of distribution (Kilgarriff & Rychlý 2010, Pęzik 2018). Although a number of such resources have been derived from corpora of Present Day English and many other languages (e.g. Pęzik 2014), the development of combinatorial dictionaries for historical languages is a more complex task. In the context of English, the greatest challenge is the processing of Old and Middle English data. First and foremost, the corpus of OE texts is finite and rather limited in size by the standards of modern reference corpora. As a consequence, there is a large number of multiword hapax legomena whose phraseological status cannot be confirmed distributionally. An additional complication is the inflectional morphology of OE, its flexible word order and spelling variation. In the case of ME, the last factor becomes the top challenge and while the amount of available texts is bigger compared to OE, the number of spelling variants is unprecedented. Such factors, inherent to historical texts, present additional difficulties with the normalization and lemmatization of related forms making up multiword units.

In this paper we aim to address these methodological challenges in order to compile an experimental ACD of historical English. The problem of spelling and morphological variation is solved through the lemmatization of the York-Toronto-Helsinki Parsed Corpus of Old English Prose (YCOE, Taylor et al. 2003), the Penn-Helsinki Parsed Corpus of Middle English (PPCME2, c. 1.2 million words; Kroch & Taylor 2000), the Penn–Helsinki Parsed Corpus of Early Modern English (PPCEME, c. 1.7 million words; Kroch et al. 2004) and the Penn Parsed Corpus of Modern British English (PPCMBE2, c. 2.8 million words; Kroch et al. 2016). We plan to use the resulting morphological dictionaries (following the format of Cichosz et al. 2021, http://varioe.pelcra.pl/), to lemmatize and normalize word forms used in historical records. We also follow a hybrid positional-relational approach utilizing the syntactic annotation of the Penn corpora to generate a database of potential open and restricted collocations, collocational chains, idioms and other types of phraseological units. This method improves the recall of strictly relational collocation extraction techniques by relaxing the relative order of constituents, allowing for the extraction of specific collocations in diverse word order configurations.

An early version of the resulting ACD for OE prose is available in the form of a web application at http://varioe.pelcra.pl/collocations, providing access to over 191,000 recurrent combinations of 18 different syntactic types. The entries are aggregated on lemmatized word constituents and enriched with frequency and text dispersion statistics as well as concordances. As an extension of this project, we are preparing a combinatorial dictionary covering the subsequent periods of historical English. The result of this effort will involve linking the lemmas diachronically to allow users to trace the development of selected phraseological units in time. We believe that such tools will serve as valuable resources in studies of historical English phraseology.

Cichosz, Anna, Piotr Pęzik, Maciej Grabski, Michał Adamczyk, Paulina Rybińska & Aneta Ostrowska. 2021. The VARIOE online morphological dictionary for YCOE. University of Łódź. (Available online at varioe.pelcra.pl/morph)

Kilgarriff, Adam, and Pavel Rychlý. “Semi-Automatic Dictionary Drafting.” In A Way with Words : Recent Advances in Lexical Theory and Analysis : A Festschrift for Patrick Hanks, edited by Gilles-Maurice De Schryver, 299–312. Kampala: Menha Publishers, 2010.

Kroch, Anthony, Beatrice Santorini & Lauren Delfs. 2004. The Penn–Helsinki Parsed Corpus of Early Modern English (PPCEME). Department of Linguistics, University of Pennsylvania.

Kroch, Anthony, Beatrice Santorini & Ariel Diertani. 2016. The Penn Parsed Corpus of Modern British English (PPCMBE2). Department of Linguistics, University of Pennsylvania.

Kroch, Anthony & Ann Taylor. 2000. The Penn–Helsinki Parsed Corpus of Middle English (PPCME2). Department of Linguistics, University of Pennsylvania.

Pęzik, Piotr. 2014 “Graph-Based Analysis of Collocational Profiles.” In Phraseologie Im Wörterbuch Und Korpus (Phraseology in Dictionaries and Corpora), edited by Vida Jesenšek and Peter Grzybek, 227–43. ZORA 97. Maribor, Bielsko‑Biała, Budapest, Kansas, Praha: Filozofska fakuteta, 2014.

Taylor, Ann, Anthony Warner, Susan Pintzuk & Frank Beths. 2003. The York–Toronto–Helsinki Parsed Corpus of Old English Prose (YCOE). Department of Linguistics, University of York. Oxford Text Archive.

Cook, Samuel Peter

Cross-linguistic formulae and the study of contact-induced language change: the case of Greek and Coptic legal formulae from 6^th–8^th century Egypt

The legal landscape of Late Antique and Early Islamic Egypt is characterised by its multilingual nature. Prior to the Islamic conquest of 641, Greek was the primary language of law, with legal formulae in part building on pre-existing models in Demotic. Following the conquest, Coptic became more prominent as the use of Greek decreased, with the largest body of Coptic legal contracts attested from the 7^th and 8^th centuries. Throughout the 6^th to 8^th centuries, the formulae used in both Greek and Coptic legal texts were intrinsically linked, since the contracts in which they appeared represent a single legal system expressed in two languages. Till even goes so far as to describe Coptic legal formulae as a case of Byzantine formulae in “Coptic translation” (Till 1950, 81). This continuity between Greek and Coptic formulae was noted by 20^th century scholars of Coptic legal documents (Lüddeckens 1979; Wenger 1953; Steinwenter 1920; Boulard 1912). However, the only comprehensive discussion of the topic to date is that of Richter (2009), with scholars tending to specialise either in Coptic or in Greek. Drawing on the results of my PhD thesis (Cook 2019), the present paper discusses the use of Egyptian legal contracts of the 6^th to 8^th centuries as a vehicle to study how formulaic language navigates two linguistic systems in a multilingual society. On one hand, I outline the strategies used to express legal formulae in two languages belonging to different language families. On the other, drawing on theoretical models from the fields of historical linguistics and language contact, I identify possible influences from underlying Greek formulae which have led to a new, domain-specific grammatical form: the so-called “performative ⲉⲓⲥⲱⲧⲙ”.

Boulard, Louis. 1912. La vente dans les actes Coptes. P. Geuthner.

Cook, Samuel Peter. 2019. “Linguistic and Legal Continuity in 6th to 8th Century Coptic Documents: A Comparative Study of Greek and Coptic Legal Formulae in Byzantine and Early Islamic Egypt.” Doctoral thesis, Sydney: Macquarie University.

Lüddeckens, Erich. 1979. “Demotische Und Koptische Urkundenformeln.” Enchoria 2:21–31.

Richter, Tonio Sebastian. 2009. “Greek, Coptic and the ‘Language of the Hijra’: The Rise and Decline of the Coptic Language in Late Antique and Medieval Egypt.” From Hellenism to Islam: Cultural and Linguistic Change in the Roman Near East, 401–46.

Steinwenter, Artur. 1920. Studien zu den koptischen Rechtsurkunden aus Oberägypten. Haessel.

Till, Walter. 1950. “Die Koptische Stipulationsklausel.” Orientalia 19 (1): 81–87.

Wenger, Leopold. 1953. Die Quellen des römischen Rechts. A. Holzhausen.

di Bartolo, Giuseppina & Marchesi, Beatrice

Negative concord in Postclassical Greek: Impact and functions of formulaic expressions in documentary papyri

This paper deals with Ancient Greek and aims to investigate functions and role of formulaic expressions involving the occurrence of one or more negative indefinites and/or a negative marker. It is part of a broader research on the diachrony of the Postclassical Greek negative concord system (Gianollo 2024; di Bartolo, Gianollo & Marchesi forthcoming).

The paper combines a quantitative and qualitative analysis and is based on a corpus of documentary papyri of the early Roman Period (i.e., 1st–2nd cent. CE). The analysis includes all the occurrences of the lemma oudeís (‘no one’) which have been extracted using the Trismegistos database, divided by gender and case, and annotated according to the annotation scheme developed by Gianollo (2024) for the syntactic analysis of negation in the New Testament.

First, the paper discusses the methodological difficulties of dealing with a corpus of documentary papyri (e.g., their heterogeneity in terms of register, cf. Palme 2011 and Bentein 2015). Due to the high number of occurrences of oudeís in ‘formulaic expressions’ (cf. Wray 2008: 3–10) encountered in the corpus, our analysis discusses the methodological choices introduced to minimize the impact that such occurrences, given their frequency and their fixed word-order, might have on the diachronic analysis based on quantitative data. Thus, we present the adjustments made to the annotation scheme of data from documentary papyri and the new labels proposed to differentiate between the different realizations of formulaic expressions.

Secondly, this work provides a list of the formulaic expressions found in the corpus together with an overview of the different types of discourse in which they occur. Even though this study analyses only the ou- series of ‘objective negation’ (Chatzopoulou 2012; Willmot 2013), formulaic expressions present a greater interplay than non-formulaic occurrences between this series and the mḗ- series of ‘subjective negation’, giving way to qualitative observations regarding negative patterns that might be favoured in contexts with fixed word-order.

Third, the study shows the impact of these formulaic expressions on the distribution of the different syntactic patterns used to describe the negative system in the corpus.

Finally, it addresses the pragmatic functions of these expressions and their role in terms of discourse segmentation according to some of the research questions of the conference.

di Bartolo, G., Gianollo, C. & Marchesi, B. Forthcoming. The system of negative concord in Postclassical Greek: Evidence from documentary papyri. In: G. di Bartolo, P. Filos, G. Giannakis & D. Kölligan (eds.), tba. Berlin/Boston.

Bentein, K. 2015. The Greek documentary papyri as a linguistically heterogeneous corpus: The case of the katochoi of the Sarapeion-archive. In: Classical World 108 (4): 461–484.

Chatzopoulou, K. 2012. Negation and Nonveridicality in the History of Greek. Journal of Greek Linguistics 13 (1): 149-153.

Gianollo, C. 2024. Negative concord and word order in the Greek Bible and New Testament. In: G. di Bartolo, D. Kölligan (eds.), Postclassical Greek: Problems and Perspectives: 187–223. Berlin/Boston.

Palme, B. 2011. The Range of Documentary Texts: Types and Categories. In: R. S. Bagnall (ed.), The Oxford Handbook of Papyrology: 358–394. Oxford.

Willmott, J. C. 2013. Negation in the history of Greek. In D. Willis, C. Lucas & A. Breitbarth (eds.), The history of negation in the languages of Europe and the Mediterranean I. Case studies: 299–340. Oxford: Oxford University Press.

Wray, A. 2008. Formulaic Language: Pushing the Boundaries. Oxford University Press.

Di Pasquale, Daniele

Shifts and Standardization of Language Patterns in Korean Old Vernacular Epistles (언간, 諺簡, Ŏn'gan)

This study investigates the gradual changes and standardization of formulaic expressions in Ŏn'gan, vernacular (Ŏnmun, 언문, 諺文) Korean letters from the late Chosŏn Dynasty (1392-1897). Initially informal and shaped by oral traditions, these letters exhibit repeated use of certain expressions that, over time, became increasingly fixed and formalized. By the 19th century, this standardization culminated in the publication of instructional letter collections known as Ŏn'gandok (언간독, 諺簡牘), which established more rigid norms for letter writing.

By comparing early Ŏn'gan manuscripts with later standardized Ŏn'gandok materials, the study aims to identify key changes in the formulation and use of recurring expressions. This comparative approach will reveal how these expressions shifted from flexible, context-dependent forms to fixed, standardized elements in formal epistolary conventions. By examining patterns of repetition, ellipsis, and conventionality, the study will assess how these linguistic elements reflected the socio-cultural contexts of the time, including social hierarchy, politeness conventions, and relationships between correspondents.

This study, which employs paleographic expertise to decode the manuscript letters in cursive vernacular, aims to investigate the reasons behind the use of formulaic expressions in premodern Korean vernacular letters and what they reveal about this particular communication practice of the time. Early results suggest that these expressions reflected social statuses, maintained proper etiquette, and preserved oral traditions in written form. This study’s results aim to examine how recurring patterns in Korean epistolary practices contributed to the development of standardized epistolary norms and formulaic expressions, sheding light on early examples of linguistic standardization in Korean historical linguistics.

Drigo, Jasmim

Calques or Formulaic Expressions? An analysis of calques in Early Irish religious language

Some work has been done on Latin borrowings into Early Irish (e.g. McManus 1982, 1983, 1984), but most of the work has been concentrated on phonetic loanwords and relative chronology of phonological changes. Meanwhile, simplex calques have only been briefly discussed (e.g. McManus (1982), and structural calques have been ignored.

Structural or syntactic calques are one type of borrowing, more specifically, loan translation of complete sentences (Molnár 1985).

Some Latin religious terms have been passed into Early Irish as loan translations, e.g.:

OIr. Tír Tairngiri ‘he promised Land of the Old testament’ (Wb. 33b6, 2c21, Ml. 68b4, 78c11, 83d4, etc.)

In this example, Tír ‘land’ is combined with Tairngire ‘prophecy’, a word built with based on Latin praedictio: tair ‘before’ + ngire ‘a calling’

MIr. grásta Dé ‘the grace of God’ from Latin gratia Dei (Dánta Gr. 63.24)

Can some of these Early Irish structural calques be also considered formulaic expressions? Formulaic language are fixed expressions used in special contexts, but how do we understand formulaic expressions in bilingual texts and bilingual communities? We know that Medieval Ireland was a place of multilingualism, and this is especially clear in some texts, such as the Old Irish Glosses (Würzburg glosses, Milan glosses, St Gall Priscian glosses) and some Middle Irish texts (e.g. Binchy 1976, Bisagni 2014, Moran 2022, etc.).

In this paper, I will examine the translation of some Latin religious terms into Early Irish, exploring both how they were rendered and how they were used in context. Whether these terms are best classified as calques or formulaic expressions depends on the analytical approach—linguistic or philological—as each may lead to different conclusions based on their respective focuses.

BINCHY, D. A. (1976). “Semantic influence of Latin in Old Irish glosses”. O’MEARE, J. & NAUMANN, B. (Eds). Latin script and letters a.d. 400–900: Festschrift presented to Ludwig Bieler on the occasion of is 70th birthday, 167–73.

BISAGNI, J. (2014). “Prolegomena to the study of code-switching in the Old Irish Glosses”. Peritia 24-25, 1-58.

Mc MANUS, D. (1982). The Latin loanwords in Early Irish. Dissertation: University of Dublin.

Mc MANUS, D. (1983). “A Chronology of the Latin Loan-Words in Early Irish.” Ériu 34, 21-71.

Mc MANUS, D. (1984). “On Final Syllables in the Latin Loan-Words in Early Irish.” Ériu 35, 137-162.

MOLNÁR, N. (1986). The Calques of Greek Origin in the Most Ancient Old Slavic Gospel Texts. Cologne & Vienna: Böhlau.

MORAN, P. (2022). “Latin Grammar Crossing Multilingual Zones: St Gall, Stiftsbibliothek, 904”. In: CLARKE, M. & NÍ MHAONAIGH M. (Eds). Medieval Multilingual Manuscripts: Case Studies from Ireland to Japan, Studies in Manuscript Cultures 24., 35–54.

Elalfy, Doaa

The Canonization and Transmission of Collyria (اشياف) in Greek and Arabic Medical Traditions: Lexical Shifts and Formulaic Patterns

In the vast landscape of medical history, the use of collyria -medicated eye drops- is a prime example of the intersection between pharmacological knowledge and cultural exchange. This paper explores the process of evolution, canonization, and linguistic transmission of collyria, or اشياف (ašyāf), within the ancient Greek and Arabic medical traditions.

By examining both the conventionalized nature of collyria formulations and the shifts in terminology that occurred during the translation of medical texts, this study offers insights into the processes that solidified these remedies within two major cultural and scientific traditions.

Focusing on key figures such as Ḥunayn ibn Isḥāq, Yuḥannā ibn Māsawayh, Qustā ibn Lūqā, and Thābit ibn Qurra, the paper investigates how Greek medical knowledge, including the works of Dioscorides and Galen, was preserved, adapted, absorbed and eventually canonized in the Arabic world. These translators and physicians safeguarded the medical wisdom of antiquity and modified it to fit new cultural and practical contexts, thereby bridging ancient Greek formulations with Islamic medical practices. Their adaptations of collyria reflect broader intercultural and intellectual exchanges that contributed to the enduring legacy of these remedies. The study further analyzes specific collyria and ašyāf formulations found in both Greek and Arabic sources, exploring their therapeutic applications in treating eye conditions such as conjunctivitis and cataracts.

By tracing the formulaic structures in the texts, it highlights how certain conventionalized patterns were transmitted and solidified, thus becoming canonical within medical literature. Additionally, the lexical and semantic shifts that occurred during the translation process reveal how the adaptation of terminology shaped the understanding of these remedies across cultures, contributing to their lasting influence.

Ultimately, this paper situates ašyāf not only as practical therapeutic agents but also as linguistic and cultural markers of canonized medical knowledge. The formulaic nature of collyria, alongside the lexical evolution that occurred through translation, underscores the importance of both language and practice in the canonization of medical traditions. Through this lens, we gain a deeper understanding of the intercultural transmission of pharmacological knowledge and the role that formulaic language played in solidifying ancient medical practices as enduring elements of both Greek and Arabic medical canons.

Elder, Claire M.

The Fermesse, Fidelity and Faith: A Pragmatic Analysis of the Formulaic Symbols in an Early Modern Scottish Community of Practice

This paper examines the evidence of formulaic symbol use within a curated dataset of 183 seventeenth-century letters preserved in the Edinburgh archives. The analysis combines quantitative and qualitative techniques to uncover the pragmatic intentions which underpinned the senders’ decision to inscribe the fermesse symbol in their correspondence.

Studies by Nevalainen and Raumolin-Brunberg (1995), Nevailen (2001), Nevala (2003, 2004), Dossena (2007, 2013), Wood (2009), Rutten and van der Wal (2013), Pfeiffer and Schiegg (2020) and Bengough-Smith (2023) have previously established the capacity of formulaic language to encode pragmatic function in early modern letters. Similarly, scholars including Gibson (1997), Stewart and Wolfe (2004), Daybell (2012), Meurman-Solin (2013), Wiggins (2017), Evans (2020), and Smith (2020) have revealed the capacity for visual and material features to carry equivalent meaning within such documents. Moreover, the increased accessibility to high-quality, zoomable images afforded by digital scholarly editions of letters has allowed researchers to examine the extratextual pen marks added to such manuscripts (Starza Smith 2013).

As well as the signs of abbreviations and punctuation in regular handwritten use at the time, adding decorative lines, flourishes, and other symbols to superscriptions, subscriptions, and signatures was common practice. Such visual motifs provided writers with an ‘opportunity for self-fashioning’: the process of creating and projecting one's own identity through various means, including visual elements (Williams 2013). This paper will argue that in some instances, these visual features may be categorised as formulae that emblemise the relationship between sender and recipient.

One such is the fermesse symbol: an ‘intersected or slashed capital letter S, somewhat like the italic dollar sign $’ (Beal 2008) which had multiple meanings and functions across the early modern era. This symbol, originating in France, was coined as fermesse in the nineteenth century. The term connects its French translation, fermé S, to, arguably, its most salient meaning: firmness, via a pun (Chareyre 2007, 75). Hobson’s valuable 1935 survey identified countless French examples in letters and other mediums, including ceramics, jewellery, armour and fabric (115). Moreover, recent discussions have recognised fermesse use in the English language correspondence of Queen Anne of Denmark (Somers Cocks 1980; Wolfe 2013) and members of the Sidney circle (Larson 2015; Hannay 2013). However, until now, the fermesse in Scottish letters has been overlooked.

This paper calls for the systematic capture, encoding, and analysis of the formulaic symbols inscribed in early modern letters. By demonstrating the potential for visual formulae such as the fermesse to carry pragmatic significance as do epistolary manuscripts’ linguistic and material features, it seeks to spark new conversations about their importance within historical letters.

Cocks, Anna Somers. 1980. Princely Magnificence: Court Jewels of the Renaissance, 1500–1630. London: Debrett 's Peerage Ltd. in association with the Victoria and Albert Museum.

Daybell, James. 2012. The Material Letter in Early Modern England: Manuscript Letters and the Cultures and Practices of Letter-writing, 1512–1635. London: Palgrave Macmillan.

Dossena, Marina. 2013. “Mixing Genres and Reinforcing Community Ties in Nineteenth-Century Scottish Correspondence: Formality, familiarity and religious discourse.” In Communities of Practice in the History of English, edited by Joanna Kopaczyk and Andreas H. Jucker, 47–60. Amsterdam: John Benjamins Publishing Company.

Evans, Mel. 2020. Royal Voices: Language and Power in Tudor England. Cambridge: Cambridge University Press.

Gibson, Jonathan. 1997. “Significant Space in Manuscript Letters.” The Seventeenth Century 12(1): 1–10.

Hannay, Margaret P. 2013. Mary Sidney, Lady Wroth. United Kingdom: Ashgate Publishing Limited.

Hobson, G. D. 1935. Les Reliures à La Fanfare: Le Problème De l'S Fermé. London: The Chiswick Press.

Meurman-Solin, Anneli. 2013. “Features of Layout in Sixteenth– and Seventeenth–Century Scottish Letters.” In Annotating Variation and Change (Studies in Variation, Contacts and Change in English 1), edited by Anneli Meurman-Solin and Arji Nurmi. www.helsinki.fi/varieng/series/volumes/14/Meurman-Solin_b/.

Nevala, Minna. 2003. “Family First: Address and Subscription Formulae in English Family Correspondence From the Fifteenth to Seventeenth Century.” In Diachronic Perspectives on Address Term Systems, edited by Irma Taavitsainen and Andreas H. Jucker, 147–76. Philadelphia, The Netherlands: The John Benjamins Publishing Company.

Nevala, Minna. 2004. “Inside and Out: Forms of Address in Seventeenth- and Eighteenth-Century Letters.” In Journal of Historical Pragmatics (5): 27–296.

Nevalainen, Terttu. 2001. “Continental Conventions in Early English Correspondence.” In Towards a History of English as a History of Genres, edited by Hans-Jürgen Diller and Manfred Görlach, 203–224. Heidelberg: Winter.

Nevalainen, Terttu, and Helena Raumolin-Brunberg. 1995. “Constraints on Politeness: The Pragmatics of Address Formulae in Early English correspondence.” In Historical Pragmatics: Pragmatic Developments in the History of English, edited by Andreas H. Jucker, 541–601. Amsterdam: Benjamins.

Pfeiffer, Christian and Markus Schiegg. 2020. “Religious Formulae in Historical Lower-Class Patient Letters.” In Formulaic Language and New Data, edited by Elisabeth Piirainen, Elisabeth Filatkina, Elisabet, Sören Stumpf and Christian Pfeiffer, 250–77. Berlin, Boston: De Gruyter.

Rutten, Gijsbert and Marijke van der Wal. 2013. “Epistolary Formulae and Writing Experience in Dutch Letters from the Seventeenth and Eighteenth Centuries". In Touching the Past: Studies in the Historical Sociolinguistics of Ego-documents edited by Gijsbert Rutten and Marijke van der Wal, 45–65. Amsterdam: John Benjamins Publishing Company.

Starza Smith, Daniel. 2013. “The Material Features of Early Modern Letters: A Reader’s Guide”, in Bess of Hardwick's Letters: The Complete Correspondence, c.1550-1608. Edited by Alison Wiggins, Alan Bryson, Daniel Starza Smith, Anke Timmermann and Graham Williams, The University of Glasgow. Web development by Katherine Rogers, University of Sheffield Humanities Research Institute. www.bessofhardwick.org/background.jsp?id=143.

Stewart, Alan, and Heather Wolfe. 2004. Letterwriting in Renaissance England. United States: Folger Shakespeare Library.

Wiggins, Alison. 2017. Bess of Hardwick’s Letters: Language, Materiality, and Early Modern Epistolary Culture. Abingdon: Routledge.

Williams, Graham. 2013. “The Language of Early Modern Letters: A Reader's Guide”, in Bess of Hardwick’s Letters: The complete correspondence, c.1550–1608. Edited by Alison Wiggins, Alan Bryson, Daniel Starza Ldon, Anke Timmermann and Graham Williams, The University of Glasgow Web development by Katherine Rogers, University of Sheffield Humanities Research Institute. www.bessofhardwick.org/background.jsp?id=168.

Wolfe, Heather. 2013. “A Letter from Queen Anne to Buckingham Locked with Silk Embroidery Floss.” The Collation. https://www.folger.edu/blogs/collation/a-letter-from-queen-anne-to-buckingham-locked-with-silk-embroidery-floss/.

Wood, Johanna L. 2009. “Structures and Expectations: A Systematic Analysis of Margaret Paston’s Formulaic and Expressive Language.” Journal of Historical Pragmatics 10(2): 187–228.

Fantoli, Margherita, Korkiakangas, Timo

Exploring formulaic language in dependency treebanks using network analysis

Our paper explores ways to detect and quantify formulaic language in a corpus of 521 early medieval charters written in non-standard Latin in Tuscia mostly in the 9^th century and available in the Late Latin Charter Treebank (LLCT2, 242,411 tokens; Korkiakangas 2021).

Charters consist of diplomatic sections with different functions. Sections which focus on the legal validity of the document are the most conservative and mainly composed of prefabricated phrases with precise slots for variable information, such as dates and names, whereas the language of sections which convey the case-specific motivations, circumstances, and details of the transaction is freer and necessarily relies more on the scribe’s command of Latin as a second language. Sabatini (1965) noticed that such a dichotomy is relevant to linguistic study: formulaic parts display errors that derive from the scribes’ defective knowledge of centuries-old legal language, such as hypercorrections and other misunderstandings, while, in the less formulaic parts, the scribes lapsed into non-standard constructions triggered by their vernacular, which was already close to Romance. Thus, the formula context helps in drawing historical linguistic conclusions on whether a specific Latin construction was still present and whether a specific Romance construction was already present in the spoken language of the time.

We explore the variation in the syntactic structure of the sentences to guide the detection of formulaic sequences. In fact, formulaic sequences of charters are often non-linear and represented by a few core terms alternating with slots filled by variable elements. Hence, raw counts do not allow to account for such “embedded” elements. We propose to build a network of LLCT2 capturing all syntactic relations of the corpus, by merging the trees of the individual sentences. The nodes are represented by the lemmas (which are not subject to spelling variation) and the edges by the dependency relation linking two lemmas. The text is slightly preprocessed in order to group some terms that typically vary from one instance of a formula to another (proper names, numbers, and dates). With this experiment, we build on Passarotti (2015), where the treebank was transformed into a network using a similar procedure.

We carry out the analysis of the network by focussing on two aspects:

the nodes linked by the heaviest edges, e.g., the lemmas that are the most frequently linked by dependency relations;
the nodes aggregated in communities based on the Louvain algorithm (Blondel et al. 2008), i.e., the lemmas that appear to share a “syntactic cluster”.

We compare these two proxies of formulaicity to the existing manual binary formulaicity annotation in the corpus (where each sentence is marked as formulaic or non-formulaic; Korkiakangas & Lassila 2013: 66–67) as well as with a sample of more fine-grained manual annotation created for the present investigation.

Preliminary results suggest that the heaviest links correspond to an intuitive idea of formulaicity, while communities, despite binding together words that tend to appear in formulaic expressions, suffer from two methodological shortcomings: the fact that each node can only be assigned to one community (e.g., each lemma is potentially assigned to only one formula) and the fact that every node has to be assigned to one community, which results in very low-frequency nodes being included.

Blondel, Vincent D., Guillaume, Jean-Loup, Lambiotte, Renaud & Lefebvre, Etienne. 2008. ‘Fast unfolding of communities in large networks’, in Journal of Statistical Mechanics: Theory and Experiment 10, P10008, doi: 10.1088/1742-5468/2008/10/P10008. ArXiv: http://arxiv.org/abs/0803.0476

Korkiakangas, Timo. 2021. ‘Late Latin Charter Treebank: contents and annotation’, in Corpora 16:2, 191–203. https://tuhat.helsinki.fi/ws/portalfiles/portal/128999342/corpora_‌korkiakangas_‌Accepted_‌Manuscript‌_AM_deanonymized.pdf

Korkiakangas, Timo & Lassila, Matti. 2013. ‘Abbreviations, fragmentary words, formulaic language: treebanking medieval charter material’, in Mambrini, F., Passarotti, M. & Sporleder, C. (eds.) Proceedings of The Third Workshop on Annotation of Corpora for Research in the Humanities (ACRH-3), 61–72. Bulgarian Academy of Sciences: Sofia. http://bultreebank.org/wp-content/uploads/2017‌/06/ACRH-3Proceeding.pdf

LLCT2 = Korkiakangas, Timo, Cecchini, Flavio & Passarotti, Marco. 2020. ‘Late Latin Charter Treebank’, in Zeman, D., Nivre, J., Abrams, M. & al., Universal Dependencies 2.6, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University, Czech. https://github.com/Universal‌Dependencies/‌UD_Latin-LLCT.

Passarotti, Marco. 2015. ‘What syntax can do for philosophy: a treebank-based network analysis of the verb sum in Thomas Aquinas’, in Rivista di filosofia neo-scolastica 107(1–2), 309–324.

Sabatini, Francesco. 1965. ‘Esigenze di realismo e dislocazione morfologica in testi preromanzi’, in Rivista di Cultura Classica e Medievale 7, 972–998.

Fascione, Sara (part of panel B)

Formularity and idiosyncrasy in Fronto’s letter headings

Ancient epistolography has very few rules regarding the form of both documentary and literary letters, and most of them concern the headings and salutation formulae, which follow codified and extremely repetitive patterns across the centuries. The general trend in Latin epistolography shows how letter writers used distinctive formulae, repeating them with consistency in their correspondences, even when addressing different recipients. In this scenario, Fronto’s epistolary corpus represents a striking exception. He and his addressees use formulaic, traditional expression in the headings, but at the same time they seem particularly keen to introduce variations. These either outline the evolution of the relationship between the correspondents, or depend on the social status and age of the addressee, or even may be evidence of discrepancies in the editing phases of the letter collection. In fact, due to structural inconsistencies, Fronto’s corpus has been considered as a posthumous edition put together after the death of its author by an anonymous pupil or family member. However, a closer look reveals that the letters are gathered according to precise patterns that connect the various parts of the corpus at different levels. Therefore, the paper aims at analyzing the very peculiar oscillation between formularity and idiosyncratic use of headings emerging from Fronto’s letters, in order to assess whether this element may be seen as evidence of the intervention of the author in the making of his epistolary corpus.

Fezer, Katharina

Tracing and comparing formulaes in printed and handwritten texts: Methods, issues, challenges

During the Early Modern Period, letter writing manuals (which usually consisted of a set of theoretical rules on how to write letters plus a collection of model letters) played an essential role in epistolography and formed an important source of formulaic language (cf. Große 2020). The number of reprints and editions indicates that these works were frequently used – even if nobody admitted to using them, letter writing being expected to be an original, creative activity far away from any formulaicity (cf. Haroche-Bouzinac 2010).

However, it has often not been possible to carry out a scientific examination of whether the formulae provided by the letter writing manuals can actually be found in real private letters of the same era: As private texts were often deemed neither worthy nor suitable of conservation, there were hardly any known authentic private letters from these earlier periods that could have been analysed. Various attempts have been made to find such sources in archives and other places, and for some languages, corresponding corpora and studies already exist (cf. Rutten/Van der Wal 2012, 2013, Serra 2023, among others), but for French in particular this problem is still acute (see, however, an initial study by Nakagawa (2022)).

My study aims to trace formulaic language on the basis of a newly compiled 17^th century French letter corpus which consists of authentic, handwritten letters on the one hand and printed model letters (drawn from letter writing manuals) on the other.

I will describe the different steps that were necessary for the corpus creation: the strategies to find such sources, the digitization of these texts (including the different types of transcription) and their further processing using tools like MaxQDA and Textométrie. Finally, I will present a few results of the quantitative and qualitative analyses made possible by this corpus: Besides answering the question to what extent the formulae used in the two different types of letters coincide, particular attention will be paid (1) to the formulae that were promoted in the letter writing manuals but criticized in other meta-linguistic works (grammars etc.) of that period, e.g. due to their pleonastic nature. and (2) to hypercorrections or other deviations from the grammatical norms (morphosyntactic alignment etc.) that allow conclusions about the individual linguistic competence of the writers.

Große, Sybille: Über das Wandern von Worten, Formeln und Traditionen in der west- und mitteleuropäischen Epistolographie des 17. und 18. Jahrhunderts, in: Dominika Bopp / Stefaniya Ptashnyk / Kerstin Roth / Tina Theobald (eds.): Wörter – Zeichen der Veränderung, Berlin / Boston 2020, 319–341.

Haroche–Bouzinac, Geneviève: Dames et cavaliers, doctes, épistoliers ordinaires, in Gérard Ferreyrolles (ed.): L’épistolaire au XVIIe siècle, Paris 2010 (Littératures classiques 71), 67–90.

Nakagawa, Ryo: Les formules épistolaires en français aux XVIIe et XVIIIe siècles dans les lettres des réfugiés protestants (Huguenot Library F/AF et F/CA). Linx. Revue des linguistes de l’université Paris X Nanterre 85 (2022).

Rutten, Gijsbert & Marijke J. van der Wal: Functions of epistolary formulae in Dutch letters from the seventeenth and eighteenth centuries. Journal of Historical Pragmatics 13.2 (2012): 173-201.

Rutten, Gijsbert, and Marijke Van der Wal: Epistolary formulae and writing experience in Dutch letters from the seventeenth and eighteenth centuries, in: Touching the past: Studies in the historical sociolinguistics of ego-documents. Amsterdam & Philadelphia: John Benjamins (2013): 45-65.

Serra, Eleonora: Learning to Write Letters in Sixteenth Century Florence: Epistolary Formulae in the Correspondence of Lucrezia Albizzi Ricasoli. Linguistica 63.1-2 (2023): 273-300.

Frog

Translating Formula: Formula as a Universal Concept or a Concept on the Move?

Giannikou, Kyriaki

Assessing and Reassessing Formulaicity: are editorial practices a blessing or a curse?

Formulaicity is a widely discussed concept in the study of historical Greek, primarily due to the influence of the Homeric epics, where it is traditionally understood to arise from oral contexts where formulaic sequences reduce processing effort during lengthy recitations. Besides that, formulaic language also appears in entirely written contexts, such as post-classical Greek administrative and legal documents, where high standardisation meets the need of accuracy and efficiency (see e.g. Nachtergaele 2023; Saradi 2019). The corpus I focus on, Byzantine book epigrams — short, metrical texts found in the margins of Byzantine manuscripts — presents a unique case. These paratexts, embedded in the medieval manuscript tradition, blend literary and documentary functions without any oral performance context, oscillating between practical precision and creative expression. This paper explores a methodological challenge in studying formulaic language within historical Greek corpora, focusing specifically on the Database of Byzantine Book Epigrams.

Even recent comprehensive research on Homer’s formulaic language (Bozzone 2024) relies on modern editions of the Homeric epics that attempt to reconstruct an ‘archetype’ based on medieval manuscript ‘witnesses’. In contrast, the DBBE diverges from strict adherence to traditional editorial practices by presenting epigrams preserving all original scribal choices (‘Occurrences’) while also offering ‘normalised’ versions (‘Types’) that group similar instances of the originals (Ricceri et al. 2023). This raises questions: To what extent can we rely on edited texts to analyse formulaicity? How might editorial choices, driven by the desire for a cohesive text, obscure the original variability of formulaic sequences? Does the interaction between formulaicity and editorial practices facilitate research, or does this create the impression of greater fixedness in formulae, potentially skewing certain aspects of the analysis?

This paper explores the potential impact of editorial intervention on formulaicity research, advocating for a more flexible methodology that balances the use of both edited and original sources. Through a case study on supplications for salvation within a subset of the DBBE corpus, I will demonstrate how formulaic expressions function in this hybrid referential-poetic (cf. Jacobson 1960) context, and how editorial practices may shape our understanding of formulaicity. Ultimately, this study seeks to position this material within the broader framework of formulaicity research and to discuss the implications of editorial practices for linguistic research in historical corpora.

Bird, G. D. 2010. Multitextuality in the Homeric Iliad: The witness of the Ptolemaic papyri. Washington-Cambridge.

Bozzone, C. 2024. Homer’s living language: Formularity, dialect, and creativity in oral-traditional poetry. Cambridge.

Jakobson, R. 1960. ‘Closing Statements: Linguistics and Poetics’. In Thomas A. Sebeok (Ed.). Style In Language. Cambridge, 350–377.

Nachtergaele, D. 2023. The formulaic language of the Greek private papyrus letters. Leuven.

Ricceri, R. et al. 2023. ‘The Database of Byzantine Book Epigrams project: Principles, challenges, opportunities.’ Journal of Data Mining and Digital Humanities.

Saradi, H. 2019. ‘Rhetoric and Legal Clauses in the Byzantine Wills of the Athos Archives: Prooimia and Clauses of Warranty.’ In O. Delouis and K. Smyrlis (Eds.). Lire Les Archives de l’Athos, Actes Du Colloque Réuni à Athènes Du 18 Au 20 Novembre 2015 à l’occasion Des 70 Ans de La Collection Refondée Par Paul Lemerle, 23/2, 357–388.

Small, J. P. 1997. Wax tablets of the mind: Cognitive studies of memory and literacy in classical antiquity. London.

Ginevra, Riccardo, Biagetti, Erica, Brigada Villa, Luca & Zanchi, Chiara

Comparing Indo-European Poetic Languages: How to Combine Construction Grammar and Digital Resources for the Analysis of Formulaic Phraseology in Vedic Sanskrit and Homeric Greek

Soon after Parry (1971[1928]) and Lord (1960) first demonstrated the oral-formulaic character of Homeric poetry, scholars like Magoun (1953) and Kiparsky (1976) drew attention to its relevance for other poetic traditions too – the Old English and the Vedic Sanskrit ones, respectively. Such correspondences allowed Indo-Europeanists to develop a historical-comparative methodology for the analysis and reconstruction of Indo-European formulaic phrases, e.g. Campanile (1977), Watkins (1995), and García Ramón (2021).

Kiparsky (1976) already stressed the strong similarity between formulas and idioms from a linguistic perspective. Research on idiomatic expressions eventually led to Construction Grammar (Fillmore et al. 1988; Goldberg 1995), an approach that assumes no strict division between lexicon and syntax, but rather a continuum from “constructions” (i.e. “learned pairings of form with semantic or discourse function”; Goldberg 2006: 5) that are more phonologically fixed (e.g., lexemes, fixed idioms) to constructions that are more schematic (e.g., syntactic constructions, flexible idioms). Construction-based approaches are able to capture both fixed repetitions and more flexible or schematic patterns of poetic language, as first argued by Bozzone (2014) for Homeric formulas and by Frog (2014) for Old Norse kennings, and are thus highly relevant to the historical-comparative analysis and reconstruction of Indo-European formulaic patterns (Ginevra 2021; 2023).

As proposed by Biagetti (2023), in our presentation we will combine a construction-based approach with two types of digital resources, namely TreeBanks (morphosyntactically parsed corpora; Hellwig et al. 2020; Mambrini 2021) and WordNets (lexicosemantic relational databases; Biagetti et al. 2021), allowing for the automatic extraction of formulaic patterns and making research on poetic language replicable and systematic. Building on previous constructionist and computer-based research on Vedic Sanskrit (Brigada Villa et al. 2023) and Homeric Greek (Brigada Villa et al. forthcoming), we will perform a comparative analysis of formulaic patterns including speech verbs in these Indo-European traditions.

For instance, similar patterns are attested in Vedic Sanskrit (1)–(2) and Homeric Greek (3)–(4), involving verbs with call/invoke semantics governing an object X referring to a human or deity and an element Y expressing a purpose. By means of an inductive approach, we will attempt to evaluate if such parallels are best analyzed as reflexes of an inherited Indo-European formulaic construction or rather as independent developments in two related poetic traditions.

(1) indravāyū́_x manojúvā / víprā havantaūtáye_y

Indra-Vāyu.acc.du mind-swift.acc.du poet.nom.pl call.3pl.mid help(f).dat

“Indra and Vāyu, mind-swift, do the inspired poets call for help” (RV 1.23.3ab)

(2) ugrám_x pūrvī́ṣu pūrvyáṁ / hávante vā́jasātaye_y

strong.acc many.loc.pl foremost.acc call.3pl.mid prize_winning(f).dat

“They call on (you) the strong, foremost among the many (peoples), for the winning of prizes” (RV 5.35.6cd)

(3) […] Aléxandrós se_x kaleî oîkon dè néesthai_y

Alexander.nom 2sg.acc call.3sg home.acc ptc go.inf.mid

“Alexander calls on you to go home” (Il. 3.390)

(4) kiklḗskous’ Aídēn_x kaì epainḕn Persephóneian_x,

call.ptcp.nom.f Hades.acc and dread.acc.f Persephone(f).acc

/ […] / paidì dómen_y thánaton […]

son.dat give.inf.aor death.acc

“Calling on Hades and dread Persephone to give death to her son” (Il. 9.569–571)

Biagetti, Erica. 2023. Integrare Sanskrit WordNet e Vedic TreeBank: uno studio pilota sulla formularità del Rigveda tra semantica e sintassi. In: I. Bossolino and C. Zanchi (eds.), E pluribus unum. Prospettive sull’Antico Per i Decennalia dei Cantieri d’Autunno: i seminari dell’Università di Pavia dedicati al mondo antico, 45–62. Pavia: PUP.

Biagetti Erica, Chiara Zanchi, and William M. Short. 2021. Toward the creation of WordNets for ancient Indo-European languages. In: P. Vossen and C. Fellbaum (eds.), Proceedings of the 11th Global Wordnet Conference, pp. 258-266. University of South Africa (UNISA): Global WordNet Association.

Bozzone, Chiara. 2014. Homeric Constructions. PhD thesis, University of California, Los Angeles.

Brigada Villa, Luca, Erica Biagetti, Riccardo Ginevra, and Chiara Zanchi. 2023. Combining WordNets with Treebanks to study idiomatic language: A pilot study on Rigvedic formulas through the lenses of the Sanskrit WordNet and the Vedic Treebank. In: G. Rigau, F. Bond, and A. Rademaker (eds.), Proceedings of the 12^th Global WordNet Conference, 133–139. Donostia - San Sebastian: Global Wordnet Association.

Brigada Villa, Luca, Andrea Farina, and Chiara Zanchi. Forthcoming. Formulaic networks as prototypical categories: Combining the Ancient Greek Dependency Treebank with the Ancient Greek WordNet for a pilot study on the Iliad. In: Proceedings of the International Colloquium of Ancient Greek Linguistics, Madrid 2022.

Campanile, Enrico. 1977. Ricerche di cultura poetica indoeuropea. Pisa: Giardini.

Fillmore, Charles J., Paul Kay and Mary Catherine O'Connor. 1988. Regularity and Idiomaticity in Grammatical Constructions: The Case of Let Alone. Language 64.3.501–538.

Frog. 2014. Oral Poetry as Language Practice: A Perspective on Old Norse dróttkvætt Composition. In: P. Huttu-Hiltunen et al. (eds.), Song and Emergent Poetics – Laulu ja runo – Песня и видоизменяющаяся поэтика, 279-307. Kuhmo: Juminkeko.

García Ramón, José Luis. 2021. Poética, léxico, figuras: fraseología y lengua poética indoeuropea. In: L. Galván (ed.), Mímesis, acción, ficción: Contextos y consecuencias de la «Poética» de Aristóteles, 11–57. Kassel: Reichenberger.

Ginevra, Riccardo. 2021. Metaphor, metonymy, and myth: Persephone’s death-like journey in the Homeric Hymn to Demeter in the light of Greek phraseology, Indo-European poetics, and Cognitive Linguistics. In: I. Rizzato, F. Strik Lievers & E. Zurru (eds.), Variations on Metaphor, 181–211. Newcastle upon Tyne: Cambridge Scholars.

Ginevra, Riccardo. 2023. Loki’s Chains, Agni’s Yoke, Prometheus Bound, and the Old English Boethius: Indo-European Myths of the “Binding/Yoking of Fire-Gods” in the Light of Comparative Poetics and Cognitive Linguistics. Indogermanische Forschungen 128.203-252.

Goldberg, Adele E. 1995. Constructions: a Construction Grammar Approach to Argument Structure. Chicago: Chicago University Press.

Goldberg, Adele E. 2006. Constructions at Work: The Nature of Generalization in Language. Oxford: Oxford University Press.

Hellwig, Oliver, Salvatore Scarlata, Elia Ackermann, and Paul Widmer. 2020. The Treebank of Vedic Sanskrit. In: N. Calzolari, F. Bechet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi et al. (eds.), Proceedings of The 12th Language Resources and Evaluation Conference (LREC 2020), 5137-5146. Marseille, France: European Language Resources Association.

Kiparsky, Paul. 1976. Oral Poetry: Some Linguistic and Typological Considerations. In: B. A. Stolz and R. S. Shannon (eds.), Oral Literature and the Formula, 73–106. Ann Arbor: Center for Coordination of Ancient and Modern Studies.

Lord, Albert B. 1960. The Singer of Tales. Cambridge MA: Harvard University Press.

Magoun, Francis P. Jr. 1953. Oral-Formulaic Character of Anglo-Saxon Narrative Poetry. Speculum 28.3.446–467.

Mambrini, Francesco. 2021. Universal Dependencies Conversion of the Ancient Greek Dependency Treebank. https://github.com/francescomambrini/katholou/tree/main/ud_treebanks/agdt/data

Parry, Milman. 1971 [1928]. The Traditional Epithet in Homer. In: Adam Parry (ed.), The Making of Homeric Verse: The Collected Papers of Milman Parry, 1–190. Oxford: Oxford University Press.

Watkins, Calvert. 1995. How to Kill a Dragon: Aspects of Indo-European Poetics. New York and Oxford: Oxford University Press.

Groot, Hester

Identity construction and genre shift through formulaic language in Scottish pauper letters, 1750-1900

Letter-writing formulae represent an important genre feature in cross-regional pauper letter traditions, and serve both practical and stylistic functions (Gardner 2023). In the case of Scotland, the eighteenth- and nineteenth-century poor often had little writing experience, but the poor relief system necessitated the composition of letters detailing their plights and making official requests. To formulate these letters properly and lend them legitimacy, paupers drew on genre conventions, stylistic and textual, as a formulating help (Rutten & van der Wal 2014: 171). Formulaic language therefore serves as a genre marker, situating a letter within a textual tradition. Within pauper letters, Jones & King (2015) have observed a continuum between two genre types: the petition (marked by formal, distant language, which often featured archaicisms no longer found elsewhere in the language of the Scottish lower-class population) and the familiar letter (less rigidly formulaic, and with more oral characteristics). The occurrence of various formulae helps us identify where a pauper letter falls on this cline, and consequently, the communicative strategy mastered and deployed by the petitioner.

This paper will investigate the formulaic strategies used by the Scottish poor in pauper letters written between 1750 and 1900. These materials are part of the ScotPP corpus (Leiden University, under construction), and offer an important and innovative insight into the voices of the underrepresented historical Scottish lower classes. I compare the writing practices of Scottish paupers to the patterns Gardner (2023) identifies among their English contemporaries, whose use of genre and formulae differs strikingly despite their shared use of written English. In a diachronic and diatopic investigation of letters from 20 Scottish counties, using systematic corpus linguistic methods, I investigate how paupers follow, adapt, or deviate from genre trends through their choice of various formulae. This illustrates how this allows them to construct an identity and position themselves hierarchically relative to the parish boards and overseers to whom they are writing, and reflects the changing hierarchical relations between petitioner and addressee that Jones & King (2019) locally observed in the Scottish Highlands. These findings show how the writers of these letters were, despite often limited schooling, able to exert agency over their self-representation and positionality through the linguistic means and genre constraints at their disposal.

Gardner, Anne-Christine. 2023. English pauper letters in the eighteenth century and beyond: On the variability and evolution of a new text type. Linguistica 63.1-2, 301–336.

Jones, Peter & Steven King. 2015. From petition to pauper letter: the development of an epistolary form. In Peter Jones & Steven King (eds.), Obligation, Entitlement and Dispute under the English Poor Laws, 53–77. Cambridge: Cambridge Scholars.

Jones, Peter & Steven King. 2019. Voices from the far north: Pauper letters and the provision of welfare in Sutherland, 1845–1900. Journal of British Studies 55(1), 76–98.

Rutten, Gijsbert & Marijke van der Wal. 2014. Letters as Loot. A sociolinguistic approach to seventeenth- and eighteenth-century Dutch. Amsterdam: Benjamins.

Große, Sybille

Formulaicity in French letters: function and acquisition in theory and empiricism

Formulas are described in various contexts, which hinders their establishment as a distinct and widely accepted term. This limitation also applies to the formulaic nature of specific components of letters. Bruneton-Governatori and Moreux (1997: 82) refer to the existence of predetermined models (préécrits), which can also be interpreted as writing routines in line with Gülich’s (1997) definition. Rutten and van der Wal (2014: 82) implement a functional typology of specific epistolary formulas, distinguishing between text-type formulas (e.g., letter openings) and text-structure formulas. While this distinction is conceptually clear, accurately differentiating text-type formulas from other components of letters remains a persistent challenge in the digital annotation of letter corpora.

In various studies, we have analysed the openings and closings of French letters written by predominantly less experienced writers (Grosse et al. 2016; Steuckardt et al. 2020, 2022). Our findings indicate that, alongside various communicative routines, the writers we examined employ certain stereotypical formulas in their correspondence, which they either reproduce faithfully in their traditional form or adapt through micro-variations.

In recent years, the role of formulas in discourse has also been explored in conjunction with cognitive considerations and questions in construction grammar, where they are viewed as conventionalized construction patterns. This topic has been addressed in the field of language acquisition research (Filatkina 2018: 38-45).

It is reasonable to assume that these formulas used in written communication are transmitted in diverse ways across different communication communities, a phenomenon typically referred to as ‘discourse tradition’ in Romance studies.

In this presentation on French letters, a distinction will be made between writing routines that tend to be taught implicitly and epistolary formulas that are acquired as part of explicit norm descriptions in different contexts (‘socialisation écrite’ - Große/Sowada 2020).

On a functional level, there has been limited research on how formulas in texts, including letters, can support text production for writers and facilitate text reception for readers (referred to as the ‘cognitive load factor’ - Meier 2020: 21-22). Consequently, our presentation will also focus on the transition from these formulas to the body of the letter.

Bubenhofer, Noah (2009): Sprachgebrauchsmuster. Korpuslinguistik als Methode der Diskurs- und Kulturanalyse, Berlin: de Gruyter.

Bruneton-Governatori, Ariane, Moreux, Bernard (1997 : „Un modèle épistolaire populaire“, in : Fabre, Daniel (eds.) : Par écrit. Ethnologie des pratiques d’écriture quotidiennes, Paris: Éditions de la Maison des Sciences de l’Homme, 79-103.

Corrigan, Roberta/Moravcsik, Edith/Ouali, Hamid/Wheatley, Kathleen (2009): “Introduction. Approaches to the study of formulae”, in: Corrigan, Roberta et al. (2009a) (eds.): Formulaic language. Volume 1: Distribution and historical change, Amsterdam/Philadelphia; Benjamins, XI-XXIV.

Coulmas, Florian (1981): Routine im Gespräch. Zur pragmatischen Fundierung der Idiomatik, Wiesbaden: Akademische Verlagsgesellschaft Athenaion.

Coulmas, Florian (1979): “On sociolinguistic relevance of routine formulae”. Journal of PragmaticsJournal of Pragmatics 3 (3-4): 239-266. https://doi.org/10.1016/0378-2166(79)90033-X.

Filatkina, Natlia (2018): Historisch formelhafte Sprache. Theoretische Grundlagen und methodische Herausforderungen, Berlin/Boston: de Gruyter.

Große, Sybille/Sowada, Lena (2020): "Socialisation écrite et rédaction épistolaire de scripteurs moins expérimentés – lettres des soldats de la Grande Guerre", Romanistisches Jahrbuch 71, 82-129.

Große, Sybille/Steuckardt, Agnès/Sowada, Lena/Dal Bo, Beatrice (2016): "Du rituel à l’individuel dans les correspondances peu lettrées de la Grande Guerre", in: Neveu, Frank et al. (eds.): Actes du 4^e Congrès mondial de linguistique française, EPD Sciences, 1-15. DOI 10.1051/shsconf/20162706008.

Gülich, Elisabeth (1997): „Routineformeln und Formulierungsroutinen. Ein Beitrag zur Beschreibung formelhafter Texte“, in: Wimmer, Rainer (eds.): Wortbildung und Phraseologie, Tübingen: Narr, 131-176.

Meier, Kerstin (2020): Semantische und diskurstraditionelle Komplexität. Linguistische Interpretationen zur französischen Kurzprosa, Berlin/Boston: de Gruyter.

Rutten, Gijsbert/van der Wal, Marijke J. (2014): Letters as Loot. A sociolinguistic approach to seventeenth- and eighteenth-century Dutch, Amsterdam / Philadelphia: Benjamins.

Steuckardt, Agnès/Große, Sybille/Dal Bo, Beatrice/Sowada, Lena (2020): "Le rituel et l’individuel dans les pratiques d’écriture : l’exemple de la clôture dans des correspondances peu lettrées de la Grande Guerre" in: Remyssen, Wim/Tailleur, Sandrine (eds.): L’individu et sa langue, Laval: Presses de l’Université de Laval, 103-126.

Steuckardt, Agnès/Große, Sybille/Dal Bo, Beatrice/Sowada, Lena (2022): „La routine et le style. Exploration outillée des formules d’ouverture et de clôture dans des correspondances peu-lettrées de la Première Guerre mondiale d’écriture: l’exemple de la clôture dans des correspondances peu lettrées de la Grande Guerre“, in: Galleron, Ioanna /Idmhand, Fatiha (eds.): Dix ans de corpus d’auteurs, Paris: Editions des archives contemporaines, 203-220. https://doi.org/10.17184/eac.9782813004352.

Stumpf, Sören/Filatkina, Natalia (2018) (eds.): Formelhafte Sprache in Text und Diskurs, Berlin/Boston: de Gruyter.

Wood, David (2015): Fundamentals of formulaic language. An introduction, London et al. : Bloomsbury.

Wray, Alison (2009): “Identifying formulaic language. Persistent challenges and new opportunities”, in: Corrigan, Roberta et al. (2009a) (eds.): Formulaic language. Volume 1: Distribution and historical change, Amsterdam/Philadelphia; Benjamins, 27-51.

Honkanen, Saara

Formulaicity in Medieval Latin Historical Prose: the Case of Freculf of Lisieux

Scholars interested in formulaic syntax have traditionally predominantly focused on the study of various non-literary texts, whereas the presence of syntactic patterns, or ’templates’, in different literary genres has gained less attention so far as a potentially fruitful research area.

In this presentation I examine the role of formulaic syntax in Medieval Latin historiography by taking a close look at the narrative style of a 9th century Frankish historian Freculf of Lisieux. Based on a close reading and a detailed syntactic analysis of a series of narrative episodes (mainly battle sequences) selected from Freculf’s chronicle I define his preferred syntactic template(s) and illustrate these findings with several concrete examples.

Given that in the Antiquity and through to the Middle Ages historical prose was one of the genres regarded as representing ’high style’, it is perhaps surprising to note just how frequently Freculf has recourse to recurring syntactical patterns to build his historical narrative. Freculf’s style – and his continuous balancing between formulaicity and instances of independent narrative creativity – is to be understood in its historical context, the immediate aftermath of the Carolingian Renaissance. I argue that the constant interplay between template-driven phrasings and regular deviations from them reflects the contradictory nature of Freculf’s linguistic capabilities: As a representative of the generation of writers moulded by the Carolingian language reform and as a pupil of some of the reform’s famous educators Freculf has a sure grasp of Latin syntactical structures and a clear sense of the ideal of expression of his time (perspicuitas) but his attempts at a higher style often lead him into trouble and betray the limits of his linguistic competence. It seems that staying within the safe confines of learned formulaic expression offers Freculf a simple means to move his narrative along in conformity with the perceived ideal narrative style.

Kaislaniemi, Samuli

Address formulas and material practices in seventeenth-century English letters

For historical sociolinguists, the formulaicity of letter-writing provides excellent opportunities to study how social relationships are codified in language (e.g. Nevalainen & Raumolin-Brunberg 2017). The relationship between sender and recipient is particularly explicitly expressed in address terms and phrases (Nevala 2007). Letter-writing being a taught skill, most early modern European letter-writing manuals have a section instructing the reader on how to address persons of different social ranks (e.g. Day 1586: 32–34). The instructions apply to the address phrases within the letter’s text, but also to the superscription: the ‘address’ (as we still call it) on top of the letter packet.

In the early modern period, material and visual aspects of letters were just as important in negotiating and establishing social relationships as parts of the text. For instance, distance between the body text, the closing formula, and the signature, could be used to indicate respect and humility (Gibson 1997). Historically, envelopes were not used, and instead the paper the letter was written on was folded to form its own packet – this is today known as letterlocking (Dambrogio et al. 2021). Different social situations required different letterlocking types, and these were taught as part of overall letter-writing skills – although they were not described and explained in contemporary letter-writing manuals. Previous research has shown that some letterlocking types carried clear social meaning (Wolfe 2012), but the overall ‘grammar’ of letterlocking remains largely uncharted.

In this paper, I look at how superscriptions in seventeenth-century English letters match known social relationships of senders and recipients. In addition, I will also consider how the letters were folded and sealed. Given that superscriptions are one of the most rigidly formulaic parts of the letter, I expect to find some correlation between the letterlocking and the superscriptions. That is to say, I expect the superscription and folded and sealed letter to form a semiotic whole, which reflects the social relationship between the sender and the recipient. To that end, my study will chart if and how variation in the superscription formulas is matched by variation in the letterlockings.

My study corpus consists of letters from the Corpus of Early English Correspondence (CEEC). In addition to the texts of the letters, I have photographs of the source manuscripts, and have surveyed their material features.

Since material aspects of letter-writing are not familiar to most historical linguists, I would like to take advantage of the 10 extra minutes in order to have time to adequately explain epistolary materiality. This will also allow me to fully describe my dataset.

CEEC = Corpus of Early English Correspondence. Compiled by Terttu Nevalainen, Helena Raumolin-Brunberg & the CEEC team at the Department of languages, University of Helsinki. See https://varieng.helsinki.fi/CoRD/corpora/CEEC/index.html.

Dambrogio, Jana, Daniel Starza Smith, Jennifer Pellecchia, Alison Wiggins, Andrea Clarke & Alan Bryson. “The spiral-locked letters of Elizabeth I and Mary, Queen of Scots”. eBLJ 2021. DOI: 10.23636/gyhc-b427.

Day, Angel. 1586. The English Secretorie. London. STC (2nd ed.) / 6401. British Library. EEBO.

Gibson, Jonathan. 1997. “Significant space in manuscript letters”. The Seventeenth Century 12(1): 1-9.

Nevalainen, Terttu & Helena Raumolin-Brunberg. 2017. Historical sociolinguistics: Language change in Tudor and Stuart England. 2nd edn. Abingdon/New York: Routledge.

Wolfe, Heather. 2012. “ ‘Neatly sealed, with silk, and Spanish wax or otherwise’. The practice of letter-locking with silk floss in early modern England”. In S. P. Cerasano & Steven W. May (eds.), In the Prayse of Writing: Early Modern Manuscript Studies Essays in Honour of Peter Beal. London: British Library, pp. 169–189.

Kayachev, Boris

‘Roses are red and violets are blue’: poetic language between formulaicity and intertextuality (the case of purpureus)

The concept of formula plays an important role both in (various strains of) modern linguistics and in the so-called Oral-Formulaic Theory; despite many differences, they share the insight that formulae have a cognitive basis: rather than being constructed ad hoc every time they are used, formulae are retrieved from the mental lexicon as single – prefabricated – units, often with the corollary that they are also semantic units. This perspective makes sense if speech/text production is viewed as a spontaneous process that takes place in the moment, the cognitive/time constraint being a crucial factor. But what is the place of formulae within the framework of a developed literary tradition, such as classical Latin poetry, which allows a lifetime for the author to create, and for the reader to appreciate, a poem, revisiting it again and again? It might be objected that poetic language is profoundly artificial and thus of little interest to the linguist, but arguably there are other kinds of discourse that allow for, and even encourage, premeditation and self-reflexivity.

In this paper I propose to explore what ‘prefabricated’ can mean in the context of Latin poetry, by investigating formulaic noun phrases that include purpureus ‘purple’ as a modifier, in the (partly overlapping) corpora of the PHI5 (classical poetry and prose) and the Musisque deoque (classical and post-classical poetry) databases (some 440 and 344 entries respectively; my approach is very basic, so 20 min. should suffice). Initial soundings suggest that there may be (at least) five different categories (not necessarily mutually exclusive). (1) Formulae borrowed from everyday language, such as uestis purpurea ‘purple fabric’ (= purpura), relatively frequent in both prose and poetry. (2) Formulae adapted from technical discourse, such as flore purpureo in the description of dictamnum at Virgil, Aeneid 12.413–14, evocative of botanical descriptions in Pliny (e.g. 26.95–6). (3) Imitations of Greek poetry, esp. Homer, such as sale purpureo (lit. ‘purple salt’) at Valerius, Argonautica 3.422 mimicking hala porphyreēn ‘heaving sea’ at Iliad 16.391. (4) Formulae cultivated by a particular poet within his oeuvre, such as purpureus pudor ‘purple shame’ in Ovid (Amores 1.3.14, 2.5.34, Tristia 4.3.70). (5) All of the above may, or may not, become (more or less) established as traditional formulae in subsequent poetry.

While this analysis brings out a number of important questions, such as what exactly purpureus means in formulae like sale purpureo or how and to what extent we can distinguish between the different categories (esp. given the overall patchy state of evidence), I propose to conclude by considering a more general question: why are formulae used in literary poetry, where they are not necessitated by cognitive economy? It is often observed that specific formulae may belong to, in the sense of being conditioned by, specific discourses; this I suggest also gives formulae the potential to be intentionally used so as to evoke specific discourses or intertexts, which makes them attractive for literary poets.

R.J. Edgeworth, ‘Does “purpureus” mean “bright”?’, Glotta 57 (1979), 281–91.

J.M. Foley, Immanent Art: From Structure to Meaning in Traditional Oral Epic (Bloomington, 1991).

Frog and W. Lamb, eds., Weathered Words: Formulaic Language and Verbal Art (Cambridge, MA, 2022).

M. Lapidge, ‘Aldhelm’s Latin poetry and Old English verse’, Comparative Literature 31 (1979), 209–31.

E. Minchin, ‘Poet, audience, time, and text: reflections on medium and mode in Homer and Virgil’, in R. Scodel, ed., Between Orality and Literacy: Communication and Adaptation in Antiquity (Leiden, 2014), 267–88.

W. Moskalew, Formular Language and Poetic Design in the Aeneid (Leiden, 1982).

M. Sale, ‘Virgil’s formularity and pius Aeneas’, in A. MacKay, ed., Signs of Orality: The Oral Tradition and Its Influence in the Greek and Roman World (Leiden, 1999), 199–220.

A. Wray, Formulaic Language and the Lexicon (Cambridge, 2002).

Kootstra-Ford, Fokelien (part of panel A)

Formulaic variation: Leveraging formulaic language to understand linguistic variation in Dadanitic inscriptions (6^th – 1^st c. BCE)

Dadanitic is the name of a script that was used to carve inscriptions in the ancient oasis of Dadan (modern-day AlUla) in Northwest Arabia, between the 6^th and late 1^st centuries BCE (for the dating of Dadan Rohmer and Charloux 2015; for the definition of Dadanitic Macdonald 2000). The inscriptions are written in a Semitic language that was linguistically distinct from, but close to Arabic (Al-Jallad 2018). The corpus of Dadanitic inscriptions is characterized by its highly formulaic language, yet it displays remarkable variation in its orthography, phonology, and morphology (Kootstra 2023).

This contribution will focus on how formulae form the key to understanding linguistic variation in smaller historical corpora like Dadanitic. It will demonstrate how formulae inform the qualitative investigation of linguistic variation, not only by establishing a linguistic baseline, but also to help identify influence from other languages and writing cultures. On the other hand, it will show how a quantitative approach to the distribution of variation across formulaic types can help us understand how different linguistic variants were used in different contexts. This will underline the influence of using ‘someone else’s language’ while highlighting the space for linguistic innovation and personal expression.

Besides illustrating how formulaic language use can be an asset when studying variation, this will show that Dadanitic was written by authors of varying competence and historical awareness. It will also reveal the impact that the rich multilingual environment in which the Dadanitic writing culture developed had on its written record.

Al-Jallad, Ahmad. 2018. “What Is Ancient North Arabian?” In Re-Engaging Comparative Semitic and Arabic Studies, edited by D. Birnstiehl and N. Pat-El, 1–43. Wiesbaden: Harrassowitz.

Kootstra, Fokelien. 2023. The Writing Culture of Ancient Dadan. Studies in Semitic Languages and Linguistics 110. Leiden: Brill.

Macdonald, Michael C.A. 2000. “Reflections on the Linguistic Map of Pre-Islamic Arabia.” Arabian Archaeology and Epigraphy 11:28–79.

Rohmer, J., and G. Charloux. 2015. “From Liḥyān to the Nabataeans: Dating the End of the Iron Age in North-West Arabia.” Proceedings of the Seminar for Arabian Studies 45:297–320.

Kopaczyk, Joanna

Studying formulaic language in historical linguistics

In this talk I address the basic questions one has to ask when delving into formulaic language of the past: what does formulaic language mean for a historical linguist, why it is important to study it, and how it has been approached. Inspired by the Formulaic Language in Historical Research and Data Extraction workshop, organised by the REPUBLIC project team in Amsterdam in 2024 (see link to non-peer-reviewed papers in the references), as well as recent knowledge cross-overs between different linguistic traditions (e.g. Filatkina 2018, 2024), this presentation reflects on the challenges of studying formulaic language and the affordances of new technologies and ways of thinking.

After reviewing the relevance of various definitions of formulaic language for a historical linguist, I move on to the issues of data extraction and interpretation. Essentially, historical linguists work with written texts. These are materials which are very different in nature than those used to investigate formulaicity in present-day language. I would argue that our approach to formulaic language should be philological, that is embedded in a deeper understanding of the conditions of text production – material, social, linguistic – as well as of patterns and traditions of text transmission. I will draw attention to the relevance of the Communities of Practice framework adopted from modern sociolinguistics (Kopaczyk forthcoming). It aligns with the idea of horizontal learning (Snijders 2019), permeating text production in monasteries and chanceries of medieval Europe, which are probably the foremost focus of formulaic language investigations, and it accounts for the spread of “ways of doing things” in a stable and recognisable manner, which goes beyond the medieval period.

In terms of challenges for the study of formulaic language of the past, I will look at the multilingual environments in which historical text production took place as well as the paradox of variation in formulaic language. I will give an overview of how various projects and researchers have tackled these issues (Kopaczyk and Tyrkkö eds. 2018; Korkiakangas 2022, 2024; Ostrowski et al. 2024; Zbíral et al. 2024). Against this background, I will highlight the tools and methodologies which have been helpful in extracting and categorising instances of formulaicity in historical texts. The overall purpose of the talk is to take stock of the current work in the field and establish the reference posts for future investigations.

Formulaic Language in Historical Research and Data Extraction. 2024. https://zenodo.org/communities/formulaiclanguage2024/records?q=&l=list&p=1&s=10&sort=newest

Filatkina, Natalia. 2018. Historische Formelhafte Sprache: Theoretische Grundlagen und Methodische Herausforderungen. Walter de Gruyter.

Filatkina, Natalia. 2024. Formulaic language. In Mieke Vandenbroucke, Jana Declercq, Frank Bisard and Sigurd D’hondt (eds.) Handbook of pragmatics, vol. 27. Amsterdam: John Benjamins.

Kopaczyk, Joanna and Jukka Tyrkkö (eds.) 2018. Applications of pattern-driven methods in corpus linguistics. Amsterdam: John Benjamins.

Kopaczyk, Joanna. 2020. Textual standardisation of legal Scots vis a vis Latin. In Laura Wright (ed.) The multilingual origins of standard English. Berlin: Mouton De Gruyter.

Kopaczyk, Joanna. Forthcoming. Third-wave historical sociolinguistics and communities of practice. In Juan M. Hernández-Campoy and Juan Camilo Conde-Silvestre (eds.) The handbook of historical sociolinguistics. (2nd edition). Oxford: Wiley Blackwell.

Korkiakangas, Timo. 2022. “From memory or formulary: how were medieval documentary formulae reproduced?” Mirator 22(1): 4–24.

Korkiakangas, Timo. 2024. A linguist's viewpoint: formulaic language as a challenge for historical linguistics. Formulaic Language in Historical Research and Data Extraction, Amsterdam, Netherlands. https://doi.org/10.5281/zenodo.10461848

Ostrowski, Alina, Thomas Haider, Bengt Büttner, Daniel Berger, Klaus Herbers, and Malte Rehbein. 2024. Annotating Formulaic Language in 12th Century Papal Charters: Observations on Definitions and Challenges. Formulaic Language in Historical Research and Data Extraction, Amsterdam, Netherlands. https://doi.org/10.5281/zenodo.10519355

Snijders, Tjamke. 2019. Communal learning and communal identities in medieval studies: Consensus, conflict, and the community of practice. In Micol Long, Tjamke Snijders and Steven Vanderputten (eds.) Horizontal learning in the High Middle Ages: Peer-to-peer knowledge transfer in religious communities. Amsterdam: Amsterdam University Press. pp. 17-46.

Wood, David. 2015. Fundamentals of formulaic language: An introduction. London: Bloomsbury.

Wray, Alison. 2002. Formulaic language and the lexicon. Cambridge: Cambridge University Press.

Zbíral, D., Kotzé, G., & Shaw, R. L. J. (2024). How formulaic are inquisition records? Measuring lexical richness and text similarity in a corpus of Latin notarial documents. Formulaic Language in Historical Research and Data Extraction, Amsterdam, Netherlands. https://doi.org/10.5281/zenodo.10628665

Korkiakangas, Timo

Mitigating formulaicity bias in historical corpus linguistics

In my presentation, I will explore ways in which formulaicity affects the diachronic linguistic analysis of historical text corpora. I will show that the massive presence of formulaic language in an early medieval Latin charter corpus limits the diachronic conclusions that can be drawn from lexically determined corpus-linguistic measures. Focusing on diachronic productivity metrics, I suggest a way to mitigate the formula bias by means of adjustment coefficients calculated on the overall productivity rate of the entire vocabulary within the same diachronically successive subcorpora across which the productivity change of the linguistic constructions under examination is measured. I will test the method on a set of Latin constructions which are known to be emergent, stable, or recessive, i.e., shrinking over time, in the early Middle Ages. My comparisons suggest that the diachronic profiles based on adjusted values, especially the adjusted neologism-based productivity (Pneo^A), reflect the expected trends of the constructions better than unadjusted profiles.

Baayen, H. 1993. ‘On frequency, transparency, and productivity’, G.E. Booij & J. van Marle (eds.), Yearbook of Morphology 1992. Kluwer Academic Publishers, 181–208.

Baayen, H. 2009. ‘Corpus linguistics in morphology: Morphological productivity’, A. Lüdeling & M. Kytö (eds.), Corpus Linguistics: An International Handbook. Gruyter, 899–919.

Berg, K. 2020. ‘Changes in the productivity of word formation patterns: Some methodological remarks’, Linguistics 58, 1117–1150.

Fried, M. 2015. ‘Construction Grammar’, A. Alexiadou & T. Kiss (eds.), Syntax – Theory and Analysis: An International Handbook. Gruyter, 974–1003.

Guardamagna, C. 2018. ‘Type frequency, productivity and schematicity in the evolution of the Latin secundum NP construction’, P. Andersson & al. (eds.), Grammaticalization Meets Construction Grammar. Benjamins, 169–202.

Korkiakangas, T. 2021. ‘Late Latin Charter Treebank: contents and annotation’, Corpora 16, 191–203.

Säily, T. 2016. ‘Sociolinguistic Variation in Morphological Productivity in Eighteenth-Century English’, Corpus Linguistics and Linguistic Theory 12, 129–151.

Valentini, C. 2018. L’evoluzione della codifica del genitivo dal tipo sintetico al tipo analitico nelle carte del Codice diplomatico longobardo. Firenze University Press.

Zeldes, A. 2012. Productivity in Argument Selection: From Morphology to Syntax. Gruyter.

Koroli, Aikaterini

Stereotypicality and variation in Greek private papyrus letters: a focus on stereotypical directive speech-acts

The presentation will deal with the phenomenon of formulaicity or stereotypicality in the main body of the Greek non-literary letters preserved on papyrus and ostraca, and dated from the Roman and Byzantine periods of Egypt. The discussion will be divided into two parts.

The first part will focus on the definition and delimitation of the concept of formulaicity/stereotypicality and that of variation in the corpus under study taking into account its individualities. Variation could be perceived either as situation-specificity or as the deliberate deviation from fixed forms of expression, e.g. through the enrichment of commonplace structures or expressions and/or literariness. Since these two concepts are actually the edges of a continuum one should speak of gradations of formulaicity/stereotypicality (or variation, respectively) expressed in several ways. This part of the discussion will revolve around issues such as: a classification of all kinds of formulas, commonplace expressions and repeated structural/rhetorical patterns; the functions of the aforementioned linguistic devices in the main body of the private papyrus letters; the reasons hidden behind the ancient senders’ tendency to construct their text by resorting to commonplace structures and expressions and, conversely, their tendency to distance themselves from clichés or derivatives; the change in the perception and expression of formulaicity/stereotypicality as a result of the transition from the Roman to the Byzantine period.

Following these introductory remarks, the discussion will then focus on stereotypical directives. This part of the presentation will revolve around the definition and the categories of formulaic requests in the corpus under study, as well as their functions with regard to the structure, register and style of the epistolary text.

Longrée, Dominique & Vanni, Laurent (part of panel B)

New Ways to identify Formulaic Expressions in Latin Epistolography: Between Statistics and AI

In this paper, we will first briefly review the different types of patterns considered as “phraseological” by the linguists and specify which one could be considered as “formulaic”. We will then specify some difficulties their automatic detection meets and evaluate some pattern detection techniques (combining data mining and statistics) in order to assess their performance, advantages and disadvantages. We will finally explore to which extent the use of HyperDeep, an AI tool, may prove useful, or even indispensable, in this field. The methods will be applied to a Latin epistolary corpus, from Classical times to medieval period.

The automatic detection of “formulaic expressions” meets various difficulties when the notion is extended to non-totally fixed patterns (Longrée & Mellet, 2013), as could be some formulaic expressions: unlike repeated segments (Salem, 1986), verbal tense sequences (Longree & Luong, 2003) or syntactic motifs (Mellet & Longrée, 2009), some non-totally fixed patterns (Fillmore et al., 1988; Sinclair, 1991: 111-112 ; Gledhill & Frath, 2007) consist in multidimensional phraseological patterns made up of items of several types (forms, lemmas, grammatical categories, etc.) and allowing some variations. Each of these variations is a new challenge for automated detection based on sequential search.

A tricky parameter is the frequency of the patterns: repetitions are necessary to ensure the memorization of patterns (Lavigne et al., 2016; Longrée, Mellet & Lavigne, 2019), but a high frequency of a pattern does not always mean that the pattern is formulaic: for instance, syntactic patterns are highly frequent but cannot be considered formulaic offhand. In fact, formulaic expressions are not only identified by the repetition but also by their textual function (structuring or characterizing one) and therefore be considered as a particular category of “textual motifs” (Longrée & Mellet, 2018).

Despite these difficulties, several tools have been proposed since the beginning of the century in order to detect “textual motifs” (see Ganascia, 2001). We will test some of them: SDMC (Sequential Data Mining under Constraints: https://sdmc.greyc.fr/index.php; Cellier et al., 2012; Quiniou et al., 2012), Le lexicoscope (Kraif 2016; 2019) and Hyperbase (https://hyperbase.unice.fr/; Vanni 2024). In order to assess the advantages and disadvantages of each of these methods, we will apply them to a corpus of Latin epistolary texts processed using LASLA methods. We will finally compare the “linguistic patterns” we extracted with those the new software HyperDeep (based on CNN and Transformers; Vanni et al., 2018, 2023; 2024) identifies in the same corpus and show the added value of this method.

Cellier, P., Quiniou, S., Charnois, Th., & Legallois, D. (2012). What About Sequential Data Mining Techniques to Identify Linguistic Patterns for Stylistics? In Lecture Notes in Computer Science, Springer Vol. 7181, 166-177.

Fillmore C.J., Kay P. & O’Connor M.C. (1988), Regularity and Idiomaticity in Grammatical Constructions: the Case of Let Alone, Language 64 (3), 501-538.

Ganascia, J. G. (2001). Extraction automatique de motifs syntaxiques, in Actes de la 8ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN’2001). Tours (France), edited by Jean Véronis, Laurence Danlos, Pierre Zweigenbaum, Nathalie Gasiglia, and Pascal Amsili. Accessed January 28, 2019. http://talnarchives.atala.org/TALN/TALN-2001/taln-2001-long-017.pdf.

Gledhill C. & Frath P. (2007), Collocation, phrasème, dénomination: vers une théorie de la créativité phraséologique, in La Linguistique 43 (1), 63-88.

Kraif, O. (2016). Le lexicoscope: un outil d’extraction des séquences phraséologiques basé sur des corpus arborés, in Cahiers de lexicologie, 108, 91-106.

Kraif, O. (2019). Explorer la combinatoire lexico-syntaxique des mots et expressions avec le Lexicoscope, in Langue française, 203, 67-83.

Lavigne, F., Longrée, D., Mellet, S., & Mayaffre, D. (2016). Semantic Integration by Pattern Priming: Experiment and Cortical Network Model, in Cognitive Neurodynamics, DOI 10.1007/s11571-016-9410-4, 1-21

Longrée, D., & Luong, X. (2003). Temps verbaux et linéarité du texte: recherches sur les distances dans un corpus de textes latins lemmatisés, in Corpus, 2

Longrée, D., & Mellet, S. (2013). Le motif: une unité phraséologique englobante? Étendre le champ de la phraséologie de la langue au discours, in Langages 189, 68-80.

Longrée, D., & Mellet, S. (2018). Towards a topological grammar of genres and styles: a way to combine paradigmatic quantitative analysis with a syntagmatic approach, in The Grammar of Genres and Styles: From Discrete to Non-Discrete Units, edited by Dominique Legallois, Thierry Charnois, and Meri Larjavaara, 140–163. Berlin/Boston: de Gruyter.

Longrée, D., Luong, X., & Mellet, S. (2008). Les motifs: un outil pour la caractérisation topologique des textes, in S. Heiden & B. Pincemin, Actes des JADT 2008, 9èmes Journées internationales d’Analyse statistique des Données Textuelles, Lyon, 12-14 mars 2008 (pp. 733-744). Lyon, France: Presses ENS

Longrée, D., Mellet, S., & Lavigne, F. (2019). Construction cognitive d’un motif: cooccurrences textuelles et associations mémorielles, in CogniTextes. doi:10.4000/cognitextes.1202

Mellet, S., & Longrée, D. (2009). Syntactical Motifs and Textual Structures. Considerations based on the Study of a Latin historical Corpus, in Belgian Journal of Linguistics, 23. doi:10.1075/bjl.23.13mel

Mellet, S., & Longrée, D. (2012). Légitimité d'une unité textométrique: le motif, in A. Dister, D. Longrée, G. Purnelle (Eds.), Actes des Journée d'analyse des données textuelles 2012 (pp. 715-728).

Quiniou, S., Cellier, P., Charnois, Th., & Legallois, D. (2012). Fouille de données pour la stylistique: l’exemple des motifs émergents, in Actes des 11èmes Journées Internationales d'analyse statistique des données textuelles, Liège, 13-15 juin 2012, 821-833.

Salem, A. (1986), Segments répétés et analyse statistique des données textuelles, in Histoire & Mesure 1986, 1-2, 5-28

Sinclair J. (1991), Corpus, Concordance, and Collocation, Oxford: Oxford University Press.

Vanni L., Hyperbase Web. (Hyper)Bases, Corpus, Langage, in Corpus, 2024, 25, ⟨10.4000/corpus.8770⟩. ⟨hal-04523479⟩

Vanni L., Corneli M., Mayaffre D., & Precioso F (2023). From text saliency to linguistic objects: learning linguistic interpretable markers with a multi-channels convolutional architecture, in Corpus 24 https://journals.openedition.org/corpus/7667

Vanni L., Hadi M., Longrée D & Mayaffre D. (2024), Multi-channel Convolutional Transformer and intertextuality: a Latin case study, in Misuraca M. & Giordano G., New Frontiers in Textual Data Analysis, Springer, à paraître.

Vanni, L., Mayaffre, D., & Longrée, D. (2018). ADT et deep learning, regards croisés. Phrases-clefs, motifs et nouveaux observables, in 14es Journées internationales d’Analyse statistique des Données Textuelles. JADT 2018, Rome, p. 459-466.

Maczuga, Julia (part of panel A)

The religious formulae attested in the Arabic graffiti from North-West Arabia during the Late pre-Islamic and Early Islamic periods: A study in continuity and change

The rich corpus of graffiti discovered in northern Saud Arabia provides a unique opportunity to study the evolving Arabic epigraphic writing culture, as it contains both pre-Islamic Arabic inscriptions written in the so-called “Paleo-Arabic” script dating from the fifth and sixth centuries AD (NEHMÉ 2020: 128) and Early Arabic Islamic inscriptions dating from the first three centuries of Islam (seventh-ninth centuries AD).

Both the pre-Islamic and Early Islamic Arabic inscriptions are characterized by their high level of formulaicity. Until now, the academic community has commonly admitted that the arrival of Islam had brought about a significant shift in religious formulae with the introduction of new types of invocations and that the Islamic religious graffiti thus developed independently from earlier writing cultures. Although Islamic material does indeed have unique features, this paper aims to demonstrate that there is also some observable continuity between Paleo-Arabic and Arabic writing cultures, not only in terms of script but also in the application of the religious formulae.

Both in Paleo-Arabic and Arabic graffiti there are certain formulae that use the same formula, such as the introductory phrase bi-smi llāh ‘in the name of God’ (Basmala) (AL-JALLAD 2022). However, there are also phrases that are semantically similar, and use the same verbal root, but the verbs appear in different grammatical forms. For example, in both Paleo-Arabic and Arabic graffiti, the root ĠFR ‘to forgive’ is used, but in Paleo-Arabic, it appears as yistiġfar (3. masc. sing. imperf) (AL-JALLAD and SIDKY 2021: 210), while in Arabic, iġfir occurs in the imperative mood. Conversely, both writing cultures have an expression that conveys similar meaning, but expressed using different verbal roots. The phrase ‘whoever invokes/says God’s name’ in Paleo-Arabic is expressed with the root DʿW ‘to invoke’ (ArDA 1, see DicoNab) while in Islamic Arabic, the root QWL ‘to say’ is applied. Although some religious formulae were adopted by early Muslims from earlier inscriptions, the difference in grammatical forms provides clear evidence that these religious formulae continued to evolve. A closer look at the formulaic usage in Paleo- and Early Islamic Arabic inscriptions will provide a more nuanced insight into the dynamics of continuity and change in formulaic and linguistic usage in this period.

AL-JALLAD, A. (2022). A pre-Islamic basmala: reflections on its first epigraphic attestation and its original significance Jerusalem Studies in Arabic and Islam 52: 1-28.

AL-JALLAD, A. and SIDKY, H. (2021): A paleo-Arabic inscription on the route north of Ṭāʾif. Arabian archaeology and epigraphy 33: 202-215.

DiCoNab: ‘The Digital Corpus of the Nabataean and Developing Arabic Inscriptions’ [Diconab.huma-num.fr]

NEHMÉ, L. (2020): The religious landscape of North-west Arabia as reflected in the Nabataen, Nabataeo-Arabic, and pre-Islamic Arabic inscriptions. Semitica et Classica 13: 127-154.

Majdak, Magdalena

Evolution of the Formulaic Expressions Referring to God in Polish Language History: Analysis of the Correspondence of the Czapski Family

The paper is a fragment of research on formulaic expressions with the word God in the history of Polish. The aim of this article is to catalogue formulas (e.g. da Bóg, jeśli Bóg pozwoli, pożal się Boże, z Bogiem) in selected correspondence from the Baroque period, to compare their collection with the resources of Polish from the 20th and 21st centuries, in which they are constantly present, and to examine whether these constructions maintain, lose or acquire formulaicity. Phrases containing the unit God were selected for the presentation, which are not always conscious references to God, but formulas, semi-magical, referring to a higher authority organizing the world and reality.

The material basis consists of letters obtained as part of the projects Edition of the letters of Magdalena née Czapska to Hieronim Florian Radziwiłł (2013–2016) and Sources on the Czapski Family in the 18th Century: Ego-documents of the Family Members of Pomeranian Voivode Piotr Jan (1685-1736) – A Philological-Historical Study and Edition (2022-2027). The correspondence, consisting of nearly 500 letters, was created in the 18th century in a typical and at the same time original family of people from the upper social class, communicating both with the outside world and within the family. The letters were transliterated and transcribed, some of them were published (2016). This resource was compared with subcorpora a) letters from King Jan III Sobieski to his wife Marysieńka (1665-1683) from The Electronic Corpus of 17th- and 18th-century Polish Texts (KorBa), b) with a subcorpus consisting of the remaining letters from KorBa, c) with a subcorpus containing the remaining text genres - divided into Baroque and Enlightenment.

The rich search options in the corpus and the advanced query syntax are helpful here, which allows asking about missing variants of formula elements without assuming a priori its components, also with variable order (e.g. uchowaj Boże, Boże uchowaj). They also allow obtaining information about the frequency of n-grams containing the unit God in the Czapski correspondence with reference to reference subcorpora. The material will then be compared with formulas containing the unit God extracted using the Kolokator program from the National Corpus of the Polish Language, where similar forms are still present, and from the NKJP letter subcorpus.

The analysis includes: 1. Extraction of formulas with the unit God from the Czapski family correspondence, 2. Determination of the canonical form and variants of the formulas, assigning them grammatical and syntactic characteristics, 3. Comparison of them with formulas with the unit God: a) in the collection of Sobieski's letters to Marysieńka, b) reference subcorpora, c) with formulas with the unit God in the NKJP. 4. Discussion of changes in the strength of the formula and its equivalent non-formula structures in the examined epistolographic material. The analysis of the aforementioned formulas also includs material from historical and phraseological dictionaries of the Polish language. This allows for deeper considerations on the subject of formulaicity based on the potential evolution of the meanings and uses of the formulas studied.

‒ Włodzimierz Gruszczyński, Dorota Adamiec, Renata Bronikowska, Witold Kieraś, Emanuel Modrzejewski, Aleksandra Wieczorek, Marcin Woliński 2021. The Electronic Corpus of 17th- and 18th-century Polish Texts, „Language Resources & Evaluation”, t. 56, z. 1, s. 309-332, https://link.springer.com/article/10.1007/s10579-021-09549-1

‒ Mikhail Mikhailov (2021), God, the Devil, and Christ: A corpus study of Russian syntactic idioms and their English and Finnish translation correspondences [in:] Formulaic language. Theories and methods, Edited by Aleksandar Trklja and Łukasz Grabowski, DOI: 10.5281/ZENODO.4727675

‒ Piotr Pęzik (2012), Wyszukiwarka PELCRA dla danych NKJP. Narodowy Korpus Języka Polskiego. Przepiórkowski A., Bańko M., Górski R., Lewandowska-Tomaszczyk B. (red.). 2012.

‒ Zygmunt Saloni, Marek Świdziński (2007), Składnia współczesnego języka polskiego, Warszawa.

‒ Elena Tognini-Bonelli (2001). Corpus Linguistics at Work, Amsterdam/Philadelphia.

‒ Formulaic Language in Historical Research and Data Extraction https://republic.huygens.knaw.nl/index.php/en/conference-formulaic-language/

‒ Joanna Zaucha (2007). Status językowy porównań standardowych a pojęcie utartości, [in:] Frazeologia a językowe obrazy świata przełomu wieków, red. W. Chlebda, Opole, s. 343-348.

Letters edition

‒ „Gdybym Cię, moje Serce, za męża nie miała, żyć bym nie mogła”. Listy Magdaleny z Czapskich do Hieronima Floriana Radziwiłła z lat 1744–1759, wstęp i opracowanie I. Maciejewska i K. Zawilska, Olsztyn 2016.

Corpora

‒ Electronic Corpus of 17th- and 18th-century Polish Texts (KorBa), https://korba.edu.pl/

‒ National Corpus of the Polish Language, https://nkjp.pl/

Dictionaries

‒ Bańko, M. (Ed.). (2000). Inny słownik języka polskiego. Wydawnictwo Naukowe PWN.

‒ Doroszewski, W. (Ed.). (1958–1969). Słownik języka polskiego PAN. Wydawnictwo Naukowe

PWN.

‒ Majdak M. (2024–), Gruszczyński, W. (2004–2023). (Ed.). Elektroniczny słownik języka

polskiego XVII i XVIII wieku. Instytut Języka Polskiego PAN. https://sxvii.pl

‒ Mrowcewicz, K., & Potoniec P. (Eds.). (1956–). Słownik polszczyzny XVI wieku. Instytut Badań

Literackich.

‒ Skorupka, S. (Ed.). (1967) Słownik frazeologiczny języka polskiego, Warszawa.

‒ Urbańczyk, S. (Ed.). (1953–2002). Słownik staropolski. Instytut Języka Polskiego PAN.

‒ Wielki słownik frazeologiczny PWN z przysłowiami (2022), Warszawa.

‒ Żmigrodzki, P. (2007–). Wielki słownik języka polskiego PAN. Instytut Języka Polskiego PAN.

Marszałek, Jagoda & Wieczorek, Aleksandra

Polish and Latin date formulas used in Polish texts from 17^th to 18^th centuries

Although the topic of multilingualism is already well explored for the historical languages of Western Europe (Trotter 2000, Adams 2003, Amsler 2012, Pahta et al. 2018), its features and significance for the history of the Polish language have become the subject of scientific research relatively recently (Axerowa 2007, Walczak and Mielczarek 2015, Zarębski 2021, Masłej 2023).

Polish-Latin bilingualism was present in the Polish-speaking area in the Middle Polish era (16^th-18^th centuries). The Polish literary language was already fully formed in the 16^th century, but Latin continued to function in Polish literature in the following two centuries. Some texts were written entirely in Latin, but Latin elements were often incorporated into the uniform Polish text in the form of inlay (Brajerski 1965, Klemensiewicz 2009: 402–409, Lewaszkiewicz & Rzepka 1978, Leszczyński 1983, Dubisz 2002: 222–229, Kopaczyk 2018, Kopaczyk et al. 2016). The presentation focuses on the coexistence of Latin and Middle Polish on the example of date formulaic expressions in Polish texts from the 17^th and 18^th centuries. Repetitive expressions referring to time and date constitute one of the most popular lexical bundles, especially in some specialized historical texts (cf. e.g. Kopaczyk 2013: 210). Here are some examples from Middle Polish texts:

die 3 iulii anno 1732 – Lat. ‘3rd July 1732’
dnia 17. Maja 1628 – Pl. ‘17th May 1628’
dnia 31. Aprilis, Anno 1646 – Pl. and Lat. ‘31st April 1646’
in Anno 1612. et 1613 – Lat. ‘in the years 1612 and 1613’
Czternastego Novembra, Roku 1719 – Pl. ‘14th November 1719’

Several types of date formulas can be pointed out, both Polish and Latin, which, despite being highly repeatable, show significant variation in the elements used. As example 3) shows, they are often themselves a combination of two languages. An additional topic is the use of Polish names of months and names of Latin origin (listopad vs. november, see example 5). Using date formulas as an example, the study addresses the question of why Latin language elements are still present in Polish texts from the period, despite the existence of Polish equivalents.

In addition, the research examines how the extra-linguistic context (genre, topic, etc.) may have influenced the choice of a particular date formula, and how these trends have changed over the course of two centuries. Researchers of Middle Polish texts note that the degree of saturation of the text with Latin elements varied depending on the type of text and other extra-linguistic features (cf. e.g. Walczak-Mikołajczakowa and Mikołajczak 2021).

Next, the formal language of date notation and its standardization in Polish baroque texts are discussed. Finally, the research examined the possibilities of annotating date formulas in the Middle Polish Dependency Treebank (Wieczorek 2025).

The presented study is corpus-based. The research material comes from the Electronic Corpus of the 17^th- and 18^th-century Polish Texts, which gives us many possibilities of searching and analyzing data, also using metadata such as publication time or genre (korba.edu.pl; Gruszczyński et al. 2022; cf. also Bronikowska & Kryńska 2020).

Adams J. N. (2003): Bilingualism and the Latin Language. Cambridge.

Amsler M. (2012): Affective Literacies. Writing and Multilingualism in the Late Middle Ages. Turnhout.

Axerowa, A. (2007): Niespodzianki dwujęzyczności szlacheckiej: Pasek jako orator. “Pamiętnik Literacki. Czasopismo kwartalne poświęcone historii i krytyce literatury polskiej” (2): 207–218. (https://pamietnik-literacki.pl/uploads/settings/2023/05/21/646a1260091c57.89377580_9-axerowa.pdf)

Brajerski T. (1965): Ze składni tekstu makaronizowanego. “Studia z Filologii Polskiej i Słowiańskiej” 5: 237–240.

Bronikowska R. and Kryńska K. (2020): Łacina w KorBie. Użyteczność elektronicznego korpusu tekstów polskich XVII i XVIII wieku dla filologa neolatynisty. “Polonica” 40:123–135.

Dubisz S. (2002): Język – historia – kultura (wykłady, studia, analizy). Warszawa.

Gruszczyński W., Adamiec D., Bronikowska R., Kieraś W., Modrzejewski E., Wieczorek A., and Woliński M. (2022): The Electronic Corpus of 17th- and 18th-century Polish Texts. “Language Resources and Evaluation” 56(1):309–332.

Klemensiewicz Z. (2009): Historia języka polskiego. 9. ed., Warszawa.

Kopaczyk J. (2013): The Legal Language of Scottish Burghs: Standardization and Lexical bundles (1380–1560). Oxford University Press 166.

Kopaczyk J. (2018): Administrative multilingualism on the page in early modern Poland: In search of a framework for written code-switching. In P. Pahta, J. Skaffari, L. Wright (eds.): Multilingual Practices in Language History. English and Beyond. – Berlin–Boston: De Gruyter Mouton, 258–275.

Kopaczyk J., Włodarczyk M., and Adamczyk E. (2016): Medieval Multilingualism in Poland: Creating a Corpus of Greater Poland Court Oaths (Rotha). “Studia Anglica Posnaniensia Adam Mickiewicz University” 51, no. 3: 16–20. (https://doi.org/10.1515/stap-2016-0012)

Lewaszkiewicz T. and Rzepka W. R. (1978): Uwagi o leksyce makaronicznej w tekstach polskich z XVII wieku. “Z polskich studiów slawistycznych” 5: 271–277.

Leszczyński Z. (1983): Echa makaronizowania. “Roczniki Humanistyczne” vol. XXX-XXXI, is. 6 – 1982–1983: 97–104. (https://bibliotekanauki.pl/articles/2127932).

Masłej, D. (2023). Średniowieczne zabytki polsko-łacińskie jako przedmiot badań historycznojęzykowych. Perspektywy badawcze. “Biuletyn PTJ”, LXXIX(79), 355-369. https://doi.org/10.5604/01.3001.0054.2635.

Trotter D. A. (ed.) (2000): Multilingualism in Later Medieval Britain. Cambridge.

Walczak B. and Mielczarek A. (2017): Prolegomena historyczne – wielojęzyczność w Rzeczypospolitej Obojga Narodów. “Białostockie Archiwum Językowe” 17: 255–268. (https://www.ceeol.com/search/viewpdf?id=679921)

Walczak-Mikołajczakowa M. and Mikołajczak A. (2021): Kilka uwag o języku i kontekście kulturowym Diariusza podróżnego hetmana Filipa Orlika. “Poznańskie Studia Polonistyczne Seria Językoznawcza” vol. 28 (48), no 2: 162-167. https://doi.org/10.14746/pspsj.2021.28.2.9

Wieczorek A. (2025): Towards Middle Polish Dependency Treebank. In “Native Language in the 21^st Century: System, Communication Practices and Education”. V & R Unipress GmbH. (https://czasopisma.uni.lodz.pl/linguistica/article/view/20488)

Zarębski R. (2021): O potrzebie badań bilingwizmu w historii polszczyzny. “Prace Językoznawcze” XXIII/3: 71–86. (http://uwm.edu.pl/polonistyka/pracejezykoznawcze/pol/pliki/Prace_Jezykoznawcze_23_3_2021.pdf)

Martín González, Elena & Konstantopoulou, Stavroula

Formulaic Language in the Oracular Inscriptions of Dodona: Integrating Traditional Epigraphic Analysis and Deep Neural Networks

The 2013 publication of the corpus of oracular inscriptions from Dodona by Dakaris, Vokotopoulou, and Christidis, which contains over four thousand inscriptions, invites a reassessment of the formulaic language used by consultants in their enquiries to Zeus Naios and Dione. Contrary to previous assumptions, the new evidence reveals a broader range of formulae, although identifiable patterns remain.

As part of our work to produce a new edition of these inscriptions, which combines traditional epigraphic analysis with the assistance of the deep neural network Ithaca (Assael et al. 2022), the formulaic language of the oracular enquiries plays a central role. Epigraphic analysis is essential for restoring the highly fragmentary texts on the lead tablets, providing valuable information about the cult practices at the sanctuary, the language of the enquirers, and even the inscriptions' chronology. Meanwhile, applying Artificial Intelligence to these texts offers an exceptional opportunity to test the performance of the Ithaca model, particularly in its reliance on standard oracular formulas for restoration and its ability to attribute texts geographically and chronologically (Bodel et al. 2024).

Our presentation will introduce the dataset and methodology of our research, emphasizing the importance of combining insights from both traditional analysis and deep neural networks to offer a comprehensive, renewed understanding of the formulaic language in the Dodona oracular tablets.

Assael, Y., Sommerschield, T., Shillingford, B. et al. (2022). Restoring and attributing ancient texts using deep neural networks. Nature 603, 280–283. https://doi.org/10.1038/s41586-022-04448-z.

Bodel, J., Prag, J.R.W. and Roueché, C. (2024). Open Scholarship: Epigraphic Corpora in the Digital Age, In Pierre Fröhlich & Milagros Navarro Cabellero (eds.), L’épigraphie au XXIe siècle. Actes du XVIe Congrès International d’Épigraphie Grecque et Latine, Bordeaux, 29 août-02 septembre 2022, Bordeaux, Ausonius, 91-117.

Dakaris, S., Vokotopoulou, I. and A.-Ph. Christidis. 2013. Τα χρηστήρια ελάσματα της Δωδώνης των ανασκαφών Δ. Ευαγγελίδη [Ta chresteria elasmata ton anaskaphon D. Euangelide] Ι-ΙΙ, Athens.

Meeder, Sven & Schmidt, Gleb

Formulae of Authority: Formulaic Aspects of Referencing the Bible in Early Medieval Canon Law

The evolution of canon law in the Early Middle Ages was marked by the constant adaptation and re-adaptation of the normative legacy to new social and spiritual contexts. To remain relevant and authoritative, new collections of canonical texts—whether official or forged in an attempt to appear legitimate — had to conform to the strict constraints of tradition. This left compilers with a rich yet limited set of “conceptual building blocks” — existing norms, exempla, interpretations, and precedents — almost always supported by references to authoritative sources, especially Scripture. Additionally, various “linguistic building blocks” —specific formulae and expressions — were available to the compilers, enabling them to achieve particular rhetorical effects.

As a result, the corpus of Early Medieval canonical texts is linguistically repetitive and formulaic in its intellectual structures. This poses significant challenges not only for those seeking to contextualize these texts (e.g., dating or attributing them), but also for understanding how they functioned and why their reception varied so dramatically, with some texts achieving wide and lasting influence, while others saw only limited circulation.

Recent scholarship has acknowledged that the individuality of authors in canonical texts manifested itself in subtle differences in how these various building blocks were framed and interconnected. Building on this, we argue that the “success” and authority of a collection largely depended on the compiler’s ability to employ complex formulaic language to emphasize its continuity with the authoritative tradition.

The ultimate aim of this paper is to demonstrate that compilers were highly conscious in their use of pre-defined elements, developing various techniques to present this legacy to their readers in a convincing, authentic, and authoritative manner.

To pursue this ambition, the SOLEMNE project is constructing what is going to be a nearly complete corpus of Early Medieval canonical texts, both already edited and newly collected from original manuscript documents. To work with this rich and growing body of data, we have developed a pipeline that includes embedding-based semantic search, text reuse detection, and retrieval-augmented generation.

This synthetic approach, combined with close reading, enables us to systematically detect, describe, and interpret what we call “biblical formulaicity” in canonical texts. We argue that “biblical formulaicity”, which we define as biblical intertextuality in its strictly functional aspect, is one of the core stylistic features of these texts. At the most basic, intra-sentence level, it allowed compilers to give their texts almost sacred connotations. More importantly, quoting the Bible in a particular way could strengthen an argument or establish a connection between a norm and a Scriptural exemplum. By altering the framing, changing the introductory formula, interrupting quotations, or adding explanations and commentary, compilers could give the material a new sound.

Having introduced the ways to detect and categorize the recurring literary devices in the canonical corpus, we shall consider in more detail the case of the so-called Pseudo-Isidorean corpus, a voluminous canonical collection forged sometime in the 9th century to justify a nearly complete immunity of bishops from secular power. By analyzing how the recurring patterns detected in a machine-assisted manner manifest themselves in this particular collection, we shall showcase how the forger achieved his goal of establishing his creation as a legitimate reference in legal disputes.

Ubl, Karl, and Daniel Ziemann. Fälschung als Mittel der Politik ? Pseudoisidor im Licht der neuen Forschung Gedenkschrift für Klaus Zechiel-Eckes. Monumenta Germaniae Historica. Studien und Texte, Bd 57. Wiesbaden: Harrassowitz, 2015.

Mandikal, Priyanka. “Ancient Wisdom, Modern Tools: Exploring Retrieval-Augmented LLMs for Ancient Indian Philosophy”, Proceedings of the 1st Machine Learning for Ancient Languages Workshop, Association for Computational Linguistics (ACL), 2024.

Korkiakangas, Timo. “Documentary formulae as text reuse templates: constat and manifestus clauses in early medieval Latin charters”, in Digital Medievalist 16, 1–44. 2023. https://doi.org/10.16995/dm.8195

Mika, Tomasz

Division of Old Polish Apocrypha: Title Formulas in the Middle Ages

Given the historical context, the process of vernacularization faced, among others, the chief challenge of finding meaningful ways to divide large texts into portions of content, which today are recognized as chapters, titles, and subtitles. However, the formulation of all three varied widely.

At its genesis, a common strategy was to extract and differentiate one of the beginning sentences to act as a chapter; however, this process eventually became viewed as an overview of the chapter itself. Title formulation would evolve further, with some writers choosing experimental formulas for dividing content and others developing rigid and explicit formulas. The Old Polish apocryphal work of the New Testament contains approximately 800 titles and subtitles. The largest Slavic apocryphal text, Meditations on the Life of Jesus (also called Meditations of Przemysl), contains more than 430 subtitles. These chapters, titles, and subtitles owe their creation and evolution to medieval scribes.

Thanks in large part to the rewriting process, many texts became chaptered and titled, with medieval scribes being directly responsible. There is evidence to show how scribes gradually created, adopted, abandoned, and even readopted schemes. Further evidence shows how scribes perfected schemes over successive texts through developing narrative and etiquette, action to the object, and a sense of detachment from time and people. Simultaneously, we also see the increasing importance of sentence structure and the role of the noun phrase. Yet, these medieval scribes faced the challenge of finding critical chapter information while creating and applying the appropriate scheme to express it.

The vernacularization of the Old Polish Apocrypha, coupled with the evolution and process of text division, ultimately produces a stabilized linguistic schema. This paper, albeit highly truncated, aims to illuminate these processes. Consequently, my presentation will explore this matter through statistical and processual analysis. The former will deal with the identification and frequency of schemes. The latter will investigate and reconstruct the stages of schema formation and its mechanisms, which include reduction, derivation, and the transformation of syntactic structures.

Buerki, Andreas. 2020. Formulaic Language and Linguistic Change: A Data-Led Approach, CUP.

Kiparsky, Paul. 1976. “Oral poetry: some linguistic and typological considerations”, in Stolz, Benjamin A. & Stoll Shannon, Richard (eds), Oral Literature and the Formula, Ann Arbor, 73–106.

Kuiper, Koenraad. 2004. “Formulaic performance in conventionalised varieties of speech”, in Schmitt, Norbert (ed.), Formulaic Sequences: Acquisition, Processing, and Use, Benjamins, 37–54.

Kuiper, Koenraad. 2009. Formulaic Genres, Palgrave.

Mika, Tomasz. 2018. “The oldest Polish texts. New methods and new research issues in Polish historical linguistics”, in Kapetanović, Amir (ed,), The oldest attestations and texts in the Slavic languages, Holzhausen Der Verlag, 212-233.

Mika, Tomasz, Wacław, Twardzik. 2012. “Jak zagadkowe cztery tytuły rozdziałów w „Rozmyślaniu przemyskim” pozwalają wyobrażać sobie jego zagubiony autograf” [‘How the mysterious four chapter titles in The Przemysl Meditation allow us to imagine its lost autograph”], in Podtergera, Irina (ed.), Schnittpunkt Slavistik: Ost und West im wissenschaftlichen Dialog. Festgabe für Helmut Keipert zum 70. Geburtstag, Vandenhoeck & Ruprecht Verlag, 359-375.

Schmitt, Norbert (ed.). 2004. Formulaic Sequences: Acquisition, Processing, and Use, Benjamins.

Wray, Alison. 2008. Formulaic Language: Pushing the Boundaries, OUP.

Murgia, Giulia & Puddu, Nicoletta

Notarial Formularies in Early Modern Sardinia

In the history of the Sardinian language – a romance language attested in Sardinia since the 11^th century – legal-administrative Sardinian represents the only textual tradition that can be traced with continuity from the earliest written manifestations, when Sardinia was subdivided into four autonomous kingdoms (Giudicati). With the start of the Catalan-Aragonese conquest of Sardinia in the 14^th century, we witness the penetration of the figure of the Iberian-trained notary (Condorelli 2009). Even in the modern age, the Sardinian language of the administration persists as an area of resistance in writing (especially in notarial production), even though new languages (especially Catalan and Spanish) oust it from the upper echelons of Sardinia’s community language repertoire.

The peculiar sociolinguistic situation just described accounts for the interest in the writings of modern-day legal practitioners in Sardinia: these figures make up a particular community of practice and discourse (Putzu 2021), in which one observes the sedimentation and sharing of practices also characterised by the use of a formulaic language, often characterised by multilingualism.

This articulate professional group has a local apprenticeship with Sardinian notaries, and a specific training in Latin that, starting in the 17^th century, took place at the newly founded University of Cagliari. Their repertoire, however, is shared by a wider textual community, which draws on the Iberian framework on the one hand, but is fundamentally pan-European, due to the European diffusion of common law in the medieval and modern era. The Sardinian community that drafts notarial deeds is, moreover, variegated and not entirely homogeneous (sometimes showing instances of formulae that are awkwardly re-proposed or the result of contamination of several models): if notaries have a specific and more supervised preparation, the different level of familiarity with a sectorial script is expressed above all when it is the curates who write, especially wills. Moreover, the acquisition of European models does not exclude the presence of clauses and formulae of indigenous matrix, the elaboration of which was necessary to adapt to the peculiarities of Sardinian law (Era 1934).

Previous studies on Sardinian notarial production (Puddu-Talamo 2020) and its formulaic aspects (Murgia-Puddu 2024) have shown – particularly through the study of coordinated binomials – the emergence of traits of considerable interest both for the identification of the multilingual patterns and practices of the writers and, more generally, for the study of the Sardinian legal language.

In this contribution we will focus on the analysis of the formulaic language of the Sardinian-language notarial formularies scattered in the archives of Sardinia, some of which have been published (Carta 2020), while others are still waiting to be brought to light. An integrated approach between philology and linguistics will be adopted, aiming, on the one hand, at the archival recovery of the materials and, on the other hand, at an initial quantitative analysis of the formulaic language in relation to the distribution of formulae within the different textual typologies present in the formularies.

Bach Ulrich. 2017. “«I do make and ordayne this my last wyll and testament in maner and forme Folowing»: Functions of Binomials in Early Modern English Protestant Wills”, in Kopaczyk, Joanna & Sauer, Hans (eds), Binomials in the history of English: Fixed and flexible, Cambridge, Cambridge University Press, 222-240.

Biber, Douglas. 2009. “A Corpus-Driven Approach to Formulaic Language in English: Multi-Word Patterns in Speech and Writing”, in International Journal of Corpus Linguistics 14 (3): 275-311.

Biber, Douglas. 2010. “What can a corpus tell us about registers and genres?”, in O’Keeffe, Anne & McCarthy, Michael (eds), The Routledge handbook of corpus linguistics, London, Routledge, 241-254.

Blasco Ferrer, Eduardo, Koch, Peter & Marzo, Daniela (eds). 2017. Manuale di linguistica sarda, Berlin/Boston, De Gruyter.

Cadeddu, Maria Eugenia. 2023. “Scrivere in castigliano, parlare in sardo. Esempi di contesti comunicativi in Ogliastra (XVIII secolo)”, in Fresu, Rita, Maninchedda, Paolo, Murgia, Giulia Serra, Patrizia (eds), Il «traffico delle lingue». Idiomi a contatto in Sardegna e nel Mediterraneo in età preunitaria, Cagliari, UNICApress, 149-174, <https://doi.org/10.13125/unicapress.978-88-3312-108-6>.

Carta, Michele. 2020. «Tabula dessas formulas de differentes instrumentos». Il formulario del notaio Gavino Francesco Pinna Succhioni di Ploaghe, Serramanna, Tipografia 3ESSE.

Condorelli, Orazio. 2009. “Profili del notariato in Italia Meridionale, Sicilia e Sardegna (secoli XII-XIX)”, in Schmoeckel, Mathias & Schubert, Werner (eds), Handbuch zur Geschichte des Notariats der europaischen Traditionen, Baden Baden, Nomos, 65-123.

Era, Antonio. 1934. Lezioni di storia delle istituzioni giuridiche ed economiche sarde, Roma, s.n.

Korkiakangas, Timo. 2022. “From memory or formulary: how were medieval documentary formulae reproduced?”, in Mirator 22, 4-24. <https://doi.org/10.54334/mirator.v22i1.119760>

Koolen, Marijn & Hoekstra, Rik. 2022. “Detecting formulaic language use in historical administrative corpora”, in Proceedings of the Computational Humanities Research Conference 2022, Antwerp, Belgium, December 12-14, 2022, 127-151. <https://ceur-ws.org/Vol-3290/long_paper5740.pdf>

Kopaczyk, Joanna & Sauer, Hans. 2017. “Defining and Exploring Binomials”, in Kopaczyk, Joanna & Sauer, Hans (eds), Binomials in the history of English: Fixed and flexible, Cambridge, Cambridge University Press, 1-23.

Kopaczyk, Joanna. 2020. “The language of Medieval legal record as a complex multilingual code”, in Armstrong, Jackson W. & Frankot, Edda (eds), Cultures of Law in Urban Northern Europe. Scotland and its Neighbours c. 1350-c.1650, London, Routledge, 58-79.

Kopaczyk, Joanna. 2024. “Unpacking and capturing multilingual practices and their effects in medieval administrative and legal discourse”, in Consani, Carlo, Guazzelli, Francesca & Perta, Carmela (eds), Gruppi professionali come fattore di innovazione linguistica. Evidenze documentarie in Europa tra Tarda Antichità e Medioevo, Alessandria, Edizioni dell’Orso, 13-28.

Murgia, Giulia & Puddu, Nicoletta. 2024. “Su alcuni binomi coordinati in un corpus di documenti sardi di età moderna”, in Consani, Carlo, Guazzelli, Francesca & Perta, Carmela (eds), Gruppi professionali come fattore di innovazione linguistica. Evidenze documentarie in Europa tra Tarda Antichità e Medioevo, Alessandria, Edizioni dell’Orso, 113-133.

Puddu, Nicoletta & Stein, Achim. 2018. “Word-level and higher level annotation of the Sardinian Medieval Corpus”, in Frank, Andrew U., Ivanovic, Christine, Mambrini, Francesco, Passarotti, Marco & Sporleder, Caroline (eds), Proceedings of the Second Workshop on Corpus-Based Research in the Humanities. CRH-2, Vienna, Gerastree, Dept. of Geoinformation, TU, 161-170.

Puddu, Nicoletta & Talamo, Luigi. 2020. “EModSar: A Corpus of Early Modern Sardinian Texts”, in Marras, Cristina, Passarotti, Marco, Franzini, Greta & Litta, Eleonora (eds), Atti del IX Convegno Annuale dell’Associazione per l’Informatica Umanistica e la Cultura Digitale (AIUCD). La svolta inevitabile: sfide e prospettive per l’Informatica Umanistica, Milano, Universita Cattolica del Sacro Cuore, 210-215.

Putzu, Ignazio Efisio. 2021. “Comunità di pratica, comunità di discorso e comunità testuali tra sincronia e diacronia: alcune considerazioni preliminari”, in Rhesis, 12.1, 66-88, <https://ojs.unica.it/index.php/rhesis/article/view/5659>.

Schena, Olivetta. 2013. “Notai e notariato nella Sardegna del tardo Medioevo”, in Meloni. Maria Giuseppina (ed.), Elites urbane e organizzazione sociale in area mediterranea fra tardo medioevo e prima etò moderna, Atti del seminario di studi Cagliari, 1-2 novembre 2011, Roma, ISEM-CNR, 325-353.

Stefanowitsch, Anatol & Gries, Stefan. 2003. “Collostructions: Investigating the interaction of words and constructions”, in International Journal of Corpus Linguistics 8(2), 209-243.

Virdis, Maurizio. 2023. “Dinamiche linguistiche nella lunga età sardo-iberica”, in RiMe, 13.II, 485-510, <https://rime.cnr.it/index.php/rime/article/view/664>.

Mäkinen, Martti

Exploring formulae through stylometric analysis of Middle English documents

Often in historical language data, the variation in spelling has been a considerable challenge for empirical and corpus linguists, and this is particularly true for studies in historical formulaic language use (cf. Korkiakangas, 2024). This paper investigates the usability of Stylo, a stylometric package written for R (Eder, Rybicki & Kestemont, 2017, Eder 2015a and 2015b) in identifying formulae in Middle English documents. The aim is to test the potential of unsupervised character and word n-gram analysis in charting formulaic language use in texts encumbered by a lot of spelling variation in advance of in-detail, traditional analysis of texts, thus enabling the use of unannotated and unlemmatized corpora in the study of formulae.

In earlier studies, Stylo has been able to distinguish between Middle English document categories (Mäkinen 2019), and also between Middle English dialect areas, the latter by using sets of less frequent n-grams (Mäkinen 2020). In the current study, the focus is on more frequent word and character n-grams. The choice of analytic units is based on two assumptions: (1) Despite the prevalent spelling variation in Middle English texts, documents would carry somewhat restricted vocabulary, which may enrich the occurrence of certain spelling forms, and thus also the occurrence of certain character n-grams, which makes them analytically a reasonable choice. (2) Documents are more homogeneous with other documents written for the same purpose, i.e. the structure of agreements, grants, leases etc. would have followed a more or less similar patterns. Also, many of them would have been exported from earlier text types in other languages, like Latin and French, and that emphasizes the fact that the text of formulae did not necessarily belong in the scribes’ own linguistic repertoires. Therefore, early standard spellings may first have occurred in the formulaic parts of documents, thus lessening spelling variation in the passages.

The data for the study is drawn from A Corpus of Middle English Local Documents (henceforth MELD), compiled at the University of Stavanger. It contains documentary texts from 1400 to 1525. The version used in this study is 2017.1, consisting of over 2,000 localizable scribal documents, and c. 850,000 words (MELD). Localizable documents are texts that either contain the information on the provenance of the document in the actual text or provide circumstantial information about the provenance (through the use of personal and place names) so that localizing the origin of the document is possible (Stenroos and Thengs, 2012). Methodologically, this study is inspired by Kopaczyk (2013) and Wynne, McIntyre and Burke (2024).

Eder M, J. Rybicki, and M. Kestemont. (2017). ‘Stylo’: a package for stylometric analyses. Computational Stylistics Group. 1-36. Available at: https://tinyurl.com/y449xxkk.

Eder, M. (2015a). Visualization in Stylometry: Cluster Analysis Using Networks. Digital Scholarship in the Humanities, 1–15. doi:10.1093/llc/fqv061.

Eder, M. (2015b). Taking stylometry to the limits: Benchmark study on 5,281 texts from Patrologia Latina. [Online]. Digital Humanities 2015: Conference Abstracts. Available at: http://dh2015.org/abstracts.

Kopaczyk, J. (2013). The Legal Language of Scottish Burghs: Standardization and Lexical Bundles (1380-1560), Oxford Studies in Language and Law. OUP.

Korkiakangas, T. (2024). A linguist's viewpoint: formulaic language as a challenge for historical linguistics. In Formulaic Language in Historical Research and Data Extraction (Huygens Institute for the History and Culture of the Netherlands; Royal Netherlands Academy of Arts and Sciences, Amsterdam, 7-9 February, 2024).

Mäkinen M. (2019). Testing a stylometric tool in the study of Middle English documentary texts. In: Bös B. and Claridge C. (eds.). Norms and Conventions in the History of English. John Benjamins (Amsterdam). 149-166.

Mäkinen, M. (2020). Stylo visualisations of Middle English documents. Journal of Data Mining & Digital Humanities. Special issue on Visualisations in Historical Linguistics. 1-10. Available at: https://jdmdh.episciences.org/7022.

MELD = The Middle English Local Documents Corpus, version 2017.1. June 2017, University of Stavanger (Stavanger). Available at: https://www.uis.no/research/history-languages-and-literature/the-mest-programme/a-corpus-of-middle-english-local-documents-meld/.

Stenroos, Merja & Thengs, Kjetil V. (2012). Two Staffordshires: real and linguistic space in the study of Late Middle English dialects. In Jukka Tyrkkö, Matti Kilpiö, Terttu Nevalainen & Matti Rissanen (Eds.), Outposts of Historical Corpus Linguistics: From the Helsinki Corpus to a Proliferation of Resources. (Studies in Variation, Contacts and Change in English 10), Helsinki: VARIENG. [Online]. Available at: http://www.helsinki.fi/varieng/series/volumes/10/stenroos_thengs/.

Wynne, M., McIntyre, D., & Burke, M. (2024). Formulaic language in Early English Books Online: From computational linguistics to classical rhetoric. In Formulaic Language in Historical Research and Data Extraction (Huygens Institute for the History and Culture of the Netherlands; Royal Netherlands Academy of Arts and Sciences, Amsterdam, 7-9 February, 2024).

Norris, Jérôme (part of panel A)

The highly formulaic nature of epigraphic habits in North-West Arabia before Islam

Pre-Islamic Arabia has been described as one of the most extraordinary places in the ancient world from the point-of-view of epigraphy, due to the impressive number of texts it produced, the high level of literacy of its population and also because it developed its own family of alphabets, the so-called “South Semitic” scripts which include the “Ancient South Arabian” (ASA) script from southern Arabia and the “Ancient North Arabian” scripts from northern Arabia (Macdonald 2015: 1).

The epigraphic situation of North-West Arabia was marked by a profound diversity with the co-occurrence of a multitude of different scripts. These scripts include the local ANA scripts, of which more than 10 are currently distinguished, which include the so-called “Dadanitic”, “Taymanitic”, “Dumaitic”, “Safaitic”, “Hismaic” scripts and a plenty of other scripts improperly called “Thamudic” (Hayajneh 2011; Al-Jallad 2018). Besides these, they include scripts of Aramaic origin imported from the Levant, including the Imperial Aramaic script as well as local variants of Aramaic that developed in Arabia such as “Taymāʾ Aramaic” and “Nabataean Aramaic”. The Nabataean script, after three centuries of evolution, developed into the “Arabic” script, passing through the intermediate phases of “Nabataeo-Arabic” (late third-mid fifth centuries AD) and “Palaeo-Arabic” (late fifth-sixth centuries AD) (Nehmé 2010).

Despite their profound diversity, the different writing traditions of pre-Islamic North-West Arabia have the common characteristic of being extremely formulaic. Each epigraphic group, thus, tends to be linked to a limited set of recurring formulae. Although this characteristic has been identified for a long time, no comparative study of the different formulae attested from one group to another has been conducted until now, which is what this contribution aims to do. This analysis leads to a double observation. On the one hand, certain formulae are specific to a given epigraphic tradition. This is, for instance, the case of Taymanitc nṣr l-Ṣlm “he kept watch on behalf of [the deity] Ṣalm”, Thamudic D PN (Personal Name) ʿs²q PN “PN the lover of PN” or Nabataean šlm PN “may PN be secure”. In this case, I would propose the concept of “closed” formulae, comparable to the concept of “closed cults”. On the other hand, other formulae are shared among different writing traditions, as the invocation ḏkr(t) DN (Divine name) PN “may DN be mindful of PN”, the expression of longing ts²wq ʾl-PN “he longed for PN” or the expression wdd f-PN “love/desire for PN”. For the latter, I would propose the concept of “open” formulae.

The conclusion that will emerge is that, despite their very high literacy, the ancient inhabitants of North-West Arabia used to carve inscriptions to express a very limited number of messages specific to a given writing tradition and following extremely codified rules. As for the identification of “closed” and “open” formulae, it allows to distinguish between writing traditions that developed independently and others that were, on the contrary, in contact with each other, probably reflecting neighbourhood relations and tribal ties between several populations sharing the same territory.

Al-Jallad, A. 2018. What is Ancient North Arabian? In D. Birnstiel and N. Pat-El (eds) Re-engaging Comparative Semitic and Arabic Studies. Wiesbaden: Harrassowitz Verlag (Abhandlungen für die Kunde des Morgenlandes, 115): 1–44.

Hayajneh, H. 2011. Ancient North Arabian. In S. Weninger, G. Khan, M. Streck, J. Watson (eds.), The Semitic Languages: An International Handbook. Berlin/Boston: De Gruyter Mouton (Handbooks of Linguistics and Communication Science, 36): 756–782.

Macdonald, M.C.A. 2015. On the uses of writing in ancient Arabia and the role of palaeography in studying them. Arabian Epigraphic Notes 1: 1–49.

Nehmé, L. 2010. A glimpse of the development of the Nabataean script into Arabic based on old and new epigraphic material. In M.C.A. Macdonald (ed.), The development of Arabic as a written language. (Supplement to the Proceedings of the. Seminar for Arabian Studies 40). Oxford: Archaeopress: 47-88.

PANEL A: Norris, Jérôme, Maczuga, Julia & Kootstra-Ford, Fokelien

Shared formulae, continuity, and change in the epigraphy of Northern Arabia

(the abstracts of the panel presentations are listed in alphabetical order among other presentations)

In May 2024, the AlUla Inscriptions Analysis Project (AICAP) started at Ghent University. This project is a collaboration with the Royal Commission for AlUla and aims to read and interpret the tens of thousands of rock inscriptions that are found in AlUla County (Northwest Saudi Arabia), and to bring them together in a single database. The epigraphy from AlUla spans a wide range of periods, scripts, and languages. It ranges from pre-Islamic inscriptions in South Semitic script variants like Dadanitic (6^th – 1^st c. BCE), including several Greek and Latin inscriptions, different varieties of Aramaic, early Arabic inscriptions, up to modern Arabic inscriptions written in the 20^th century.

The Arabian Peninsula was extremely rich in local scripts and associated writing cultures up until about the 5^th century AD (Macdonald 2000). One thing the various writing cultures have in common is the high number of non-official inscriptions, or graffiti, that were left in them. At the same time, these inscriptions are highly formulaic, with individual formulaic usage often being key to identifying script variants and their associated writing cultures (e.g. Winnett 1987; Prioletta 2022).

This panel aims to shed light on formulaic language use in this uniquely varied corpus from the Arabian Peninsula. Combining a meta discussion of how to define linguistic formula with more in-depth examination of variation and change within the formulaic usage of individual corpora, this panel will engage with the question of how we to leverage to understand complex connections of continuity and change with in the epigraphic record of the Arabian Peninsula and beyond.

Macdonald, Michael C.A. 2000. “Reflections on the Linguistic Map of Pre-Islamic Arabia.” Arabian Archaeology and Epigraphy 11:28–79.

Prioletta, Alessia. 2022. “The Inscriptions in Ancient South Arabian Script from Ḥimā: A Preliminary Historical and Cultural Appraisal.” Proceedings of the Seminar for Arabian Studies 51:271–82.

Winnett, F.V. 1987. “Studies in Ancient North Arabian.” Journal of the American Oriental Society 107 (2): 239–44.

PANEL B: Longrée, Dominique, Vanni, Laurent, Fascione, Sara, Rosa, Arianna & Thon, Valérie

Formulae in Latin Epistolography

(the abstracts of the panel presentations are listed in alphabetical order among other presentations)

Rodek, Ewa

The Role of Keywords in Building Sender-Receiver Relationships: A Case Study of Polish-Language Texts from 1600-1750

The sender-receiver relationship is most fully revealed in prefaces to literary works, as one of the primary functions of these texts is to establish a connection between the author and the reader. Keywords and other fixed elements play a special role in building this relationship. This phenomenon is clearly visible in Polish-language prefaces from the 17th century and the first half of the 18th century.

A corpus study of a collection of 150 prefaces addressed to anonymous readers (dedicatory prefaces to specific individuals were excluded) showed that the reader is referred to by fixed descriptors (kind, pious, noble), which appear at key points in the preface where the reader’s attention is at its peak: in the apostrophe, the incipit, paragraph beginnings, and the closing. Moreover, formulaic sequences include topoi, such as the topos of labor and humility, which are evoked through characteristic phrases and words (servant, labor) and their synonyms (servant, lucubration). The prefaces also feature other conventional strategies that evoke formulaic language. Among them are direct addresses to the reader, which, in real life, almost never appeared in formal situations at that time. Even children addressed their parents, and wives their husbands, with appropriate titles, and certainly strangers addressed each other similarly. Therefore, this device must be recognized as having an important and stable function in the text: the function of building a good, friendly relationship with the reader.

These terms used for the reader, words consistently used in the function of topoi, and other conventional expressions should be considered as keywords understood as thematic words (Knights 2010), activating a familiar schema and thereby facilitating the reception and interpretation of the content, including that of the main work itself. These words open the space for cooperation between the author, who recommends their work, and the reader, who may read it favorably or with reluctance. Additionally, their placement within the text clearly serves an organizing and unifying function, as is often emphasized by the authors of prefaces themselves.

The possibility for readers of prefaces to decode the meaning embedded in these keywords encourages consideration of a cultural definition of keywords (Wierzbicka 1997: 8), as they express values common to contemporary Polish society.

In this presentation, I will illustrate with specific, clear examples, how keywords built the relationship between senders and receivers in the 17th century and the first half of the 18th century, as well as I will present the functions of the contstants of individual elements and motifs, examined using corpus-based and pragmalinguistic methods. In conclusion, I would like to emphasize that a holistic approach (Gatos et al. 2006) is essential in analyzing keywords and formulaic expressions, allowing us to achieve a multidimensional portrayal of a particular historical reality.

Wierzbicka A. 1997. Understanding Cultures through Their Key Words: English, Russian, Polish, German, and Japanese. New York: Oxford University Press, 1997. 317 pp.

Gatos B., Konidaris T., Pratikakis I., Perantonis S. 2006. A Holistic Methodology for Keyword Search in Historical Typewritten Documents. Proceedings of the 4th Helenic conference on Advances in Artificial Intelligence. https://doi.org/10.1007/11752912_52

Knights M. 2010. Towards a Social and Cultural History of Keywords and Concepts by the Early Modern Research Group. History of Political Thought, Vol. 31, No. 3 (Autumn 2010), pp. 427-448. https://www.jstor.org/stable/i26224138

Roldão, Filipa & Serafim, Joana

Formulaic Language in Portuguese Municipal Charters of the Middle Ages: A Historical and Linguistic Analysis

The recent electronic edition of the earliest municipal charters granted by Portuguese monarchs to local communities—known as forais—primarily from the 12th and 13th centuries, offers a corpus of over four hundred documents in Latin and in vernacular, providing valuable resources for various disciplines, including History, Diplomatics, Philology, and Linguistics. Formulaic expressions were extensively employed not only to meet the diplomatic conventions of chancery documents but also to standardize juridical clauses on various subjects addressed within these forais, including, among others, municipal governance, economic regulation, fiscal issues, and judicial matters. This standardization of content across these areas allows for a clear identification of municipal charters that share similarities, as well as those that diverge, resulting in distinct textual traditions. Historians have concluded that certain forais served as templates, replicated whenever a community petitioned the royal chancery for a municipal charter. Approximately three primary 'document families' or models have been identified within this corpus. However, this interpretation has traditionally relied on historical analysis, often overlooking a linguistic approach to the data. This paper seeks to identify and analyse the formulaic language present in these documents by integrating historical and linguistic approaches. Using the CollateX program, the study will compare textual data to examine the topics where formulaic language is employed, as opposed to unique information specific to individual communities or instances of direct speech. Notably, even within these seemingly spontaneous or less formal passages, formulaic expressions persist. This analysis also seeks to illuminate the contextual choices between Latin and the vernacular in these records. Municipal charters granted by the earliest Portuguese kings continued to be copied and reproduced throughout the Middle Ages, until the 15th century, as evidenced by recent electronic editions. Considering both historical and linguistic perspectives, this paper will finally reflect on the extent to which formulaic language contributed to the survival of these documents over such an extended period.

Electronic edition: https://deti-iforal.ua.pt/documents

Roldão, F. & Serafim, J. (2021), Os mais antigos forais régios portugueses: uma proposta de estudo e de edição, Poder y Poderes en la Edad Media (Monografía de la Sociedad Española de Estudios Medievales, 16). Coord. Raquel Martínez Peñín y Gregoria Cavero Domínguez, Múrcia, Sociedad Española de Estudios Medievales, 375-386. ISBN: 978-84-17865-93-1.

Silvestre, J. P., Pacheco, O., Sousa, J., Roldão, F., & Serafim, J. (2024). A edição digital de forais medievais portugueses com o suporte de um sistema de edição colaborativa em base de dados. Diacrítica, 38(1), 130–145. https://doi.org/10.21814/diacritica.5602

Rosa, Arianna & Thon, Valérie (part of panel B)

Letters Across Time: a Diachronic Study of the Epistolary Formulas in Cicero, Jerome and Peter Damian.

The epistolary genre has its roots in Antiquity and is born from a strong need of communication: the tone and style vary, however, according to the addressee and the nature of the letter, whether it is private, administrative, political, consolatory, etc. Despite its different forms and modalities, the epistolary genre also has fixed and formulaic characteristics: the inscriptio with the abbreviation of the name and title of the addressee, as well as the initial and final formulas of salutatio to the addressee.

The goal of our presentation is to explore this formulaic nature of the epistolary genre and its possible developments over time from a diachronic point of view. We will go through the various centuries, using examples found in the letter collections of some of the most important epistolary authors: Cicero for the Republican age, Saint Jerome for the Late Antiquity and Peter Damian for the Central Middle Ages. Do the epistolary formulas evolve over time? If they do, are these variations related to the socio-cultural context of the author or to linguistic phenomena particular to the Latin language? In other words: can formulaicity, despite its fixed nature, also be subject to change? To answer these questions, we will also explore our corpus using an innovative method: Hyperbase, a software developed by the LASLA which enables a statistical and quantitative survey of the Latin language.

Salemenou, Maroula

Diplomatic correspondence in the corpus Demosthenicum: an evaluation of authenticity

This paper combines quantitative and qualitative approaches in order to evaluate the issue of authenticity in the diplomatic correspondence cited in Demosthenes’ speeches. As quantitative approach I understand the examination of the standardised or formulaic parts of these documents. I refer to focusing on the genre of the letters and the way this influences their content and style as qualitative approach. The way in which all documents in the corpus Demosthenicum have survived makes it inevitable that speculation is inherent in any discussion of them. Nonetheless, I contend that the study of the formulaic language in the diplomatic correspondence in Demosthenes, which has been overlooked or misunderstood in secondary literature, can provide some grounds for understanding the origin and for defining the degree of authenticity in each such document.

Scapini, Elia & Iezzi, Federico

Θεὸν εκ θεοῦ: a case study for semantic retrieval in Ancient Greek

In this joint paper, we present a search tool for stereotype formulations in Ancient Greek that tolerates some variation in language in the face of preservation of meaning. As part of the ITSERR (Italian Strenghtening of Esfri RI Resilence) infrastructure dealing with the research and development of digital tools for the Digital Humanities, particularly Religious Studies, WP4 DaMSym (Data Mining applied to the Nicene-Constantinopolitan Symbol) uses the creed of Nicaea and Constantinople as a case study and examines it in its various languages of ancient translation (Ancient Greek, Latin, Coptic, Arabic, Sanskrit, Church Slavonic). Our research starts from the fact that the expressions God from God, Light from Light, true God from true God are stereotypical formulations describing an x-from-x causality, where the cause reproduces itself (Barnes 2001). Although these stereotypical formulations run through the 4th century in various forms, rule-based tools for verbatim retrieval such as the TLG do not allow us to collect all the possible x's that go into x-from-x formulations. Indeed, in addition to θεὸν ἐκ θεοῦ, φῶς ἐκ φωτός, θεὸν ἀληθινὸν ἐκ θεοῦ ἀληθινοῦ, in the synodical documents of the 4th century and in the writings of many church authors of this period we also find expressions such as ζωὴν ἐκ ζωῆς, ὅλον ἐξ ὅλου, μόνον ἐκ μόνου, τέλειον ἐκ τελείου, βασιλέα ἐκ βασιλέως, κύριον ἀπὸ κυρίου etc. which cannot be returned by rule-based search tools. To address this deficiency in the state of the art, we have built and will make public a machine learning-based semantic retrieval tool for ancient Greek that reorders the phrases in a corpus based on vector similarity with the query sentences assigned as input. The phrases to be searched within the corpus can be more than one, and they are all embedded in such a way that they are described as points on a multi-dimensional space and can be related to the expressions in the corpus closest to them. We therefore present the first benchmarks of our work by discussing which encoder proves best suited for the purpose, show the sister project for Latin and the intention to combine the two systems into one, suggest the best strategies to exploit this tool to colleagues who might want to make use of it, and list the improvements we plan to make in the future.

Schironi, Francesca

Formulae and Formulaic Language in Hellenistic Greek Astronomy

During the Hellenistic period scientific Greek prose develops, as especially attested in Euclid’s Elements, which become the model to express mathematical content in all disciplines that the Greek defined as ‘mathematical sciences’: geometry, arithmetic, mechanics, astronomy, harmonics, optics. While the language of geometry has been studied,[1] the other branches of mathematical sciences still need further research. In this talk I will discuss some examples of formulaic language in Hellenistic Greek astronomy, focusing on Hipparchus of Nicaea (2n cent. BCE) and some Hellenistic papyri I am studying. Hipparchus is the most important Hellenistic astronomer, who, among other things, discovered the precession of the equinoxes. Despite his importance, almost all of his works have been lost, due especially to the success of Ptolemy, whose Mathematike Syntaxis or Almagest supplanted all the previous treatises on mathematical astronomy. Only Hipparchus’ Exegesis to Eudoxus’ and Aratus’ Phaenomena has survived– a ‘polemical’ commentary in which Hipparchus develops a detailed critique of Eudoxus’ Phaenomena and above all of Aratus’ Phaenomena. My study on this work has shown that Hipparchus was not only an innovator in the science of astronomy and its methodology: he was also an innovator in creating a scientific language to express astronomical concepts. His linguistic strategy consisted both in building a scientific lexicon in which terms were used in a very specific way and in manipulating Greek syntax to turn it into a vehicle to express precise data.

Technical lexicons tend to be standardized, economic, and concise,[2] avoiding polysemy and synonymy. Indeed, the astronomical lexicon used by Hipparchus applies this strategy to the point of becoming a monosemous lexicon—while being at the same time a quite clear and etymologically ‘transparent’ lexicon.[3] However, in the Exegesis Hipparchus also uses what we might term ‘fixed formulas’; for example, to name certain constellations his strategy seems to have been functional to avoiding confusion between similar constellations’ names. In addition, in the last section of his Exegesis, which consists of his Catalogue of Simultaneous Risings and Settings, Hipparchus conveys plenty of technical data in an organized and logical structure, by using continuous prose, which hardly adapts to expressing lists of scientific data, rather than using tables or bullet points like modern scientists do. This is achieved through mainly three stylistic tools: 1) a formulaic structure which uses almost always identical phrases, or with little variation; 2) a syntax reduced to the minimum, so that the reader’s attention is focused on the data themselves, and 3) topicalization. These tools make the resulting prose formulaic to an extent that the reader soon learns what to expect and can focus on the individual data. This procedure is consistent with, and even further develops, the communication strategies attested in other areas of Greek scientific language, which often makes use of formulas as a way to help either memorization or learning.[4] On the other hand, the astronomical language attested in one famous Hellenistic astronomical papyrus (PParis 1, ca. 165 BCE) shows knowledge but also a partial misunderstanding of syntactic formulas of Greek mathematics. This reflects a lower level of accuracy and familiarity with the technical discipline and its language, in line with type of text preserved by PParis 1 (a general handbook addressed to non-professionals).

[1] Michel Federspiel wrote a series of articles on the language of Greek mathematics between 1992 and 2006; see also Netz 1999; Acerbi 2021.

[2] For a discussion of technical languages in the Graeco-Roman world, see Langslow 2000, 6-28; Fögen 2003, Willi 2003, 66 and 69.

[3] See Schironi 2024.

[4] See Aujac 1984; Netz 1999, 127-167.

Acerbi, F., 2021, The Logical Syntax of Greek Mathematics,

Aujac, G., 1984, ‘Le Langage Formulaire Dans La Géométrie Grecque’, Revue d'histoire des sciences, pp. 97-109.

Fögen, T., 2003, ‘Metasprachliche Reflexionen Antiker Autoren Zu Den Charakteristika Von Fachtexten Und Fachsprachen’, in Antike Fachschriftsteller: Literarischer Diskurs Und Sozialer Kontext, ed. by Horster, M. and Reitz, C., Stuttgart, pp. 31-60.

Langslow, D.R., 2000, Medical Latin in the Roman Empire, Oxford.

Netz, R., 1999, The Shaping of Deduction in Greek Mathematics: A Study in Cognitive History, Cambridge ; New York.

Schironi, F., 2024, ‘The Language of Hellenistic Astronomy’, in Coming to Terms. Approches to (Ancient) Terminologies, ed. by Asper, M., Berlin - New York, pp. 11-39.

Willi, A., 2003, The Languages of Aristophanes: Aspects of Linguistic Variation in Classical Attic Greek, Oxford - New York.

Soffiantini, Laura

Pliny’s formulaic language in geographical books

Books 2-6 of Pliny's Naturalis Historia represent the most extensive surviving geographical Latin work from antiquity. Drawing on earlier Greek and Latin sources, Pliny undertook an ambitious synthesis of the known world from Gibraltar to India unmatched by any prior Latin author. In accomplishing this monumental task, Pliny confronted the challenge of developing appropriate language to describe geographical space (Pinkster, 2005). Previous studies have shown that scientific languages, including scientific Latin, exhibit high levels of formulaicity employing defined terminology and linguistic constructs in (semi)fixed structures to express specific concepts (Langslow, 2000; Netz, 2003). Using a corpus-driven approach, recent research (Fantoli, 2020; Fantoli-Soffiantini 2023) has demonstrated that Pliny’s language shows formulaic aspects and has tested various methods of pattern extraction on the Naturalis Historia.

With my presentation, I will investigate Pliny’s formulaic language in geographical books with two main objectives: presenting potential strategies of formulae retrieval and analyzing the extracted formulae. After text pre-processing which includes masking all proper nouns and numerals to reduce variability, two methods will be employed. The first method (1) involves the extraction of n-grams, while the second method (2) identifies longer formulaic patterns containing free spots (also known as non-continuous formulae). In the second method, the text is represented as a graph where each node represents a word, and consecutive words are connected by edges. Preliminary analysis performed on book 4 has revealed recurring bigrams such as ex adverso, in longitudinem, a septentrione, a meridie, ab oriente, which are used to convey orientational indications. Furthermore, by analyzing the structure of the network resulting from (2), it was possible to reconstruct that the text contains various combinations where the preposition ab is followed by a place name two slots later. The intervening slot can be filled by different tokens such as eo, ea, oppido which by depending on ab may indicate the starting point of the geographical description.

Fantoli, M. 2020.“Res ardua uetustis nouitatem dare, nouis auctoritatem”: Étude contrastive des enjeux linguistiques et communicatifs du deuxième livre de la Naturalis Historia de Pline l’Ancien [PhD Thesis]. ULiège - Université de Liège.

Fantoli, M. & Soffiantini, L. 2023. “Formulaic Language in Latin non-literary texts: computational approaches.” ICLL 2023 22nd International Colloquium on Latin Linguistics. Prague, June 19–23, 2023.

Langslow, D. R. 2000. Medical Latin in the Roman Empire. Oxford: Oxford University Press.

Netz, R. 2003. The Shaping of Deduction in Greek Mathematics: A Study in Cognitive History. Cambridge: Cambridge University Press.

Pinkster, H. 2005. “The language of Pliny the Elder.” In Aspects of the Language of Latin Prose, eds. T. Reinhardt, N. Lapidge, and J. N. Adams, 239–256. Oxford.

Stenroos, Merja

Formulaicity and the individual voice in late medieval English legal statements

Legal documents present a paradoxical picture as historical linguistic evidence. On the one hand, they may be quite precisely dated and localized, and refer to specific individuals and events; some types of documents may also contain individual, even personal content. On the other hand, they tend to be heavily formulaic. This complexity is especially notable in the type of documents that we might categorize as legal statements. These are text types written in the first person and typically conveying a statement or commissive, including attestations, affidavits and vows of allegiance, as well as, in the ecclesiastical sphere, confessions and abjurations; testaments and wills may also be considered to belong to this category, as may receipts. Such documents reflect to a varying extent the voices of the person making the statement and the scribe drawing up the document; to what extent the formulaic content reflects the language of either is a challenging question, to which there can be no single answer. The proposed paper addresses this question with reference to late medieval English documentary materials, and argues that their multilingual and linguistically fluid context makes present-day concepts of formulaicity problematic.

The formulae found in late medieval English legal statements can seldom be described in terms of chunks of identical phrasing. Rather, the formulae make up a wide range of variants expressing approximately the same content but differing in phrasing, length, morphology and spelling. This variability, which has considerable implications for the linguistic study of the texts, reflects both the multilingual context of medieval English administration and the lack of standard models of writing. From the fifteenth century, English was increasingly acceptable as a medium of legal documents; at this point, however, written English had neither an established standard nor available conventions for administrative writing. As Latin continued to be the dominant administrative language, the varied usage in English documents probably reflects individual translations from Latin templates, whether memorized or actually consulted.

The proposed paper explores the different kinds of formulae found in fifteenth- and early sixteenth-century English legal statements and problematizes the concepts of both formulaicity and individual voice in these materials. It argues that a focus on formal identity is of limited use for understanding the interaction of conventions, registers and voices in these early materials, and suggests a more flexible approach to their study, with multiple levels of formulaicity and voice. The empirical material is drawn from A Corpus of Middle English Local Documents (MELD; 2017-), and the study will combine an overview of formulaicity in the statement category with a focussed discussion of two or three types of statement. In order to present the data and explain the approach, I would be very happy to have the extra 10 minutes suggested in the Call for Papers, but will be able to scale the paper as needed.

MELD = A Corpus of Middle English Local Documents. Version 2017.1. Compiled by Merja Stenroos, Kjetil V. Thengs & Geir Bergstrøm. University of Stavanger. www.uis.no/meld

Vatri, Alessandro

Aristotle’s ‘diagrammar’: Formulaicity and multimodality in the Organon

The treatises that compose Aristotle’s Organon contain the earliest systematic treatment of (what we would call) formal logic, a discipline which lends itself both to symbolic representation and to visualization, as medieval mnemonics and study aids testify. The intrinsic multimodal character of Aristotle’s formulations is revealed by his use of denotative letters. Rather than standing for logical variables (as interpreters have often — anachronistically — thought), these have been shown by Netz to be used in the same way as they are in discussions of geometrical objects and problems — an idea that is suggestive of possible oral and visual classroom practices.

This, however, is not the only point of contact between Aristotle’s logic and contemporary geometry. As this paper will show, Aristotle’s logical metalanguage displays striking correspondences with that of mathematical texts, conditioned as both disciplines were by the necessity to express verbally abstract relations in a consistent and rigorous manner in the absence of graphic symbols. Similarly to Greek mathematicians, Aristotle uses formulaic elements to express what modern logicians would express symbolically. Such elements include lexical items (e.g. quantifiers, deontics, etc.), connectives, syntactic patterns and constructions (e.g. kata tinos huparkhein ‘to apply as a predicate’). One of the measurable consequences of the formulaic character of Aristotle’s logic is its significantly low lexical variety in comparison both with other Aristotelian texts (even though discourse-structuring formulae — e.g. phaneron esti ‘it is clear that…’ — are conspicuous in the corpus) and samples of Greek prose of different genres. [As computed from the digital texts included in the Diorisis Ancient Greek Corpus (https://figshare.com/articles/dataset/The_Diorisis_Ancient_Greek_Corpus/6187256).]

Netz, R. 2009. Ludic Proof. Greek Mathematics and the Alexandrian Aesthetic. Cambridge.

Netz, R. 2022. A New History of Greek Mathematics. Cambridge.

Netz, R. 2023. Aristotle’s Three Logical Figures: A Proposed Reconstruction. Phronesis 68, 62–77.

Schironi, F. 2010. Technical Languages: Science and Medicine, in E. J. Bakker (ed.), A Companion to the Ancient Greek Language, Oxford, 338–53.

Schironi, F. 2019. Naming the Phenomena. Technical Lexicon in Descriptive and Deductive Sciences, in A. Willi/P. Derron (eds), Formes et fonctions des langues littéraires en Grèce ancienne, Entretiens Hardt lxv, Vandœuvres, 227–58.

Vezzosi, Letizia & Rosselli Del Turco, Roberto

Poetic formulas in the Germanic literatures of the Middle Ages: semantic annotation and analysis

Since Magoun’s 1953 article, which was based on the theories of Milman Parry (Parry 1932), studies on poetic formulas in Old English and other Germanic languages have flourished, quickly becoming one of the most intriguing topics in literary research within the field of medieval Germanic cultures. The oral-formulaic composition system used in Germanic languages not only reveals significant connections with the Indo-European tradition, but also developed in a particularly sophisticated manner. Combined with other stylistic devices such as kennings and imaginative compounds, formulas play a fundamental role in the compositional process: they serve both as a mnemonic aid for memorization and subsequent recitation, and as an essential tool for the Germanic poet.

In the past, notable research employing quantitative methods has led to a general assessment of the relevance of formulas in Germanic poetic texts (Green 1971). More recent projects, such as the CLASP project (Orchard 2018), allow for identifying formulas in poetic texts and providing a hyperlinked list to assess the formula’s usage in context. However, we believe it is crucial to combine these research methods with a data preparation process that considers the semantic dimension. This would allow us to evaluate not only the context, frequency, and other statistical aspects of the formulas, but also their fundamental structure, the way this structure changes over time, and the specific usage patterns of individual authors, particularly their ability to innovate upon the pre-established repertoire (Russom 1978).

To achieve this goal, we intend to code sample texts in multiple Germanic languages using XML/TEI schemas, developing a flexible encoding model suited for cross-referencing within marked texts. A sufficiently rich level of annotation allows the TEI document to function as a repository of information that can be processed not only for text visualization, but also to generate new knowledge through analytical tools that enable complex searches and comparisons (Rosselli Del Turco 2021). The first step is to define an encoding model that takes into account the structural characteristics of the poetic formulas. This will also make it possible to establish a typology so that the sample texts used for the proposed research can be analyzed and compared in detail.

Orchard, Andy. 2018-. “CLASP: A Consolidated Library of Anglo Saxon Poetry.” Accessed September 25, 2021. https://clasp.ell.ox.ac.uk/.

Green, Donald C. 1971. “Formulas and Syntax in Old English Poetry: A Computer Study.” Computers and the Humanities 6 (2): 85–93. https://www.jstor.org/stable/30199462.

Magoun, Francis P. 1953. “Oral-Formulaic Character of Anglo-Saxon Narrative Poetry.” Speculum 28 (3): 446–67. https://doi.org/10.2307/2847021.

Parry, Milman. 1932. “Studies in the Epic Technique of Oral Verse-Making: II. The Homeric Language as the Language of an Oral Poetry.” Harvard Studies in Classical Philology 43:1–50. https://doi.org/10.2307/310666.

Rosselli Del Turco, Roberto. 2021. “Elaborazione di dati semi-strutturati: ipotesi implementative e casi d’uso tratti da testi in inglese antico.” Umanistica Digitale, no. 10 (September), 387–407. https://doi.org/10.6092/issn.2532-8816/12598.

Russom, Geoffrey R. 1978. “Artful Avoidance of the Useful Phrase in ‘Beowulf’, ‘The Battle of Maldon’, and ‘Fates of the Apostles.’” Studies in Philology 75 (4): 371–90. http://www.jstor.org/stable/4173979.

Wong, Catherine, Fitzmaurice, Susan & Lam, Benson SY

Tracing Formulaic Patterns and Language Change in Early Modern English: A Quantitative and Computational Approach

This study examines how formulaic expressions reveal linguistic change in Early Modern English, focusing on religious and institutional language. Drawing on the data-driven approach (Buerki, 2019, 2020; Hilpert & Cuyckens, 2015), this methodology affords a broad overview of trends while also highlighting synchronic differences within the period. Using NLP techniques – including n-gram and temporal analysis – applied to the EEBO-TCP and ECCO-TCP corpora, the study tracks multi-word expressions (MWEs) across yearly and decadal intervals to explore trends leading up to Late Modern English. This approach enables a diachronic analysis of linguistic shifts while capturing the predominance of religious discourse and its gradual blend into institutional language.

In this pilot study, n-gram analysis was applied to over 40,000 documents (approximately two-thirds of the two corpora) to extract meaningful n-grams and identify MWEs. These expressions were examined using both frequency-based and statistical collocation measures. Temporal analysis revealed patterns of language variation over time, particularly in religious discourse, providing insights into historical changes.

Religious utter sequences (USs) such as ‘lord jesus christ’, ‘lord thy god’, ‘father son holy spirit’, and ‘father son holy ghost’ dominate the data, functioning as verbal routines in prayer books and liturgy. Commandments like ‘thou shalt love thy neighbour’ also feature prominently, not as literal instructions but as fixed phrases lifted from canonical religious texts that reflect social and religious norms. These formulaic expressions, as Lakoff and Johnson (2008) argue, are phrasal lexical items constructed by ‘metaphorical concepts’. Rather than conveying novel meaning in specific contexts, they are contextually appropriate within religious doctrine, functioning as institutional utterances that reinforce faith and social order. This highlights the prescriptive nature of religious discourse during the period, where language served to reiterate and maintain established religious and societal structures.

In contrast, the multi-word unit ‘chief lord justice’ stands out in the mid-1600s as a non-religious salutation or address, marking the growing salience of secular social hierarchy, although few similar expressions appear. Collocations of ‘church’ offer further insight into the evolving relationship between religious and institutional discourse. While ‘god’ and ‘christ’ as collocates of ‘church’ prevail consistently throughout the period, ‘england’, ‘rome’, and ‘roman’ as collocates of ‘church’ begin to rise after the mid-1600s, peaking around 1690. Investigating these collocations provides a valuable perspective on how the term ‘church’ transitioned from a primarily religious concept to one embedded in civic and institutional language over the course of the 17th century.

This study contributes to the field of historical linguistics by illustrating how formulaic expressions reveal shifts in language related to both religious and institutional contexts during Early Modern English. By employing NLP techniques and a data-driven approach, the research highlights the significance of these expressions in tracking linguistic change and underscores their role in reflecting evolving social hierarchies and norms.

Buerki, A. (2019). Furiously fast: On the speed of change in formulaic language. Yearbook of Phraseology, 10(1), 5-38. https://doi.org/10.1515/phras-2019-0003

Buerki, A. (2020). Formulaic Language and Linguistic Change: A Data-Led Approach. Cambridge University Press.

Hilpert, M., & Cuyckens, H. (2015). How do corpus-based techniques advance description and theory in English historical linguistics? An introduction to the special issue. Corpus Linguistics and Linguistic Theory, 11(2), 141-150.

Lakoff, G., & Johnson, M. (2008). Metaphors We Live By (2nd ed.). University of Chicago Press.

Wong, Jorge

The Formulaic Template and Linguistic Innovation in Homer

This paper explores the connection between formulaic language and linguistic innovation in Homeric diction. Since the pioneering studies of Milman Parry, Homeric scholars have equated “formulaic” with “archaic” in the language of Homer, especially regarding Aeolic features, considered vestiges of an older phase of epic poetry. The argument is that the Homeric poets, compelled by the exigencies of extemporaneous oral composition, resorted to the constructions and collocations that they knew best, the ones they had heard from their own masters. Some resistance to this theory was offered by Hoekstra and Hainsworth, both of whom sought to find room for creativity in the Homeric formula either through the modification of formulas or a flexibility inherent in the formulas themselves. Recently, Nussbaum and Tate have pointed out examples of formulaic language not tied to specific formulas. These formulaic templates are abstract syntactic patterns that can generate surface outputs with no lexical items in common.

In this paper, I will show that these formulaic templates also encode for specific dialect forms. Specifically, I investigate the distribution of some of the Aeolic features deemed most formulaic in Homer, such as the genitive in -οιο and dative in -εσσι. These endings usually appear either at the feminine caesura or at the end of the verse, long thought harbor archaisms in Homeric language. Using the framework of the formulaic template, I demonstrate that the poets of the Iliad and Odyssey perceived a relationship between certain parts of the verse and the different dialect endings, so much so that when introducing novel linguistic forms into the poetic dialect, they outfitted them with these Aeolic endings to fit certain caesuras, e.g. νέεσσι (cf. Aeolic νᾶεσσι and Ionic νηυσί) and ἐπέεσσι (cf. Aeolic ἔπεσσι and Ionic ἔπεσι). This same process led to the creation of mixed dialect formulas like line final Μενελάου κυδαλίμοιο (14x), ὁμοιΐου π(τ)ολέμοιο (8x), and στυγεροῦ πολέμοιο (Δ 240, Ζ 330). More abstract syntactic patterns are reflected in parallel constructions like ἀπὸ βηλοῦ θεσπεσίοιο (Α 591) ~ ἀπὸ χαλκοῦ θεσπεσίοιο (Β 457), and υἱὸς ὑπερθύμοιο Κορώνου Καινεΐδαο (Β 747) ~ υἷε δύω Λήθοιο Πελασγοῦ Τευταμίδαο (Β 843), and αὐτὰρ ὃ Ἰφίκλοιο πάϊς τοῦ Φυλακίδαο (N 698).

Hainsworth, J.B. 1968. The Flexibility of the Homeric Formula. Oxford.

Hoekstra, A. 1965. Homeric Modifications of Formulaic Prototypes. Amsterdam.

Nussbaum, A.J. 2018. “The Homeric Formulary Template and a Linguistic Innovation in the Epics.” In Language and Meter, edited by D. Gunkel and O. Hackstein. Leiden; Boston. pp. 267–318.

Parry, M. “Studies in the Epic Technique of Oral Verse-Making: II. The Homeric Language as the Language of an Oral Poetry.” Harvard Studies in Classical Philology 43 (1932): 1–50.

Tate, A. P. 2011. “Modularity and the Spectrum of Formularity in the Homeric Corpus.” Cornell Dissertation.

Yiftach, Uri

Teaching AI Greek Syntax: The Taxonomy of the Legal Document as an Experimental Platform

Greek lease contracts from Egypt regularly record the duties of the lessee in the duration of the contract. However, whereas the routine formulation of text is initially—in the Ptolemaic period and in the Roman Arsinoites—paratactical (the act being recorded in an independent clause), in contemporary Oxyrhynchos, and later on throughout Egypt, the duties of the lessee are recorded hypotactically, introduced through the semiconsequential or semifinal ἐπὶ τῷ, ὥστε, or ἐφʼ ᾧ, all with the infinitive, which is predominately in the aorist tense. The semiconsequential clause is appended to the 'creation clause', the clause that records the act of lease per se, stressing the tight connection between the act of lease and the consequential duties. In my proposed paper I shall discuss three documents from second and third century Oxyrhynchos—P.Oxy. L 3596 (ca. 240-255 CE), P.Ross.Georg. II 19 (141 CE), and PSI XIII 1338 (299 CE)—that exhibit both the paratactical and the hypotactical formulation. I will investigate the cohabitation, and respective position of both types of duty clauses in the same documentary context.

Zilio, Leonardo & Arblaster, Paul

Exploring formulaic language in 17th-century Dutch-language newspaper articles

Journalism is one sphere in which formulaic language is common, since it creates communicative shortcuts that enable news to be conveyed quickly and to be easily fitted into existing mental categories. Europe’s first weekly newspapers began to be published in the early seventeenth century: in Germany from 1605, the Dutch Republic from 1618, and the Spanish Netherlands and England from 1620. These newspapers often reported the same news, sometimes on the basis of the same newsletters or simply by copying it from one another. The focus was on great public events: movements of fleets and armies, military engagements, arrivals and departures of ambassadors, the public life of royal families, and the publication of decrees or proclamations.

While the weekly production of thousands of words of text should in theory create a massive multilingual corpus, in practice survivals are patchy and unpredictable, and do not always overlap. One of the best-preserved news series of the early seventeenth century is that of the Nieuwe Tijdinghen, published in Antwerp in the years 1620-1629 (see Arblaster 2024). Its unusually good survival is due to some owners having issues bound into annual volumes to keep a year’s overview of the news (Pettegree 2015). Between collections in Amsterdam, Antwerp, Ghent, The Hague, London, and above all the Royal Library of Belgium in Brussels, most issues survived and have been collectively catalogued both digitally [USTC N4-1 to N4-1460] and in print (Der Weduwen 2017). Many issues, particularly those held in Antwerp, Brussels and Ghent, have now been made available online, but only as images rather than text.

This study uses a relatively small sample, mostly transcribed in the late 1990s by Paul Arblaster while working on his doctoral thesis, which after revision was published as From Ghent to Aix: How They Brought the News in the Habsburg Netherlands, 1550-1700 (Arblaster 2014). With the aid of computational tools, we investigate linguistic features of this 51k-word corpus, focusing specifically on recurrent expressions, terminology and formulaic language. We use corpus linguistics tools, such as AntConc (Anthony 2005) and Sketch Engine (Kilgarriff et al. 2008) to produce an initial overview of n-grams, collocations, keywords and multi-word terms, to then further organise and expand these lists using a semi-automatic approach.

The historical writing that is present in these documents means that the original lists extracted with automatic tools need to be manually validated to some extent, as many words that are not present in modern corpora will result in false positives for the analysis, for instance, of keywords and terms. After a manual revision, Python scripts are employed to search for variant spellings by using edit-distance algorithms. We also briefly test whether tools based on generative artificial intelligence can help in detecting textual regularities and terminology in these historical documents. The resulting lists of expressions and terms are then further contextualised with the help of concordances and example-sentences to form a repository of the formulaic language that is present in these historical newspaper articles.

Anthony, L. (2005, July). AntConc: design and development of a freeware corpus analysis toolkit for the technical writing classroom. In IPCC 2005. Proceedings. International Professional Communication Conference, 2005. (pp. 729-737). IEEE.

Arblaster, P. (2014). From Ghent to Aix: How They Brought the News in the Habsburg Netherlands, 1550-1700. Leiden: Brill.

Arblaster, P. (2024). Las noticias publicadas por Abraham Verhoeven en Amberes en 1621, in Manuel Borrego and Carmen Espejo-Cala (eds.) El mundo en 1621: Avisos, relaciones de sucesos, conexiones culturales. Besançon: Presses universitaires de Franche-Comté, 139-161.

Der Weduwen, A. (2017). Dutch and Flemish Newspapers of the Seventeenth Century, 1618–1700, vol. Leiden: Brill, 334-417.

Kilgarriff, A., Rychly, P., Smrz, P., & Tugwell, D. (2008). The sketch engine. Practical Lexicography: a reader, 297-306.

Pettegree, A. (2015). Tabloid Values: On the Trail of Europe’s First News Hound, in Richard Kirwan and Sophie Mullins (eds.) Specialist Markets in the Early Modern Book World. Leiden: Brill, 15-34.

Abstracts

Participants and titles

Abstracts

Bentein, Klaas

Brown, Joshua

Chioni, Irene

Cichosz, Anna & Pęzik, Piotr

Cook, Samuel Peter

di Bartolo, Giuseppina & Marchesi, Beatrice

Di Pasquale, Daniele

Drigo, Jasmim

Elalfy, Doaa

Elder, Claire M.

Fantoli, Margherita, Korkiakangas, Timo

Fascione, Sara (part of panel B)

Fezer, Katharina

Frog

Giannikou, Kyriaki

Ginevra, Riccardo, Biagetti, Erica, Brigada Villa, Luca & Zanchi, Chiara

Groot, Hester

Große, Sybille

Honkanen, Saara

Kaislaniemi, Samuli

Kayachev, Boris

Kootstra-Ford, Fokelien (part of panel A)

Kopaczyk, Joanna

Korkiakangas, Timo

Koroli, Aikaterini

Longrée, Dominique & Vanni, Laurent (part of panel B)

Maczuga, Julia (part of panel A)

Majdak, Magdalena

Marszałek, Jagoda & Wieczorek, Aleksandra

Martín González, Elena & Konstantopoulou, Stavroula

Meeder, Sven & Schmidt, Gleb

Mika, Tomasz

Murgia, Giulia & Puddu, Nicoletta

Mäkinen, Martti

Norris, Jérôme (part of panel A)

PANEL A: Norris, Jérôme, Maczuga, Julia & Kootstra-Ford, Fokelien

PANEL B: Longrée, Dominique, Vanni, Laurent, Fascione, Sara, Rosa, Arianna & Thon, Valérie

Rodek, Ewa

Roldão, Filipa & Serafim, Joana

Rosa, Arianna & Thon, Valérie (part of panel B)

Salemenou, Maroula

Scapini, Elia & Iezzi, Federico

Schironi, Francesca

Soffiantini, Laura

Stenroos, Merja

Vatri, Alessandro

Vezzosi, Letizia & Rosselli Del Turco, Roberto

Wong, Catherine, Fitzmaurice, Susan & Lam, Benson SY

Wong, Jorge

Yiftach, Uri

Zilio, Leonardo & Arblaster, Paul

Comments

Popular posts from this blog

Home: Practical information