Abstracts
Participants and titles
- Bentein, Klaas: Studying ancient interaction ritual: a comparison of two Greek corpora
- Brown, Joshua: Formulae as a source of linguistic innovation in Renaissance Italian: the salutatio and conclusio as loci of language contact
- Chioni, Irene: Request Formula Variations in Greek Petitions from Roman Egypt: Typological Classification and Sociolinguistic Interpretation
- Cook, Samuel Peter: Cross-linguistic formulae and the study of contact-induced language change: the case of Greek and Coptic legal formulae from 6th–8th century Egypt
- Del Grosso, Sarah: Formulaic language in translations in the period after the French Revolution and in the Napoleonic era
- di Bartolo, Giuseppina & Marchesi, Beatrice: Negative concord in Postclassical Greek: Impact and functions of formulaic expressions in documentary papyri
- Di Pasquale, Daniele: Shifts and Standardization of Language Patterns in Korean Old Vernacular Epistles (언간, 諺簡, Ŏn'gan)
- Drigo, Jasmim: Calques or Formulaic Expressions? An analysis of calques in Early Irish religious language
- Elalfy, Doaa: The Canonization and Transmission of Collyria (اشياف) in Greek and Arabic Medical Traditions: Lexical Shifts and Formulaic Patterns
- Elder, Claire: The Fermesse, Fidelity and Faith: A Pragmatic Analysis of the Formulaic Symbols in an Early Modern Scottish Community of Practice
- Fantoli, Margherita & Korkiakangas, Timo: Exploring formulaic language in dependency treebanks using network analysis
- Fascione, Sara: Formularity and idiosyncrasy in Fronto’s letter headings (panel B: Formulae in Latin Epistolography)
- Fezer, Katharina: Tracing and comparing formulae in printed and handwritten texts: Methods, issues, challenges
- Giannikou, Kyriaki: Assessing and Reassessing Formulaicity: are editorial practices a blessing or a curse?
- Ginevra, Riccardo, Biagetti, Erica, Brigada Villa, Luca & Zanchi, Chiara: Comparing Indo-European Poetic Languages: How to Combine Construction Grammar and Digital Resources for the Analysis of Formulaic Phraseology in Vedic Sanskrit and Homeric Greek (30 min)
- Groot, Hester: Identity construction and genre shift through formulaic language in Scottish pauper letters, 1750-1900
- Große, Sybille: Formulaicity in French letters: function and acquisition in theory and empiricism
- Honkanen, Saara: Formulaicity in Medieval Latin Historical Prose: the Case of Freculf of Lisieux
- Iezzi, Luca: The pragmatic usage of formulae: Evidence from the Datini Archive (1382-1402)
- Jauhiainen, Tommi
- Kaislaniemi, Samuli: Address formulas and material practices in seventeenth-century English letters (30 min)
- Kayachev, Boris: ‘Roses are red and violets are blue’: poetic language between formulaicity and intertextuality (the case of purpureus)
- Kootstra-Ford, Fokelien: Formulaic variation: Leveraging formulaic language to understand linguistic variation in Dadanitic inscriptions (6th–1st c. BCE) (panel A: Shared formulae, continuity, and change in the epigraphy of Northern Arabia)
- Kopaczyk, Joanna: To be announced
- Koroli, Aikaterini: Stereotypicality and variation in Greek private papyrus letters: a focus on stereotypical directive speech-acts
- Longrée, Dominique & Vanni, Laurent: New Ways to identify Formulaic Expressions in Latin Epistolography: Between Statistics and AI (panel B: Formulae in Latin Epistolography)
- Maczuga, Julia: The religious formulae attested in the Arabic graffiti from North-West Arabia during the Late pre-Islamic and Early Islamic periods: A study in continuity and change (panel A: Shared formulae, continuity, and change in the epigraphy of Northern Arabia)
- Majdak, Magdalena: Evolution of the Formulaic Expressions Referring to God in Polish Language History: Analysis of the Correspondence of the Czapski Family
- Marszałek, Jagoda & Wieczorek, Aleksandra: Polish and Latin date formulas used in Polish texts from 17th to 18th centuries
- Martín González, Elena & Konstantopoulou, Stavroula: Formulaic Language in the Oracular Inscriptions of Dodona: Integrating Traditional Epigraphic Analysis and Deep Neural Networks
- Meeder, Sven & Schmidt, Gleb: Formulae of Authority: Formulaic Aspects of Referencing the Bible in Early Medieval Canon Law
- Mika, Tomasz: Division of Old Polish Apocrypha: Title Formulas in the Middle Ages (30 min)
- Murel, Jacob, Feng, Steven, Haubold, Johannes & Graziosi, Barbara: Towards an LLM-Assisted Philology of Formulae in Greek Verse
- Murgia, Giulia & Puddu, Nicoletta: Notarial Formularies in Early Modern Sardinia
- Mäkinen, Martti: Exploring formulae through stylometric analysis of Middle English documents
- Norris, Jérôme:
The highly formulaic nature of epigraphic habits in North-West Arabia
before Islam (30 min; panel A: Shared formulae, continuity, and change in the
epigraphy of Northern Arabia)
- PANEL A: Norris, Jérôme, Maczuga, Julia & Kootstra-Ford, Fokelien: Shared formulae, continuity, and change in the epigraphy of Northern Arabia
- PANEL B: Longrée, Dominique, Vanni, Laurent, Fascione, Sara, Rosa, Arianna & Thon, Valérie: Formulae in Latin Epistolography
- Rodek, Ewa: The Role of Keywords in Building Sender-Receiver Relationships: A Case Study of Polish-Language Texts from 1600-1750
- Roldão, Filipa & Serafim, Joana: Formulaic Language in Portuguese Municipal Charters of the Middle Ages: A Historical and Linguistic Analysis
- Rosa, Arianna & Thon, Valérie: Letters Across Time: a Diachronic Study of the Epistolary Formulas in Cicero, Jerome and Peter Damian (panel B: Formulae in Latin Epistolography)
- Salemenou, Maroula: Diplomatic correspondence in the corpus Demosthenicum: an evaluation of authenticity
- Scapini, Elia & Iezzi, Federico: Θεὸν εκ θεοῦ: a case study for semantic retrieval in Ancient Greek
- Scappaticcio, Maria Chiara: OVF (Oro Vos Faciatis): Praying and Requesting. On the Formulaic Nature of Canvassing in Literature and Epigraphy
- Schironi, Francesca: Formulae and Formulaic Language in Hellenistic Greek Astronomy
- Soffiantini, Laura: Pliny’s formulaic language in geographical books
- Stenroos, Merja: Formulaicity and the individual voice in late medieval English legal statements (30 min.)
- Trombetta, Chiara: Discourse-organizational functions of formulaic language in Italian 16th-century historiographical texts
- Vatri, Alessandro: Aristotle’s ‘diagrammar’: Formulaicity and multimodality in the Organon
- Vezzosi, Letizia & Rosselli Del Turco, Roberto: Poetic formulas in the Germanic literatures of the Middle Ages: semantic annotation and analysis
- Vierros, Marja
- Wong, Catherine, Fitzmaurice, Susan & Lam, Benson: Tracing Formulaic Patterns and Language Change in Early Modern English: A Quantitative and Computational Approach
- Wong, Jorge: The Formulaic Template and Linguistic Innovation in Homer
- Yiftach, Uri: Syntactical Transformations in the Clause recording the Duties of the Lessee in Early Roman Egypt
- Zilio, Leonardo & Arblaster, Paul: Exploring formulaic language in 17th-century Dutch-language newspaper articles
Abstracts
Bentein, Klaas
Studying ancient interaction ritual: a comparison of two Greek corpora
Formulaic language has received a fair amount of attention in fields such as papyrology and epigraphy, disciplines which work with corpora that typically consist of shorter, relatively repetitive texts; scholarship has catalogued the formulaic phrases that can be found, focusing in particular on specific genres, such as letters and petitions in papyrology (e.g. Mascellari 2012; Nachtergaele 2023 for Ancient Greek texts). To a limited extent, these inventories have been incorporated within a broader ‘ecology of writing’ framework (for which, see e.g. Basso 1989). One approach has been to study the texts as instances of ‘technical’ text types (Fachtexte), which can be characterized in terms of an isomorphy of features, whether structural, formal, contextual, visual, or language-related (see Kruschwitz and Halla-aho 2007 for Pompeian wall inscriptions; and for technical literature, see Fögen 2010). A related perspective, rooted in the Anglo-American tradition, examines ‘formulaic’ genres (Kuiper 2000; 2009), which are characterized by fixed discourse components as well as formulaic phrases cueing each of these components (for an outline of the ‘discourse grammar’ of ancient Greek letters, see Bentein 2023, 433–46).
In this contribution, I intend to develop a third perspective that has received limited attention so far, known as ‘interaction ritual,’ a concept first introduced by Erving Goffman to describe the structured, ritualistic behaviors people use in everyday interactions to uphold social order and manage social identities (Goffman 1967), and more recently expanded by Dániel Kádár (e.g., Kádár 2013; Kádár 2024). Kádár’s typology of ritual interaction offers a set of analytical tools that I believe can be applied to compare different corpora (and the genres within these corpora), providing a deeper understanding of formulaic phraseology. I will apply four features identified by Kádár to examine two corpora that I have been studying for some time: Greek documentary texts preserved on papyrus (Palme 2009) and Byzantine book epigrams (Bernard and Demoen 2019). Despite their chronological and thematic differences, these corpora share a strong formulaic element that can be more deeply explored using Kádár’s interaction ritual framework. The four features of focus are as follows:
- Social Extension of the Ritual: Kádár’s framework allows us to distinguish between ‘in-group’ rituals—practices confined to a specific social group—and broader social rituals, which are widely recognized and accessible across society.
- Pragmatic Complexity: Kádár distinguishes between ‘interactional complexity’—the range of communicative acts involved in a ritual—and ‘relational complexity,’ which concerns the depth and intricacy of relationships between individuals engaged in ritualistic practices (compare Bentein 2023b).
- Ritual Frame Indicating Expressions: Kádár defines formulaic phrases as ‘ritual frame indicating expressions,’ which signal a ritual context in communication and guide participants to follow its norms and expectations. Often linked to specific speech acts such as greetings, apologies, or farewells, these expressions have a recognizable structure and may include non-verbal or visual cues to help manage the flow of ritualized interactions (compare Bentein and Capano 2025).
- Mimesis and Self-Display: Kádár sees rituals as naturally imitative but open to enhanced behaviors, allowing participants to either intensify mimicry or add elaborate displays to reflect social values or showcase relational skill (compare Bentein 2023a).
My discussion will highlight similarities and
differences between the two corpora and review how me and my research team have
previously studied aspects of these four features (see Ricceri et al. 2023;
Bentein 2024 for the relevant digital environments); along the way I will
introduce a newly launched project, ANNOPHIS (www.annophis.ugent.be), which aims to develop a
machine-learning-based annotation platform for a broad range of historical
formulaic text corpora, including papyri, book epigrams, and inscriptions.
Basso, K. H. 1989. “The Ethnography of Writing.” In Explorations in the Ethnography of Speaking, edited by J. Sherzer and R. Bauman, 2nd ed., 425–32. Cambridge: Cambridge University Press.
Bentein, Klaas. 2023a. “A Typology of Variations in the Ancient Greek Epistolary Frame (IIII AD).” In Historical Linguistics and Classical Philology, edited by G. Giannakis, E.
Crespo, J. de La Villa, and P. Filos, 429–72. Berlin: De Gruyter.
———. 2023b“Why Say Goodbye Twice? Repetition and Involvement in the Greek Epistolary Frame (I-IV AD).” In La Correspondance Privée Dans La Méditerranée Antique, edited by M. Dana, 173–206. Bordeaux: éditions Ausonius.
———. 2024. “Socio-Semiotic, Multimodal Annotation of Documentary Sources : Digital Infrastructure in the Everyday Writing Project.” In Digital Papyrology III, edited by N. Reggiani. Berlin: De Gruyter.
Bentein, K., and M. Capano. 2025. “Spacing out Speech Acts. Textual Units and Their Visual Organization in Greek Letters on Papyrus.” In Everyday Communication in Antiquity.
Frames and Framings, edited by K. Bentein. Venice: Edizioni Ca’ Foscari.
Bernard, F., and K. Demoen. 2019. “Byzantine Book Epigrams.” In A Companion to Byzantine Poetry, edited by W. Hörandner, A. Rhoby, and N. Zagklas, 404–29. Leiden: Brill.
Fögen, T. 2010. “Technical Literature.” In A Companion to Greek Literature, edited by E.
Bakker, 266–79. Chichester, UK: John Wiley & Sons, Ltd.
Goffman, E. 1967. Interaction Ritual; Essays in Face-to-Face Behavior. Chicago: Aldine PubCo.
Kádár, D.Z. 2013. Relational Rituals and Communication: Ritual Interaction in Groups. London: Palgrave Macmillan.
———. 2024. Ritual and Language. Cambridge, UK; New York: Cambridge University Press.
Kruschwitz, P., and H. Halla-aho. 2007. “The Pompeian Wall Inscriptions and the Latin Language: A Critical Reappraisal.” Arctos: Acta Philologica Fennica 41:31–49.
Kuiper, K. 2000. “On the Linguistic Properties of Formulaic Speech.” Oral Tradition 15 (2): 279–305.
———. 2009. Formulaic Genres. Basingstoke [England] ; New York: Palgrave Macmillan.
Mascellari, R. 2012. “Le Petizioni Nell’Egitto Romano. Evoluzione Di Formulario, Procedure e Organizzazione Della Giustizia. Documentazione Su Papiro Dal 30 a.C. al 300 d.C.” Firenze: Università degli Studi di Firenze.
Nachtergaele, D. 2023. The Formulaic Language of the Greek Private Papyrus Letters. Leuven: Trismegistos Online Publications.
Palme, B. 2009. “The Range of Documentary Texts: Types and Categories.” In The Oxford Handbook of Papyrology, edited by R.S. Bagnall, 358–94. New York: Oxford University Press.
Ricceri, R., K. Bentein, F. Bernard, A. Bronselaer, E. De Paermentier, P. De Potter, G. De Tré, et al. 2023. “The Database of Byzantine Book Epigrams Project: Principles, Challenges, Opportunities.” Journal of Data Mining and Digital Humanities.
Brown, Joshua
Formulae as a source of linguistic innovation in Renaissance Italian: the salutatio and conclusio as loci of language contact
Formulaicity and formulaic strings provide forms of language that aid letter writers through prefabricated units, either through copying or retrieved as whole from memory (Rutten & van der Wal 2012; Serra 2023). In historical corpora, documentary formulae have been characterized as ‘text reuse templates’, allowing the investigation of historical drift of documentary production and cultural change (Korkiakangas 2023). Formulae may also contribute to aspects of in-group membership and identity formation (Laitinen & Norlund 2012). Less focus has been places on formulae as loci of linguistic innovation (cf. Bybee & Torres Cacoullos 2009), despite the importance of imitation in both language acquisition and in letter-writing literacy.
This paper identifies a series of formulae in merchant letters sent from Milan to various locations around the Mediterranean between 1396-1402, and currently housed at the Datini Archive, Archivio di Stato di Prato in Tuscany. Defining a corpus of 82 letters, written in vernacular, I show how the written correspondence of two merchants, Francesco Tanso and Giovanni da Pessano, represent a ‘hybrid’ linguistic variety, with forms of Tuscan, Milanese, and Latin clearly identifiable (Brown 2024). Tracing the infiltration of 1pl. verb forms in the letters of these merchants reveals that formulae represent a clear focus of linguistic innovation, particularly in the salutatio and conclusio of letters. Part of this case-study is shown as a proof-of-concept for a digital project scaling up to all 810 letters sent from Milan in the Datini Archive.
As with many merchants of late medieval Europe, both Francesco Tanso and Giovanni da Pessano sent large numbers of letters, sometimes in rapid succession. Merchants required access to quick information. In creating such an enormous written correspondence, both writers made use of formulae in their letters, especially in the salutatio and conclusio. Repetition of formulae was one way in which particular forms of language spread quickly and made writing easy. Were specific linguistic items transferred from one variety to another through formulae? If so, what were they? The paper concludes by returning to the question of methodology, and how preprocessing of linguistic data can best be achieved to ascertain the presence of formulae in a ‘big data’ corpus, making some comparison to research in similar domains (Granger 2018; Koolen & Hoekstra 2022).
Brown, Joshua. 2024. Dialect levelling and merchant writing in Renaissance Italy. Special issue of “Journal of Historical Sociolinguistics” ed. by Anita Auer & Joshua Brown. 10:(2) 197-223.
Bybee, Joan L & Rena Torres Cacoullos. 2009. The role of prefabs in grammaticization: How the particular and the general interact in language change. In: Corrigan, Roberta, Edith A Moravcsik, Hamid Ouali & Kathleen Wheatley (eds.) Formulaic Language, volume 1: Distribution and historical change. Amsterdam: John Benjamins, pp.187-218.
Granger, Sylviane. 2018. Formulaic sequences in learner corpora: Collections and lexical bundles. In: Siyanova-Chanturia, A & A Pellicer-Sanchez (eds.) Understanding Formulaic Language: A second language acquisition perspective. London: Routledge, pp.228-247.
Koolen, Marijn & Rik Hoekstra. 2022. Detecting formulaic language use in historical administrative corpora. Proceedings of the Computational Humanities Research Conference 2022, Antwerp, Belgium, December 12-14. 127-151.
Korkiakangas, Timo. 2023. Documentary formulae as text reuse templates: Constat and Manifestus clauses in early medieval Latin charters. Digital Medievalist. 16:1-44.
Rutten, Gijsbert & Marijke J van der Wal. 2012. Functions of epistolary formulae in Dutch letters from the seventeenth and eighteenth centuries. Journal of Historical Pragmatics. 13:(2) 173-201.
Serra, Eleonora. 2023. Learning to write letters in sixteenth-century Florence: Epistolary formulae in the correspondence of Lucrezia Albizzi Ricasoli. Linguistica. 63:(1-2) 273-300.
Chioni, Irene
Request Formula Variations in Greek Petitions from Roman Egypt: Typological Classification and Sociolinguistic Interpretation
A significant number of highly formulaic Greek texts can be identified in the papyrological corpus. Among these, petitions stand out for using similar or identical expressions over an extended period. This linguistic consistency underscores the nature of petitions as a formulaic genre (Kuiper 2009).
While the formulaic language of petitions has been thoroughly examined, particularly for the Ptolemaic and Roman periods (Di Bitonto 1967, 1968, 1976; Mascellari 2021), the variations within these formulas remain largely unexplored, despite their linguistic potential (Kuiper 2000). For example, the study of formulaic variation in letters has resulted in a detailed typology (Bentein 2023), revealing new research opportunities. Similarly, developing a typology of variation in petitions could provide insights into the social contexts and shifts in the power dynamics between petitioners and officials, offering a richer understanding of how formulaic language reflects interactions.
One particularly significant formula within a petition is the request section, as it represents the core communicative purpose of the document—in pragmatics, the “head act” of a speech act (House-Kádár 2021: 105-133). Given its centrality, an examination of this section offers a lens through which the functionality of the text can be better understood.
Previous research has identified several fundamental constituents of the request formula in petitions (e.g., Mullins 1962; White 1972; Mascellari 2021). Building on this foundation, I propose a more refined structure of petitions’ request formula, based on the direct analysis of a substantial corpus of Greek petitions from Egypt, dating from the 1st to the 3rd centuries AD, all containing relatively intact request sections.
The essential constituents of the request formula, resulting from my observation, typically include a performative request verb (most often ἀξιόω) and an infinitive specifying the requested action. Inferential expressions frequently introduce the request formula, while courtesy phrases or honorific titles support the request verb. Occasionally, a final plea is added, anticipating the expected benefit or relief resulting from the fulfillment of the request. Other minor elements within the formula, while not extensively discussed in previous scholarship, play a significant role in contributing to its variation.
By analyzing the frequency and the recurrent structure of these unexplored constituents in the formula, I aim to explore the relationship between normative formulaic structures and their variations. To investigate this, I propose a typology of variations in the request formulae, drawing on the typology for formulaic variation in letter openings and closings proposed in Bentein (2023). I suggest that some patterns of variation can be identified: reformulation, addition, repetition, combination, and displacement, while the presence of others remains uncertain. The focus will then shift to the reformulation pattern, specifically to the use of lexical variants. I will demonstrate that changes in the performative request verb offer a sociolinguistic lens through which these variations can be interpreted (Dickey 2009, 2016). Specifically, I will show how such changes reflect the petitioner’s construction of identity and strategic use of language in relation to the social status of the recipient, as explored in Bentein (2016) about inferential expressions.
Bentein, K. (2016). “Διό, διὰ τοῦτο, ὅθεν, τοίνυν, οὖν, or rather asyndeton? Inferential expressions and their social value in Greek official petitions (I–IV AD)”. Acta Classica, 59, 23–51.
Bentein, K. (2023). “A Typology of Variations in the Ancient Greek Epistolary Frame (I–III AD)”. In G. K. Giannakis, P. Filos, E. Crespo, & J. De La Villa (Eds.), Classical Philology and Linguistics (429-472). Berlin-Boston: De Gruyter.
Di Bitonto, A. 1967. “Le petizioni al re. Studio sul formulario”. Aegyptus, 47, 5-57.
Di Bitonto, A. 1968. “Le petizioni ai funzionari nel periodo tolemaico. Studio sul formulario”. Aegyptus 48, 53-107.
Di Bitonto, A. 1976. “Frammenti di petizioni del periodo tolemaico. Studio sul formulario”. Aegyptus 56, 109-143.
Dickey, E. (2009). Latin Influence and Greek Request Formulae. In T. V. Evans & D. D. Obbink (Eds.), The Language of the Papyri (208–220). Oxford: Oxford University Press.
Dickey, E. (2016). “Emotional language and formulae of persuasion in Greek papyrus letters”. In E. Sanders & M. Johncock (Eds.), Emotion and Persuasion in Classical Antiquity (237-262). Stuttgart: Franz Steiner Verlag.
House, J. & Kádár, D. Z. (2021). Cross-Cultural Pragmatics. Cambridge: Cambridge University Press.
Kuiper, K. (2000). “On the Linguistic Properties of Formulaic Speech”. Oral Tradition, 15(2), 279-305.
Kuiper, K. (2009). Formulaic genres. Basingstoke: Palgrave Macmillan.
Mascellari, R. (2021). La lingua delle petizioni nell’Egitto romano: Evoluzione di lessico, formule e procedure dal 30 a.C. al 300 d.C. Florence: Firenze University Press.
Mullins, T. Y. (1962). “Petition as a Literary Form”. Novum Testamentum, 5(1), 46-54.
Cook, Samuel Peter
Cross-linguistic formulae and the study of contact-induced language change: the case of Greek and Coptic legal formulae from 6th–8th century Egypt
The legal landscape of Late Antique and Early Islamic Egypt is characterised by its multilingual nature. Prior to the Islamic conquest of 641, Greek was the primary language of law, with legal formulae in part building on pre-existing models in Demotic. Following the conquest, Coptic became more prominent as the use of Greek decreased, with the largest body of Coptic legal contracts attested from the 7th and 8th centuries. Throughout the 6th to 8th centuries, the formulae used in both Greek and Coptic legal texts were intrinsically linked, since the contracts in which they appeared represent a single legal system expressed in two languages. Till even goes so far as to describe Coptic legal formulae as a case of Byzantine formulae in “Coptic translation” (Till 1950, 81). This continuity between Greek and Coptic formulae was noted by 20th century scholars of Coptic legal documents (Lüddeckens 1979; Wenger 1953; Steinwenter 1920; Boulard 1912). However, the only comprehensive discussion of the topic to date is that of Richter (2009), with scholars tending to specialise either in Coptic or in Greek. Drawing on the results of my PhD thesis (Cook 2019), the present paper discusses the use of Egyptian legal contracts of the 6th to 8th centuries as a vehicle to study how formulaic language navigates two linguistic systems in a multilingual society. On one hand, I outline the strategies used to express legal formulae in two languages belonging to different language families. On the other, drawing on theoretical models from the fields of historical linguistics and language contact, I identify possible influences from underlying Greek formulae which have led to a new, domain-specific grammatical form: the so-called “performative ⲉⲓⲥⲱⲧⲙ”.
Boulard, Louis. 1912. La vente dans les actes Coptes. P. Geuthner.
Cook, Samuel Peter. 2019. “Linguistic and Legal Continuity in 6th to 8th Century Coptic Documents: A Comparative Study of Greek and Coptic Legal Formulae in Byzantine and Early Islamic Egypt.” Doctoral thesis, Sydney: Macquarie University.
Lüddeckens, Erich. 1979. “Demotische Und Koptische Urkundenformeln.” Enchoria 2:21–31.
Richter, Tonio Sebastian. 2009. “Greek, Coptic and the ‘Language of the Hijra’: The Rise and Decline of the Coptic Language in Late Antique and Medieval Egypt.” From Hellenism to Islam: Cultural and Linguistic Change in the Roman Near East, 401–46.
Steinwenter, Artur. 1920. Studien zu den koptischen Rechtsurkunden aus Oberägypten. Haessel.
Till, Walter. 1950. “Die Koptische Stipulationsklausel.” Orientalia 19 (1): 81–87.
Wenger, Leopold. 1953. Die Quellen des römischen Rechts. A. Holzhausen.
Del Grosso, Sarah
Formulaic language in translations
in the period after the French Revolution and in the Napoleonic era
In the period after the French Revolution and in the Napoleonic era, tens of thousands of legal and administrative texts were translated into different European languages (cf. D’hulst 2015: 93-94). These texts were translated because the conquered territories became part of the French Empire, which led to a standardization of law and administration, i.e., the adoption of French laws and administrative structures, and thus the adoption of French legal and administrative texts.
In three projects (for a short description of the three projects and a bibliography see https://en.uepol1789-1815.uni-mainz.de/) under the direction of Michael Schreiber (funded by the Deutsche Forschungsgemeinschaft), bilingually printed texts of various legal and administrative language text types have been collected in archives in Belgium, Italy and the German Rhineland. These texts were imported into three language-pair specific databases that use the tool IDaSTo (cf. Beyer 2016). This tool was developed for the corpus linguistic analysis of historical parallel texts.
Formulaic language is an important element of legal language; it helps to simplify internal information by using existing formulation patterns to create consistency (cf. Pommer 2006: 26). The analysis of formulaic language in historical legal and administrative language, especially in relation to translations, has rarely been examined to date (cf. Kjær 2007: 506).
Many of the texts have the same macrostructure including the typical French one-sentence structure (phrase unique). Although this structure is not known in the target legal cultures, it is generally adopted in the translation and the formulaic nature of the translations is stabilized over time (cf. Del Grosso 2024, Schreiber 2017). The translations are often strongly orientated towards the source text (cf. Reinke/Schreiber 2015: 700), a strategy of documentary translation dominates (cf. Engberg 1999: 88). Even if individual translator decisions cannot be identified due to a lack of information about the translators and translation processes, case studies have shown that the country or city where the translation was produced has an effect on the formulation patterns in the target language (cf. Del Grosso in press on the Italian translation of formulaic language in court judgements in Genoa and Milan or Del Grosso 2024 on the German translation of the phrase unique in texts that were translated in Paris and in Mainz).
In my paper, I would like to show how formulation traditions are influenced by the translation of texts with a high degree of formulaic language. By presenting examples from the three corpora, I would like to show which formulaic language patterns of the legal and administrative texts have been recognized by the translators and adopted as such and which translation strategies they used in the different languages. If I could get the extra 10 minutes, I would also like to present the databases and discuss the advantages and limitations of these databases for our work.
Beyer, Rahel (2016): „IdaSTo – Ein Tool zum Taggen und Suchen in historischen Paralleltexten“, in: Fisseni, Bernhard/Schröder, Bernhard/Zesch, Torsten (eds.): Proceedings of the International Conference of the German Society for Computational Linguistics and Language Technology. Sep 30-Oct. 2 2015, University of Duisburg-Essen – Gesellschaft für Sprachtechnologie und Computerlinguistik e.V., 162-169.
D’hulst, Lieven (2015): „‘Localiser’ des traductions nationales. Le Bulletin des lois en version flamande et hollandaise sous la période française (1797–1813)“, in: Dizdar, Dilek/Gipper, Andreas/Schreiber, Michael (eds.): Nationenbildung und Übersetzung, Berlin: Frank & Timme, 93-108.
Del Grosso, Sarah (2024): „Die Übersetzung der phrase unique ins Deutsche am Beispiel der Sammlung der Verordnungen und Beschlüsse im Departement Donnersberg (1799-1802)“, in: trans-kom 17(1), 21-39.
Del Grosso, Sarah (in press): „Die Übersetzung der Abschlussformel in Gerichtsurteilen in Mailand und Genua (1797–1808)“, in: Betz, Katrin/Lützelberger, Florian (eds.): Alt & Neu. Neue Quellen, alte Fragen – alte Quellen, neue Fragen Beiträge zum XXXVI. Forum Junge Romanistik in Bamberg (31.März 2021). München: AVM, 113-130.
Engberg, Jan (1999): „Übersetzen von Gerichtsurteilen. Der Einfluss der Perspektive“, in: Sandrini, Peter (Ed.): Übersetzen von Rechtstexten. Fachkommunikation im Spannungsfeld zwischen Rechtsordnung und Sprache. Tübingen: Narr, 93-101.
Kjær, Anne Lise (2007): „Phrasemes in legal texts”, in: Harald Burger et al. (eds.): Phraseologie. Ein internationales Handbuch der zeitgenössischen Forschung. Phraseology. An International Handbook of Contemporary Research (vol. 2). Berlin: de Gruyter, 506-515.
Pommer, Sieglinde (2006): Rechtsübersetzung und Rechtsvergleichung. Translatologische Fragen zur Interdisziplinarität. Frankfurt a. M.: Lang.
Reinke, Kristin/Schreiber, Michael (2015): „Juristische Fachübersetzungen im Sprachenpaar Französisch-Italienisch in den Jahren 1789 bis 1814“, in: Lavric, Eva/Pöckl, Wolfgang (eds.): Comparatio delectat II, Akten der VII. Internationalen Arbeitstagung zum romanisch-deutschen und innerromanischen Sprachvergleich. Frankfurt a. M.: Lang, 693-706.
Schreiber, Michael (2017): „La phrase unique: Die Ein-Satz-Struktur in Texten der Französischen Revolution und deren Übersetzungen“, in Dahmen, Wolfgang et al. (Hrsg.): Sprachvergleich und Übersetzung. Die romanischen Sprachen im Kontrast zum Deutschen. Romanistisches Kolloquium XXIX. Tübingen: Narr, 81-98.
Project on “Translation policies in/for Belgium 1792-1814 in the language pair French-Dutch with special attention to the Flemish Departments Escaut (Scheldt) and Lys (Leie)”: https://belgien.uepol.uni-mainz.de/
Project on “Legal, administrative and political translations from French into Italian during the Napoleonic era illustrated by the example of Milan and Genoa”: https://italien.uepol.uni-mainz.de/
Project on “The translation of legal and administrative texts in Rhine-Hesse
and the Palatinate region during the ‘Republic of Mainz’ and the French rule”: https://rheinland.uepol.uni-mainz.de/
di Bartolo, Giuseppina & Marchesi, Beatrice
Negative concord in Postclassical Greek: Impact and functions of formulaic expressions in documentary papyri
This paper deals with Ancient Greek and aims to investigate functions and role of formulaic expressions involving the occurrence of one or more negative indefinites and/or a negative marker. It is part of a broader research on the diachrony of the Postclassical Greek negative concord system (Gianollo 2024; di Bartolo, Gianollo & Marchesi forthcoming).
The paper combines a quantitative and qualitative analysis and is based on a corpus of documentary papyri of the early Roman Period (i.e., 1st–2nd cent. CE). The analysis includes all the occurrences of the lemma oudeís (‘no one’) which have been extracted using the Trismegistos database, divided by gender and case, and annotated according to the annotation scheme developed by Gianollo (2024) for the syntactic analysis of negation in the New Testament.
First, the paper discusses the methodological difficulties of dealing with a corpus of documentary papyri (e.g., their heterogeneity in terms of register, cf. Palme 2011 and Bentein 2015). Due to the high number of occurrences of oudeís in ‘formulaic expressions’ (cf. Wray 2008: 3–10) encountered in the corpus, our analysis discusses the methodological choices introduced to minimize the impact that such occurrences, given their frequency and their fixed word-order, might have on the diachronic analysis based on quantitative data. Thus, we present the adjustments made to the annotation scheme of data from documentary papyri and the new labels proposed to differentiate between the different realizations of formulaic expressions.
Secondly, this work provides a list of the formulaic expressions found in the corpus together with an overview of the different types of discourse in which they occur. Even though this study analyses only the ou- series of ‘objective negation’ (Chatzopoulou 2012; Willmot 2013), formulaic expressions present a greater interplay than non-formulaic occurrences between this series and the mḗ- series of ‘subjective negation’, giving way to qualitative observations regarding negative patterns that might be favoured in contexts with fixed word-order.
Third, the study shows the impact of these formulaic expressions on the distribution of the different syntactic patterns used to describe the negative system in the corpus.
Finally, it addresses the pragmatic functions of these expressions and their role in terms of discourse segmentation according to some of the research questions of the conference.
di Bartolo, G., Gianollo, C. & Marchesi, B. Forthcoming. The system of negative concord in Postclassical Greek: Evidence from documentary papyri. In: G. di Bartolo, P. Filos, G. Giannakis & D. Kölligan (eds.), tba. Berlin/Boston.
Bentein, K. 2015. The Greek documentary papyri as a linguistically heterogeneous corpus: The case of the katochoi of the Sarapeion-archive. In: Classical World 108 (4): 461–484.
Chatzopoulou, K. 2012. Negation and Nonveridicality in the History of Greek. Journal of Greek Linguistics 13 (1): 149-153.
Gianollo, C. 2024. Negative concord and word order in the Greek Bible and New Testament. In: G. di Bartolo, D. Kölligan (eds.), Postclassical Greek: Problems and Perspectives: 187–223. Berlin/Boston.
Palme, B. 2011. The Range of Documentary Texts: Types and Categories. In: R. S. Bagnall (ed.), The Oxford Handbook of Papyrology: 358–394. Oxford.
Willmott, J. C. 2013. Negation in the history of Greek. In D. Willis, C. Lucas & A. Breitbarth (eds.), The history of negation in the languages of Europe and the Mediterranean I. Case studies: 299–340. Oxford: Oxford University Press.
Wray, A. 2008. Formulaic Language: Pushing the Boundaries. Oxford University Press.
Di Pasquale, Daniele
Shifts and Standardization of Language Patterns in Korean Old Vernacular Epistles (언간, 諺簡, Ŏn'gan)
This study investigates the gradual changes and standardization of formulaic expressions in Ŏn'gan, vernacular (Ŏnmun, 언문, 諺文) Korean letters from the late Chosŏn Dynasty (1392-1897). Initially informal and shaped by oral traditions, these letters exhibit repeated use of certain expressions that, over time, became increasingly fixed and formalized. By the 19th century, this standardization culminated in the publication of instructional letter collections known as Ŏn'gandok (언간독, 諺簡牘), which established more rigid norms for letter writing.
By comparing early Ŏn'gan manuscripts with later standardized Ŏn'gandok materials, the study aims to identify key changes in the formulation and use of recurring expressions. This comparative approach will reveal how these expressions shifted from flexible, context-dependent forms to fixed, standardized elements in formal epistolary conventions. By examining patterns of repetition, ellipsis, and conventionality, the study will assess how these linguistic elements reflected the socio-cultural contexts of the time, including social hierarchy, politeness conventions, and relationships between correspondents.
This study, which employs paleographic expertise to decode the manuscript letters in cursive vernacular, aims to investigate the reasons behind the use of formulaic expressions in premodern Korean vernacular letters and what they reveal about this particular communication practice of the time. Early results suggest that these expressions reflected social statuses, maintained proper etiquette, and preserved oral traditions in written form. This study’s results aim to examine how recurring patterns in Korean epistolary practices contributed to the development of standardized epistolary norms and formulaic expressions, sheding light on early examples of linguistic standardization in Korean historical linguistics.
Drigo, Jasmim
Calques or Formulaic Expressions? An analysis of calques in Early Irish religious language
Some work has been done on Latin borrowings into Early Irish (e.g. McManus 1982, 1983, 1984), but most of the work has been concentrated on phonetic loanwords and relative chronology of phonological changes. Meanwhile, simplex calques have only been briefly discussed (e.g. McManus (1982), and structural calques have been ignored.
Structural or syntactic calques are one type of borrowing, more specifically, loan translation of complete sentences (Molnár 1985).
Some Latin religious terms have been passed into Early Irish as loan translations, e.g.:
- OIr. Tír Tairngiri ‘he promised Land of the Old testament’ (Wb. 33b6, 2c21, Ml. 68b4, 78c11, 83d4, etc.)
In this example, Tír ‘land’ is combined with Tairngire ‘prophecy’, a word built with based on Latin praedictio: tair ‘before’ + ngire ‘a calling’
- MIr. grásta Dé ‘the grace of God’ from Latin gratia Dei (Dánta Gr. 63.24)
Can some of these Early Irish structural calques be also considered formulaic expressions? Formulaic language are fixed expressions used in special contexts, but how do we understand formulaic expressions in bilingual texts and bilingual communities? We know that Medieval Ireland was a place of multilingualism, and this is especially clear in some texts, such as the Old Irish Glosses (Würzburg glosses, Milan glosses, St Gall Priscian glosses) and some Middle Irish texts (e.g. Binchy 1976, Bisagni 2014, Moran 2022, etc.).
In this paper, I will examine the translation of some Latin religious terms into Early Irish, exploring both how they were rendered and how they were used in context. Whether these terms are best classified as calques or formulaic expressions depends on the analytical approach—linguistic or philological—as each may lead to different conclusions based on their respective focuses.
BINCHY, D. A. (1976). “Semantic influence of Latin in Old Irish glosses”. O’MEARE, J. & NAUMANN, B. (Eds). Latin script and letters a.d. 400–900: Festschrift presented to Ludwig Bieler on the occasion of is 70th birthday, 167–73.
BISAGNI, J. (2014). “Prolegomena to the study of code-switching in the Old Irish Glosses”. Peritia 24-25, 1-58.
Mc MANUS, D. (1982). The Latin loanwords in Early Irish. Dissertation: University of Dublin.
Mc MANUS, D. (1983). “A Chronology of the Latin Loan-Words in Early Irish.” Ériu 34, 21-71.
Mc MANUS, D. (1984). “On Final Syllables in the Latin Loan-Words in Early Irish.” Ériu 35, 137-162.
MOLNÁR, N. (1986). The Calques of Greek Origin in the Most Ancient Old Slavic Gospel Texts. Cologne & Vienna: Böhlau.
MORAN, P. (2022). “Latin Grammar Crossing Multilingual Zones: St Gall, Stiftsbibliothek, 904”. In: CLARKE, M. & NÍ MHAONAIGH M. (Eds). Medieval Multilingual Manuscripts: Case Studies from Ireland to Japan, Studies in Manuscript Cultures 24., 35–54.
Elalfy, Doaa
The Canonization and Transmission of Collyria (اشياف) in Greek and Arabic Medical Traditions: Lexical Shifts and Formulaic Patterns
In the vast landscape of medical history, the use of collyria -medicated eye drops- is a prime example of the intersection between pharmacological knowledge and cultural exchange. This paper explores the process of evolution, canonization, and linguistic transmission of collyria, or اشياف (ašyāf), within the ancient Greek and Arabic medical traditions.
By examining both the conventionalized nature of collyria formulations and the shifts in terminology that occurred during the translation of medical texts, this study offers insights into the processes that solidified these remedies within two major cultural and scientific traditions.
Focusing on key figures such as Ḥunayn ibn Isḥāq, Yuḥannā ibn Māsawayh, Qustā ibn Lūqā, and Thābit ibn Qurra, the paper investigates how Greek medical knowledge, including the works of Dioscorides and Galen, was preserved, adapted, absorbed and eventually canonized in the Arabic world. These translators and physicians safeguarded the medical wisdom of antiquity and modified it to fit new cultural and practical contexts, thereby bridging ancient Greek formulations with Islamic medical practices. Their adaptations of collyria reflect broader intercultural and intellectual exchanges that contributed to the enduring legacy of these remedies. The study further analyzes specific collyria and ašyāf formulations found in both Greek and Arabic sources, exploring their therapeutic applications in treating eye conditions such as conjunctivitis and cataracts.
By tracing the formulaic structures in the texts, it highlights how certain conventionalized patterns were transmitted and solidified, thus becoming canonical within medical literature. Additionally, the lexical and semantic shifts that occurred during the translation process reveal how the adaptation of terminology shaped the understanding of these remedies across cultures, contributing to their lasting influence.
Ultimately, this paper situates ašyāf not only as practical therapeutic agents but also as linguistic and cultural markers of canonized medical knowledge. The formulaic nature of collyria, alongside the lexical evolution that occurred through translation, underscores the importance of both language and practice in the canonization of medical traditions. Through this lens, we gain a deeper understanding of the intercultural transmission of pharmacological knowledge and the role that formulaic language played in solidifying ancient medical practices as enduring elements of both Greek and Arabic medical canons.
Elder, Claire M.
The Fermesse, Fidelity and Faith: A Pragmatic Analysis of the Formulaic Symbols in an Early Modern Scottish Community of Practice
This paper examines the evidence of formulaic symbol use within a curated dataset of 183 seventeenth-century letters preserved in the Edinburgh archives. The analysis combines quantitative and qualitative techniques to uncover the pragmatic intentions which underpinned the senders’ decision to inscribe the fermesse symbol in their correspondence.
Studies by Nevalainen and Raumolin-Brunberg (1995), Nevailen (2001), Nevala (2003, 2004), Dossena (2007, 2013), Wood (2009), Rutten and van der Wal (2013), Pfeiffer and Schiegg (2020) and Bengough-Smith (2023) have previously established the capacity of formulaic language to encode pragmatic function in early modern letters. Similarly, scholars including Gibson (1997), Stewart and Wolfe (2004), Daybell (2012), Meurman-Solin (2013), Wiggins (2017), Evans (2020), and Smith (2020) have revealed the capacity for visual and material features to carry equivalent meaning within such documents. Moreover, the increased accessibility to high-quality, zoomable images afforded by digital scholarly editions of letters has allowed researchers to examine the extratextual pen marks added to such manuscripts (Starza Smith 2013).
As well as the signs of abbreviations and punctuation in regular handwritten use at the time, adding decorative lines, flourishes, and other symbols to superscriptions, subscriptions, and signatures was common practice. Such visual motifs provided writers with an ‘opportunity for self-fashioning’: the process of creating and projecting one's own identity through various means, including visual elements (Williams 2013). This paper will argue that in some instances, these visual features may be categorised as formulae that emblemise the relationship between sender and recipient.
One such is the fermesse symbol: an ‘intersected or slashed capital letter S, somewhat like the italic dollar sign $’ (Beal 2008) which had multiple meanings and functions across the early modern era. This symbol, originating in France, was coined as fermesse in the nineteenth century. The term connects its French translation, fermé S, to, arguably, its most salient meaning: firmness, via a pun (Chareyre 2007, 75). Hobson’s valuable 1935 survey identified countless French examples in letters and other mediums, including ceramics, jewellery, armour and fabric (115). Moreover, recent discussions have recognised fermesse use in the English language correspondence of Queen Anne of Denmark (Somers Cocks 1980; Wolfe 2013) and members of the Sidney circle (Larson 2015; Hannay 2013). However, until now, the fermesse in Scottish letters has been overlooked.
This paper calls for the systematic capture, encoding, and analysis of the formulaic symbols inscribed in early modern letters. By demonstrating the potential for visual formulae such as the fermesse to carry pragmatic significance as do epistolary manuscripts’ linguistic and material features, it seeks to spark new conversations about their importance within historical letters.
Cocks, Anna Somers. 1980. Princely Magnificence: Court Jewels of the Renaissance, 1500–1630. London: Debrett 's Peerage Ltd. in association with the Victoria and Albert Museum.
Daybell, James. 2012. The Material Letter in Early Modern England: Manuscript Letters and the Cultures and Practices of Letter-writing, 1512–1635. London: Palgrave Macmillan.
Dossena, Marina. 2013. “Mixing Genres and Reinforcing Community Ties in Nineteenth-Century Scottish Correspondence: Formality, familiarity and religious discourse.” In Communities of Practice in the History of English, edited by Joanna Kopaczyk and Andreas H. Jucker, 47–60. Amsterdam: John Benjamins Publishing Company.
Evans, Mel. 2020. Royal Voices: Language and Power in Tudor England. Cambridge: Cambridge University Press.
Gibson, Jonathan. 1997. “Significant Space in Manuscript Letters.” The Seventeenth Century 12(1): 1–10.
Hannay, Margaret P. 2013. Mary Sidney, Lady Wroth. United Kingdom: Ashgate Publishing Limited.
Hobson, G. D. 1935. Les Reliures à La Fanfare: Le Problème De l'S Fermé. London: The Chiswick Press.
Meurman-Solin, Anneli. 2013. “Features of Layout in Sixteenth– and Seventeenth–Century Scottish Letters.” In Annotating Variation and Change (Studies in Variation, Contacts and Change in English 1), edited by Anneli Meurman-Solin and Arji Nurmi. www.helsinki.fi/varieng/series/volumes/14/Meurman-Solin_b/.
Nevala, Minna. 2003. “Family First: Address and Subscription Formulae in English Family Correspondence From the Fifteenth to Seventeenth Century.” In Diachronic Perspectives on Address Term Systems, edited by Irma Taavitsainen and Andreas H. Jucker, 147–76. Philadelphia, The Netherlands: The John Benjamins Publishing Company.
Nevala, Minna. 2004. “Inside and Out: Forms of Address in Seventeenth- and Eighteenth-Century Letters.” In Journal of Historical Pragmatics (5): 27–296.
Nevalainen, Terttu. 2001. “Continental Conventions in Early English Correspondence.” In Towards a History of English as a History of Genres, edited by Hans-Jürgen Diller and Manfred Görlach, 203–224. Heidelberg: Winter.
Nevalainen, Terttu, and Helena Raumolin-Brunberg. 1995. “Constraints on Politeness: The Pragmatics of Address Formulae in Early English correspondence.” In Historical Pragmatics: Pragmatic Developments in the History of English, edited by Andreas H. Jucker, 541–601. Amsterdam: Benjamins.
Pfeiffer, Christian and Markus Schiegg. 2020. “Religious Formulae in Historical Lower-Class Patient Letters.” In Formulaic Language and New Data, edited by Elisabeth Piirainen, Elisabeth Filatkina, Elisabet, Sören Stumpf and Christian Pfeiffer, 250–77. Berlin, Boston: De Gruyter.
Rutten, Gijsbert and Marijke van der Wal. 2013. “Epistolary Formulae and Writing Experience in Dutch Letters from the Seventeenth and Eighteenth Centuries". In Touching the Past: Studies in the Historical Sociolinguistics of Ego-documents edited by Gijsbert Rutten and Marijke van der Wal, 45–65. Amsterdam: John Benjamins Publishing Company.
Starza Smith, Daniel. 2013. “The Material Features of Early Modern Letters: A Reader’s Guide”, in Bess of Hardwick's Letters: The Complete Correspondence, c.1550-1608. Edited by Alison Wiggins, Alan Bryson, Daniel Starza Smith, Anke Timmermann and Graham Williams, The University of Glasgow. Web development by Katherine Rogers, University of Sheffield Humanities Research Institute. www.bessofhardwick.org/background.jsp?id=143.
Stewart, Alan, and Heather Wolfe. 2004. Letterwriting in Renaissance England. United States: Folger Shakespeare Library.
Wiggins, Alison. 2017. Bess of Hardwick’s Letters: Language, Materiality, and Early Modern Epistolary Culture. Abingdon: Routledge.
Williams, Graham. 2013. “The Language of Early Modern Letters: A Reader's Guide”, in Bess of Hardwick’s Letters: The complete correspondence, c.1550–1608. Edited by Alison Wiggins, Alan Bryson, Daniel Starza Ldon, Anke Timmermann and Graham Williams, The University of Glasgow Web development by Katherine Rogers, University of Sheffield Humanities Research Institute. www.bessofhardwick.org/background.jsp?id=168.
Wolfe, Heather. 2013. “A Letter from Queen Anne to Buckingham Locked with Silk Embroidery Floss.” The Collation. https://www.folger.edu/blogs/collation/a-letter-from-queen-anne-to-buckingham-locked-with-silk-embroidery-floss/.
Wood, Johanna L. 2009. “Structures and Expectations: A Systematic Analysis of Margaret Paston’s Formulaic and Expressive Language.” Journal of Historical Pragmatics 10(2): 187–228.
Fantoli, Margherita, Korkiakangas, Timo
Exploring formulaic language in dependency treebanks using network analysis
Our paper explores ways to detect and quantify formulaic language in a corpus of 521 early medieval charters written in non-standard Latin in Tuscia mostly in the 9th century and available in the Late Latin Charter Treebank (LLCT2, 242,411 tokens; Korkiakangas 2021).
Charters consist of diplomatic sections with different functions. Sections which focus on the legal validity of the document are the most conservative and mainly composed of prefabricated phrases with precise slots for variable information, such as dates and names, whereas the language of sections which convey the case-specific motivations, circumstances, and details of the transaction is freer and necessarily relies more on the scribe’s command of Latin as a second language. Sabatini (1965) noticed that such a dichotomy is relevant to linguistic study: formulaic parts display errors that derive from the scribes’ defective knowledge of centuries-old legal language, such as hypercorrections and other misunderstandings, while, in the less formulaic parts, the scribes lapsed into non-standard constructions triggered by their vernacular, which was already close to Romance. Thus, the formula context helps in drawing historical linguistic conclusions on whether a specific Latin construction was still present and whether a specific Romance construction was already present in the spoken language of the time.
We explore the variation in the syntactic structure of the sentences to guide the detection of formulaic sequences. In fact, formulaic sequences of charters are often non-linear and represented by a few core terms alternating with slots filled by variable elements. Hence, raw counts do not allow to account for such “embedded” elements. We propose to build a network of LLCT2 capturing all syntactic relations of the corpus, by merging the trees of the individual sentences. The nodes are represented by the lemmas (which are not subject to spelling variation) and the edges by the dependency relation linking two lemmas. The text is slightly preprocessed in order to group some terms that typically vary from one instance of a formula to another (proper names, numbers, and dates). With this experiment, we build on Passarotti (2015), where the treebank was transformed into a network using a similar procedure.
We carry out the analysis of the network by focussing on two aspects:
- the nodes linked by the heaviest edges, e.g., the lemmas that are the most frequently linked by dependency relations;
- the nodes aggregated in communities based on the Louvain algorithm (Blondel et al. 2008), i.e., the lemmas that appear to share a “syntactic cluster”.
We compare these two proxies of formulaicity to the existing manual binary formulaicity annotation in the corpus (where each sentence is marked as formulaic or non-formulaic; Korkiakangas & Lassila 2013: 66–67) as well as with a sample of more fine-grained manual annotation created for the present investigation.
Preliminary results suggest that the heaviest links correspond to an intuitive idea of formulaicity, while communities, despite binding together words that tend to appear in formulaic expressions, suffer from two methodological shortcomings: the fact that each node can only be assigned to one community (e.g., each lemma is potentially assigned to only one formula) and the fact that every node has to be assigned to one community, which results in very low-frequency nodes being included.
Blondel, Vincent D., Guillaume, Jean-Loup, Lambiotte, Renaud & Lefebvre, Etienne. 2008. ‘Fast unfolding of communities in large networks’, in Journal of Statistical Mechanics: Theory and Experiment 10, P10008, doi: 10.1088/1742-5468/2008/10/P10008. ArXiv: http://arxiv.org/abs/0803.0476
Korkiakangas, Timo. 2021. ‘Late Latin Charter Treebank: contents and annotation’, in Corpora 16:2, 191–203. https://tuhat.helsinki.fi/ws/portalfiles/portal/128999342/corpora_korkiakangas_Accepted_Manuscript_AM_deanonymized.pdf
Korkiakangas, Timo & Lassila, Matti. 2013. ‘Abbreviations, fragmentary words, formulaic language: treebanking medieval charter material’, in Mambrini, F., Passarotti, M. & Sporleder, C. (eds.) Proceedings of The Third Workshop on Annotation of Corpora for Research in the Humanities (ACRH-3), 61–72. Bulgarian Academy of Sciences: Sofia. http://bultreebank.org/wp-content/uploads/2017/06/ACRH-3Proceeding.pdf
LLCT2 = Korkiakangas, Timo, Cecchini, Flavio & Passarotti, Marco. 2020. ‘Late Latin Charter Treebank’, in Zeman, D., Nivre, J., Abrams, M. & al., Universal Dependencies 2.6, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University, Czech. https://github.com/UniversalDependencies/UD_Latin-LLCT.
Passarotti, Marco. 2015. ‘What syntax can do for philosophy: a treebank-based network analysis of the verb sum in Thomas Aquinas’, in Rivista di filosofia neo-scolastica 107(1–2), 309–324.
Sabatini, Francesco. 1965. ‘Esigenze di realismo e dislocazione morfologica in testi preromanzi’, in Rivista di Cultura Classica e Medievale 7, 972–998.
Fascione, Sara (part of panel B)
Formularity and idiosyncrasy in Fronto’s letter headings
Ancient epistolography has very few rules regarding the form of both documentary and literary letters, and most of them concern the headings and salutation formulae, which follow codified and extremely repetitive patterns across the centuries. The general trend in Latin epistolography shows how letter writers used distinctive formulae, repeating them with consistency in their correspondences, even when addressing different recipients. In this scenario, Fronto’s epistolary corpus represents a striking exception. He and his addressees use formulaic, traditional expression in the headings, but at the same time they seem particularly keen to introduce variations. These either outline the evolution of the relationship between the correspondents, or depend on the social status and age of the addressee, or even may be evidence of discrepancies in the editing phases of the letter collection. In fact, due to structural inconsistencies, Fronto’s corpus has been considered as a posthumous edition put together after the death of its author by an anonymous pupil or family member. However, a closer look reveals that the letters are gathered according to precise patterns that connect the various parts of the corpus at different levels. Therefore, the paper aims at analyzing the very peculiar oscillation between formularity and idiosyncratic use of headings emerging from Fronto’s letters, in order to assess whether this element may be seen as evidence of the intervention of the author in the making of his epistolary corpus.
Fezer, Katharina
Tracing and comparing formulaes in printed and handwritten texts: Methods, issues, challenges
During the Early Modern Period, letter writing manuals (which usually consisted of a set of theoretical rules on how to write letters plus a collection of model letters) played an essential role in epistolography and formed an important source of formulaic language (cf. Große 2020). The number of reprints and editions indicates that these works were frequently used – even if nobody admitted to using them, letter writing being expected to be an original, creative activity far away from any formulaicity (cf. Haroche-Bouzinac 2010).
However, it has often not been possible to carry out a scientific examination of whether the formulae provided by the letter writing manuals can actually be found in real private letters of the same era: As private texts were often deemed neither worthy nor suitable of conservation, there were hardly any known authentic private letters from these earlier periods that could have been analysed. Various attempts have been made to find such sources in archives and other places, and for some languages, corresponding corpora and studies already exist (cf. Rutten/Van der Wal 2012, 2013, Serra 2023, among others), but for French in particular this problem is still acute (see, however, an initial study by Nakagawa (2022)).
My study aims to trace formulaic language on the basis of a newly compiled 17th century French letter corpus which consists of authentic, handwritten letters on the one hand and printed model letters (drawn from letter writing manuals) on the other.
I will describe the different steps that were necessary for the corpus creation: the strategies to find such sources, the digitization of these texts (including the different types of transcription) and their further processing using tools like MaxQDA and Textométrie. Finally, I will present a few results of the quantitative and qualitative analyses made possible by this corpus: Besides answering the question to what extent the formulae used in the two different types of letters coincide, particular attention will be paid (1) to the formulae that were promoted in the letter writing manuals but criticized in other meta-linguistic works (grammars etc.) of that period, e.g. due to their pleonastic nature. and (2) to hypercorrections or other deviations from the grammatical norms (morphosyntactic alignment etc.) that allow conclusions about the individual linguistic competence of the writers.
Große, Sybille: Über das Wandern von Worten, Formeln und Traditionen in der west- und mitteleuropäischen Epistolographie des 17. und 18. Jahrhunderts, in: Dominika Bopp / Stefaniya Ptashnyk / Kerstin Roth / Tina Theobald (eds.): Wörter – Zeichen der Veränderung, Berlin / Boston 2020, 319–341.
Haroche–Bouzinac, Geneviève: Dames et cavaliers, doctes, épistoliers ordinaires, in Gérard Ferreyrolles (ed.): L’épistolaire au XVIIe siècle, Paris 2010 (Littératures classiques 71), 67–90.
Nakagawa, Ryo: Les formules épistolaires en français aux XVIIe et XVIIIe siècles dans les lettres des réfugiés protestants (Huguenot Library F/AF et F/CA). Linx. Revue des linguistes de l’université Paris X Nanterre 85 (2022).
Rutten, Gijsbert & Marijke J. van der Wal: Functions of epistolary formulae in Dutch letters from the seventeenth and eighteenth centuries. Journal of Historical Pragmatics 13.2 (2012): 173-201.
Rutten, Gijsbert, and Marijke Van der Wal: Epistolary formulae and writing experience in Dutch letters from the seventeenth and eighteenth centuries, in: Touching the past: Studies in the historical sociolinguistics of ego-documents. Amsterdam & Philadelphia: John Benjamins (2013): 45-65.
Serra, Eleonora: Learning to Write Letters in Sixteenth Century Florence: Epistolary Formulae in the Correspondence of Lucrezia Albizzi Ricasoli. Linguistica 63.1-2 (2023): 273-300.
Giannikou, Kyriaki
Assessing and Reassessing Formulaicity: are editorial practices a blessing or a curse?
Formulaicity is a widely discussed concept in the study of historical Greek, primarily due to the influence of the Homeric epics, where it is traditionally understood to arise from oral contexts where formulaic sequences reduce processing effort during lengthy recitations. Besides that, formulaic language also appears in entirely written contexts, such as post-classical Greek administrative and legal documents, where high standardisation meets the need of accuracy and efficiency (see e.g. Nachtergaele 2023; Saradi 2019). The corpus I focus on, Byzantine book epigrams — short, metrical texts found in the margins of Byzantine manuscripts — presents a unique case. These paratexts, embedded in the medieval manuscript tradition, blend literary and documentary functions without any oral performance context, oscillating between practical precision and creative expression. This paper explores a methodological challenge in studying formulaic language within historical Greek corpora, focusing specifically on the Database of Byzantine Book Epigrams.
Even recent comprehensive research on Homer’s formulaic language (Bozzone 2024) relies on modern editions of the Homeric epics that attempt to reconstruct an ‘archetype’ based on medieval manuscript ‘witnesses’. In contrast, the DBBE diverges from strict adherence to traditional editorial practices by presenting epigrams preserving all original scribal choices (‘Occurrences’) while also offering ‘normalised’ versions (‘Types’) that group similar instances of the originals (Ricceri et al. 2023). This raises questions: To what extent can we rely on edited texts to analyse formulaicity? How might editorial choices, driven by the desire for a cohesive text, obscure the original variability of formulaic sequences? Does the interaction between formulaicity and editorial practices facilitate research, or does this create the impression of greater fixedness in formulae, potentially skewing certain aspects of the analysis?
This paper explores the potential impact of editorial intervention on formulaicity research, advocating for a more flexible methodology that balances the use of both edited and original sources. Through a case study on supplications for salvation within a subset of the DBBE corpus, I will demonstrate how formulaic expressions function in this hybrid referential-poetic (cf. Jacobson 1960) context, and how editorial practices may shape our understanding of formulaicity. Ultimately, this study seeks to position this material within the broader framework of formulaicity research and to discuss the implications of editorial practices for linguistic research in historical corpora.
Bird, G. D. 2010. Multitextuality in the Homeric Iliad: The witness of the Ptolemaic papyri. Washington-Cambridge.
Bozzone, C. 2024. Homer’s living language: Formularity, dialect, and creativity in oral-traditional poetry. Cambridge.
Jakobson, R. 1960. ‘Closing Statements: Linguistics and Poetics’. In Thomas A. Sebeok (Ed.). Style In Language. Cambridge, 350–377.
Nachtergaele, D. 2023. The formulaic language of the Greek private papyrus letters. Leuven.
Ricceri, R. et al. 2023. ‘The Database of Byzantine Book Epigrams project: Principles, challenges, opportunities.’ Journal of Data Mining and Digital Humanities.
Saradi, H. 2019. ‘Rhetoric and Legal Clauses in the Byzantine Wills of the Athos Archives: Prooimia and Clauses of Warranty.’ In O. Delouis and K. Smyrlis (Eds.). Lire Les Archives de l’Athos, Actes Du Colloque Réuni à Athènes Du 18 Au 20 Novembre 2015 à l’occasion Des 70 Ans de La Collection Refondée Par Paul Lemerle, 23/2, 357–388.
Small, J. P. 1997. Wax tablets of the mind: Cognitive studies of memory and literacy in classical antiquity. London.
Ginevra, Riccardo, Biagetti, Erica, Brigada Villa, Luca & Zanchi, Chiara
Comparing Indo-European Poetic Languages: How to Combine Construction Grammar and Digital Resources for the Analysis of Formulaic Phraseology in Vedic Sanskrit and Homeric Greek
Soon after Parry (1971[1928]) and Lord (1960) first demonstrated the oral-formulaic character of Homeric poetry, scholars like Magoun (1953) and Kiparsky (1976) drew attention to its relevance for other poetic traditions too – the Old English and the Vedic Sanskrit ones, respectively. Such correspondences allowed Indo-Europeanists to develop a historical-comparative methodology for the analysis and reconstruction of Indo-European formulaic phrases, e.g. Campanile (1977), Watkins (1995), and García Ramón (2021).
Kiparsky (1976) already stressed the strong similarity between formulas and idioms from a linguistic perspective. Research on idiomatic expressions eventually led to Construction Grammar (Fillmore et al. 1988; Goldberg 1995), an approach that assumes no strict division between lexicon and syntax, but rather a continuum from “constructions” (i.e. “learned pairings of form with semantic or discourse function”; Goldberg 2006: 5) that are more phonologically fixed (e.g., lexemes, fixed idioms) to constructions that are more schematic (e.g., syntactic constructions, flexible idioms). Construction-based approaches are able to capture both fixed repetitions and more flexible or schematic patterns of poetic language, as first argued by Bozzone (2014) for Homeric formulas and by Frog (2014) for Old Norse kennings, and are thus highly relevant to the historical-comparative analysis and reconstruction of Indo-European formulaic patterns (Ginevra 2021; 2023).
As proposed by Biagetti (2023), in our presentation we will combine a construction-based approach with two types of digital resources, namely TreeBanks (morphosyntactically parsed corpora; Hellwig et al. 2020; Mambrini 2021) and WordNets (lexicosemantic relational databases; Biagetti et al. 2021), allowing for the automatic extraction of formulaic patterns and making research on poetic language replicable and systematic. Building on previous constructionist and computer-based research on Vedic Sanskrit (Brigada Villa et al. 2023) and Homeric Greek (Brigada Villa et al. forthcoming), we will perform a comparative analysis of formulaic patterns including speech verbs in these Indo-European traditions.
For instance, similar patterns are attested in Vedic Sanskrit (1)–(2) and Homeric Greek (3)–(4), involving verbs with call/invoke semantics governing an object X referring to a human or deity and an element Y expressing a purpose. By means of an inductive approach, we will attempt to evaluate if such parallels are best analyzed as reflexes of an inherited Indo-European formulaic construction or rather as independent developments in two related poetic traditions.
(1) indravāyū́x manojúvā / víprā havanta ūtáyey
Indra-Vāyu.acc.du mind-swift.acc.du poet.nom.pl call.3pl.mid help(f).dat
“Indra and Vāyu, mind-swift, do the inspired poets call for help” (RV 1.23.3ab)
(2) ugrámx pūrvī́ṣu pūrvyáṁ / hávante vā́jasātayey
strong.acc many.loc.pl foremost.acc call.3pl.mid prize_winning(f).dat
“They call on (you) the strong, foremost among the many (peoples), for the winning of prizes” (RV 5.35.6cd)
(3) […] Aléxandrós sex kaleî oîkon dè néesthaiy
Alexander.nom 2sg.acc call.3sg home.acc ptc go.inf.mid
“Alexander calls on you to go home” (Il. 3.390)
(4) kiklḗskous’ Aídēnx kaì epainḕn Persephóneianx,
call.ptcp.nom.f Hades.acc and dread.acc.f Persephone(f).acc
/ […] / paidì dómeny thánaton […]
son.dat give.inf.aor death.acc
“Calling on Hades and dread Persephone to give death to her son” (Il. 9.569–571)
Biagetti, Erica. 2023. Integrare Sanskrit WordNet e Vedic TreeBank: uno studio pilota sulla formularità del Rigveda tra semantica e sintassi. In: I. Bossolino and C. Zanchi (eds.), E pluribus unum. Prospettive sull’Antico Per i Decennalia dei Cantieri d’Autunno: i seminari dell’Università di Pavia dedicati al mondo antico, 45–62. Pavia: PUP.
Biagetti Erica, Chiara Zanchi, and William M. Short. 2021. Toward the creation of WordNets for ancient Indo-European languages. In: P. Vossen and C. Fellbaum (eds.), Proceedings of the 11th Global Wordnet Conference, pp. 258-266. University of South Africa (UNISA): Global WordNet Association.
Bozzone, Chiara. 2014. Homeric Constructions. PhD thesis, University of California, Los Angeles.
Brigada Villa, Luca, Erica Biagetti, Riccardo Ginevra, and Chiara Zanchi. 2023. Combining WordNets with Treebanks to study idiomatic language: A pilot study on Rigvedic formulas through the lenses of the Sanskrit WordNet and the Vedic Treebank. In: G. Rigau, F. Bond, and A. Rademaker (eds.), Proceedings of the 12th Global WordNet Conference, 133–139. Donostia - San Sebastian: Global Wordnet Association.
Brigada Villa, Luca, Andrea Farina, and Chiara Zanchi. Forthcoming. Formulaic networks as prototypical categories: Combining the Ancient Greek Dependency Treebank with the Ancient Greek WordNet for a pilot study on the Iliad. In: Proceedings of the International Colloquium of Ancient Greek Linguistics, Madrid 2022.
Campanile, Enrico. 1977. Ricerche di cultura poetica indoeuropea. Pisa: Giardini.
Fillmore, Charles J., Paul Kay and Mary Catherine O'Connor. 1988. Regularity and Idiomaticity in Grammatical Constructions: The Case of Let Alone. Language 64.3.501–538.
Frog. 2014. Oral Poetry as Language Practice: A Perspective on Old Norse dróttkvætt Composition. In: P. Huttu-Hiltunen et al. (eds.), Song and Emergent Poetics – Laulu ja runo – Песня и видоизменяющаяся поэтика, 279-307. Kuhmo: Juminkeko.
García Ramón, José Luis. 2021. Poética, léxico, figuras: fraseología y lengua poética indoeuropea. In: L. Galván (ed.), Mímesis, acción, ficción: Contextos y consecuencias de la «Poética» de Aristóteles, 11–57. Kassel: Reichenberger.
Ginevra, Riccardo. 2021. Metaphor, metonymy, and myth: Persephone’s death-like journey in the Homeric Hymn to Demeter in the light of Greek phraseology, Indo-European poetics, and Cognitive Linguistics. In: I. Rizzato, F. Strik Lievers & E. Zurru (eds.), Variations on Metaphor, 181–211. Newcastle upon Tyne: Cambridge Scholars.
Ginevra, Riccardo. 2023. Loki’s Chains, Agni’s Yoke, Prometheus Bound, and the Old English Boethius: Indo-European Myths of the “Binding/Yoking of Fire-Gods” in the Light of Comparative Poetics and Cognitive Linguistics. Indogermanische Forschungen 128.203-252.
Goldberg, Adele E. 1995. Constructions: a Construction Grammar Approach to Argument Structure. Chicago: Chicago University Press.
Goldberg, Adele E. 2006. Constructions at Work: The Nature of Generalization in Language. Oxford: Oxford University Press.
Hellwig, Oliver, Salvatore Scarlata, Elia Ackermann, and Paul Widmer. 2020. The Treebank of Vedic Sanskrit. In: N. Calzolari, F. Bechet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi et al. (eds.), Proceedings of The 12th Language Resources and Evaluation Conference (LREC 2020), 5137-5146. Marseille, France: European Language Resources Association.
Kiparsky, Paul. 1976. Oral Poetry: Some Linguistic and Typological Considerations. In: B. A. Stolz and R. S. Shannon (eds.), Oral Literature and the Formula, 73–106. Ann Arbor: Center for Coordination of Ancient and Modern Studies.
Lord, Albert B. 1960. The Singer of Tales. Cambridge MA: Harvard University Press.
Magoun, Francis P. Jr. 1953. Oral-Formulaic Character of Anglo-Saxon Narrative Poetry. Speculum 28.3.446–467.
Mambrini, Francesco. 2021. Universal Dependencies Conversion of the Ancient Greek Dependency Treebank. https://github.com/francescomambrini/katholou/tree/main/ud_treebanks/agdt/data
Parry, Milman. 1971 [1928]. The Traditional Epithet in Homer. In: Adam Parry (ed.), The Making of Homeric Verse: The Collected Papers of Milman Parry, 1–190. Oxford: Oxford University Press.
Watkins, Calvert. 1995. How to Kill a Dragon: Aspects of Indo-European Poetics. New York and Oxford: Oxford University Press.
Groot, Hester
Identity construction and genre shift through formulaic language in Scottish pauper letters, 1750-1900
Letter-writing formulae represent an important genre feature in cross-regional pauper letter traditions, and serve both practical and stylistic functions (Gardner 2023). In the case of Scotland, the eighteenth- and nineteenth-century poor often had little writing experience, but the poor relief system necessitated the composition of letters detailing their plights and making official requests. To formulate these letters properly and lend them legitimacy, paupers drew on genre conventions, stylistic and textual, as a formulating help (Rutten & van der Wal 2014: 171). Formulaic language therefore serves as a genre marker, situating a letter within a textual tradition. Within pauper letters, Jones & King (2015) have observed a continuum between two genre types: the petition (marked by formal, distant language, which often featured archaicisms no longer found elsewhere in the language of the Scottish lower-class population) and the familiar letter (less rigidly formulaic, and with more oral characteristics). The occurrence of various formulae helps us identify where a pauper letter falls on this cline, and consequently, the communicative strategy mastered and deployed by the petitioner.
This paper will investigate the formulaic strategies used by the Scottish poor in pauper letters written between 1750 and 1900. These materials are part of the ScotPP corpus (Leiden University, under construction), and offer an important and innovative insight into the voices of the underrepresented historical Scottish lower classes. I compare the writing practices of Scottish paupers to the patterns Gardner (2023) identifies among their English contemporaries, whose use of genre and formulae differs strikingly despite their shared use of written English. In a diachronic and diatopic investigation of letters from 20 Scottish counties, using systematic corpus linguistic methods, I investigate how paupers follow, adapt, or deviate from genre trends through their choice of various formulae. This illustrates how this allows them to construct an identity and position themselves hierarchically relative to the parish boards and overseers to whom they are writing, and reflects the changing hierarchical relations between petitioner and addressee that Jones & King (2019) locally observed in the Scottish Highlands. These findings show how the writers of these letters were, despite often limited schooling, able to exert agency over their self-representation and positionality through the linguistic means and genre constraints at their disposal.
Gardner, Anne-Christine. 2023. English pauper letters in the eighteenth century and beyond: On the variability and evolution of a new text type. Linguistica 63.1-2, 301–336.
Jones, Peter & Steven King. 2015. From petition to pauper letter: the development of an epistolary form. In Peter Jones & Steven King (eds.), Obligation, Entitlement and Dispute under the English Poor Laws, 53–77. Cambridge: Cambridge Scholars.
Jones, Peter & Steven King. 2019. Voices from the far north: Pauper letters and the provision of welfare in Sutherland, 1845–1900. Journal of British Studies 55(1), 76–98.
Rutten, Gijsbert & Marijke van der Wal. 2014. Letters as Loot. A sociolinguistic approach to seventeenth- and eighteenth-century Dutch. Amsterdam: Benjamins.
Große, Sybille
Formulaicity in French letters: function and acquisition in theory and empiricism
Formulas are described in various contexts, which hinders their establishment as a distinct and widely accepted term. This limitation also applies to the formulaic nature of specific components of letters. Bruneton-Governatori and Moreux (1997: 82) refer to the existence of predetermined models (préécrits), which can also be interpreted as writing routines in line with Gülich’s (1997) definition. Rutten and van der Wal (2014: 82) implement a functional typology of specific epistolary formulas, distinguishing between text-type formulas (e.g., letter openings) and text-structure formulas. While this distinction is conceptually clear, accurately differentiating text-type formulas from other components of letters remains a persistent challenge in the digital annotation of letter corpora.
In various studies, we have analysed the openings and closings of French letters written by predominantly less experienced writers (Grosse et al. 2016; Steuckardt et al. 2020, 2022). Our findings indicate that, alongside various communicative routines, the writers we examined employ certain stereotypical formulas in their correspondence, which they either reproduce faithfully in their traditional form or adapt through micro-variations.
In recent years, the role of formulas in discourse has also been explored in conjunction with cognitive considerations and questions in construction grammar, where they are viewed as conventionalized construction patterns. This topic has been addressed in the field of language acquisition research (Filatkina 2018: 38-45).
It is reasonable to assume that these formulas used in written communication are transmitted in diverse ways across different communication communities, a phenomenon typically referred to as ‘discourse tradition’ in Romance studies.
In this presentation on French letters, a distinction will be made between writing routines that tend to be taught implicitly and epistolary formulas that are acquired as part of explicit norm descriptions in different contexts (‘socialisation écrite’ - Große/Sowada 2020).
On a functional level, there has been limited research on how formulas in texts, including letters, can support text production for writers and facilitate text reception for readers (referred to as the ‘cognitive load factor’ - Meier 2020: 21-22). Consequently, our presentation will also focus on the transition from these formulas to the body of the letter.
Bubenhofer, Noah (2009): Sprachgebrauchsmuster. Korpuslinguistik als Methode der Diskurs- und Kulturanalyse, Berlin: de Gruyter.
Bruneton-Governatori, Ariane, Moreux, Bernard (1997 : „Un modèle épistolaire populaire“, in : Fabre, Daniel (eds.) : Par écrit. Ethnologie des pratiques d’écriture quotidiennes, Paris: Éditions de la Maison des Sciences de l’Homme, 79-103.
Corrigan, Roberta/Moravcsik, Edith/Ouali, Hamid/Wheatley, Kathleen (2009): “Introduction. Approaches to the study of formulae”, in: Corrigan, Roberta et al. (2009a) (eds.): Formulaic language. Volume 1: Distribution and historical change, Amsterdam/Philadelphia; Benjamins, XI-XXIV.
Coulmas, Florian (1981): Routine im Gespräch. Zur pragmatischen Fundierung der Idiomatik, Wiesbaden: Akademische Verlagsgesellschaft Athenaion.
Coulmas, Florian (1979): “On sociolinguistic relevance of routine formulae”. Journal of PragmaticsJournal of Pragmatics 3 (3-4): 239-266. https://doi.org/10.1016/0378-2166(79)90033-X.
Filatkina, Natlia (2018): Historisch formelhafte Sprache. Theoretische Grundlagen und methodische Herausforderungen, Berlin/Boston: de Gruyter.
Große, Sybille/Sowada, Lena (2020): "Socialisation écrite et rédaction épistolaire de scripteurs moins expérimentés – lettres des soldats de la Grande Guerre", Romanistisches Jahrbuch 71, 82-129.
Große, Sybille/Steuckardt, Agnès/Sowada, Lena/Dal Bo, Beatrice (2016): "Du rituel à l’individuel dans les correspondances peu lettrées de la Grande Guerre", in: Neveu, Frank et al. (eds.): Actes du 4e Congrès mondial de linguistique française, EPD Sciences, 1-15. DOI 10.1051/shsconf/20162706008.
Gülich, Elisabeth (1997): „Routineformeln und Formulierungsroutinen. Ein Beitrag zur Beschreibung formelhafter Texte“, in: Wimmer, Rainer (eds.): Wortbildung und Phraseologie, Tübingen: Narr, 131-176.
Meier, Kerstin (2020): Semantische und diskurstraditionelle Komplexität. Linguistische Interpretationen zur französischen Kurzprosa, Berlin/Boston: de Gruyter.
Rutten, Gijsbert/van der Wal, Marijke J. (2014): Letters as Loot. A sociolinguistic approach to seventeenth- and eighteenth-century Dutch, Amsterdam / Philadelphia: Benjamins.
Steuckardt, Agnès/Große, Sybille/Dal Bo, Beatrice/Sowada, Lena (2020): "Le rituel et l’individuel dans les pratiques d’écriture : l’exemple de la clôture dans des correspondances peu lettrées de la Grande Guerre" in: Remyssen, Wim/Tailleur, Sandrine (eds.): L’individu et sa langue, Laval: Presses de l’Université de Laval, 103-126.
Steuckardt, Agnès/Große, Sybille/Dal Bo, Beatrice/Sowada, Lena (2022): „La routine et le style. Exploration outillée des formules d’ouverture et de clôture dans des correspondances peu-lettrées de la Première Guerre mondiale d’écriture: l’exemple de la clôture dans des correspondances peu lettrées de la Grande Guerre“, in: Galleron, Ioanna /Idmhand, Fatiha (eds.): Dix ans de corpus d’auteurs, Paris: Editions des archives contemporaines, 203-220. https://doi.org/10.17184/eac.9782813004352.
Stumpf, Sören/Filatkina, Natalia (2018) (eds.): Formelhafte Sprache in Text und Diskurs, Berlin/Boston: de Gruyter.
Wood, David (2015): Fundamentals of formulaic language. An introduction, London et al. : Bloomsbury.
Wray, Alison (2009): “Identifying formulaic language. Persistent challenges and new opportunities”, in: Corrigan, Roberta et al. (2009a) (eds.): Formulaic language. Volume 1: Distribution and historical change, Amsterdam/Philadelphia; Benjamins, 27-51.
Honkanen, Saara
Formulaicity in Medieval Latin Historical Prose: the Case of Freculf of Lisieux
Scholars interested in formulaic syntax have traditionally predominantly focused on the study of various non-literary texts, whereas the presence of syntactic patterns, or ’templates’, in different literary genres has gained less attention so far as a potentially fruitful research area.
In this presentation I examine the role of formulaic syntax in Medieval Latin historiography by taking a close look at the narrative style of a 9th century Frankish historian Freculf of Lisieux. Based on a close reading and a detailed syntactic analysis of a series of narrative episodes (mainly battle sequences) selected from Freculf’s chronicle I define his preferred syntactic template(s) and illustrate these findings with several concrete examples.
Given that in the Antiquity and through to the Middle Ages historical prose was one of the genres regarded as representing ’high style’, it is perhaps surprising to note just how frequently Freculf has recourse to recurring syntactical patterns to build his historical narrative. Freculf’s style – and his continuous balancing between formulaicity and instances of independent narrative creativity – is to be understood in its historical context, the immediate aftermath of the Carolingian Renaissance. I argue that the constant interplay between template-driven phrasings and regular deviations from them reflects the contradictory nature of Freculf’s linguistic capabilities: As a representative of the generation of writers moulded by the Carolingian language reform and as a pupil of some of the reform’s famous educators Freculf has a sure grasp of Latin syntactical structures and a clear sense of the ideal of expression of his time (perspicuitas) but his attempts at a higher style often lead him into trouble and betray the limits of his linguistic competence. It seems that staying within the safe confines of learned formulaic expression offers Freculf a simple means to move his narrative along in conformity with the perceived ideal narrative style.
Iezzi, Luca
The pragmatic usage of formulae: Evidence from the Datini Archive (1382-1402)
The mercantile epistolary production in Italy in the period between the fourteenth and fifteenth century shows common features, which are visible in the documentation belonging to various companies from different places, written by merchants having varied degree of writing competences (Guazzelli & Ferrari 2024). Similarly to the letters composed by their European colleagues (see Benucci 2009, Del Lungo Camiciotti 2014, Kittler 2020, Palander-Collin 2009, Trivellato 2009, among others), the invocation and the final part of the letter were mostly formulaic. This contribution aims at exploring the letters within the Datini Archive, specifically the ones belonging to the Milanese correspondence, written from 1382 to 1402 (Frangioni 1994). These letters were composed by merchants from Milan and from Tuscany. More specifically, the objective of the contribution is to analyse and evaluate the relationship between formulaic elements (such as salutation and benediction) and the informative content of the letters, expanding what has been found out in a previous contribution (Ferrari & Iezzi 2024). This allows to individuate specific tendencies to both innovation and closeness as a peculiar feature of some writers, differently to the adherence to formulaicity and distancing used by others. As a consequence, I intend to see whether the formulaic elements actually reflect both social and individual peculiarities, which contribute to modify the effectiveness and character of the message.
Benucci, F. (2009). Analysis request strategies within a pragmatic framework in a seventeenth-century epistolary corpus. In De Zordo, O. (Ed.), Saggi di anglistica e americanistica. Ricerche in corso, Firenze: Firenze University Press, 3-33.
Del Lungo Camiciotti, G. (2014). Letters and letter writing in early modern culture: An introduction. Journal of Early Modern Studies 3, 17-35.
Ferrari, V., & Iezzi, L. (2024). Discorsi a confronto tra mercanti. Formularità e rapporti sociali tra Milano e la Toscana (1382-1402). In Consani, C., Guazzelli, F., & Perta, C. (Eds.), Gruppi professionali come fattore di innovazione linguistica. Evidenze documentarie in Europa tra tarda antichità e medioevo, Alessandria: Edizioni dell’Orso, 153-169.
Frangioni, L. (1994). Milano fine Trecento. Il carteggio Milanese dell’Archivio Datini di Prato. Firenze: Opus Libri.
Guazzelli, F., & Ferrari, V. (2024). Formazione scrittoria e competenze linguistiche. Alcune lettere mercantili del fondo datiniano. In Consani, C., Guazzelli, F., & Perta, C. (Eds.), Gruppi professionali come fattore di innovazione linguistica. Evidenze documentarie in Europa tra tarda antichità e medioevo, Alessandria: Edizioni dell’Orso, 135-152.
Kittler, J. (2020). “The pen is so noble and excellent an instrument”: How the medieval merchants and Renaissance diplomats invented the newswriting style. Journalism Studies, 21(10), 1403-1419.
Palander-Collin, M. (2009). Patterns of Interaction: Self-Mention and Addressee Inclusion in the Letters of Nathaniel Bacon and his Correspondents. In Nurmi, A., Nevala, M., & Palander-Collin, M. (Eds.), The Language of Daily Life in England (1400-1800), Amsterdam/Philadelphia: John Benjamins, 53-74.
Trivellato, F. (2009). The Familiarity of Strangers. The Sephardic Diaspora, Livorno, and Cross-cultural Trade in the Early Modern Period. New Haven: Yale University Press.
Kaislaniemi, Samuli
Address formulas and material practices in seventeenth-century English letters
For historical sociolinguists, the formulaicity of letter-writing provides excellent opportunities to study how social relationships are codified in language (e.g. Nevalainen & Raumolin-Brunberg 2017). The relationship between sender and recipient is particularly explicitly expressed in address terms and phrases (Nevala 2007). Letter-writing being a taught skill, most early modern European letter-writing manuals have a section instructing the reader on how to address persons of different social ranks (e.g. Day 1586: 32–34). The instructions apply to the address phrases within the letter’s text, but also to the superscription: the ‘address’ (as we still call it) on top of the letter packet.
In the early modern period, material and visual aspects of letters were just as important in negotiating and establishing social relationships as parts of the text. For instance, distance between the body text, the closing formula, and the signature, could be used to indicate respect and humility (Gibson 1997). Historically, envelopes were not used, and instead the paper the letter was written on was folded to form its own packet – this is today known as letterlocking (Dambrogio et al. 2021). Different social situations required different letterlocking types, and these were taught as part of overall letter-writing skills – although they were not described and explained in contemporary letter-writing manuals. Previous research has shown that some letterlocking types carried clear social meaning (Wolfe 2012), but the overall ‘grammar’ of letterlocking remains largely uncharted.
In this paper, I look at how superscriptions in seventeenth-century English letters match known social relationships of senders and recipients. In addition, I will also consider how the letters were folded and sealed. Given that superscriptions are one of the most rigidly formulaic parts of the letter, I expect to find some correlation between the letterlocking and the superscriptions. That is to say, I expect the superscription and folded and sealed letter to form a semiotic whole, which reflects the social relationship between the sender and the recipient. To that end, my study will chart if and how variation in the superscription formulas is matched by variation in the letterlockings.
My study corpus consists of letters from the Corpus of Early English Correspondence (CEEC). In addition to the texts of the letters, I have photographs of the source manuscripts, and have surveyed their material features.
Since material aspects of letter-writing are not familiar to most historical linguists, I would like to take advantage of the 10 extra minutes in order to have time to adequately explain epistolary materiality. This will also allow me to fully describe my dataset.
CEEC = Corpus of Early English Correspondence. Compiled by Terttu Nevalainen, Helena Raumolin-Brunberg & the CEEC team at the Department of languages, University of Helsinki. See https://varieng.helsinki.fi/CoRD/corpora/CEEC/index.html.
Dambrogio, Jana, Daniel Starza Smith, Jennifer Pellecchia, Alison Wiggins, Andrea Clarke & Alan Bryson. “The spiral-locked letters of Elizabeth I and Mary, Queen of Scots”. eBLJ 2021. DOI: 10.23636/gyhc-b427.
Day, Angel. 1586. The English Secretorie. London. STC (2nd ed.) / 6401. British Library. EEBO.
Gibson, Jonathan. 1997. “Significant space in manuscript letters”. The Seventeenth Century 12(1): 1-9.
Nevalainen, Terttu & Helena Raumolin-Brunberg. 2017. Historical sociolinguistics: Language change in Tudor and Stuart England. 2nd edn. Abingdon/New York: Routledge.
Wolfe, Heather. 2012. “ ‘Neatly sealed, with silk, and Spanish wax or otherwise’. The practice of letter-locking with silk floss in early modern England”. In S. P. Cerasano & Steven W. May (eds.), In the Prayse of Writing: Early Modern Manuscript Studies Essays in Honour of Peter Beal. London: British Library, pp. 169–189.
Kayachev, Boris
‘Roses are red and violets are blue’: poetic language between formulaicity and intertextuality (the case of purpureus)
The concept of formula plays an important role both in (various strains of) modern linguistics and in the so-called Oral-Formulaic Theory; despite many differences, they share the insight that formulae have a cognitive basis: rather than being constructed ad hoc every time they are used, formulae are retrieved from the mental lexicon as single – prefabricated – units, often with the corollary that they are also semantic units. This perspective makes sense if speech/text production is viewed as a spontaneous process that takes place in the moment, the cognitive/time constraint being a crucial factor. But what is the place of formulae within the framework of a developed literary tradition, such as classical Latin poetry, which allows a lifetime for the author to create, and for the reader to appreciate, a poem, revisiting it again and again? It might be objected that poetic language is profoundly artificial and thus of little interest to the linguist, but arguably there are other kinds of discourse that allow for, and even encourage, premeditation and self-reflexivity.
In this paper I propose to explore what ‘prefabricated’ can mean in the context of Latin poetry, by investigating formulaic noun phrases that include purpureus ‘purple’ as a modifier, in the (partly overlapping) corpora of the PHI5 (classical poetry and prose) and the Musisque deoque (classical and post-classical poetry) databases (some 440 and 344 entries respectively; my approach is very basic, so 20 min. should suffice). Initial soundings suggest that there may be (at least) five different categories (not necessarily mutually exclusive). (1) Formulae borrowed from everyday language, such as uestis purpurea ‘purple fabric’ (= purpura), relatively frequent in both prose and poetry. (2) Formulae adapted from technical discourse, such as flore purpureo in the description of dictamnum at Virgil, Aeneid 12.413–14, evocative of botanical descriptions in Pliny (e.g. 26.95–6). (3) Imitations of Greek poetry, esp. Homer, such as sale purpureo (lit. ‘purple salt’) at Valerius, Argonautica 3.422 mimicking hala porphyreēn ‘heaving sea’ at Iliad 16.391. (4) Formulae cultivated by a particular poet within his oeuvre, such as purpureus pudor ‘purple shame’ in Ovid (Amores 1.3.14, 2.5.34, Tristia 4.3.70). (5) All of the above may, or may not, become (more or less) established as traditional formulae in subsequent poetry.
While this analysis brings out a number of important questions, such as what exactly purpureus means in formulae like sale purpureo or how and to what extent we can distinguish between the different categories (esp. given the overall patchy state of evidence), I propose to conclude by considering a more general question: why are formulae used in literary poetry, where they are not necessitated by cognitive economy? It is often observed that specific formulae may belong to, in the sense of being conditioned by, specific discourses; this I suggest also gives formulae the potential to be intentionally used so as to evoke specific discourses or intertexts, which makes them attractive for literary poets.
R.J. Edgeworth, ‘Does “purpureus” mean “bright”?’, Glotta 57 (1979), 281–91.
J.M. Foley, Immanent Art: From Structure to Meaning in Traditional Oral Epic (Bloomington, 1991).
Frog and W. Lamb, eds., Weathered Words: Formulaic Language and Verbal Art (Cambridge, MA, 2022).
M. Lapidge, ‘Aldhelm’s Latin poetry and Old English verse’, Comparative Literature 31 (1979), 209–31.
E. Minchin, ‘Poet, audience, time, and text: reflections on medium and mode in Homer and Virgil’, in R. Scodel, ed., Between Orality and Literacy: Communication and Adaptation in Antiquity (Leiden, 2014), 267–88.
W. Moskalew, Formular Language and Poetic Design in the Aeneid (Leiden, 1982).
M. Sale, ‘Virgil’s formularity and pius Aeneas’, in A. MacKay, ed., Signs of Orality: The Oral Tradition and Its Influence in the Greek and Roman World (Leiden, 1999), 199–220.
A. Wray, Formulaic Language and the Lexicon (Cambridge, 2002).
Kootstra-Ford, Fokelien (part of panel A)
Formulaic variation: Leveraging formulaic language to understand linguistic variation in Dadanitic inscriptions (6th – 1st c. BCE)
Dadanitic is the name of a script that was used to carve inscriptions in the ancient oasis of Dadan (modern-day AlUla) in Northwest Arabia, between the 6th and late 1st centuries BCE (for the dating of Dadan Rohmer and Charloux 2015; for the definition of Dadanitic Macdonald 2000). The inscriptions are written in a Semitic language that was linguistically distinct from, but close to Arabic (Al-Jallad 2018). The corpus of Dadanitic inscriptions is characterized by its highly formulaic language, yet it displays remarkable variation in its orthography, phonology, and morphology (Kootstra 2023).
This contribution will focus on how formulae form the key to understanding linguistic variation in smaller historical corpora like Dadanitic. It will demonstrate how formulae inform the qualitative investigation of linguistic variation, not only by establishing a linguistic baseline, but also to help identify influence from other languages and writing cultures. On the other hand, it will show how a quantitative approach to the distribution of variation across formulaic types can help us understand how different linguistic variants were used in different contexts. This will underline the influence of using ‘someone else’s language’ while highlighting the space for linguistic innovation and personal expression.
Besides illustrating how formulaic language use can be an asset when studying variation, this will show that Dadanitic was written by authors of varying competence and historical awareness. It will also reveal the impact that the rich multilingual environment in which the Dadanitic writing culture developed had on its written record.
Al-Jallad, Ahmad. 2018. “What Is Ancient North Arabian?” In Re-Engaging Comparative Semitic and Arabic Studies, edited by D. Birnstiehl and N. Pat-El, 1–43. Wiesbaden: Harrassowitz.
Kootstra, Fokelien. 2023. The Writing Culture of Ancient Dadan. Studies in Semitic Languages and Linguistics 110. Leiden: Brill.
Macdonald, Michael C.A. 2000. “Reflections on the Linguistic Map of Pre-Islamic Arabia.” Arabian Archaeology and Epigraphy 11:28–79.
Rohmer, J., and G. Charloux. 2015. “From Liḥyān to the Nabataeans: Dating the End of the Iron Age in North-West Arabia.” Proceedings of the Seminar for Arabian Studies 45:297–320.
Koroli, Aikaterini
Stereotypicality and variation in Greek private papyrus letters: a focus on stereotypical directive speech-acts
The presentation will deal with the phenomenon of formulaicity or stereotypicality in the main body of the Greek non-literary letters preserved on papyrus and ostraca, and dated from the Roman and Byzantine periods of Egypt. The discussion will be divided into two parts.
The first part will focus on the definition and delimitation of the concept of formulaicity/stereotypicality and that of variation in the corpus under study taking into account its individualities. Variation could be perceived either as situation-specificity or as the deliberate deviation from fixed forms of expression, e.g. through the enrichment of commonplace structures or expressions and/or literariness. Since these two concepts are actually the edges of a continuum one should speak of gradations of formulaicity/stereotypicality (or variation, respectively) expressed in several ways. This part of the discussion will revolve around issues such as: a classification of all kinds of formulas, commonplace expressions and repeated structural/rhetorical patterns; the functions of the aforementioned linguistic devices in the main body of the private papyrus letters; the reasons hidden behind the ancient senders’ tendency to construct their text by resorting to commonplace structures and expressions and, conversely, their tendency to distance themselves from clichés or derivatives; the change in the perception and expression of formulaicity/stereotypicality as a result of the transition from the Roman to the Byzantine period.
Following these introductory remarks, the discussion will then focus on stereotypical directives. This part of the presentation will revolve around the definition and the categories of formulaic requests in the corpus under study, as well as their functions with regard to the structure, register and style of the epistolary text.
Longrée, Dominique & Vanni, Laurent (part of panel B)
New Ways to identify Formulaic Expressions in Latin Epistolography: Between Statistics and AI
In this paper, we will first briefly review the different types of patterns considered as “phraseological” by the linguists and specify which one could be considered as “formulaic”. We will then specify some difficulties their automatic detection meets and evaluate some pattern detection techniques (combining data mining and statistics) in order to assess their performance, advantages and disadvantages. We will finally explore to which extent the use of HyperDeep, an AI tool, may prove useful, or even indispensable, in this field. The methods will be applied to a Latin epistolary corpus, from Classical times to medieval period.
The automatic detection of “formulaic expressions” meets various difficulties when the notion is extended to non-totally fixed patterns (Longrée & Mellet, 2013), as could be some formulaic expressions: unlike repeated segments (Salem, 1986), verbal tense sequences (Longree & Luong, 2003) or syntactic motifs (Mellet & Longrée, 2009), some non-totally fixed patterns (Fillmore et al., 1988; Sinclair, 1991: 111-112 ; Gledhill & Frath, 2007) consist in multidimensional phraseological patterns made up of items of several types (forms, lemmas, grammatical categories, etc.) and allowing some variations. Each of these variations is a new challenge for automated detection based on sequential search.
A tricky parameter is the frequency of the patterns: repetitions are necessary to ensure the memorization of patterns (Lavigne et al., 2016; Longrée, Mellet & Lavigne, 2019), but a high frequency of a pattern does not always mean that the pattern is formulaic: for instance, syntactic patterns are highly frequent but cannot be considered formulaic offhand. In fact, formulaic expressions are not only identified by the repetition but also by their textual function (structuring or characterizing one) and therefore be considered as a particular category of “textual motifs” (Longrée & Mellet, 2018).
Despite these difficulties, several tools have been proposed since the beginning of the century in order to detect “textual motifs” (see Ganascia, 2001). We will test some of them: SDMC (Sequential Data Mining under Constraints: https://sdmc.greyc.fr/index.php; Cellier et al., 2012; Quiniou et al., 2012), Le lexicoscope (Kraif 2016; 2019) and Hyperbase (https://hyperbase.unice.fr/; Vanni 2024). In order to assess the advantages and disadvantages of each of these methods, we will apply them to a corpus of Latin epistolary texts processed using LASLA methods. We will finally compare the “linguistic patterns” we extracted with those the new software HyperDeep (based on CNN and Transformers; Vanni et al., 2018, 2023; 2024) identifies in the same corpus and show the added value of this method.
Cellier, P., Quiniou, S., Charnois, Th., & Legallois, D. (2012). What About Sequential Data Mining Techniques to Identify Linguistic Patterns for Stylistics? In Lecture Notes in Computer Science, Springer Vol. 7181, 166-177.
Fillmore C.J., Kay P. & O’Connor M.C. (1988), Regularity and Idiomaticity in Grammatical Constructions: the Case of Let Alone, Language 64 (3), 501-538.
Ganascia, J. G. (2001). Extraction automatique de motifs syntaxiques, in Actes de la 8ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN’2001). Tours (France), edited by Jean Véronis, Laurence Danlos, Pierre Zweigenbaum, Nathalie Gasiglia, and Pascal Amsili. Accessed January 28, 2019. http://talnarchives.atala.org/TALN/TALN-2001/taln-2001-long-017.pdf.
Gledhill C. & Frath P. (2007), Collocation, phrasème, dénomination: vers une théorie de la créativité phraséologique, in La Linguistique 43 (1), 63-88.
Kraif, O. (2016). Le lexicoscope: un outil d’extraction des séquences phraséologiques basé sur des corpus arborés, in Cahiers de lexicologie, 108, 91-106.
Kraif, O. (2019). Explorer la combinatoire lexico-syntaxique des mots et expressions avec le Lexicoscope, in Langue française, 203, 67-83.
Lavigne, F., Longrée, D., Mellet, S., & Mayaffre, D. (2016). Semantic Integration by Pattern Priming: Experiment and Cortical Network Model, in Cognitive Neurodynamics, DOI 10.1007/s11571-016-9410-4, 1-21
Longrée, D., & Luong, X. (2003). Temps verbaux et linéarité du texte: recherches sur les distances dans un corpus de textes latins lemmatisés, in Corpus, 2
Longrée, D., & Mellet, S. (2013). Le motif: une unité phraséologique englobante? Étendre le champ de la phraséologie de la langue au discours, in Langages 189, 68-80.
Longrée, D., & Mellet, S. (2018). Towards a topological grammar of genres and styles: a way to combine paradigmatic quantitative analysis with a syntagmatic approach, in The Grammar of Genres and Styles: From Discrete to Non-Discrete Units, edited by Dominique Legallois, Thierry Charnois, and Meri Larjavaara, 140–163. Berlin/Boston: de Gruyter.
Longrée, D., Luong, X., & Mellet, S. (2008). Les motifs: un outil pour la caractérisation topologique des textes, in S. Heiden & B. Pincemin, Actes des JADT 2008, 9èmes Journées internationales d’Analyse statistique des Données Textuelles, Lyon, 12-14 mars 2008 (pp. 733-744). Lyon, France: Presses ENS
Longrée, D., Mellet, S., & Lavigne, F. (2019). Construction cognitive d’un motif: cooccurrences textuelles et associations mémorielles, in CogniTextes. doi:10.4000/cognitextes.1202
Mellet, S., & Longrée, D. (2009). Syntactical Motifs and Textual Structures. Considerations based on the Study of a Latin historical Corpus, in Belgian Journal of Linguistics, 23. doi:10.1075/bjl.23.13mel
Mellet, S., & Longrée, D. (2012). Légitimité d'une unité textométrique: le motif, in A. Dister, D. Longrée, G. Purnelle (Eds.), Actes des Journée d'analyse des données textuelles 2012 (pp. 715-728).
Quiniou, S., Cellier, P., Charnois, Th., & Legallois, D. (2012). Fouille de données pour la stylistique: l’exemple des motifs émergents, in Actes des 11èmes Journées Internationales d'analyse statistique des données textuelles, Liège, 13-15 juin 2012, 821-833.
Salem, A. (1986), Segments répétés et analyse statistique des données textuelles, in Histoire & Mesure 1986, 1-2, 5-28
Sinclair J. (1991), Corpus, Concordance, and Collocation, Oxford: Oxford University Press.
Vanni L., Hyperbase Web. (Hyper)Bases, Corpus, Langage, in Corpus, 2024, 25, ⟨10.4000/corpus.8770⟩. ⟨hal-04523479⟩
Vanni L., Corneli M., Mayaffre D., & Precioso F (2023). From text saliency to linguistic objects: learning linguistic interpretable markers with a multi-channels convolutional architecture, in Corpus 24 https://journals.openedition.org/corpus/7667
Vanni L., Hadi M., Longrée D & Mayaffre D. (2024), Multi-channel Convolutional Transformer and intertextuality: a Latin case study, in Misuraca M. & Giordano G., New Frontiers in Textual Data Analysis, Springer, à paraître.
Vanni, L., Mayaffre, D., & Longrée, D. (2018). ADT et deep learning, regards croisés. Phrases-clefs, motifs et nouveaux observables, in 14es Journées internationales d’Analyse statistique des Données Textuelles. JADT 2018, Rome, p. 459-466.
Maczuga, Julia (part of panel A)
The religious formulae attested in the Arabic graffiti from North-West Arabia during the Late pre-Islamic and Early Islamic periods: A study in continuity and change
The rich corpus of graffiti discovered in northern Saud Arabia provides a unique opportunity to study the evolving Arabic epigraphic writing culture, as it contains both pre-Islamic Arabic inscriptions written in the so-called “Paleo-Arabic” script dating from the fifth and sixth centuries AD (NEHMÉ 2020: 128) and Early Arabic Islamic inscriptions dating from the first three centuries of Islam (seventh-ninth centuries AD).
Both the pre-Islamic and Early Islamic Arabic inscriptions are characterized by their high level of formulaicity. Until now, the academic community has commonly admitted that the arrival of Islam had brought about a significant shift in religious formulae with the introduction of new types of invocations and that the Islamic religious graffiti thus developed independently from earlier writing cultures. Although Islamic material does indeed have unique features, this paper aims to demonstrate that there is also some observable continuity between Paleo-Arabic and Arabic writing cultures, not only in terms of script but also in the application of the religious formulae.
Both in Paleo-Arabic and Arabic graffiti there are certain formulae that use the same formula, such as the introductory phrase bi-smi llāh ‘in the name of God’ (Basmala) (AL-JALLAD 2022). However, there are also phrases that are semantically similar, and use the same verbal root, but the verbs appear in different grammatical forms. For example, in both Paleo-Arabic and Arabic graffiti, the root ĠFR ‘to forgive’ is used, but in Paleo-Arabic, it appears as yistiġfar (3. masc. sing. imperf) (AL-JALLAD and SIDKY 2021: 210), while in Arabic, iġfir occurs in the imperative mood. Conversely, both writing cultures have an expression that conveys similar meaning, but expressed using different verbal roots. The phrase ‘whoever invokes/says God’s name’ in Paleo-Arabic is expressed with the root DʿW ‘to invoke’ (ArDA 1, see DicoNab) while in Islamic Arabic, the root QWL ‘to say’ is applied. Although some religious formulae were adopted by early Muslims from earlier inscriptions, the difference in grammatical forms provides clear evidence that these religious formulae continued to evolve. A closer look at the formulaic usage in Paleo- and Early Islamic Arabic inscriptions will provide a more nuanced insight into the dynamics of continuity and change in formulaic and linguistic usage in this period.
AL-JALLAD, A. (2022). A pre-Islamic basmala: reflections on its first epigraphic attestation and its original significance Jerusalem Studies in Arabic and Islam 52: 1-28.
AL-JALLAD, A. and SIDKY, H. (2021): A paleo-Arabic inscription on the route north of Ṭāʾif. Arabian archaeology and epigraphy 33: 202-215.
DiCoNab: ‘The Digital Corpus of the Nabataean and Developing Arabic Inscriptions’ [Diconab.huma-num.fr]
NEHMÉ, L. (2020): The religious landscape of North-west Arabia as reflected in the Nabataen, Nabataeo-Arabic, and pre-Islamic Arabic inscriptions. Semitica et Classica 13: 127-154.
Majdak, Magdalena
Evolution of the Formulaic Expressions Referring to God in Polish Language History: Analysis of the Correspondence of the Czapski Family
The paper is a fragment of research on formulaic expressions with the word God in the history of Polish. The aim of this article is to catalogue formulas (e.g. da Bóg, jeśli Bóg pozwoli, pożal się Boże, z Bogiem) in selected correspondence from the Baroque period, to compare their collection with the resources of Polish from the 20th and 21st centuries, in which they are constantly present, and to examine whether these constructions maintain, lose or acquire formulaicity. Phrases containing the unit God were selected for the presentation, which are not always conscious references to God, but formulas, semi-magical, referring to a higher authority organizing the world and reality.
The material basis consists of letters obtained as part of the projects Edition of the letters of Magdalena née Czapska to Hieronim Florian Radziwiłł (2013–2016) and Sources on the Czapski Family in the 18th Century: Ego-documents of the Family Members of Pomeranian Voivode Piotr Jan (1685-1736) – A Philological-Historical Study and Edition (2022-2027). The correspondence, consisting of nearly 500 letters, was created in the 18th century in a typical and at the same time original family of people from the upper social class, communicating both with the outside world and within the family. The letters were transliterated and transcribed, some of them were published (2016). This resource was compared with subcorpora a) letters from King Jan III Sobieski to his wife Marysieńka (1665-1683) from The Electronic Corpus of 17th- and 18th-century Polish Texts (KorBa), b) with a subcorpus consisting of the remaining letters from KorBa, c) with a subcorpus containing the remaining text genres - divided into Baroque and Enlightenment.
The rich search options in the corpus and the advanced query syntax are helpful here, which allows asking about missing variants of formula elements without assuming a priori its components, also with variable order (e.g. uchowaj Boże, Boże uchowaj). They also allow obtaining information about the frequency of n-grams containing the unit God in the Czapski correspondence with reference to reference subcorpora. The material will then be compared with formulas containing the unit God extracted using the Kolokator program from the National Corpus of the Polish Language, where similar forms are still present, and from the NKJP letter subcorpus.
The analysis includes: 1. Extraction of formulas with the unit God from the Czapski family correspondence, 2. Determination of the canonical form and variants of the formulas, assigning them grammatical and syntactic characteristics, 3. Comparison of them with formulas with the unit God: a) in the collection of Sobieski's letters to Marysieńka, b) reference subcorpora, c) with formulas with the unit God in the NKJP. 4. Discussion of changes in the strength of the formula and its equivalent non-formula structures in the examined epistolographic material. The analysis of the aforementioned formulas also includs material from historical and phraseological dictionaries of the Polish language. This allows for deeper considerations on the subject of formulaicity based on the potential evolution of the meanings and uses of the formulas studied.
‒ Włodzimierz Gruszczyński, Dorota Adamiec, Renata Bronikowska, Witold Kieraś, Emanuel Modrzejewski, Aleksandra Wieczorek, Marcin Woliński 2021. The Electronic Corpus of 17th- and 18th-century Polish Texts, „Language Resources & Evaluation”, t. 56, z. 1, s. 309-332, https://link.springer.com/article/10.1007/s10579-021-09549-1
‒ Mikhail Mikhailov (2021), God, the Devil, and Christ: A corpus study of Russian syntactic idioms and their English and Finnish translation correspondences [in:] Formulaic language. Theories and methods, Edited by Aleksandar Trklja and Łukasz Grabowski, DOI: 10.5281/ZENODO.4727675
‒ Piotr Pęzik (2012), Wyszukiwarka PELCRA dla danych NKJP. Narodowy Korpus Języka Polskiego. Przepiórkowski A., Bańko M., Górski R., Lewandowska-Tomaszczyk B. (red.). 2012.
‒ Zygmunt Saloni, Marek Świdziński (2007), Składnia współczesnego języka polskiego, Warszawa.
‒ Elena Tognini-Bonelli (2001). Corpus Linguistics at Work, Amsterdam/Philadelphia.
‒ Formulaic Language in Historical Research and Data Extraction https://republic.huygens.knaw.nl/index.php/en/conference-formulaic-language/
‒ Joanna Zaucha (2007). Status językowy porównań standardowych a pojęcie utartości, [in:] Frazeologia a językowe obrazy świata przełomu wieków, red. W. Chlebda, Opole, s. 343-348.
Letters edition
‒ „Gdybym Cię, moje Serce, za męża nie miała, żyć bym nie mogła”. Listy Magdaleny z Czapskich do Hieronima Floriana Radziwiłła z lat 1744–1759, wstęp i opracowanie I. Maciejewska i K. Zawilska, Olsztyn 2016.
Corpora
‒ Electronic Corpus of 17th- and 18th-century Polish Texts (KorBa), https://korba.edu.pl/
‒ National Corpus of the Polish Language, https://nkjp.pl/
Dictionaries
‒ Bańko, M. (Ed.). (2000). Inny słownik języka polskiego. Wydawnictwo Naukowe PWN.
‒ Doroszewski, W. (Ed.). (1958–1969). Słownik języka polskiego PAN. Wydawnictwo Naukowe
PWN.
‒ Majdak M. (2024–), Gruszczyński, W. (2004–2023). (Ed.). Elektroniczny słownik języka
polskiego XVII i XVIII wieku. Instytut Języka Polskiego PAN. https://sxvii.pl
‒ Mrowcewicz, K., & Potoniec P. (Eds.). (1956–). Słownik polszczyzny XVI wieku. Instytut Badań
Literackich.
‒ Skorupka, S. (Ed.). (1967) Słownik frazeologiczny języka polskiego, Warszawa.
‒ Urbańczyk, S. (Ed.). (1953–2002). Słownik staropolski. Instytut Języka Polskiego PAN.
‒ Wielki słownik frazeologiczny PWN z przysłowiami (2022), Warszawa.
‒ Żmigrodzki, P. (2007–). Wielki słownik języka polskiego PAN. Instytut Języka Polskiego PAN.
Marszałek, Jagoda & Wieczorek, Aleksandra
Polish and Latin date formulas used in Polish texts from 17th to 18th centuries
Although the topic of multilingualism is already well explored for the historical languages of Western Europe (Trotter 2000, Adams 2003, Amsler 2012, Pahta et al. 2018), its features and significance for the history of the Polish language have become the subject of scientific research relatively recently (Axerowa 2007, Walczak and Mielczarek 2015, Zarębski 2021, Masłej 2023).
Polish-Latin bilingualism was present in the Polish-speaking
area in the Middle Polish era (16th-18th centuries). The
Polish literary language was already fully formed in the 16th
century, but Latin continued to function in Polish literature in the following
two centuries. Some texts were written entirely in Latin, but Latin elements
were often incorporated into the uniform Polish text in the form of inlay
(Brajerski 1965, Klemensiewicz 2009: 402–409, Lewaszkiewicz & Rzepka 1978,
Leszczyński 1983, Dubisz 2002: 222–229, Kopaczyk 2018, Kopaczyk et al. 2016).
The presentation focuses on the coexistence of Latin and Middle Polish on the
example of date formulaic expressions in Polish texts from the 17th
and 18th centuries. Repetitive expressions referring to time and
date constitute one of the most popular lexical bundles, especially in some
specialized historical texts (cf. e.g. Kopaczyk 2013: 210). Here are some examples from Middle Polish
texts:
- die 3 iulii anno 1732 – Lat. ‘3rd July 1732’
- dnia 17. Maja 1628 – Pl. ‘17th May 1628’
- dnia 31. Aprilis, Anno 1646 – Pl. and Lat. ‘31st April 1646’
- in Anno 1612. et 1613 – Lat. ‘in the years 1612 and 1613’
- Czternastego Novembra, Roku 1719 – Pl. ‘14th November 1719’
Several types of date formulas can be pointed out, both Polish and Latin, which, despite being highly repeatable, show significant variation in the elements used. As example 3) shows, they are often themselves a combination of two languages. An additional topic is the use of Polish names of months and names of Latin origin (listopad vs. november, see example 5). Using date formulas as an example, the study addresses the question of why Latin language elements are still present in Polish texts from the period, despite the existence of Polish equivalents.
In addition, the research examines how the extra-linguistic context (genre, topic, etc.) may have influenced the choice of a particular date formula, and how these trends have changed over the course of two centuries. Researchers of Middle Polish texts note that the degree of saturation of the text with Latin elements varied depending on the type of text and other extra-linguistic features (cf. e.g. Walczak-Mikołajczakowa and Mikołajczak 2021).
Next, the formal language of date notation and its standardization in Polish baroque texts are discussed. Finally, the research examined the possibilities of annotating date formulas in the Middle Polish Dependency Treebank (Wieczorek 2025).
The presented study is corpus-based. The research material comes from the Electronic Corpus of the 17th- and 18th-century Polish Texts, which gives us many possibilities of searching and analyzing data, also using metadata such as publication time or genre (korba.edu.pl; Gruszczyński et al. 2022; cf. also Bronikowska & Kryńska 2020).
Adams J. N. (2003): Bilingualism and the Latin Language. Cambridge.
Amsler M. (2012): Affective Literacies. Writing and Multilingualism in the Late Middle Ages. Turnhout.
Axerowa, A. (2007): Niespodzianki dwujęzyczności szlacheckiej: Pasek jako orator. “Pamiętnik Literacki. Czasopismo kwartalne poświęcone historii i krytyce literatury polskiej” (2): 207–218. (https://pamietnik-literacki.pl/uploads/settings/2023/05/21/646a1260091c57.89377580_9-axerowa.pdf)
Brajerski T. (1965): Ze składni tekstu makaronizowanego. “Studia z Filologii Polskiej i Słowiańskiej” 5: 237–240.
Bronikowska R. and Kryńska K. (2020): Łacina w KorBie. Użyteczność elektronicznego korpusu tekstów polskich XVII i XVIII wieku dla filologa neolatynisty. “Polonica” 40:123–135.
Dubisz S. (2002): Język – historia – kultura (wykłady, studia, analizy). Warszawa.
Gruszczyński W., Adamiec D., Bronikowska R., Kieraś W., Modrzejewski E., Wieczorek A., and Woliński M. (2022): The Electronic Corpus of 17th- and 18th-century Polish Texts. “Language Resources and Evaluation” 56(1):309–332.
Klemensiewicz Z. (2009): Historia języka polskiego. 9. ed., Warszawa.
Kopaczyk J. (2013): The Legal Language of Scottish Burghs: Standardization and Lexical bundles (1380–1560). Oxford University Press 166.
Kopaczyk J. (2018): Administrative multilingualism on the page in early modern Poland: In search of a framework for written code-switching. In P. Pahta, J. Skaffari, L. Wright (eds.): Multilingual Practices in Language History. English and Beyond. – Berlin–Boston: De Gruyter Mouton, 258–275.
Kopaczyk J., Włodarczyk M., and Adamczyk E. (2016): Medieval Multilingualism in Poland: Creating a Corpus of Greater Poland Court Oaths (Rotha). “Studia Anglica Posnaniensia Adam Mickiewicz University” 51, no. 3: 16–20. (https://doi.org/10.1515/stap-2016-0012)
Lewaszkiewicz T. and Rzepka W. R. (1978): Uwagi o leksyce makaronicznej w tekstach polskich z XVII wieku. “Z polskich studiów slawistycznych” 5: 271–277.
Leszczyński Z. (1983): Echa makaronizowania. “Roczniki Humanistyczne” vol. XXX-XXXI, is. 6 – 1982–1983: 97–104. (https://bibliotekanauki.pl/articles/2127932).
Masłej, D. (2023). Średniowieczne zabytki polsko-łacińskie jako przedmiot badań historycznojęzykowych. Perspektywy badawcze. “Biuletyn PTJ”, LXXIX(79), 355-369. https://doi.org/10.5604/01.3001.0054.2635.
Trotter D. A. (ed.) (2000): Multilingualism in Later Medieval Britain. Cambridge.
Walczak B. and Mielczarek A. (2017): Prolegomena historyczne – wielojęzyczność w Rzeczypospolitej Obojga Narodów. “Białostockie Archiwum Językowe” 17: 255–268. (https://www.ceeol.com/search/viewpdf?id=679921)
Walczak-Mikołajczakowa M. and Mikołajczak A. (2021): Kilka uwag o języku i kontekście kulturowym Diariusza podróżnego hetmana Filipa Orlika. “Poznańskie Studia Polonistyczne Seria Językoznawcza” vol. 28 (48), no 2: 162-167. https://doi.org/10.14746/pspsj.2021.28.2.9
Wieczorek A. (2025): Towards Middle Polish Dependency Treebank. In “Native Language in the 21st Century: System, Communication Practices and Education”. V & R Unipress GmbH. (https://czasopisma.uni.lodz.pl/linguistica/article/view/20488)
Zarębski R. (2021): O potrzebie badań bilingwizmu w historii polszczyzny. “Prace Językoznawcze” XXIII/3: 71–86. (http://uwm.edu.pl/polonistyka/pracejezykoznawcze/pol/pliki/Prace_Jezykoznawcze_23_3_2021.pdf)
Martín González, Elena & Konstantopoulou, Stavroula
Formulaic Language in the Oracular Inscriptions of Dodona: Integrating Traditional Epigraphic Analysis and Deep Neural Networks
The 2013 publication of the corpus of oracular inscriptions from Dodona by Dakaris, Vokotopoulou, and Christidis, which contains over four thousand inscriptions, invites a reassessment of the formulaic language used by consultants in their enquiries to Zeus Naios and Dione. Contrary to previous assumptions, the new evidence reveals a broader range of formulae, although identifiable patterns remain.
As part of our work to produce a new edition of these inscriptions, which combines traditional epigraphic analysis with the assistance of the deep neural network Ithaca (Assael et al. 2022), the formulaic language of the oracular enquiries plays a central role. Epigraphic analysis is essential for restoring the highly fragmentary texts on the lead tablets, providing valuable information about the cult practices at the sanctuary, the language of the enquirers, and even the inscriptions' chronology. Meanwhile, applying Artificial Intelligence to these texts offers an exceptional opportunity to test the performance of the Ithaca model, particularly in its reliance on standard oracular formulas for restoration and its ability to attribute texts geographically and chronologically (Bodel et al. 2024).
Our presentation will introduce the dataset and methodology of our research, emphasizing the importance of combining insights from both traditional analysis and deep neural networks to offer a comprehensive, renewed understanding of the formulaic language in the Dodona oracular tablets.
Assael, Y., Sommerschield, T., Shillingford, B. et al. (2022). Restoring and attributing ancient texts using deep neural networks. Nature 603, 280–283. https://doi.org/10.1038/s41586-022-04448-z.
Bodel, J., Prag, J.R.W. and Roueché, C. (2024). Open Scholarship: Epigraphic Corpora in the Digital Age, In Pierre Fröhlich & Milagros Navarro Cabellero (eds.), L’épigraphie au XXIe siècle. Actes du XVIe Congrès International d’Épigraphie Grecque et Latine, Bordeaux, 29 août-02 septembre 2022, Bordeaux, Ausonius, 91-117.
Dakaris, S., Vokotopoulou, I. and A.-Ph. Christidis. 2013. Τα χρηστήρια ελάσματα της Δωδώνης των ανασκαφών Δ. Ευαγγελίδη [Ta chresteria elasmata ton anaskaphon D. Euangelide] Ι-ΙΙ, Athens.
Meeder, Sven & Schmidt, Gleb
Formulae of Authority: Formulaic Aspects of Referencing the Bible in Early Medieval Canon Law
The evolution of canon law in the Early Middle Ages was marked by the constant adaptation and re-adaptation of the normative legacy to new social and spiritual contexts. To remain relevant and authoritative, new collections of canonical texts—whether official or forged in an attempt to appear legitimate — had to conform to the strict constraints of tradition. This left compilers with a rich yet limited set of “conceptual building blocks” — existing norms, exempla, interpretations, and precedents — almost always supported by references to authoritative sources, especially Scripture. Additionally, various “linguistic building blocks” —specific formulae and expressions — were available to the compilers, enabling them to achieve particular rhetorical effects.
As a result, the corpus of Early Medieval canonical texts is linguistically repetitive and formulaic in its intellectual structures. This poses significant challenges not only for those seeking to contextualize these texts (e.g., dating or attributing them), but also for understanding how they functioned and why their reception varied so dramatically, with some texts achieving wide and lasting influence, while others saw only limited circulation.
Recent scholarship has acknowledged that the individuality of authors in canonical texts manifested itself in subtle differences in how these various building blocks were framed and interconnected. Building on this, we argue that the “success” and authority of a collection largely depended on the compiler’s ability to employ complex formulaic language to emphasize its continuity with the authoritative tradition.
The ultimate aim of this paper is to demonstrate that compilers were highly conscious in their use of pre-defined elements, developing various techniques to present this legacy to their readers in a convincing, authentic, and authoritative manner.
To pursue this ambition, the SOLEMNE project is constructing what is going to be a nearly complete corpus of Early Medieval canonical texts, both already edited and newly collected from original manuscript documents. To work with this rich and growing body of data, we have developed a pipeline that includes embedding-based semantic search, text reuse detection, and retrieval-augmented generation.
This synthetic approach, combined with close reading, enables us to systematically detect, describe, and interpret what we call “biblical formulaicity” in canonical texts. We argue that “biblical formulaicity”, which we define as biblical intertextuality in its strictly functional aspect, is one of the core stylistic features of these texts. At the most basic, intra-sentence level, it allowed compilers to give their texts almost sacred connotations. More importantly, quoting the Bible in a particular way could strengthen an argument or establish a connection between a norm and a Scriptural exemplum. By altering the framing, changing the introductory formula, interrupting quotations, or adding explanations and commentary, compilers could give the material a new sound.
Having introduced the ways to detect and categorize the recurring literary devices in the canonical corpus, we shall consider in more detail the case of the so-called Pseudo-Isidorean corpus, a voluminous canonical collection forged sometime in the 9th century to justify a nearly complete immunity of bishops from secular power. By analyzing how the recurring patterns detected in a machine-assisted manner manifest themselves in this particular collection, we shall showcase how the forger achieved his goal of establishing his creation as a legitimate reference in legal disputes.
Ubl, Karl, and Daniel Ziemann. Fälschung als Mittel der Politik ? Pseudoisidor im Licht der neuen Forschung Gedenkschrift für Klaus Zechiel-Eckes. Monumenta Germaniae Historica. Studien und Texte, Bd 57. Wiesbaden: Harrassowitz, 2015.
Mandikal, Priyanka. “Ancient Wisdom, Modern Tools: Exploring Retrieval-Augmented LLMs for Ancient Indian Philosophy”, Proceedings of the 1st Machine Learning for Ancient Languages Workshop, Association for Computational Linguistics (ACL), 2024.
Korkiakangas, Timo. “Documentary formulae as text reuse templates: constat and manifestus clauses in early medieval Latin charters”, in Digital Medievalist 16, 1–44. 2023. https://doi.org/10.16995/dm.8195
Mika, Tomasz
Division of Old Polish Apocrypha: Title Formulas in the Middle Ages
Given the historical context, the process of vernacularization faced, among others, the chief challenge of finding meaningful ways to divide large texts into portions of content, which today are recognized as chapters, titles, and subtitles. However, the formulation of all three varied widely.
At its genesis, a common strategy was to extract and differentiate one of the beginning sentences to act as a chapter; however, this process eventually became viewed as an overview of the chapter itself. Title formulation would evolve further, with some writers choosing experimental formulas for dividing content and others developing rigid and explicit formulas. The Old Polish apocryphal work of the New Testament contains approximately 800 titles and subtitles. The largest Slavic apocryphal text, Meditations on the Life of Jesus (also called Meditations of Przemysl), contains more than 430 subtitles. These chapters, titles, and subtitles owe their creation and evolution to medieval scribes.
Thanks in large part to the rewriting process, many texts became chaptered and titled, with medieval scribes being directly responsible. There is evidence to show how scribes gradually created, adopted, abandoned, and even readopted schemes. Further evidence shows how scribes perfected schemes over successive texts through developing narrative and etiquette, action to the object, and a sense of detachment from time and people. Simultaneously, we also see the increasing importance of sentence structure and the role of the noun phrase. Yet, these medieval scribes faced the challenge of finding critical chapter information while creating and applying the appropriate scheme to express it.
The vernacularization of the Old Polish Apocrypha, coupled with the evolution and process of text division, ultimately produces a stabilized linguistic schema. This paper, albeit highly truncated, aims to illuminate these processes. Consequently, my presentation will explore this matter through statistical and processual analysis. The former will deal with the identification and frequency of schemes. The latter will investigate and reconstruct the stages of schema formation and its mechanisms, which include reduction, derivation, and the transformation of syntactic structures.
Buerki, Andreas. 2020. Formulaic Language and Linguistic Change: A Data-Led Approach, CUP.
Kiparsky, Paul. 1976. “Oral poetry: some linguistic and typological considerations”, in Stolz, Benjamin A. & Stoll Shannon, Richard (eds), Oral Literature and the Formula, Ann Arbor, 73–106.
Kuiper, Koenraad. 2004. “Formulaic performance in conventionalised varieties of speech”, in Schmitt, Norbert (ed.), Formulaic Sequences: Acquisition, Processing, and Use, Benjamins, 37–54.
Kuiper, Koenraad. 2009. Formulaic Genres, Palgrave.
Mika, Tomasz. 2018. “The oldest Polish texts. New methods and new research issues in Polish historical linguistics”, in Kapetanović, Amir (ed,), The oldest attestations and texts in the Slavic languages, Holzhausen Der Verlag, 212-233.
Mika, Tomasz, Wacław, Twardzik. 2012. “Jak zagadkowe cztery tytuły rozdziałów w „Rozmyślaniu przemyskim” pozwalają wyobrażać sobie jego zagubiony autograf” [‘How the mysterious four chapter titles in The Przemysl Meditation allow us to imagine its lost autograph”], in Podtergera, Irina (ed.), Schnittpunkt Slavistik: Ost und West im wissenschaftlichen Dialog. Festgabe für Helmut Keipert zum 70. Geburtstag, Vandenhoeck & Ruprecht Verlag, 359-375.
Schmitt, Norbert (ed.). 2004. Formulaic Sequences: Acquisition, Processing, and Use, Benjamins.
Wray, Alison. 2008. Formulaic Language: Pushing the Boundaries, OUP.
Murel, Jacob, Feng, Steven, Haubold, Johannes & Graziosi, Barbara
Towards an LLM-Assisted Philology of Formulae in Greek Verse
The Logion Project develops large language models (LLMs) for philological research and restoration of pre-modern Greek texts. Despite success using LLMs for philological analysis of prose—e.g. Michael Psellus and Aristotle[1]—Greek verse poses new challenges. Two crucial issues are formulae and meter.[2] Formulae—e.g. whole and partially repeated lines, recurrent epithets, etc.—in particular can negatively affect model performance. Our presentation presents current work-in-progress on how to leverage fixed and flexible formulae to aid LLM-assisted philology and restoration of pre-modern Greek verse.
Recent machine learning research with pre-modern Greek has adopted LLMs to restore Greek inscriptions, such as tombstones or commemorative monuments.[3] Unfortunately, this research never addresses how the highly formulaic features of Greek inscriptions may impact model performance. Indeed, our analysis of inscription data used for such restoration models[4] suggests these models may be overtrained on textual formulae, inhibiting their generalizability to new data. Much like Greek inscriptions, Greek poetry is also highly formulaic. Greek poetry provides a notable case study for examining formulae impact on LLMs given its widely documented use of both fixed and flexible formulae.[5] As we examine how to adapt our LLM research to Greek epic verse, we ask: how might LLMs be trained to restore formulaic texts without negatively impacting generalizability while also predicting variances among flexible formulae? Such an investigation is sorely needed as previous studies of LLMs for Greek philology do not investigate the role and impact of formulae.[6]
To this end, we investigate computational methods for handling textual formulae in pre-modern Greek verse. Several questions drive our research: How might we leverage knowledge from formulae without negatively impacting model predictions of textual errors among non-formulaic features? How does traditional model fine-tuning compare to state-of-the-art retrieval augmented generation (RAG) techniques[7] for handling formulaic texts? How might LLMs be deployed to analyze the role of formulae in epic verse for philological research tasks?
To answer these questions, we compare fine-tuning and retrieval-augmented methods to detect errors among formulae in Greek epic verse. We generate variants of our pre-training data using various degrees of deduplication. We then cross-examine traditional LLM fine-tuning against RAG implementations with target-adjacent texts for restoring and detecting errors within formulae (e.g. τὸν δ' ἀπαμειβόμενος προσέφη πολύμητις Ὀδυσσεύς) and variances among formula (e.g. τὸν/τὴν δ' ἀπαμειβόμενος προσέφη πολύμητις Ὀδυσσεύς). As a case study, we focus on formulaic language in Homer and Hesiod. We choose these texts for their varying degrees of fixed and flexible formulae,[8] extant fragmentary states, and previous use in computational research.[9] We evaluate using standard masked language modeling evaluation metrics (e.g. top-k accuracy) as well as original philological error detection methods.[10] In doing so, we examine the extent to which formulae may assist in predicting non-formulae, and vice-versa. Moreover, we hope to shed light on the role of formulae in machine-assisted philological research tasks for epic verse and how machine learning may illuminate the use of formulae in epic verse.
[1] Barbara Graziosi, Johannes Haubold, Charlie Cowen-Breen, and Creston Brooks, “Machine Learning and the Future of Philology: A Case Study,” TAPA, 153:1, 2023, 253-84.
[2] For a summary discussion, see Gregory Nagy. “Formula and Meter: The Oral Poetics of Homer,” Greek Mythology and Poetics, Cornell University Press, 1990.
[3] Yannis Assael, Thea Sommerschield, Brendan Shillingford, et al., “Restoring and attributing ancient texts using deep neural networks,” Nature, 603, 2022, 280–283; Yannis Assael, Thea Sommerschield, Jonathan Prag, “Restoring ancient text using deep learning: a case study on Greek epigraphy,” Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, 2019, 6368-75; Eric Cullhed, “Instruct-Tuning Pretrained Causal Language Models for Ancient Greek Papyrology and Epigraphy,” arXiv, 2024, https://arxiv.org/abs/2409.13870.
[4] Available at: https://github.com/sommerschield/iphi
[5] Paul Kiparsky, “Oral Poetry: Some Linguistic and Typological Considerations,” Oral Literature and the Formula, edited by Benjamin Stolz and Richard Shannon, CCAMS, 1976, pp. 73-106
[6] E.g. Pranaydeep Singh, Gorik Rutten, and Els Lefever, “A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek,” Proceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, 2021, 128-37. Frederick Riemenschneider and Anette Frank, “Exploring Large Language Models for Classical Philology,” Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 2023, 15181-99.
[7] E.g. Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” Advances in Neural Information Processing Systems 33, 2020, 9459-74; Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-Wei Chang, “REALM: Retrieval-Augmented Language Model Pre-Training,” Proceedings of the 37 th International Conference on Machine Learning, 2020, 3929-38.
[8] E.g. Athena Kirk. “Swelling Women: Formulaics in the Hesiodic Catalogue.” CHS Research Bulletin 5, no. 2 (2017). http://nrs.harvard.edu/urn-3:hlnc.essay:KirkA.Swelling_Women.2017.
[9] John Pavlopoulos and Maria Konstantinidou, “Computational authorship analysis of the homeric poems,” International Journal of Digital Humanities, 5, 2023, 45–64; John Pavlopoulos, Ryan Sandell, Maria Konstantinidou, Chiara Bozzone, “HoLM: Analyzing Linguistic Unexpectedness in Homeric Poetry,” Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, 2024, 8166-72.
[10] Charlie Cowen-Breen, Creston Brooks, Barbara Graziosi, Johannes Haubold, “Logion: Machine-Learning Based Detection and Correction of Textual Errors in Greek Philology,” Proceedings of the Ancient Language Processing Workshop, 2023, 170-8.
Murgia, Giulia & Puddu, Nicoletta
Notarial Formularies in Early Modern Sardinia
In the history of the Sardinian language – a romance language attested in Sardinia since the 11th century – legal-administrative Sardinian represents the only textual tradition that can be traced with continuity from the earliest written manifestations, when Sardinia was subdivided into four autonomous kingdoms (Giudicati). With the start of the Catalan-Aragonese conquest of Sardinia in the 14th century, we witness the penetration of the figure of the Iberian-trained notary (Condorelli 2009). Even in the modern age, the Sardinian language of the administration persists as an area of resistance in writing (especially in notarial production), even though new languages (especially Catalan and Spanish) oust it from the upper echelons of Sardinia’s community language repertoire.
The peculiar sociolinguistic situation just described accounts for the interest in the writings of modern-day legal practitioners in Sardinia: these figures make up a particular community of practice and discourse (Putzu 2021), in which one observes the sedimentation and sharing of practices also characterised by the use of a formulaic language, often characterised by multilingualism.
This articulate professional group has a local apprenticeship with Sardinian notaries, and a specific training in Latin that, starting in the 17th century, took place at the newly founded University of Cagliari. Their repertoire, however, is shared by a wider textual community, which draws on the Iberian framework on the one hand, but is fundamentally pan-European, due to the European diffusion of common law in the medieval and modern era. The Sardinian community that drafts notarial deeds is, moreover, variegated and not entirely homogeneous (sometimes showing instances of formulae that are awkwardly re-proposed or the result of contamination of several models): if notaries have a specific and more supervised preparation, the different level of familiarity with a sectorial script is expressed above all when it is the curates who write, especially wills. Moreover, the acquisition of European models does not exclude the presence of clauses and formulae of indigenous matrix, the elaboration of which was necessary to adapt to the peculiarities of Sardinian law (Era 1934).
Previous studies on Sardinian notarial production (Puddu-Talamo 2020) and its formulaic aspects (Murgia-Puddu 2024) have shown – particularly through the study of coordinated binomials – the emergence of traits of considerable interest both for the identification of the multilingual patterns and practices of the writers and, more generally, for the study of the Sardinian legal language.
In this contribution we will focus on the analysis of the formulaic language of the Sardinian-language notarial formularies scattered in the archives of Sardinia, some of which have been published (Carta 2020), while others are still waiting to be brought to light. An integrated approach between philology and linguistics will be adopted, aiming, on the one hand, at the archival recovery of the materials and, on the other hand, at an initial quantitative analysis of the formulaic language in relation to the distribution of formulae within the different textual typologies present in the formularies.
Bach Ulrich. 2017. “«I do make and ordayne this my last wyll and testament in maner and forme Folowing»: Functions of Binomials in Early Modern English Protestant Wills”, in Kopaczyk, Joanna & Sauer, Hans (eds), Binomials in the history of English: Fixed and flexible, Cambridge, Cambridge University Press, 222-240.
Biber, Douglas. 2009. “A Corpus-Driven Approach to Formulaic Language in English: Multi-Word Patterns in Speech and Writing”, in International Journal of Corpus Linguistics 14 (3): 275-311.
Biber, Douglas. 2010. “What can a corpus tell us about registers and genres?”, in O’Keeffe, Anne & McCarthy, Michael (eds), The Routledge handbook of corpus linguistics, London, Routledge, 241-254.
Blasco Ferrer, Eduardo, Koch, Peter & Marzo, Daniela (eds). 2017. Manuale di linguistica sarda, Berlin/Boston, De Gruyter.
Cadeddu, Maria Eugenia. 2023. “Scrivere in castigliano, parlare in sardo. Esempi di contesti comunicativi in Ogliastra (XVIII secolo)”, in Fresu, Rita, Maninchedda, Paolo, Murgia, Giulia Serra, Patrizia (eds), Il «traffico delle lingue». Idiomi a contatto in Sardegna e nel Mediterraneo in età preunitaria, Cagliari, UNICApress, 149-174, <https://doi.org/10.13125/unicapress.978-88-3312-108-6>.
Carta, Michele. 2020. «Tabula dessas formulas de differentes instrumentos». Il formulario del notaio Gavino Francesco Pinna Succhioni di Ploaghe, Serramanna, Tipografia 3ESSE.
Condorelli, Orazio. 2009. “Profili del notariato in Italia Meridionale, Sicilia e Sardegna (secoli XII-XIX)”, in Schmoeckel, Mathias & Schubert, Werner (eds), Handbuch zur Geschichte des Notariats der europaischen Traditionen, Baden Baden, Nomos, 65-123.
Era, Antonio. 1934. Lezioni di storia delle istituzioni giuridiche ed economiche sarde, Roma, s.n.
Korkiakangas, Timo. 2022. “From memory or formulary: how were medieval documentary formulae reproduced?”, in Mirator 22, 4-24. <https://doi.org/10.54334/mirator.v22i1.119760>
Koolen, Marijn & Hoekstra, Rik. 2022. “Detecting formulaic language use in historical administrative corpora”, in Proceedings of the Computational Humanities Research Conference 2022, Antwerp, Belgium, December 12-14, 2022, 127-151. <https://ceur-ws.org/Vol-3290/long_paper5740.pdf>
Kopaczyk, Joanna & Sauer, Hans. 2017. “Defining and Exploring Binomials”, in Kopaczyk, Joanna & Sauer, Hans (eds), Binomials in the history of English: Fixed and flexible, Cambridge, Cambridge University Press, 1-23.
Kopaczyk, Joanna. 2020. “The language of Medieval legal record as a complex multilingual code”, in Armstrong, Jackson W. & Frankot, Edda (eds), Cultures of Law in Urban Northern Europe. Scotland and its Neighbours c. 1350-c.1650, London, Routledge, 58-79.
Kopaczyk, Joanna. 2024. “Unpacking and capturing multilingual practices and their effects in medieval administrative and legal discourse”, in Consani, Carlo, Guazzelli, Francesca & Perta, Carmela (eds), Gruppi professionali come fattore di innovazione linguistica. Evidenze documentarie in Europa tra Tarda Antichità e Medioevo, Alessandria, Edizioni dell’Orso, 13-28.
Murgia, Giulia & Puddu, Nicoletta. 2024. “Su alcuni binomi coordinati in un corpus di documenti sardi di età moderna”, in Consani, Carlo, Guazzelli, Francesca & Perta, Carmela (eds), Gruppi professionali come fattore di innovazione linguistica. Evidenze documentarie in Europa tra Tarda Antichità e Medioevo, Alessandria, Edizioni dell’Orso, 113-133.
Puddu, Nicoletta & Stein, Achim. 2018. “Word-level and higher level annotation of the Sardinian Medieval Corpus”, in Frank, Andrew U., Ivanovic, Christine, Mambrini, Francesco, Passarotti, Marco & Sporleder, Caroline (eds), Proceedings of the Second Workshop on Corpus-Based Research in the Humanities. CRH-2, Vienna, Gerastree, Dept. of Geoinformation, TU, 161-170.
Puddu, Nicoletta & Talamo, Luigi. 2020. “EModSar: A Corpus of Early Modern Sardinian Texts”, in Marras, Cristina, Passarotti, Marco, Franzini, Greta & Litta, Eleonora (eds), Atti del IX Convegno Annuale dell’Associazione per l’Informatica Umanistica e la Cultura Digitale (AIUCD). La svolta inevitabile: sfide e prospettive per l’Informatica Umanistica, Milano, Universita Cattolica del Sacro Cuore, 210-215.
Putzu, Ignazio Efisio. 2021. “Comunità di pratica, comunità di discorso e comunità testuali tra sincronia e diacronia: alcune considerazioni preliminari”, in Rhesis, 12.1, 66-88, <https://ojs.unica.it/index.php/rhesis/article/view/5659>.
Schena, Olivetta. 2013. “Notai e notariato nella Sardegna del tardo Medioevo”, in Meloni. Maria Giuseppina (ed.), Elites urbane e organizzazione sociale in area mediterranea fra tardo medioevo e prima etò moderna, Atti del seminario di studi Cagliari, 1-2 novembre 2011, Roma, ISEM-CNR, 325-353.
Stefanowitsch, Anatol & Gries, Stefan. 2003. “Collostructions: Investigating the interaction of words and constructions”, in International Journal of Corpus Linguistics 8(2), 209-243.
Virdis, Maurizio. 2023. “Dinamiche linguistiche nella lunga età sardo-iberica”, in RiMe, 13.II, 485-510, <https://rime.cnr.it/index.php/rime/article/view/664>.
Mäkinen, Martti
Exploring formulae through stylometric analysis of Middle English documents
Often in historical language data, the variation in spelling has been a considerable challenge for empirical and corpus linguists, and this is particularly true for studies in historical formulaic language use (cf. Korkiakangas, 2024). This paper investigates the usability of Stylo, a stylometric package written for R (Eder, Rybicki & Kestemont, 2017, Eder 2015a and 2015b) in identifying formulae in Middle English documents. The aim is to test the potential of unsupervised character and word n-gram analysis in charting formulaic language use in texts encumbered by a lot of spelling variation in advance of in-detail, traditional analysis of texts, thus enabling the use of unannotated and unlemmatized corpora in the study of formulae.
In earlier studies, Stylo has been able to distinguish between Middle English document categories (Mäkinen 2019), and also between Middle English dialect areas, the latter by using sets of less frequent n-grams (Mäkinen 2020). In the current study, the focus is on more frequent word and character n-grams. The choice of analytic units is based on two assumptions: (1) Despite the prevalent spelling variation in Middle English texts, documents would carry somewhat restricted vocabulary, which may enrich the occurrence of certain spelling forms, and thus also the occurrence of certain character n-grams, which makes them analytically a reasonable choice. (2) Documents are more homogeneous with other documents written for the same purpose, i.e. the structure of agreements, grants, leases etc. would have followed a more or less similar patterns. Also, many of them would have been exported from earlier text types in other languages, like Latin and French, and that emphasizes the fact that the text of formulae did not necessarily belong in the scribes’ own linguistic repertoires. Therefore, early standard spellings may first have occurred in the formulaic parts of documents, thus lessening spelling variation in the passages.
The data for the study is drawn from A Corpus of Middle English Local Documents (henceforth MELD), compiled at the University of Stavanger. It contains documentary texts from 1400 to 1525. The version used in this study is 2017.1, consisting of over 2,000 localizable scribal documents, and c. 850,000 words (MELD). Localizable documents are texts that either contain the information on the provenance of the document in the actual text or provide circumstantial information about the provenance (through the use of personal and place names) so that localizing the origin of the document is possible (Stenroos and Thengs, 2012). Methodologically, this study is inspired by Kopaczyk (2013) and Wynne, McIntyre and Burke (2024).
Eder M, J. Rybicki, and M. Kestemont. (2017). ‘Stylo’: a package for stylometric analyses. Computational Stylistics Group. 1-36. Available at: https://tinyurl.com/y449xxkk.
Eder, M. (2015a). Visualization in Stylometry: Cluster Analysis Using Networks. Digital Scholarship in the Humanities, 1–15. doi:10.1093/llc/fqv061.
Eder, M. (2015b). Taking stylometry to the limits: Benchmark study on 5,281 texts from Patrologia Latina. [Online]. Digital Humanities 2015: Conference Abstracts. Available at: http://dh2015.org/abstracts.
Kopaczyk, J. (2013). The Legal Language of Scottish Burghs: Standardization and Lexical Bundles (1380-1560), Oxford Studies in Language and Law. OUP.
Korkiakangas, T. (2024). A linguist's viewpoint: formulaic language as a challenge for historical linguistics. In Formulaic Language in Historical Research and Data Extraction (Huygens Institute for the History and Culture of the Netherlands; Royal Netherlands Academy of Arts and Sciences, Amsterdam, 7-9 February, 2024).
Mäkinen M. (2019). Testing a stylometric tool in the study of Middle English documentary texts. In: Bös B. and Claridge C. (eds.). Norms and Conventions in the History of English. John Benjamins (Amsterdam). 149-166.
Mäkinen, M. (2020). Stylo visualisations of Middle English documents. Journal of Data Mining & Digital Humanities. Special issue on Visualisations in Historical Linguistics. 1-10. Available at: https://jdmdh.episciences.org/7022.
MELD = The Middle English Local Documents Corpus, version 2017.1. June 2017, University of Stavanger (Stavanger). Available at: https://www.uis.no/research/history-languages-and-literature/the-mest-programme/a-corpus-of-middle-english-local-documents-meld/.
Stenroos, Merja & Thengs, Kjetil V. (2012). Two Staffordshires: real and linguistic space in the study of Late Middle English dialects. In Jukka Tyrkkö, Matti Kilpiö, Terttu Nevalainen & Matti Rissanen (Eds.), Outposts of Historical Corpus Linguistics: From the Helsinki Corpus to a Proliferation of Resources. (Studies in Variation, Contacts and Change in English 10), Helsinki: VARIENG. [Online]. Available at: http://www.helsinki.fi/varieng/series/volumes/10/stenroos_thengs/.
Wynne, M., McIntyre, D., & Burke, M. (2024). Formulaic language in Early English Books Online: From computational linguistics to classical rhetoric. In Formulaic Language in Historical Research and Data Extraction (Huygens Institute for the History and Culture of the Netherlands; Royal Netherlands Academy of Arts and Sciences, Amsterdam, 7-9 February, 2024).
Norris, Jérôme (part of panel A)
The highly formulaic nature of epigraphic habits in North-West Arabia before Islam
Pre-Islamic Arabia has been described as one of the most extraordinary places in the ancient world from the point-of-view of epigraphy, due to the impressive number of texts it produced, the high level of literacy of its population and also because it developed its own family of alphabets, the so-called “South Semitic” scripts which include the “Ancient South Arabian” (ASA) script from southern Arabia and the “Ancient North Arabian” scripts from northern Arabia (Macdonald 2015: 1).
The epigraphic situation of North-West Arabia was marked by a profound diversity with the co-occurrence of a multitude of different scripts. These scripts include the local ANA scripts, of which more than 10 are currently distinguished, which include the so-called “Dadanitic”, “Taymanitic”, “Dumaitic”, “Safaitic”, “Hismaic” scripts and a plenty of other scripts improperly called “Thamudic” (Hayajneh 2011; Al-Jallad 2018). Besides these, they include scripts of Aramaic origin imported from the Levant, including the Imperial Aramaic script as well as local variants of Aramaic that developed in Arabia such as “Taymāʾ Aramaic” and “Nabataean Aramaic”. The Nabataean script, after three centuries of evolution, developed into the “Arabic” script, passing through the intermediate phases of “Nabataeo-Arabic” (late third-mid fifth centuries AD) and “Palaeo-Arabic” (late fifth-sixth centuries AD) (Nehmé 2010).
Despite their profound diversity, the different writing traditions of pre-Islamic North-West Arabia have the common characteristic of being extremely formulaic. Each epigraphic group, thus, tends to be linked to a limited set of recurring formulae. Although this characteristic has been identified for a long time, no comparative study of the different formulae attested from one group to another has been conducted until now, which is what this contribution aims to do. This analysis leads to a double observation. On the one hand, certain formulae are specific to a given epigraphic tradition. This is, for instance, the case of Taymanitc nṣr l-Ṣlm “he kept watch on behalf of [the deity] Ṣalm”, Thamudic D PN (Personal Name) ʿs²q PN “PN the lover of PN” or Nabataean šlm PN “may PN be secure”. In this case, I would propose the concept of “closed” formulae, comparable to the concept of “closed cults”. On the other hand, other formulae are shared among different writing traditions, as the invocation ḏkr(t) DN (Divine name) PN “may DN be mindful of PN”, the expression of longing ts²wq ʾl-PN “he longed for PN” or the expression wdd f-PN “love/desire for PN”. For the latter, I would propose the concept of “open” formulae.
The conclusion that will emerge is that, despite their very high literacy, the ancient inhabitants of North-West Arabia used to carve inscriptions to express a very limited number of messages specific to a given writing tradition and following extremely codified rules. As for the identification of “closed” and “open” formulae, it allows to distinguish between writing traditions that developed independently and others that were, on the contrary, in contact with each other, probably reflecting neighbourhood relations and tribal ties between several populations sharing the same territory.
Al-Jallad, A. 2018. What is Ancient North Arabian? In D. Birnstiel and N. Pat-El (eds) Re-engaging Comparative Semitic and Arabic Studies. Wiesbaden: Harrassowitz Verlag (Abhandlungen für die Kunde des Morgenlandes, 115): 1–44.
Hayajneh, H. 2011. Ancient North Arabian. In S. Weninger, G. Khan, M. Streck, J. Watson (eds.), The Semitic Languages: An International Handbook. Berlin/Boston: De Gruyter Mouton (Handbooks of Linguistics and Communication Science, 36): 756–782.
Macdonald, M.C.A. 2015. On the uses of writing in ancient Arabia and the role of palaeography in studying them. Arabian Epigraphic Notes 1: 1–49.
Nehmé, L. 2010. A glimpse of the development of the Nabataean script into Arabic based on old and new epigraphic material. In M.C.A. Macdonald (ed.), The development of Arabic as a written language. (Supplement to the Proceedings of the. Seminar for Arabian Studies 40). Oxford: Archaeopress: 47-88.
PANEL A: Norris, Jérôme, Maczuga, Julia & Kootstra-Ford, Fokelien
Shared formulae, continuity, and change in the epigraphy of Northern Arabia
(the abstracts of the panel presentations are listed in alphabetical order among other presentations)In May 2024, the AlUla Inscriptions Analysis Project (AICAP) started at Ghent University. This project is a collaboration with the Royal Commission for AlUla and aims to read and interpret the tens of thousands of rock inscriptions that are found in AlUla County (Northwest Saudi Arabia), and to bring them together in a single database. The epigraphy from AlUla spans a wide range of periods, scripts, and languages. It ranges from pre-Islamic inscriptions in South Semitic script variants like Dadanitic (6th – 1st c. BCE), including several Greek and Latin inscriptions, different varieties of Aramaic, early Arabic inscriptions, up to modern Arabic inscriptions written in the 20th century.
The Arabian Peninsula was extremely rich in local scripts and associated writing cultures up until about the 5th century AD (Macdonald 2000). One thing the various writing cultures have in common is the high number of non-official inscriptions, or graffiti, that were left in them. At the same time, these inscriptions are highly formulaic, with individual formulaic usage often being key to identifying script variants and their associated writing cultures (e.g. Winnett 1987; Prioletta 2022).
This panel aims to shed light on formulaic language use in this uniquely varied corpus from the Arabian Peninsula. Combining a meta discussion of how to define linguistic formula with more in-depth examination of variation and change within the formulaic usage of individual corpora, this panel will engage with the question of how we to leverage to understand complex connections of continuity and change with in the epigraphic record of the Arabian Peninsula and beyond.
Macdonald, Michael C.A. 2000. “Reflections on the Linguistic Map of Pre-Islamic Arabia.” Arabian Archaeology and Epigraphy 11:28–79.
Prioletta, Alessia. 2022. “The Inscriptions in Ancient South Arabian Script from Ḥimā: A Preliminary Historical and Cultural Appraisal.” Proceedings of the Seminar for Arabian Studies 51:271–82.
Winnett, F.V. 1987. “Studies in Ancient North Arabian.” Journal of the American Oriental Society 107 (2): 239–44.PANEL B: Longrée, Dominique, Vanni, Laurent, Fascione, Sara, Rosa, Arianna & Thon, Valérie
Formulae in Latin Epistolography
(the abstracts of the panel presentations are listed in alphabetical order among other presentations)
Rodek, Ewa
The Role of Keywords in Building Sender-Receiver Relationships: A Case Study of Polish-Language Texts from 1600-1750
The sender-receiver relationship is most fully revealed in prefaces to literary works, as one of the primary functions of these texts is to establish a connection between the author and the reader. Keywords and other fixed elements play a special role in building this relationship. This phenomenon is clearly visible in Polish-language prefaces from the 17th century and the first half of the 18th century.
A corpus study of a collection of 150 prefaces addressed to anonymous readers (dedicatory prefaces to specific individuals were excluded) showed that the reader is referred to by fixed descriptors (kind, pious, noble), which appear at key points in the preface where the reader’s attention is at its peak: in the apostrophe, the incipit, paragraph beginnings, and the closing. Moreover, formulaic sequences include topoi, such as the topos of labor and humility, which are evoked through characteristic phrases and words (servant, labor) and their synonyms (servant, lucubration). The prefaces also feature other conventional strategies that evoke formulaic language. Among them are direct addresses to the reader, which, in real life, almost never appeared in formal situations at that time. Even children addressed their parents, and wives their husbands, with appropriate titles, and certainly strangers addressed each other similarly. Therefore, this device must be recognized as having an important and stable function in the text: the function of building a good, friendly relationship with the reader.
These terms used for the reader, words consistently used in the function of topoi, and other conventional expressions should be considered as keywords understood as thematic words (Knights 2010), activating a familiar schema and thereby facilitating the reception and interpretation of the content, including that of the main work itself. These words open the space for cooperation between the author, who recommends their work, and the reader, who may read it favorably or with reluctance. Additionally, their placement within the text clearly serves an organizing and unifying function, as is often emphasized by the authors of prefaces themselves.
The possibility for readers of prefaces to decode the meaning embedded in these keywords encourages consideration of a cultural definition of keywords (Wierzbicka 1997: 8), as they express values common to contemporary Polish society.
In this presentation, I will illustrate with specific, clear examples, how keywords built the relationship between senders and receivers in the 17th century and the first half of the 18th century, as well as I will present the functions of the contstants of individual elements and motifs, examined using corpus-based and pragmalinguistic methods. In conclusion, I would like to emphasize that a holistic approach (Gatos et al. 2006) is essential in analyzing keywords and formulaic expressions, allowing us to achieve a multidimensional portrayal of a particular historical reality.
Wierzbicka A. 1997. Understanding Cultures through Their Key Words: English, Russian, Polish, German, and Japanese. New York: Oxford University Press, 1997. 317 pp.
Gatos B., Konidaris T., Pratikakis I., Perantonis S. 2006. A Holistic Methodology for Keyword Search in Historical Typewritten Documents. Proceedings of the 4th Helenic conference on Advances in Artificial Intelligence. https://doi.org/10.1007/11752912_52
Knights M. 2010. Towards a Social and Cultural History of Keywords and Concepts by the Early Modern Research Group. History of Political Thought, Vol. 31, No. 3 (Autumn 2010), pp. 427-448. https://www.jstor.org/stable/i26224138
Roldão, Filipa & Serafim, Joana
Formulaic Language in Portuguese Municipal Charters of the Middle Ages: A Historical and Linguistic Analysis
The recent electronic edition of the earliest municipal charters granted by Portuguese monarchs to local communities—known as forais—primarily from the 12th and 13th centuries, offers a corpus of over four hundred documents in Latin and in vernacular, providing valuable resources for various disciplines, including History, Diplomatics, Philology, and Linguistics. Formulaic expressions were extensively employed not only to meet the diplomatic conventions of chancery documents but also to standardize juridical clauses on various subjects addressed within these forais, including, among others, municipal governance, economic regulation, fiscal issues, and judicial matters. This standardization of content across these areas allows for a clear identification of municipal charters that share similarities, as well as those that diverge, resulting in distinct textual traditions. Historians have concluded that certain forais served as templates, replicated whenever a community petitioned the royal chancery for a municipal charter. Approximately three primary 'document families' or models have been identified within this corpus. However, this interpretation has traditionally relied on historical analysis, often overlooking a linguistic approach to the data. This paper seeks to identify and analyse the formulaic language present in these documents by integrating historical and linguistic approaches. Using the CollateX program, the study will compare textual data to examine the topics where formulaic language is employed, as opposed to unique information specific to individual communities or instances of direct speech. Notably, even within these seemingly spontaneous or less formal passages, formulaic expressions persist. This analysis also seeks to illuminate the contextual choices between Latin and the vernacular in these records. Municipal charters granted by the earliest Portuguese kings continued to be copied and reproduced throughout the Middle Ages, until the 15th century, as evidenced by recent electronic editions. Considering both historical and linguistic perspectives, this paper will finally reflect on the extent to which formulaic language contributed to the survival of these documents over such an extended period.
Electronic edition: https://deti-iforal.ua.pt/documents
Roldão, F. & Serafim, J. (2021), Os mais antigos forais régios portugueses: uma proposta de estudo e de edição, Poder y Poderes en la Edad Media (Monografía de la Sociedad Española de Estudios Medievales, 16). Coord. Raquel Martínez Peñín y Gregoria Cavero Domínguez, Múrcia, Sociedad Española de Estudios Medievales, 375-386. ISBN: 978-84-17865-93-1.
Silvestre, J. P., Pacheco, O., Sousa, J., Roldão, F., & Serafim, J. (2024). A edição digital de forais medievais portugueses com o suporte de um sistema de edição colaborativa em base de dados. Diacrítica, 38(1), 130–145. https://doi.org/10.21814/diacritica.5602
Rosa, Arianna & Thon, Valérie (part of panel B)
Letters Across Time: a Diachronic Study of the Epistolary Formulas in Cicero, Jerome and Peter Damian.
The epistolary genre has its roots in Antiquity and is born from a strong need of communication: the tone and style vary, however, according to the addressee and the nature of the letter, whether it is private, administrative, political, consolatory, etc. Despite its different forms and modalities, the epistolary genre also has fixed and formulaic characteristics: the inscriptio with the abbreviation of the name and title of the addressee, as well as the initial and final formulas of salutatio to the addressee.
The goal of our presentation is to explore this formulaic nature of the epistolary genre and its possible developments over time from a diachronic point of view. We will go through the various centuries, using examples found in the letter collections of some of the most important epistolary authors: Cicero for the Republican age, Saint Jerome for the Late Antiquity and Peter Damian for the Central Middle Ages. Do the epistolary formulas evolve over time? If they do, are these variations related to the socio-cultural context of the author or to linguistic phenomena particular to the Latin language? In other words: can formulaicity, despite its fixed nature, also be subject to change? To answer these questions, we will also explore our corpus using an innovative method: Hyperbase, a software developed by the LASLA which enables a statistical and quantitative survey of the Latin language.
Salemenou, Maroula
Diplomatic correspondence in the corpus Demosthenicum: an evaluation of authenticity
This paper combines quantitative and qualitative approaches in order to evaluate the issue of authenticity in the diplomatic correspondence cited in Demosthenes’ speeches. As quantitative approach I understand the examination of the standardised or formulaic parts of these documents. I refer to focusing on the genre of the letters and the way this influences their content and style as qualitative approach. The way in which all documents in the corpus Demosthenicum have survived makes it inevitable that speculation is inherent in any discussion of them. Nonetheless, I contend that the study of the formulaic language in the diplomatic correspondence in Demosthenes, which has been overlooked or misunderstood in secondary literature, can provide some grounds for understanding the origin and for defining the degree of authenticity in each such document.
Scapini, Elia & Iezzi, Federico
Θεὸν εκ θεοῦ: a case study for semantic retrieval in Ancient Greek
In this joint paper, we present a search tool for stereotype formulations in Ancient Greek that tolerates some variation in language in the face of preservation of meaning. As part of the ITSERR (Italian Strenghtening of Esfri RI Resilence) infrastructure dealing with the research and development of digital tools for the Digital Humanities, particularly Religious Studies, WP4 DaMSym (Data Mining applied to the Nicene-Constantinopolitan Symbol) uses the creed of Nicaea and Constantinople as a case study and examines it in its various languages of ancient translation (Ancient Greek, Latin, Coptic, Arabic, Sanskrit, Church Slavonic). Our research starts from the fact that the expressions God from God, Light from Light, true God from true God are stereotypical formulations describing an x-from-x causality, where the cause reproduces itself (Barnes 2001). Although these stereotypical formulations run through the 4th century in various forms, rule-based tools for verbatim retrieval such as the TLG do not allow us to collect all the possible x's that go into x-from-x formulations. Indeed, in addition to θεὸν ἐκ θεοῦ, φῶς ἐκ φωτός, θεὸν ἀληθινὸν ἐκ θεοῦ ἀληθινοῦ, in the synodical documents of the 4th century and in the writings of many church authors of this period we also find expressions such as ζωὴν ἐκ ζωῆς, ὅλον ἐξ ὅλου, μόνον ἐκ μόνου, τέλειον ἐκ τελείου, βασιλέα ἐκ βασιλέως, κύριον ἀπὸ κυρίου etc. which cannot be returned by rule-based search tools. To address this deficiency in the state of the art, we have built and will make public a machine learning-based semantic retrieval tool for ancient Greek that reorders the phrases in a corpus based on vector similarity with the query sentences assigned as input. The phrases to be searched within the corpus can be more than one, and they are all embedded in such a way that they are described as points on a multi-dimensional space and can be related to the expressions in the corpus closest to them. We therefore present the first benchmarks of our work by discussing which encoder proves best suited for the purpose, show the sister project for Latin and the intention to combine the two systems into one, suggest the best strategies to exploit this tool to colleagues who might want to make use of it, and list the improvements we plan to make in the future.
Scappaticcio, Maria Chiara
OVF (Oro Vos Faciatis): Praying and Requesting. On the Formulaic Nature of Canvassing in Literature and Epigraphy
Canvassing for office was a distinctive feature of Roman culture, integral to civic life alongside the elections themselves (see, for example, Tatum 2018: 4–19, with bibliography). The only extant treatise theorizing canvassing in ancient Rome, the Commentariolum petitionis (‘Brief Handbook on Canvassing for Office’, henceforth CP), along with insights from Cicero’s letters and speeches, provides readers with exegetical keys to understanding a sociopolitical phenomenon shaped by the necessity of communicational strategies and cultural imprinting. Additionally, the passions and emotions of individual participants become apparent, particularly when exploring the remains of the ancient city of Pompeii and examining its electoral posters. Notably, electoral propaganda was characterized by a rich array of formulaic expressions, both in terms of abbreviated phrases and in the syntactical and stylistic structures used.
An example of this is the most common electoral formula, OVF. Candidates were expected to ‘beg’ for office: the Latin verb oro (‘I beg’) is part of the vocabulary of supplication and appears frequently in the electoral propaganda that adorned the walls of ancient Pompeii. The phrase ‘I beg you to elect’ — oro vos faciatis, often abbreviated to OVF — ranks among the most common electoral formulas. The busiest streets of Pompeii were filled with electoral posters, highlighting analogies with Republican canvassing, particularly due to the pervasive use of supplicatory language. Terms such as oro, petitio (‘a request’), petere (‘to importune’), rogare (‘to entreat’), servire (‘to be a slave to’), and supplicare (‘to grovel for’) all belong to the canvassing lexicon, emphasizing the humble posture that candidates needed to adopt during their electoral campaigns, whether for the position of aedilis or consul. While CP is permeated with this vocabulary of supplication (as noted by Flaig 2003: 20ff.; Tatum 2007: 112ff.), it notably lacks the term oro, which is so prevalent in the walls of Pompeii.
This paper aims to provide, for the first time, a comprehensive examination of the ‘formularity’ characterising Roman canvassing, bridging Latin epigraphy and literature. The electoral propaganda of Pompeii is indeed rich in abbreviated formulas that imply familiarity with the populace. However, while many of these abbreviated formulas are understood (albeit not thoroughly explored), numerous inscriptions (dipinti) from Pompeii remain unresolved and obscure, often absent from common textual databases.
Chiavia, C. (2002) Programmata. Manifesti elettorali nella colonia romana di Pompei. Torino
Flaig, E. (2003) Ritualisierte Politik: Zeichen, Gesten und Herrschaft in alten Rom. Göttingen
(2013) Die Mehrheitsentscheidung: Enstehung und kulturelle Dynamik. Paderborn
Tatum, W. J.
(2018) Quintus Cicero. A Brief Handbook on Canvassing for Office
(Commentariolum petitionis). Oxford
Schironi, Francesca
Formulae and Formulaic Language in Hellenistic Greek Astronomy
During the Hellenistic period scientific Greek prose develops, as especially attested in Euclid’s Elements, which become the model to express mathematical content in all disciplines that the Greek defined as ‘mathematical sciences’: geometry, arithmetic, mechanics, astronomy, harmonics, optics. While the language of geometry has been studied,[1] the other branches of mathematical sciences still need further research. In this talk I will discuss some examples of formulaic language in Hellenistic Greek astronomy, focusing on Hipparchus of Nicaea (2n cent. BCE) and some Hellenistic papyri I am studying. Hipparchus is the most important Hellenistic astronomer, who, among other things, discovered the precession of the equinoxes. Despite his importance, almost all of his works have been lost, due especially to the success of Ptolemy, whose Mathematike Syntaxis or Almagest supplanted all the previous treatises on mathematical astronomy. Only Hipparchus’ Exegesis to Eudoxus’ and Aratus’ Phaenomena has survived– a ‘polemical’ commentary in which Hipparchus develops a detailed critique of Eudoxus’ Phaenomena and above all of Aratus’ Phaenomena. My study on this work has shown that Hipparchus was not only an innovator in the science of astronomy and its methodology: he was also an innovator in creating a scientific language to express astronomical concepts. His linguistic strategy consisted both in building a scientific lexicon in which terms were used in a very specific way and in manipulating Greek syntax to turn it into a vehicle to express precise data.
Technical lexicons tend to be standardized, economic, and concise,[2] avoiding polysemy and synonymy. Indeed, the astronomical lexicon used by Hipparchus applies this strategy to the point of becoming a monosemous lexicon—while being at the same time a quite clear and etymologically ‘transparent’ lexicon.[3] However, in the Exegesis Hipparchus also uses what we might term ‘fixed formulas’; for example, to name certain constellations his strategy seems to have been functional to avoiding confusion between similar constellations’ names. In addition, in the last section of his Exegesis, which consists of his Catalogue of Simultaneous Risings and Settings, Hipparchus conveys plenty of technical data in an organized and logical structure, by using continuous prose, which hardly adapts to expressing lists of scientific data, rather than using tables or bullet points like modern scientists do. This is achieved through mainly three stylistic tools: 1) a formulaic structure which uses almost always identical phrases, or with little variation; 2) a syntax reduced to the minimum, so that the reader’s attention is focused on the data themselves, and 3) topicalization. These tools make the resulting prose formulaic to an extent that the reader soon learns what to expect and can focus on the individual data. This procedure is consistent with, and even further develops, the communication strategies attested in other areas of Greek scientific language, which often makes use of formulas as a way to help either memorization or learning.[4] On the other hand, the astronomical language attested in one famous Hellenistic astronomical papyrus (PParis 1, ca. 165 BCE) shows knowledge but also a partial misunderstanding of syntactic formulas of Greek mathematics. This reflects a lower level of accuracy and familiarity with the technical discipline and its language, in line with type of text preserved by PParis 1 (a general handbook addressed to non-professionals).
[1] Michel Federspiel wrote a series of articles on the language of Greek mathematics between 1992 and 2006; see also Netz 1999; Acerbi 2021.
[2] For a discussion of technical languages in the Graeco-Roman world, see Langslow 2000, 6-28; Fögen 2003, Willi 2003, 66 and 69.
[3] See Schironi 2024.
[4] See Aujac 1984; Netz 1999, 127-167.
Acerbi, F., 2021, The Logical Syntax of Greek Mathematics,
Aujac, G., 1984, ‘Le Langage Formulaire Dans La Géométrie Grecque’, Revue d'histoire des sciences, pp. 97-109.
Fögen, T., 2003, ‘Metasprachliche Reflexionen Antiker Autoren Zu Den Charakteristika Von Fachtexten Und Fachsprachen’, in Antike Fachschriftsteller: Literarischer Diskurs Und Sozialer Kontext, ed. by Horster, M. and Reitz, C., Stuttgart, pp. 31-60.
Langslow, D.R., 2000, Medical Latin in the Roman Empire, Oxford.
Netz, R., 1999, The Shaping of Deduction in Greek Mathematics: A Study in Cognitive History, Cambridge ; New York.
Schironi, F., 2024, ‘The Language of Hellenistic Astronomy’, in Coming to Terms. Approches to (Ancient) Terminologies, ed. by Asper, M., Berlin - New York, pp. 11-39.
Willi, A., 2003, The Languages of Aristophanes: Aspects of Linguistic Variation in Classical Attic Greek, Oxford - New York.
Soffiantini, Laura
Pliny’s formulaic language in geographical books
Books 2-6 of Pliny's Naturalis Historia represent the most extensive surviving geographical Latin work from antiquity. Drawing on earlier Greek and Latin sources, Pliny undertook an ambitious synthesis of the known world from Gibraltar to India unmatched by any prior Latin author. In accomplishing this monumental task, Pliny confronted the challenge of developing appropriate language to describe geographical space (Pinkster, 2005). Previous studies have shown that scientific languages, including scientific Latin, exhibit high levels of formulaicity employing defined terminology and linguistic constructs in (semi)fixed structures to express specific concepts (Langslow, 2000; Netz, 2003). Using a corpus-driven approach, recent research (Fantoli, 2020; Fantoli-Soffiantini 2023) has demonstrated that Pliny’s language shows formulaic aspects and has tested various methods of pattern extraction on the Naturalis Historia.
With my presentation, I will investigate Pliny’s formulaic language in geographical books with two main objectives: presenting potential strategies of formulae retrieval and analyzing the extracted formulae. After text pre-processing which includes masking all proper nouns and numerals to reduce variability, two methods will be employed. The first method (1) involves the extraction of n-grams, while the second method (2) identifies longer formulaic patterns containing free spots (also known as non-continuous formulae). In the second method, the text is represented as a graph where each node represents a word, and consecutive words are connected by edges. Preliminary analysis performed on book 4 has revealed recurring bigrams such as ex adverso, in longitudinem, a septentrione, a meridie, ab oriente, which are used to convey orientational indications. Furthermore, by analyzing the structure of the network resulting from (2), it was possible to reconstruct that the text contains various combinations where the preposition ab is followed by a place name two slots later. The intervening slot can be filled by different tokens such as eo, ea, oppido which by depending on ab may indicate the starting point of the geographical description.
Fantoli, M. 2020.“Res ardua uetustis nouitatem dare, nouis auctoritatem”: Étude contrastive des enjeux linguistiques et communicatifs du deuxième livre de la Naturalis Historia de Pline l’Ancien [PhD Thesis]. ULiège - Université de Liège.
Fantoli, M. & Soffiantini, L. 2023. “Formulaic Language in Latin non-literary texts: computational approaches.” ICLL 2023 22nd International Colloquium on Latin Linguistics. Prague, June 19–23, 2023.
Langslow, D. R. 2000. Medical Latin in the Roman Empire. Oxford: Oxford University Press.
Netz, R. 2003. The Shaping of Deduction in Greek Mathematics: A Study in Cognitive History. Cambridge: Cambridge University Press.
Pinkster, H. 2005. “The language of Pliny the Elder.” In Aspects of the Language of Latin Prose, eds. T. Reinhardt, N. Lapidge, and J. N. Adams, 239–256. Oxford.
Stenroos, Merja
Formulaicity and the individual voice in late medieval English legal statements
Legal documents present a paradoxical picture as historical linguistic evidence. On the one hand, they may be quite precisely dated and localized, and refer to specific individuals and events; some types of documents may also contain individual, even personal content. On the other hand, they tend to be heavily formulaic. This complexity is especially notable in the type of documents that we might categorize as legal statements. These are text types written in the first person and typically conveying a statement or commissive, including attestations, affidavits and vows of allegiance, as well as, in the ecclesiastical sphere, confessions and abjurations; testaments and wills may also be considered to belong to this category, as may receipts. Such documents reflect to a varying extent the voices of the person making the statement and the scribe drawing up the document; to what extent the formulaic content reflects the language of either is a challenging question, to which there can be no single answer. The proposed paper addresses this question with reference to late medieval English documentary materials, and argues that their multilingual and linguistically fluid context makes present-day concepts of formulaicity problematic.
The formulae found in late medieval English legal statements can seldom be described in terms of chunks of identical phrasing. Rather, the formulae make up a wide range of variants expressing approximately the same content but differing in phrasing, length, morphology and spelling. This variability, which has considerable implications for the linguistic study of the texts, reflects both the multilingual context of medieval English administration and the lack of standard models of writing. From the fifteenth century, English was increasingly acceptable as a medium of legal documents; at this point, however, written English had neither an established standard nor available conventions for administrative writing. As Latin continued to be the dominant administrative language, the varied usage in English documents probably reflects individual translations from Latin templates, whether memorized or actually consulted.
The proposed paper explores the different kinds of formulae found in fifteenth- and early sixteenth-century English legal statements and problematizes the concepts of both formulaicity and individual voice in these materials. It argues that a focus on formal identity is of limited use for understanding the interaction of conventions, registers and voices in these early materials, and suggests a more flexible approach to their study, with multiple levels of formulaicity and voice. The empirical material is drawn from A Corpus of Middle English Local Documents (MELD; 2017-), and the study will combine an overview of formulaicity in the statement category with a focussed discussion of two or three types of statement. In order to present the data and explain the approach, I would be very happy to have the extra 10 minutes suggested in the Call for Papers, but will be able to scale the paper as needed.
MELD = A Corpus of Middle English Local Documents. Version 2017.1. Compiled by Merja Stenroos, Kjetil V. Thengs & Geir Bergstrøm. University of Stavanger. www.uis.no/meld
Trombetta, Chiara
Discourse-organizational functions of formulaic language in Italian 16th-century historiographical texts
The use of formulaic language is a key strategy for structuring, organizing, and managing information within texts, while also enhancing communication with the reader. This proposal seeks to examine the various functions that formulaic language performs in 16th-century Italian historiographical works, with particular emphasis on two major texts from the Florentine literary tradition: Guicciardini’s Storia d’Italia and Machiavelli’s Istorie fiorentine.
Written texts are semantic-pragmatic entities composed of interconnected units across three structural levels: the logical-argumentative, thematic-referential, and enunciative-polyphonic levels (Ferrari 2022, p. 282). In the texts under consideration, formulaic and semi-fixed patterns contribute to the construction of these levels and facilitate transitions between them, making these shifts explicit to the reader. The significance of these phenomena becomes especially evident when considering that, in their original form, these texts lacked the modern organization of paragraphs and chapters, elements that were introduced only in later editions. Initially, these works were structured solely into books, following the humanistic canon. In this context, the use of fixed sequences serves multiple functions, aiding both the narrator in structuring the narrative and the readers in processing the information.
The recurring patterns identified in these works can be interpreted as conventionalized strategies aimed at addressing recurrent communicative needs within the text. Their fixed formal configurations and consistent associations with specific functions make them readily recognizable to the readers, especially after they have become familiar with the work, its rhythm, and its regularities. This aligns with Günthner and Knoblauch’s (1995, p. 8) broader definition of ‘communicative genres’, which are understood as historically and culturally specific, pre-patterned, and complex solutions to recurrent communicative challenges. Moreover, the use of specific fixed expressions connects to a broader context beyond the text itself, reflecting the authors’ intention to situate their work within a particular textual genre and discursive tradition. In fact, by adhering to and emulating established conventions, the authors not only conform to the norms of the genre but also engage with other texts, thereby demonstrating their familiarity with the literary tradition.
The talk will focus on formulaic expressions and semi-fixed patterns in the aforementioned texts, analyzing their role as mechanisms for transitioning between distinct textual levels. This includes their function in shifting between discursive topics, navigating enunciative levels during the introduction of direct speech, and facilitating narrative digressions and commentary transitions by the narrator.
De Roberto, E. (2013). Usi formulari delle costruzioni assolute in italiano antico: dal discorso alla grammatica. In C. Giovanardi & E. De Roberto, Il linguaggio formulare in italiano tra sintassi, testualità e discorso. Casoria: Loffredo, pp. 153-212.
De Roberto, E. (2023). La ripetizione dal discorso alla grammatica: L’apporto della prospettiva formulare (con una presentazione del progetto ForMa). In D. Mastrantonio, I. G. M. Abdelsayed, M. Marrucci, M. Bellinzona, O. Paris & V. Bianchi, Repetita iuvant, perseverare diabolicum. Un approccio multidisciplinare alla ripetizione. Siena: Edizioni Università per Stranieri di Siena.
Ferranti, L. (2010). Aspetti della sintassi e della testualità della storiografia toscana cinquecentesca. Doctoral Thesis, Università degli studi Roma Tre.
Ferrari, A. (2002). Il testo come intreccio di gerarchie. In Italiano LinguaDue 1, pp. 582-594.
Guicciardini, F. (1971). Storia d’Italia. A cura di Silvana Seidel Menchi in 3 volumi. Torino: Einaudi (I millenni).
Günthner, S., Knoblauch, H. A. (1995). Culturally patterned speaking practices - the analysis of communicative genres. In: Pragmatics 5, pp. 1-32.
Longrée, D., Mellet, S. (2017). A text structure indicator and two topological methods: New ways for studying Latin historic narratives. In Digital Scholarship in the Humanities, Vol. 32, No. 3, pp. 577-590.
Machiavelli, N. (2010). Istorie fiorentine. A cura di A. Montevecchi & C. Varotti, coord. di G.M. Anselmi, tt. 1 e 2, Edizione nazionale delle opere di Niccolò Machiavelli, II. Opere storiche. Roma, pp. 77-785.
Mellet, S., Longrée, D. (2009). Syntactical Motifs and Textual Structures: Considerations based on the Study of Latin historical Corpus. In Belgian Journal of Linguistics 23, pp. 161-174.
Tacke, F. (2021). Sprache, Genres und Diskurstraditionen. Kognitionslinguistische Modelle im Lichte der romanistischen Theoriebildung. In Romanistisches Jahrbuch 72(1), pp. 118-55.
Vatri, Alessandro
Aristotle’s ‘diagrammar’: Formulaicity and multimodality in the Organon
The treatises that compose Aristotle’s Organon contain the earliest systematic treatment of (what we would call) formal logic, a discipline which lends itself both to symbolic representation and to visualization, as medieval mnemonics and study aids testify. The intrinsic multimodal character of Aristotle’s formulations is revealed by his use of denotative letters. Rather than standing for logical variables (as interpreters have often — anachronistically — thought), these have been shown by Netz to be used in the same way as they are in discussions of geometrical objects and problems — an idea that is suggestive of possible oral and visual classroom practices.
This, however, is not the only point of contact between Aristotle’s logic and contemporary geometry. As this paper will show, Aristotle’s logical metalanguage displays striking correspondences with that of mathematical texts, conditioned as both disciplines were by the necessity to express verbally abstract relations in a consistent and rigorous manner in the absence of graphic symbols. Similarly to Greek mathematicians, Aristotle uses formulaic elements to express what modern logicians would express symbolically. Such elements include lexical items (e.g. quantifiers, deontics, etc.), connectives, syntactic patterns and constructions (e.g. kata tinos huparkhein ‘to apply as a predicate’). One of the measurable consequences of the formulaic character of Aristotle’s logic is its significantly low lexical variety in comparison both with other Aristotelian texts (even though discourse-structuring formulae — e.g. phaneron esti ‘it is clear that…’ — are conspicuous in the corpus) and samples of Greek prose of different genres. [As computed from the digital texts included in the Diorisis Ancient Greek Corpus (https://figshare.com/articles/dataset/The_Diorisis_Ancient_Greek_Corpus/6187256).]
Netz, R. 2009. Ludic Proof. Greek Mathematics and the Alexandrian Aesthetic. Cambridge.
Netz, R. 2022. A New History of Greek Mathematics. Cambridge.
Netz, R. 2023. Aristotle’s Three Logical Figures: A Proposed Reconstruction. Phronesis 68, 62–77.
Schironi, F. 2010. Technical Languages: Science and Medicine, in E. J. Bakker (ed.), A Companion to the Ancient Greek Language, Oxford, 338–53.
Schironi, F. 2019. Naming the Phenomena. Technical Lexicon in Descriptive and Deductive Sciences, in A. Willi/P. Derron (eds), Formes et fonctions des langues littéraires en Grèce ancienne, Entretiens Hardt lxv, Vandœuvres, 227–58.
Vezzosi, Letizia & Rosselli Del Turco, Roberto
Poetic formulas in the Germanic literatures of the Middle Ages: semantic annotation and analysis
Since Magoun’s 1953 article, which was based on the theories of Milman Parry (Parry 1932), studies on poetic formulas in Old English and other Germanic languages have flourished, quickly becoming one of the most intriguing topics in literary research within the field of medieval Germanic cultures. The oral-formulaic composition system used in Germanic languages not only reveals significant connections with the Indo-European tradition, but also developed in a particularly sophisticated manner. Combined with other stylistic devices such as kennings and imaginative compounds, formulas play a fundamental role in the compositional process: they serve both as a mnemonic aid for memorization and subsequent recitation, and as an essential tool for the Germanic poet.
In the past, notable research employing quantitative methods has led to a general assessment of the relevance of formulas in Germanic poetic texts (Green 1971). More recent projects, such as the CLASP project (Orchard 2018), allow for identifying formulas in poetic texts and providing a hyperlinked list to assess the formula’s usage in context. However, we believe it is crucial to combine these research methods with a data preparation process that considers the semantic dimension. This would allow us to evaluate not only the context, frequency, and other statistical aspects of the formulas, but also their fundamental structure, the way this structure changes over time, and the specific usage patterns of individual authors, particularly their ability to innovate upon the pre-established repertoire (Russom 1978).
To achieve this goal, we intend to code sample texts in multiple Germanic languages using XML/TEI schemas, developing a flexible encoding model suited for cross-referencing within marked texts. A sufficiently rich level of annotation allows the TEI document to function as a repository of information that can be processed not only for text visualization, but also to generate new knowledge through analytical tools that enable complex searches and comparisons (Rosselli Del Turco 2021). The first step is to define an encoding model that takes into account the structural characteristics of the poetic formulas. This will also make it possible to establish a typology so that the sample texts used for the proposed research can be analyzed and compared in detail.
Orchard, Andy. 2018-. “CLASP: A Consolidated Library of Anglo Saxon Poetry.” Accessed September 25, 2021. https://clasp.ell.ox.ac.uk/.
Green, Donald C. 1971. “Formulas and Syntax in Old English Poetry: A Computer Study.” Computers and the Humanities 6 (2): 85–93. https://www.jstor.org/stable/30199462.
Magoun, Francis P. 1953. “Oral-Formulaic Character of Anglo-Saxon Narrative Poetry.” Speculum 28 (3): 446–67. https://doi.org/10.2307/2847021.
Parry, Milman. 1932. “Studies in the Epic Technique of Oral Verse-Making: II. The Homeric Language as the Language of an Oral Poetry.” Harvard Studies in Classical Philology 43:1–50. https://doi.org/10.2307/310666.
Rosselli Del Turco, Roberto. 2021. “Elaborazione di dati semi-strutturati: ipotesi implementative e casi d’uso tratti da testi in inglese antico.” Umanistica Digitale, no. 10 (September), 387–407. https://doi.org/10.6092/issn.2532-8816/12598.
Russom, Geoffrey R. 1978. “Artful Avoidance of the Useful Phrase in ‘Beowulf’, ‘The Battle of Maldon’, and ‘Fates of the Apostles.’” Studies in Philology 75 (4): 371–90. http://www.jstor.org/stable/4173979.
Wong, Catherine, Fitzmaurice, Susan & Lam, Benson SY
Tracing Formulaic Patterns and Language Change in Early Modern English: A Quantitative and Computational Approach
This study examines how formulaic expressions reveal linguistic change in Early Modern English, focusing on religious and institutional language. Drawing on the data-driven approach (Buerki, 2019, 2020; Hilpert & Cuyckens, 2015), this methodology affords a broad overview of trends while also highlighting synchronic differences within the period. Using NLP techniques – including n-gram and temporal analysis – applied to the EEBO-TCP and ECCO-TCP corpora, the study tracks multi-word expressions (MWEs) across yearly and decadal intervals to explore trends leading up to Late Modern English. This approach enables a diachronic analysis of linguistic shifts while capturing the predominance of religious discourse and its gradual blend into institutional language.
In this pilot study, n-gram analysis was applied to over 40,000 documents (approximately two-thirds of the two corpora) to extract meaningful n-grams and identify MWEs. These expressions were examined using both frequency-based and statistical collocation measures. Temporal analysis revealed patterns of language variation over time, particularly in religious discourse, providing insights into historical changes.
Religious utter sequences (USs) such as ‘lord jesus christ’, ‘lord thy god’, ‘father son holy spirit’, and ‘father son holy ghost’ dominate the data, functioning as verbal routines in prayer books and liturgy. Commandments like ‘thou shalt love thy neighbour’ also feature prominently, not as literal instructions but as fixed phrases lifted from canonical religious texts that reflect social and religious norms. These formulaic expressions, as Lakoff and Johnson (2008) argue, are phrasal lexical items constructed by ‘metaphorical concepts’. Rather than conveying novel meaning in specific contexts, they are contextually appropriate within religious doctrine, functioning as institutional utterances that reinforce faith and social order. This highlights the prescriptive nature of religious discourse during the period, where language served to reiterate and maintain established religious and societal structures.
In contrast, the multi-word unit ‘chief lord justice’ stands out in the mid-1600s as a non-religious salutation or address, marking the growing salience of secular social hierarchy, although few similar expressions appear. Collocations of ‘church’ offer further insight into the evolving relationship between religious and institutional discourse. While ‘god’ and ‘christ’ as collocates of ‘church’ prevail consistently throughout the period, ‘england’, ‘rome’, and ‘roman’ as collocates of ‘church’ begin to rise after the mid-1600s, peaking around 1690. Investigating these collocations provides a valuable perspective on how the term ‘church’ transitioned from a primarily religious concept to one embedded in civic and institutional language over the course of the 17th century.
This study contributes to the field of historical linguistics by illustrating how formulaic expressions reveal shifts in language related to both religious and institutional contexts during Early Modern English. By employing NLP techniques and a data-driven approach, the research highlights the significance of these expressions in tracking linguistic change and underscores their role in reflecting evolving social hierarchies and norms.
Buerki, A. (2019). Furiously fast: On the speed of change in formulaic language. Yearbook of Phraseology, 10(1), 5-38. https://doi.org/10.1515/phras-2019-0003
Buerki, A. (2020). Formulaic Language and Linguistic Change: A Data-Led Approach. Cambridge University Press.
Hilpert, M., & Cuyckens, H. (2015). How do corpus-based techniques advance description and theory in English historical linguistics? An introduction to the special issue. Corpus Linguistics and Linguistic Theory, 11(2), 141-150.
Lakoff, G., & Johnson, M. (2008). Metaphors We Live By (2nd ed.). University of Chicago Press.
Wong, Jorge
The Formulaic Template and Linguistic Innovation in Homer
This paper explores the connection between formulaic language and linguistic innovation in Homeric diction. Since the pioneering studies of Milman Parry, Homeric scholars have equated “formulaic” with “archaic” in the language of Homer, especially regarding Aeolic features, considered vestiges of an older phase of epic poetry. The argument is that the Homeric poets, compelled by the exigencies of extemporaneous oral composition, resorted to the constructions and collocations that they knew best, the ones they had heard from their own masters. Some resistance to this theory was offered by Hoekstra and Hainsworth, both of whom sought to find room for creativity in the Homeric formula either through the modification of formulas or a flexibility inherent in the formulas themselves. Recently, Nussbaum and Tate have pointed out examples of formulaic language not tied to specific formulas. These formulaic templates are abstract syntactic patterns that can generate surface outputs with no lexical items in common.
In this paper, I will show that these formulaic templates also encode for specific dialect forms. Specifically, I investigate the distribution of some of the Aeolic features deemed most formulaic in Homer, such as the genitive in -οιο and dative in -εσσι. These endings usually appear either at the feminine caesura or at the end of the verse, long thought harbor archaisms in Homeric language. Using the framework of the formulaic template, I demonstrate that the poets of the Iliad and Odyssey perceived a relationship between certain parts of the verse and the different dialect endings, so much so that when introducing novel linguistic forms into the poetic dialect, they outfitted them with these Aeolic endings to fit certain caesuras, e.g. νέεσσι (cf. Aeolic νᾶεσσι and Ionic νηυσί) and ἐπέεσσι (cf. Aeolic ἔπεσσι and Ionic ἔπεσι). This same process led to the creation of mixed dialect formulas like line final Μενελάου κυδαλίμοιο (14x), ὁμοιΐου π(τ)ολέμοιο (8x), and στυγεροῦ πολέμοιο (Δ 240, Ζ 330). More abstract syntactic patterns are reflected in parallel constructions like ἀπὸ βηλοῦ θεσπεσίοιο (Α 591) ~ ἀπὸ χαλκοῦ θεσπεσίοιο (Β 457), and υἱὸς ὑπερθύμοιο Κορώνου Καινεΐδαο (Β 747) ~ υἷε δύω Λήθοιο Πελασγοῦ Τευταμίδαο (Β 843), and αὐτὰρ ὃ Ἰφίκλοιο πάϊς τοῦ Φυλακίδαο (N 698).
Hainsworth, J.B. 1968. The Flexibility of the Homeric Formula. Oxford.
Hoekstra, A. 1965. Homeric Modifications of Formulaic Prototypes. Amsterdam.
Nussbaum, A.J. 2018. “The Homeric Formulary Template and a Linguistic Innovation in the Epics.” In Language and Meter, edited by D. Gunkel and O. Hackstein. Leiden; Boston. pp. 267–318.
Parry, M. “Studies in the Epic Technique of Oral Verse-Making: II. The Homeric Language as the Language of an Oral Poetry.” Harvard Studies in Classical Philology 43 (1932): 1–50.
Tate, A. P. 2011. “Modularity and the Spectrum of Formularity in the Homeric Corpus.” Cornell Dissertation.
Yiftach, Uri
Syntactical Transformations in the Clause recording the Duties of the Lessee in Early Roman Egypt
Greek lease contracts from Egypt regularly record the duties of the lessee in the duration of the contract. However, whereas the routine formulation of text is initially—in the Ptolemaic period and in the Roman Arsinoites—paratactical (the act being recorded in an independent clause), in contemporary Oxyrhynchos, and later on throughout Egypt, the duties of the lessee are recorded hypotactically, introduced through the semiconsequential or semifinal ἐπὶ τῷ, ὥστε, or ἐφʼ ᾧ, all with the infinitive, which is predominately in the aorist tense. The semiconsequential clause is appended to the 'creation clause', the clause that records the act of lease per se, stressing the tight connection between the act of lease and the consequential duties. In my proposed paper I shall discuss three documents from second and third century Oxyrhynchos—P.Oxy. L 3596 (ca. 240-255 CE), P.Ross.Georg. II 19 (141 CE), and PSI XIII 1338 (299 CE)—that exhibit both the paratactical and the hypotactical formulation. I will investigate the cohabitation, and respective position of both types of duty clauses in the same documentary context.
Zilio, Leonardo & Arblaster, Paul
Exploring formulaic language in 17th-century Dutch-language newspaper articles
Journalism is one sphere in which formulaic language is common, since it creates communicative shortcuts that enable news to be conveyed quickly and to be easily fitted into existing mental categories. Europe’s first weekly newspapers began to be published in the early seventeenth century: in Germany from 1605, the Dutch Republic from 1618, and the Spanish Netherlands and England from 1620. These newspapers often reported the same news, sometimes on the basis of the same newsletters or simply by copying it from one another. The focus was on great public events: movements of fleets and armies, military engagements, arrivals and departures of ambassadors, the public life of royal families, and the publication of decrees or proclamations.
While the weekly production of thousands of words of text should in theory create a massive multilingual corpus, in practice survivals are patchy and unpredictable, and do not always overlap. One of the best-preserved news series of the early seventeenth century is that of the Nieuwe Tijdinghen, published in Antwerp in the years 1620-1629 (see Arblaster 2024). Its unusually good survival is due to some owners having issues bound into annual volumes to keep a year’s overview of the news (Pettegree 2015). Between collections in Amsterdam, Antwerp, Ghent, The Hague, London, and above all the Royal Library of Belgium in Brussels, most issues survived and have been collectively catalogued both digitally [USTC N4-1 to N4-1460] and in print (Der Weduwen 2017). Many issues, particularly those held in Antwerp, Brussels and Ghent, have now been made available online, but only as images rather than text.
This study uses a relatively small sample, mostly transcribed in the late 1990s by Paul Arblaster while working on his doctoral thesis, which after revision was published as From Ghent to Aix: How They Brought the News in the Habsburg Netherlands, 1550-1700 (Arblaster 2014). With the aid of computational tools, we investigate linguistic features of this 51k-word corpus, focusing specifically on recurrent expressions, terminology and formulaic language. We use corpus linguistics tools, such as AntConc (Anthony 2005) and Sketch Engine (Kilgarriff et al. 2008) to produce an initial overview of n-grams, collocations, keywords and multi-word terms, to then further organise and expand these lists using a semi-automatic approach.
The historical writing that is present in these documents means that the original lists extracted with automatic tools need to be manually validated to some extent, as many words that are not present in modern corpora will result in false positives for the analysis, for instance, of keywords and terms. After a manual revision, Python scripts are employed to search for variant spellings by using edit-distance algorithms. We also briefly test whether tools based on generative artificial intelligence can help in detecting textual regularities and terminology in these historical documents. The resulting lists of expressions and terms are then further contextualised with the help of concordances and example-sentences to form a repository of the formulaic language that is present in these historical newspaper articles.
Anthony, L. (2005, July). AntConc: design and development of a freeware corpus analysis toolkit for the technical writing classroom. In IPCC 2005. Proceedings. International Professional Communication Conference, 2005. (pp. 729-737). IEEE.
Arblaster, P. (2014). From Ghent to Aix: How They Brought the News in the Habsburg Netherlands, 1550-1700. Leiden: Brill.
Arblaster, P. (2024). Las noticias publicadas por Abraham Verhoeven en Amberes en 1621, in Manuel Borrego and Carmen Espejo-Cala (eds.) El mundo en 1621: Avisos, relaciones de sucesos, conexiones culturales. Besançon: Presses universitaires de Franche-Comté, 139-161.
Der Weduwen, A. (2017). Dutch and Flemish Newspapers of the Seventeenth Century, 1618–1700, vol. Leiden: Brill, 334-417.
Kilgarriff, A., Rychly, P., Smrz, P., & Tugwell, D. (2008). The sketch engine. Practical Lexicography: a reader, 297-306.
Pettegree, A. (2015). Tabloid Values: On the Trail of Europe’s First News Hound, in Richard Kirwan and Sophie Mullins (eds.) Specialist Markets in the Early Modern Book World. Leiden: Brill, 15-34.
Comments