phraseo-mwe2026 : Multi-word expressions and phraseology – corpus-based and computer-processed
Vienna (Austria)
29-29 Sep 2026
phraseo-mwe2026 : Multi-word expressions and phraseology – corpus-based and computer-processed
Vienna (Austria)
29-29 Sep 2026
Workshop date: 29 September 2026, 09:00–14:00
Key Words: Phraseology, Multi Word Expressions (MWEs), Corpus, Lexicography, Computer Science
Phraseology (Cowie 1998; Granger & Meunier 2008; Mitkov 2017; Mel’čuk 2023; Polguère 2002, 2014; Mejri 2018; Chen 2021), or multiword expressions (MWEs) (Savary 2008; Constant 2012) occupy a central position in linguistic description, language use, and language learning. Idioms, collocations, lexical bundles, and fixed or semi-fixed expressions constitute a significant part of natural language, yet they remain challenging to model, describe, and represent in lexicographic resources (Pecina 2010; Mel’čuk 2011; Polguère 2014; Chen 2025). With the rapid development of corpus linguistics, Natural Language Processing (NLP), and Artificial Intelligence (AI), phraseology and lexicography are currently undergoing a profound methodological and conceptual transformation.
From a modeling perspective, large-scale corpora (Mitkov 2017), both general and specialized, enable the systematic identification of MWEs through statistical, distributional, and syntactic approaches. Methods such as n-gram extraction, association measures (PMI, t-score), syntactic patterning, and embedding-based similarity allow researchers to capture degrees of fixedness, semantic compositionality, and contextual variability. These advances open new perspectives for representing phraseological knowledge in structured models, including ontologies, lexical networks, and standards such as OntoLex-Lemon (McCrae; Bosque-Gil et al. 2017; Bosque-Gil et al. 2019).
Technological tools have also profoundly reshaped lexicographic practices (Atkins & Rundell 2008; Granger & Paquot 2012). Digital corpora, web data, and annotation platforms facilitate the semi-automatic extraction and validation of phraseological units (Mitkov 2017; Evert 2008; Gries 2008; Constant et al. 2017). New-generation lexical resources integrate rich metadata, usage examples, frequency information, and semantic relations, making them more dynamic and interoperable (Polguère 2014; Cimiano et al. 2016; McCrae et al. 2017). Computational approaches also enable multilingual and contrastive resources essential for studying phraseological variation across languages and cultures (Paquot 2015; Mel’čuk 2011).
New technologies play a crucial role in the design of pedagogical dictionaries and learning-oriented resources (Bogaards & van der Kloot 2002; Lew 2012). Phraseology is often a major obstacle for language learners, as MWEs cannot always be interpreted compositionally (Wray 2002; Howarth 1998; Granger 1998). Corpus-based examples, learner-oriented definitions, and adaptive digital interfaces can significantly enhance the accessibility of phraseological information. AI-driven tools and LLM-assisted lexicography offer promising avenues for generating contextualized examples and learner-adapted usage notes (Heift & Schulze 2007; Godwin-Jones 2023; Bender & Koller 2020).
In translation lexicography, phraseological units pose well-known challenges due to their idiomaticity and phraseocultural specificity (Chen 2022a; 2022b). Parallel corpora, alignment tools, and machine translation systems provide valuable data for identifying translation equivalents and strategies (Chen et al. 2024). Digital translation dictionaries can now incorporate cross-linguistic mappings, semantic annotations, and attested examples, bridging lexicographic description and translational practice.
Finally, computational analysis of existing dictionaries (Béjoint 2010; Lew 2013) offers new insights into lexicographic traditions. Large-scale comparison of entries, coverage, and microstructure contributes to understanding how dictionaries evolve in response to technological and societal changes (Hausmann 1989; Tarp 2012).
The intersection of phraseology, MWEs, lexicography, and new technologies constitutes a fertile research domain, fostering innovative models, richer resources, and more effective tools for analysis, learning, and translation.
| March 8, 2026 | Abstract submission |
| April 2, 2026 | Abstract acceptance notification |
| April 17, 2026 | Deadline for confirmation of workshop attendance |
| July 10, 2026 | Early Bird registration deadline for EURALEX |
| July 24, 2026 | Submission of camera-ready abstracts |
| September 1, 2026 | End of registration for EURALEX |
| September 29, 2026 | Phraseo-MWE-2026 at the Austrian Academy of Sciences, Vienna |
Dr. CHEN Lian 陈恋 Laboratoire LLL, University of Orléans; CRLAO – CNRS – INALCO, France https://lianchen.fr/
Dr. KABASHI Besim Eberhard Karls Universität Tübingen, Germany https://www.besim-kabashi.net/
Loading...