IPN at MWE-2026 PARSEME 2.0 Subtask 1: MWE identification via related languages and harnessing thinking mode

Conference contribution (Article)ResearchPeer reviewed

Publication data


ByAnna Hülsing, Noah-Manuel Michael, Daniel Ignacio Mora Melanchthon, Andrea Horbach
Original languageEnglish
Published inProceedings of the 22nd Workshop on Multiword Expressions (MWE 2026)
Pages177-186
Editor (Publisher)Association for Computational Linguistics
ISBN979-8-89176-363-0
DOI/Linkhttps://doi.org/10.18653/v1/2026.mwe-1.24 (Open Access)
Publication statusPublished – 03.2026

We present IPN, our system for Subtask 1 of the PARSEME 2.0 Shared Task, which targets the identification of MWEs in 17 languages. Overall, IPN outperformed a much larger-parameter baseline model, yet a performance gap to the top-performing systems remains. To better understand these results, we investigate Qwen3-32B’s suitability for mono-, cross- and multilingual MWE identification. We also explore whether this model benefits from prepending automatically generated thinking data to the gold label during instruction-tuning. We find that target language data is vital for instruction-tuning. Prepending generated thinking data to a subset of the training data slightly improves performance for two out of three languages, but more detailed evaluation is required.