Projects

MAVA

2025-2026

Project partners:

Dr. Teodora Vuković, Dr. Jeremy Zehr and Prof. Dr. Noah Bubenhofer (UZH Linguistic Research Infrastructure), Prof. Dr. Josephine Diecke and Dr. Simon Spiegel(UZH Seminar für Filmwissenschaft), Prof. Dr. Ralph Ewerth and Dr. Eric Müller-Budack (Visual Analytics Group, Technische Informationsbibliothek, Leibniz University Hannover), Dr. Cristina Grisot (CLARIN-CH National Coordinator), Dr. Milosavljevic Stefan (Swiss Data Service Center)

Funded through the Swiss Data Service Center (SDSC)

Multimodal Audio-Visual Analysis Cluster for Infrastructure Interoperability and Data Exchange – MAVA

The MAVA Cluster aims to improve the interoperability and data exchange among three research tools – VideoScope, TIB-AV-A, and VIAN. By standardizing data formats and developing shared APIs, the project will enhance data sharing and analysis capabilities across linguistic, media studies, and audiovisual research. This infrastructure will, for the first time, enable shared research workflows and ensure adherence to FAIR principles, enhancing the accessibility and reusability of research data. The project’s integration with the CLARIN infrastructure will align it with European research standards, significantly aiding the research community’s ability to handle complex multimodal datasets.

Leading House: Linguistic Research Infrastructure LiRI (UZH)

My role:

Project partner – Data exchange and link to CLARIN-CH


MORCDA

2025-2026

Project partners:

Prof. Dr. Philipp Dreesen (ZHAW), Prof. Dr. Waldemar Czachur (University of Warsaw), Prof. Dr. Stephanie Evert (Friedrich-Alexander-Universität Erlangen-Nürnberg), Dr. Cristina Grisot (National Coordinator CLARIN-CH), Prof. Dr. Goranka Rocco (University of Ferrara)

Funded through the OSCARS Cascading Grants

Making Open Research Data Suitable for Comparative Discourse Analysis – MORCDA

The promotion of a European public sphere is critical for the cultivation of democratic discourse and opinion formation. In this context, discourse analysis offers profound insights into transnational media discourses, supported by multilingual and comparative perspectives. Recent advancements in discourse analysis leverage machine learning and corpus analysis, expanding the scope for cross-border linguistic research. However, the full potential of ORD for comparative, multilingual discourse analysis remains underutilised.

Project objective

MORCDA will bridge gaps between comparative discourse researchers and open research data communities, facilitating the adoption of ORD practices and optimising RIs for enhanced comparative analysis across European languages and contexts. It will do so through workshops with academic partners in Europe. In particular, it will focus on:

  • Data and Metadata Standardisation and Access: Process linguistic data from diverse sources (media, parliament) using advanced computational tools to ensure compatibility and adherence to FAIR principles in, e.g., CLARIN.
  • Best Practices and Tool Development: Identify needs within the comparative discourse analysis community and develop best practices and digital tools tailored for multilingual and comparative research.
  • Community Engagement and Training: Develop targeted training materials to elevate ORD skills among comparative discourse researchers, promoting the integration of digital language resources.

Leading House: Zurich University of Applied Sciences (ZHAW)

My role:

Project partner – contribution about Open Science and Open Research Data; Link to CLARIN Infrastructure


EOSC Data Commons

2025-2028

Project partners:

Austria: EODC Earth Observation Data Centre
Croatia: Sveučilište u Zagrebu, Sveučilišni Računski Centar (SRCE)
Czech Republic: CESNET z. s. p. o.
Finland: Helsingin Yliopisto / University of Helsinki
France: Centre National de la Recherche Scientifique (CNRS)
Germany: Albert-Ludwigs-Universität Freiburg, Forschungszentrum Jülich GmbH, Technische Universität Dresden
Italy: Consorzio Interuniversitario Risonanze Magnetiche di Metallo Proteine (CIRMPP), Istituto Nazionale di Fisica Nucleare (INFN)
Netherlands: DANS Koninklijke Nederlandse Akademie van Wetenschappen (KNAW), Universiteit Utrecht, Stichting COAR
Poland: Akademia Górniczo-Hutnicza w Krakowie CYFRONET
Spain: Agencia Estatal Consejo Superior de Investigaciones Científicas, Universitat Politècnica de València
Switzerland: Organisation Européenne pour la Recherche Nucléaire (CERN), Eidgenössische Technische Hochschule Zürich, SIB Swiss Institute of Bioinformatics, SWITCH, Universität Zürich

Funded through HORIZON-INFRA-2024-EOSC-01-05 — Innovative and customizable services for EOSC Exchange

EOSC Data Commons: Services for inter- and cross-disciplinary data discovery, access, sharing and reuse in the EOSC Federation

In the current European Research Area, a wealth of datasets, analysis platforms, and digital services of national and pan-European relevance are available. However, their research value chain is only partly realised due to policy and technical barriers. This project addresses this challenge supporting the research data lifecycle management with the development and provisioning of new services for the integrated discovery, analysis, deposition, preservation, sharing, use and reuse of research data. The project builds on the existing functional capabilities of the European Open Science Cloud (EOSC) EU Node, and extends these with six new technical solutions captured by the project Key Exploitable Results.

The project involves 12 national repositories of cross-institute and cross-border pan-European relevance in the future EOSC federation.

The ultimate impact of the project is to contribute to the establishment of EOSC as the European Research Commons, ‘a global trusted ecosystem that provides seamless access to high-quality interoperable research outputs and services’ (Payne et al. 2023) that enables European researchers to collaborate more easily, be more productive and achieve higher levels of excellence. The EOSC Data Commons technical solutions leverage the resources and capabilities of Research Infrastructures and e-Infrastructures, and other EOSC stakeholders at national, regional and European level, elevating their technology readiness and integrability.

Leading House: Stichting EGI (EGI)

My role:

Project partner – coordination of the SSHOC-CH national consortium (UZH, DaSCH, SWISSUbase)


FAIR-FI-LD

July 2024-June 2025

Project partners:

Prof. Dr. Noah Bubenhofer (LiRI, University of Zurich), Prof. Dr. Julia Krasselt (ZHAW), Prof. Dr. Johanna Miecznikowski (USI), Dr. Cristina Grisot (National Coordinator CLARIN-CH)

Funded by swissuniversities

Moving towards a national FAIR-compliant ecosystem of Federated Infrastructure for Language Data – FAIR-FI-LD

This is a swissuniversities ORD-funded 12-months project (July 2024-June 2025) hosted by the University of Zurich, with the participation of the Zurich University of Applied Sciences and the Università della Svizzera italiana. In the last 5-10 years, Swiss HEIs have been working on building national services for language data. They include, up to now, the Linguistic Research Infrastructure (LiRI-UZH), the Swiss-AL Platform for Applied Sciences (ZHAW), a national repository for the publication and long-term preservation of language data LaRS@SWISSUbase (UNIL, UZH), and various smaller tools and services. These units however are not all interoperable, which reduces the potential for collaboration and data reuse. In addition, fields such as interactional linguistics or second language acquisition lack adequate infrastructure.

With the foundation of the CLARIN-CH consortium in 2020 (9 HEIs and the SAGW), the HEI’s efforts took a new direction: work together to build a FAIR-compliant, sustainable and expandable CLARIN-CH ecosystem of federated infrastructure to answer the needs of researchers and professionals using language data in Switzerland and beyond; an ecosystem that must be interoperable at the national and European levels. The present project aims at realizing important steps towards this mid- and long-term goal, in compliance with the Swiss ORD strategy, by prototyping and producing:

  • interoperable underlying software using NLP techniques and exploratory AI techniques
  • harmonized metadata between the existing Swiss infrastructure components and the European CLARIN infrastructure
  • CLARIN federated content search (FCS) to query each component of the infrastructure
  • a FCS multilingual landing page hosted on the CLARIN-CH website
  • a frontend of the VIAN-DH@LiRI environment to visualize, query and analyze multimodal talk-in-interaction data, hosted at USI,
  • documentation and training to support the use of the infrastructure and inform about legal and ethical issues related to language data in the context of Open Science.

My role:

Principal investigator: management and responsible of two work packages.


UpLORD

March 2023-June 2025

Project partners:

Prof. Dr. Noah Bubenhofer (LiRI, University of Zurich), Dr. Andrea Malits (Zurich University Library), Dr. Cristina Grisot (National Coordinator CLARIN-CH)

Funded by swissuniversities

Upgrading the linguistic ORD-ecosystem – UpLORD

This is a swissuniversities ORD-funded 2-year project (2023-2024) hosted by the University of Zurich, with the support of the Zurich University Library and the CLARIN-CH Consortium. Since 2018, a consortium of partners has been working on building a national ecosystem of infrastructures, which covers the whole linguistic data lifecycle according to ORD requirements (FAIR principles: Findable, Accessible, Interoperable, Reusable) from data generating, processing and analyzing to data sharing and archiving. This ecosystem includes the national technology platform LiRI and the national repository for publishing and archiving linguistic data (SWISSUbase) as service providers, a database of Swiss media texts and a platform for hosting of and searching in large text and audio/video corpora. 

The project focuses on upgrading workflows and interoperability of existing infrastructure services, establishing working groups on the national level, documenting and promoting best practices, raising awareness and training about ORD practices in the context of teaching, research and publishing, and building a robust practice of data curation. In the long-term, this project will significantly contribute to a strong foundation for a sustainable ORD strategy for linguistic data in Switzerland.

My role:

Principal investigator: management and responsible of two work packages.


Temporalité et morphologie flexionnelle verbale en français

2021-2025

Project partners:

Prof. Marion Fossard (U. of Neuchâtel), Dr. D’Honincthun Peggy (Clinical neuroscieces Dept., CHUV), Prof. Démonet Jean-François (Faculty of Biology and Medicime, U. of Lausanne), Prof. Auclair-Ouellet Noémie (Sciences and Disorders, McGill University), Dr. Cristina Grisot (Language scientist, UNINE)

Funded by the Swiss SNSF

Temporalité et morphologie flexionnelle verbale en français : comment les marques flexionnelles témoignent-elles des compétences en référence temporelle?

Dans le cadre de ce projet, nous proposons de revisiter l’architecture ‘cognitivo-linguistique’ de la référence temporelle. Plus spécifiquement, notre ambition est d’aller au-delà des aspects morpho-phonologique ou morpho-sémantique du marquage de la référence temporelle et d’intégrer à l’analyse des TV les aspects conceptuels et mnésiques liés à la temporalité. Pour y parvenir, nous travaillerons avec différents groupes de participants : des participants sains, des participants présentant une aphasie (PAA), et des participants présentant une maladie d’Alzheimer (PMA), population pour laquelle les connaissances concernant les compétences grammaticales temporelles sont rares (voire inexistantes pour le français). Cette ‘triple vision’ nous assurera un recueil de données extrêmement riche, à même d’atteindre nos objectifs.

My role :

Project partner: provide linguistic and experimental expertise for planning the experimental tasks, building the material, collaborate for interpreting the data and reporting on the results.


Intercepter avec des interprètes

2019-2023

Project partners:

Prof. Nadja Capus (Law Faculty, U. of Neuchâtel),  Prof. Mag. Dr. Mira Kadric-Scheiber (U. of Vienna),  Prof. Dr. Esther González-Martínez (U. of Fribourg), Dr. Cristina Grisot (Language scientist, UNINE)

Funded by the Swiss SNSF

Le travail des interprètes lors de l’interception de communications dans le cadre d’enquêtes pénales

Intercepting wire, oral, or electronic communication is an important element of criminal investigations. The goal is to transform communication intercepts into evidence of probable cause. This measure of secret surveillance is technically and legally possible, but expensive, and of course, only of use if the content of the conversations can be understood, that is, made available by interpreters. Hence, criminal justice is completely dependent on good performances of interpreters. Interpreters lay the very foundation for subsequent interrogations and decisions by the Public Prosecutor to take further coercive measures or not.

According to the Swiss Criminal Procedure Code, jurisprudence and legal doctrine have so far neglected the significant and powerful role of these interpreters, whose activities are very different from those of courtroom or police interrogation interpreters. Scientific research has also mostly focused on courtroom interpreting, presumably because its context makes it more accessible. However, interpreters involved in interception face specific challenges and must have different qualities than courtroom interpreters, including special linguistic skills such as dialect knowledge, voice recognition skills, criminal investigation flair, even insider knowledge. Interpreters listen, select extracts, interpret, and transcribe. They are important contributors to the inevitable “entextualization” process—that is, the ways in which parts of intercepted conversations are categorized as incriminating and thus converted into criminal evidence.

My role:

Project partner: carry out an empirical analysis of linguistic strategies used by interpreters, report on the results in a scientific article.


Temporal reference in Mandarin

2022-2025

Project partners:

Dr. Juan Sun (Sun Yat-sen University), Dr. Cristina Grisot (Language scientist, UNINE, UZH)

Funded by the Chinese Government

Temporal reference in Mandarin

This project investigates the expression of temporal reference in Mandarin, a tenseless language. Typologically, Mandarin is an isolating (or analytic) language with a limited number of grammatical markers. For example, there is no overt grammatical marking of tense. The aim of the project is to investigate, in a systematic and quantitative way by means of authentic corpus data, how temporal reference is expressed in Mandarin (in the absence of verbal tenses). 

My role:

Project partner: provide linguistic and empirical expertise for planning corpus analyses, analyzing and interpreting the data, reporting on the results.


VTS

2018-2021

Project partners:

Prof. Jacques Moeschler (U. of Geneva), Prof. Genoveva Puskas (U. of Geneva), Dr. Joanna Blochowiak (CNRS, Lyon), Dr. Juan Sun (Sun Yat-sen University), Dr. Cristina Grisot (Language scientist, UNIGE)

Funded by the Swiss SNSF

Verbal Tenses and Subjectivity in a cognitive pragmatics perspective

The research carried in the VTS project follows four main research lines: (i) the pragmatics of verbal tenses and their relation to subjectivity in French, (ii) the link between subjectivity and verbal tenses, precisely the Historical Present and its difference with the Free Indirect Discourse (FID); (iii) the linguistic cues of speaker’s subjectivity, and (iv) the expression of temporal reference in two typologically different languages: Mandarin on the one hand and French and English on the other hand.

My role:

Post-doctoral researcher: Planning and collaborating in the management of the project, collecting, cleansing, analyzing, interpreting and managing linguistic  corpus and experimental research data, writing, editing and revising the ensuing scientific articles.


Processing temporal relations

April 2018 – March 2019

Project partners:

Dr. Cristina Grisot (Language scientist, UNIGE), Dr. Hannah Rohde (University of Edinburgh)

Funded by the Swiss SNSF

Pieces of discourse such as The plane landed, (then) passengers got off and The plane landed unexpectedly because it had a technical problem show that discourse connectives, generally used to explicitly express a discourse relation, sometimes are necessary and other times they become redundant. Scholars to date have not attended to the divergent behaviour of the various types of discourse relations and their explicit expression through discourse connectives. Assessing this issue is necessary for having an accurate understanding of how humans process pieces of discourse and how does linguistically encoded information pilot this process.

My role:

Principal investigator and postdoctoral researcher: Planning, submitting for funding and managing the research project, delivering and distributing tasks to a research team, collecting, analyzing and managing research data, managing the budget and writing the financial report.


MODERN

2014-2018

Prof. Jacques Moeschler (U. of Geneva), Prof. Andrei Popescu-Belis (IICT informatique et télécommunication HEIG-VD), Prof. Martin Volk (Institut für Computerlinguistik Universität Zürich), Prof. Ted Sanders (Utrecht Institute of Linguistics OTS University of Utrecht), Prof. Sandrine Zufferey (U. of Bern)

Funded by the Swiss SNSF

Modelling Discourse Entities and Relations for Coherent Machine Translation

The goal of MODERN is to model and automatically detect textual dependencies across sentences, and to study their integration within machine translation systems, with the aim of demonstrating improvement in translation quality, with a specific focus on the interplay between referring expressions and discourse relations in four languages: English, French, German and Dutch.

My role:

PhD student: Carrying out quantitative corpus studies, disseminating the research results, working in an interdisciplinary team.


COMTIS

2010-2014

Prof. Jacques Moeschler (U. of Geneva), Prof. Andrei Popescu-Belis (IICT informatique et télécommunication HEIG-VD), Prof. Sandrine Zufferey (U. of Bern), Prof. Paola Merlo (U. of Geneva), Dr. James Henderson (IDIAP Research Institute), Dr. Bruno Cartoni (Google), Dr. Thomas Meyer (Google)

Funded by the Swiss SNSF

Improving the Coherence of Machine Translation Output by Modeling Intersentential Relations

The objective of the COMTIS SNSF Sinergia project was to use insights from linguistic modeling and corpus linguistics in order to build computational models of discourse-level phenomena and to combine them with statistical machine translation systems, thus improving the quality of translated texts. The COMTIS researchers have advanced the state of the art in their fields and with respect to the overall objective, thanks to the close collaboration of all partners, also reflected in the quality and number of joint publications. More specifically, we have proposed multilingual models of discourse connectives and verb tenses, strongly grounded in empirical evidence from parallel corpora, mainly in English and French, but also German, Italian, and Arabic. These models have generated features which served to implement automatic labelling systems for discourse connectives, verbs, and pronouns, which were further combined, in several ways, with state-of-the-art statistical MT systems and with innovative tree-based decoding algorithms. This has led to demonstrable improvements of the MT output, as assessed by humans but also by an automatic reference-based metric which COMTIS proposed and validated.

My role:

PhD student: carrying out quantitative corpus studies, disseminating the research results, acquiring solid competences in statistical methods, organizing and participants in numerous workshops and training sessions, working in an interdisciplinary team.