Excited to go to Brno in September as Keynote speaker at the Historical Network Research Conference!
Our NWO Vrije Competitie Project Framing Situations in the Dutch Language got funded!
A recent version of my CV can be found here
My main interest lies in methodological aspects of research in Computational Linguistics. I am driven by the question of how computational models of language work: what patterns and systems are found in natural language? How can they be modeled computationally? Which computational methods are suitable for modeling or analyzing which phenomena.
I am currently working on two main topics. Together with my PhD student, Pia Sommerauer, I am exploring semantic models in order to gain a better understanding of (word) embeddings. What is the impact of the algorithm chosen, distribution in the data or random settings as part of the neural network used. This interest came out of my main research topic: investigating in what (subtle) ways people express perspectives. Here, I design and implement tools that extract patterns of how specific groups of people, events or concepts are described in large amounts of text. For instance, do media systematically talk differently about events when the actors have a certain ethnic background? What is said about health and weight and how did that change over time? The basic system extracts transparent patterns, using labels that historians, social scientist and other interested researchers can read directly. In the final stage of this research, I plan to combine this with the observations done about aspects of meaning captured by embeddings.
Over the last five years, I have mainly focused on methodological aspects is the application of NLP to digital humanities. This work is mainly carried out as part of the BiographyNet project. In this project, we (a historian, computer scientist and me) work together to see how we can use NLP and Semantic Web technology to enhance historic research on the Biography Portal of the Netherlands. My research addresses how we can identify information that is useful for historians from text and how we can make sure that historians can assess the reliability of the output of tools of which they do not know the working.
The Network Institute projects Time will tell a different story and Political Discourse in the News also addressed the question of how NLP can be used in historic research and communication science, respectively.
As part of investigating methodological issues, I have also worked on issues regarding the system architecture in NewsReader and am coordinating the Enlighten Your Research project Can we Handle the News, where we pushed the limits of large scale processing and investigate what would be needed to process all the news that is published every day.
My PhD thesis proposed a new methodology for developing linguistic precision grammars. The main idea of the proposal, storing alternative analyses in a metagrammar so that they may be compared at different stages of the development process, can be applied in any theory. I particularly looked at grammars developed as part of the DELPH-IN consortium in which context open-source HPSG-based grammars are developed. The method is also closely related to the LinGO Grammar Matrix.
I am currently working as a researcher at the Computational Lexicology and Terminology Lab and visiting researcher at the Web and Media group at VU University Amsterdam. I am also part of the the Network Institute.
Recently started projects:
- CLARIAH project HHuCap: mining careers in text
- College voor de Rechten van de Mens: Identifying age discrimination in job advertisements
- Academy Assistant project: A linguistic and behavioral assessment of a possible generic optimistic bias in individuals
Recent invited talks
- 12 March 2018. Moslims in het News. Duolezing met Abdessamad Bouabid. Bij: Over Beeldvorming gesproken, hoe kunnen we moslimdiscriminatie voorkomen? Den Haag.
- 7-8 December 2017. An introduction to distributional semantics. Invited Speaker Reading like a human workshop. Amsterdam
- 12 October 2017. A closer look at distant reading. At: National eScience Symposium. Science in a Digital World.
- 11 September 2017. Possibilities and risks of using distributional semantics for identifying concept drift. Keynote Speaker Drift-a-LOD workshop. Semantic Conference, Amsterdam
- 8 June 2017. Panel member “Unhinging the National Framework: Diaries and the Digital”. IABA Conference, Kings College, London.
- 16 May 2017: Presenting at the CLS Speaker Series, University of Amsterdam
- 10-11 March 2017: Invited Talk about investigating Biographical data, Suwon, Korea.
- 6 Februari 2017: Presentation on Linked Data and Linguistic Research LD4LR.
- 22 November 2016: DESIDERIA workshop on concept drift
- 30 November 2016: Invited talk at the Tilburg Center for Cognition and Communication
- 16 November 2016: Keynote presentation at the SWE-CLARIN Workshop
- 23 September 2016: Guest speaker at LAP launch in Oslo, Norway
- 26 September 2016: Guest speaker at CRETA-Werkstatt in Stuttgart, Germany.
Events I (co-)organized
- Workshop on Muslim stereotyping: 20 October 2017, Leiden
- BD2017: 6-7 November 2017. Linz, Austria.
- April 18-21 2017: Workshop on Language, Knowledge and People in Perspectives.
- 11 July 2016, the Workshop on Biographical Data and Datamodels co-located with DH2016 took place in Krakow
- 18 December 2015, CLIN26 (The 26th Computational Linguistics in the Netherlands) took place in Amsterdam.
- 9 April 2015, the first Conference Biographical Data in a Digital World took place in Amsterdam.