Methods for Missing Data, Model Selection and Model Averaging
April 26, 2022 - Research proposal for a master thesis
We are offering a master thesis together with the BAdW:
Recognition of handwritten cards within an old Occitan dictionary project
[Supervisors: C. Heumann, E. Garces Arias, M. Schöffel]
April 05, 2022 - Paper accepted at workshop at ACL 2022
Our paper "Pre-trained language models evaluating themselves - A comparative study" was accepted at the 3rd Workshop on Insights from Negative Results in NLP co-located with ACL 2022!
[Authors: P. Koch, M. Aßenmacher and C. Heumann]
February 01, 2022 - New member of the group
Esteban joined our working group as a new PhD student today.
Model selection and model averaging are two important techniques to find statistical models with good properties. While many approaches have been proposed for complete data, it is often unclear how to proceed with data with missing values. In collaboration with our partners in South Africa (Dr. Michael Schomaker), we have proposed several approaches using multiple imputation. Furthermore, we have combined multiple imputation with resampling (bootstrap) to get confidence intervals with reasonable nominal coverage.
A part of our group is currently working in the field of Natural Language Processing (NLP). We are specifically interested in the comparability and evaluating/benchmarking large pre-trained language models used for transfer learning. We are also particularly interested in the use of such models as potential substitutes for traditional knowledge bases.