BioASQ 2022-DisTEMIST [NLP system]: Biomedical Entity Linking with Transformers - Florian Borchert
Автор: Biomedical Text Mining
Загружено: 2022-09-05
Просмотров: 344
Описание:
HPI-DHC @ BioASQ DisTEMIST: Spanish Biomedical Entity Linking with Pre-trained Transformers and Cross-lingual Candidate Retrieval
Presenter: Florian Borchert
Digital Health Center, Hasso Plattner Institute, University of Potsdam, Germany
Abstract:
Biomedical named entity recognition and entity linking are important building blocks for various clinical applications and downstream NLP tasks. In the clinical domain, language resources for developing entity linking solutions are scarce: only a few datasets have been annotated on the level of concepts and the majority of concept aliases in target ontologies are only available in English. In such a resource-constrained setting, pre-training and cross-lingual transfer are promising approaches to improve performance of entity linking systems. In this paper, we describe our contribution to the BioASQ DisTEMIST shared task. The goal of the task is to extract disease mentions from Spanish clinical case reports and map them to concepts in SNOMED CT. Our system comprises a Transformer-based named entity recognition model, a hybrid candidate generation approach, and a rule-based reranking step. For candidate generation, we employ an ensemble of 1) a TF-IDF vectorizer based on character n-grams and 2) a cross-lingual SapBERT model. Our best run for the entity linking subtrack achieves a micro-averaged F1 score of 0.566, which is the best score across all submissions in this track. A detailed analysis of system performance highlights the importance of task-specific entity ranking and the benefits of cross-lingual candidate retrieval.
Track Webpage:
https://temu.bsc.es/distemist/
Resources:
https://github.com/hpi-dhc/distemist_...
Paper: http://ceur-ws.org/Vol-3180/paper-15.pdf
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: