Worldwide Thesis Database & PhD tips

Nรฉvmรกsi anaforafeloldรกsi kรญsรฉrletek a magyar nyelvben - PhDData

Access database of worldwide thesis

Nรฉvmรกsi anaforafeloldรกsi kรญsรฉrletek a magyar nyelvben

The thesis was published by Kovรกcs Viktรณria, in February 2022, University of Szeged.

Abstract:

The aim of the dissertation is to examine the results of the currently used supervised machine learning experimental methods for automatic anaphora resolution in Hungarian texts.
I used two corpora for the experiments: the SzegedKoref Corpus, which is the coreference annotated subcorpus of the Szeged Corpus, and for comparison the KorKorpusz. Machine learning experiments were performed using the Weka software, the Mention-pair model, and the Random forest algorithm. In these experiments the classifier makes decisions about pairs of mentions, namely, whether they are anaphorically related to each other or not, so for evaluation I used the MUC task evaluation metrics.
My null hypothesis is that it is possible to automatically resolve pronominal anaphoras in Hungarian texts without semantic information, only based on morphological, syntactical, and other surface structure-based features. My first hypothesis is that models achieve best results if we do not manually reduce the number of positive or negative examples in the training files. My second hypothesis is that selecting the pronoun-antecedent pair with the highest probability value brings greater efficiency. My third hypothesis is that adding the cognitive linguistic-based features to the machine learning experiment improves the success of the model building.
I pointed out that it is important: 1 the type of the text itself, as there are big differences between the machine learning experimentsโ€ results of the two corpora, 2 the type of the annotation, as it affects the quantity and quality of positive and negative examples, 3 the type of the pronoun, as pronouns behave differently from each other based on the examined aspects. It has been proved that in case of measuring distance between the two expressions it is important to consider not just the number of clauses but the relationship between the clauses. A further result of my experiments is the finding that the effect of the features I examined may differ when the goal is identifying more antecedents.

The full thesis can be downloaded at :
https://doktori.bibl.u-szeged.hu/id/eprint/10950/
https://doktori.bibl.u-szeged.hu/id/eprint/10950/14/disszertacio.pdf

Read the last PhD tips

2022
October

What are the little secrets of elite Ph.D. programs?
2022
September

What is beyond Ph.D.?
2022
September

Why would a PhD student drop out after spending so much time in the program?
2022
October

Is it guaranteed when you receive a Ph.D., a lucrative job will be waiting for you?
2022
September

Can you live off of a Ph.D. stipend?