Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017) 2017
DOI: 10.18653/v1/s17-2013
|View full text |Cite
|
Sign up to set email alerts
|

UdL at SemEval-2017 Task 1: Semantic Textual Similarity Estimation of English Sentence Pairs Using Regression Model over Pairwise Features

Abstract: This paper describes the model UdL we proposed to solve the semantic textual similarity task of SemEval 2017 workshop. The track we participated in was estimating the semantics relatedness of a given set of sentence pairs in English. The best run out of three submitted runs of our model achieved a Pearson correlation score of 0.8004 compared to a hidden human annotation of 250 pairs. We used random forest ensemble learning to map an expandable set of extracted pairwise features into a semantic similarity estim… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
3
2
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 11 publications
(8 citation statements)
references
References 7 publications
0
8
0
Order By: Relevance
“…Also in the SemEval 2017 workshop, [14] introduced the UdL model to determine the semantic relatedness of a set of sentence pairs in English. This model utilizes random forest ensemble learning with word embedding vectors to align Part of Speech (PoS) and Named Entities (NE) tagged tokens of each sentence pair-the model experiments with a character-based range of n-grams instead of words to calculate the features.…”
Section: Sentences Comparisonmentioning
confidence: 99%
See 2 more Smart Citations
“…Also in the SemEval 2017 workshop, [14] introduced the UdL model to determine the semantic relatedness of a set of sentence pairs in English. This model utilizes random forest ensemble learning with word embedding vectors to align Part of Speech (PoS) and Named Entities (NE) tagged tokens of each sentence pair-the model experiments with a character-based range of n-grams instead of words to calculate the features.…”
Section: Sentences Comparisonmentioning
confidence: 99%
“…[17], on their review specifically about short text similarity (STS) tasks, broadens this classification to a more generic overview, to which he classifies the tasks as string-based, corpus-based, knowledge-based, and hybrid-based. Our work will use a hybrid approach (corpus-based and string-based), using the work of [14] as a reference method. To evaluate the method proposed, we will use SemEval's STSBenchmark dataset.…”
Section: Tweet Similarity Approach 41 Overviewmentioning
confidence: 99%
See 1 more Smart Citation
“…The size of each set was 200 × 182 = 36, 400 pairs. In order to estimate the semantic relatedness score of each pair for each of the 3 sets, we used a pre-trained model 3 [31] which provides an estimation score between 0.0 to 5.0. This model was trained on an open access datasets 4 .…”
Section: E Sentence Semantic Relatedness Measurementioning
confidence: 99%
“…Semantic Web techniques and natural language processing (NLP) provide computers with the ability to decipher the meaning of information and process phrases written or spoken in natural language [19]. The combination of these techniques can allow users to search for data using natural language sentences [20,21] in multiple languages, and they can also be used to find semantic similarities among disparate documents from different disciplines [22][23][24][25][26].…”
Section: Introductionmentioning
confidence: 99%