Chen Danqi's work is good. The previous text similarity is basically just a score, but it can be divided into many similar situations from different perspectives. This is equivalent to redefining the task. In addition, data construction is also generated through models, everything is very convenient and fast. The final experiment also shows that GPT4 also makes mistakes from time to time, and this development direction can be further studied and discussed.
Let's take a closer look at the author's point of view.
Paper: C-STS: Conditional Semantic Textual Similarity
Address: https://arxiv.org/abs/2305.15093
Unit: Princeton, Allen AI, etc.Enter the NLP group —> join the NLP exchange group
Semantic textual similarity (STS) has been a cornerstone task in NLP, measuring the degree of similarity between a pair of sentences, with applications in information retrieval, question answering, and embedding methods.
However, this is an inherently ambiguous task, and sentence similarity depends on specific aspects of interest.
We address this ambiguity by proposing a new task called Conditional STS ( C-STS ), which measures similarity in terms (here conditional) articulated by natural language.
For example, the similarity between the sentences " NBA player shoots a 3-pointer " and " A person throws a tennis ball in the air " is higher (up) for the " Motion of the ball " condition. and lower " ball size " (one big and one small).
C-STS has dual advantages : (1) it reduces the subjectivity and ambiguity of STS, and (2) different conditions can be used for fine-grained similarity evaluation.
C-STS contains nearly 20,000 instances from different domains, and we evaluate several state-of-the-art models to demonstrate that even the highest performing fine-tuning and contextual learning models (GPT-4, Flan, SimCSE) are found to have Challenging with a Spearman correlation score of <50.
We encourage the community to evaluate their models on C-STS to provide a more comprehensive view of semantic similarity and natural language understanding.
Experiment and Analysis
Enter the NLP group —> join the NLP exchange group