Meituan SemEval 2022 Structured Sentiment Analysis Cross-language Track Champion Method Summary

The Voice Interaction Department of Meituan aims at the lack of annotation data of small languages ​​in the cross-language structured sentiment analysis task and the high cost of traditional method optimization. Cost migration, related methods won the champion of SemEval 2022 structured sentiment analysis cross-language track.

1. Background

SemEval (International Workshop on Semantic Evaluation) is a series of international natural language processing (NLP) workshops and an authoritative international competition in the field of natural language processing. Its mission is to advance the research progress of semantic analysis and help a series of increasingly challenging Create high-quality datasets for natural language semantic questions. This SemEval-2022 (The 16th International Workshop on Semantic Evaluation) contains 12 tasks, involving a series of topics, including idiom detection and embedding, sarcasm detection, multilingual news similarity and other tasks. Baba, Alipay, Didi, Huawei, ByteDance, Stanford University and other enterprises and scientific research institutions participated.

Among them, Task 10: Structured Sentiment Analysis belongs to the field of Information Extraction. This task contains two subtasks (Monolingual Subtask-1 and Zero-shot Crosslingual Subtask-2 respectively), including 7 datasets in five languages ​​(including English, Spanish, Catalan, Basque, Norwegian), where Subtask-1 uses all seven datasets and Subtask-2 uses three of them (Spanish, Catalan, Basque). We won the second place in Subtask-1 and the first place in Subtask-2 among more than 30 teams participating in this evaluation task. The related work has been summarized in a paper MT-Speech at SemEval-2022 Task 10: Incorporating Data Augmentation and Auxiliary Task with Cross-Lingual Pretrained Language Model for Structured Sentiment Analysis, and included in NAACL 2022 Workshop SemEval.

2. Introduction to the competition

The purpose of the Structured Sentiment Analysis task (Structured Sentiment Analysis, SSA) is to extract people's views on ideas, products or policies in the text, and express them structured as opinion quadruples- Opinion tuple Oi (h, t, e , p), including Holder (subject), Target (object), emotional expression (Expression), polarity (Polarity) four elements, representing the emotional expression (Expression) of Holder (subject) to Target (object), and the corresponding Polarity. Sentiment Graphs can be used to store and represent viewpoint quadruples (as shown in Figure 1 below). Two example sentences are shown in the figure, which express in English and Basque respectively "Some people give the new UMUC University five comments. Points are not credible" means. The first sentence English example contains two opinion quadruples, namely O1 (h, t, e, p) = (Some others, the new UMUC, 5 stars, positive), and O2 (h, t, e, p) = (, them, don't believe, negative).

e4616b97a9cc5f8fd3c4466a5affe177.jpeg

Figure 1. The opinion tuples contained in the text "Some people give the new UMUC a five-point rating is not credible" (expressed in English and Basque, respectively), displayed in the form of Sentiment Graphs

There are two tasks in the game:

  1. Monolingual task : the language of the test set is known, allowing the use of labeled data in the same language for training. The total score is the macro-average Sentiment F1 of seven datasets.

  2. Crosslingual task : It is not allowed to use labeled data in the same language as the test set for training (the evaluation data set is one of the three minor language data sets - Spanish, Catalan, Basque).

Data introduction

2f6646e5a46bab45ec4a26d4056cbcf0.png

Evaluation Index

The evaluation index of the competition is Sentiment Graph F1 (SF1, the abbreviation follows the writing method of the paper [5]), which evaluates the coincidence degree of predicted quadruples and label quadruples. In addition to the need to use the traditional True Positive (TP), False Positive (FP), False Negative (FN), and True Negative (True Negative, TN) to participate in the calculation of indicators, an additional definition of weighted true Positive (Weighted True Positive, WTP) [5] is an exact match at the viewpoint tuple level - that is, when the polarity of the viewpoint tuple is judged correctly, the average of the predicted fragment and the real label fragment of the three elements (Holder, Target, Expression) The coincidence degree (if there are multiple matching opinion tuples, take the tuple with the highest average coincidence degree) is the value of WTP (for details, please refer to [5]), if WTP is greater than 0, then TP is 1, otherwise TP is 0. If the polarity judgment is wrong, both WTP and TP are 0. The Holder or Target segment of the opinion tuple tag can be empty. At this time, the corresponding Holders or Targets segment that requires prediction must also be empty, otherwise it will not be considered a successful match. It can be seen that the exact matching requirements of opinion tuples are very high.

  • When calculating the accuracy rate of opinion tuples,

  • When calculating the view tuple recall,

  • The final Sentiment Graph F1 (SF1) is

3. Existing Methods and Issues

The mainstream method of structured sentiment analysis tasks is to use the pipeline method to perform sub-tasks such as information extraction of Holder, Target and Expression respectively, and then perform sentiment classification. However, such methods cannot capture the dependencies among multiple subtasks, and there is error propagation among tasks.

In order to solve this problem, Barnes et al. (2021) [5] used graph-based dependency analysis (Dependency Parsing) to capture the dependencies among the elements in the opinion quadruple, in which the emotional subject, object and emotional expression are all nodes, and the relationships between them are arcs. The model achieved state-of-the-art performance on the SSA task at the time. However, the above-mentioned method of Barnes et al. (2021) [5] still has some problems. First of all, the knowledge of the pre-trained language model (PLM) has not been fully utilized, because Barnes et al. (2021)[5] did not solve the mapping between graph relationships and word Tokens well, so it can only use PLM to generate character embeddings , and cannot be trained with the model.

In fact, cross-lingual PLM contains rich information about the interaction between different languages. Second, the above data-driven models rely on a large amount of labeled data, but in real scenarios, there is often insufficient or even no labeled data. For example, in this task, the training set of MultiBEU (Barnes et al., 2018) [4] has only 1063 samples, and the training set of similar MultiBCA (Barnes et al., 2018) [4] has only 1174 samples. The cross-language subtask of this task requires that the training data of the target language cannot be used, which also seriously restricts the performance of the method.

4. Our approach

In order to solve the problems mentioned above, we propose a unified end-to-end SSA model (Figure 2), which uses PLM as the model backbone (Backbone) to participate in the entire end-to-end training, and uses data enhancement methods and auxiliary tasks to Greatly improved the effect of cross-language zero-shot scenes.

Specifically, we adopt XLM-RoBERTa (Conneau and Lample, 2019; Conneau et al., 2019)[10,11] as the backbone encoder (Backbone Encoder) of the model to make full use of its existing multi-language/cross-language Knowledge; use BiLSTM [12] to enhance sequence decoding capabilities; the last bilinear attention matrix (Bilinear Attention) models the dependency graph and decodes the four-tuple of views. In order to alleviate the problem of lack of labeled data, we adopted two data augmentation methods: one is to add the labeled data of the same domain (In-Domain) of the same task during the training phase, and the other is to use XLM-RoBERTa through the mask language The model (MLM) (Devlin et al., 2018) [13] generates Augmented Samples.

In addition, we also added two auxiliary tasks: 1) Sequence Labeling task (Sequence Labeling) to predict the fragments of Holder/Target/Expression in the text, and 2) Sentiment Polarity Classification (Polarity Classification). None of these auxiliary tasks require additional annotations.

ea04e4fc011b4ef545f3e92b30fa5401.jpeg

Figure 2 overall framework

5. Method implementation and experimental analysis

5.1 Model selection

There are currently many pre-trained models available as model backbones, such as Multilingual BERT (mBERT) (Devlin et al., 2018)[13], XLM-RoBERTa (Conneau et al., 2019)[10] and infoXLM(Chi et al. ., 2021) [9]. We choose XLM-RoBERTa. Because the Monolingual task involves anticipation in five languages, and the Crosslingual task is a cross-language zero-shot problem, both tasks benefit from XLM-RoBERTa's multilingual training text and Translation Language Model (TLM) training objectives.

The TLM and Masked Language Modeling (MLM) objectives in the XLM family of models outperform mBERT, which is trained on multilingual corpora using only the MLM objective. In addition, XLM-RoBERTa provides a Large version with a larger model and more training data, which makes it perform better in downstream tasks. We did not use infoXLM because it focuses on sentence-level classification objectives and is not suitable for this structured prediction task.

dff606dfc7f0dd304c86255608c1ed42.jpeg

Table 1 The effect of different encoders on the officially released Monolingual task evaluation verification set, all models apply the same structure of bilinear attention decoder

To demonstrate the effectiveness of the cross-lingual pretrained language model XLM-RoBERTa, we compare it with the following baselines: 1) w2v + BiLSTM, word2vec (Mikolov et al., 2013) [20] word embeddings and BiLSTMs; 2) mBERT, multilingual BERT (Devlin et al., 2018) [13]; 3) mBERT + BiLSTM; 4) XLM-RoBERTa + BiLSTM. Table 1 shows that XLM-RoBERTa + BiLSTM achieves the best performance on all benchmarks with an average score of 6.7% higher than the strongest baseline (mBERT + BiLSTM). BiLSTM can improve the performance by 3.7%, which shows that BiLSTM layers can capture sequence information, which is beneficial for information encoding of serialization (Cross and Huang, 2016) [12].

We use the officially released development set as the test set, and randomly split the original training set into a training set and a development set. And keep the split dev set the same size as the official release dev set.

5.2 Data Augmentation

Data Augmentation (DA1) - Data Merging in the Same Domain

If M data sets in different languages ​​belong to the same field, they can be combined as a large training set to improve the effect of each sub-data set. In this evaluation, there are four data sets MultiBEU, MultiBCA, OpeNerES, OpeNerEN (Agerri et al., 2013) [1] belonging to hotel reviews. We combined these different data sets belonging to the same field during the training phase, which can improve each effect on the dataset. We additionally add a dataset of hotel reviews in Portuguese (BOTE-rehol) (Barros and Bona, 2021) [7]. We observe that these datasets share some similar features despite being in different languages.

Specifically, the languages ​​to which these datasets belong share similar words (from the perspective of Latin alphabet similarity) for some of the same objects or concepts. For example, Catalan and Spanish have the same word for "hotel" as "hotel" in English; in Basque, "hotel" is a similar word "hotela". In addition, people share the same emotional polarity in the hotel review space, such as expressing appreciation for "excellent service" and "clean and tidy space." Among them, the MultiBEU dataset is the dataset with the least amount of data, which can get more improvement through more data enhancement.

dcd918f7cf0f649d55a3b5aed101dcc9.jpeg

Table 2 For different target data sets, merge relevant data in the same field as the enhanced training set, and the table lists the data combinations with better results

bd102f9c70c0263c7bf959e5cdd085e9.jpeg

Table 3, the effect of the data enhancement DA1 method on the official verification set of the Monolingual task, "w/DA1" means that the data enhancement DA1 is used, and the backbone of the model is XLM-R+BiLSTM

Data Augmentation (DA2) - Generating new samples via masked language model

The mask language model (Mask Language Model) uses the [MASK] mark to randomly replace the original text tokens in the pre-training stage, and the training goal is to predict the original tokens at the [MASK] position. For each sample with a valid opinion quadruple, we randomly mask a small fraction of tokens in the training set text, and use XLM-RoBERTa pretrained on the task dataset to generate new tokens on these masked samples. tokens, so that we get new samples with labels. But be aware that mask generation cannot be done on Express snippets, as the model may generate words with a different polarity than the original label.

72ef7ebe8248003127d4a73cefab3469.jpeg

Table 4, the effect of two data enhancement methods on the Crosslingual task, where OpeNerEN means that only OpeNerEN data is used as training data, and "w/DA1-2" means that data enhancement DA1 and DA2 are used at the same time

From Table 3 and Table 4, we can see that both data augmentation methods help to improve the performance, and the performance of almost every benchmark is improved. In particular, the performance of the Crosslingual task has been significantly improved, presumably because the Zero-shot task has no chance to see the text and labels of the training samples of the same data set during the training phase. The DA2 method can improve the effect of the Crosslingual task, but it has little effect on the Monolingual task. It is speculated that the Monolingual task has already seen the training samples of the same data set during the training phase.

5.3 Auxiliary tasks

The SSA task contains both structured prediction and sentiment polarity classification, and it is not trivial for the model to solve these two tasks end-to-end. We propose two auxiliary tasks to provide the model with more training signals to better handle structured prediction and polarity classification. For structured prediction, we added a sequence labeling task (as shown in Figure 3 below) to let the model predict the type of each token (Holder, Target or Expression) and obtain an auxiliary loss.

3b5556a382f32b2da26819d135329ee3.png

Figure 3 Sequence labeling task

For the polarity classification task, we convert the evaluation training data into a sentence-level polarity classification task. The specific implementation is to set the sentence with only one polarity opinion tuple as the corresponding polarity category, and set the sentence containing multiple polarity Sentences of gender opinion tuples are set to the Neutral category. In addition, for data sets in different languages, we also added related open source sentence-level sentiment polarity classification data sets, and configured a multi-layer perceptron (MLP) as a classifier for each data set. We express the average pooling (Average Pooling) of the BiLSTM hidden state (Hidden States) of the model as a text sentence-level vector expression, and input it to the corresponding classifier for sentence-level sentiment polarity classification, and obtain the auxiliary loss ( ). The total training loss (Loss) is the weighted sum of the main loss () and two auxiliary losses:

7400f80599b3d8bb5d9235d59ad6e688.jpeg

Table 5. The effect of the model on the official development set after adding auxiliary tasks. Among them, MPQA (Wiebe et al., 2005) [32], DSUnis and OpeNerEN datasets use Roberta-base (Liu et al., 2019) [19] as encoders; OpeNerES datasets use bert-base-spanish-wwm- cased (Cañete et al., 2020)[8] as encoder; MultiBCA dataset uses Roberta-base-ca (Armengol-Estapé et al., 2021)[3] as encoder; MultiBEU dataset uses berteus-base- cased (Agerri et al., 2020)[2] as encoder; NoReCFine (Øvrelid et al., 2020)[23] dataset uses norwegian-RoBERTa-base (https://huggingface.co/patrickvonplaten/norwegian-roberta -base) as the encoder. The language model used for each data set is an open source medium model with the same language as the target data set, and the cost of ablation experiments is low

6. Comparison with other participating teams

Compared with the results of other teams, we have an advantage in the average score and multiple sub-datasets. On the Zero-shot dataset of Subtask-2 (Table 7), the average score is 5.2pp higher than that of the second place. On Subtask-1 (Table 6), multiple data sets (MultiBEU, MultiBCA, OpeNerES, and OpeNerEN) ranked first, and the average score was only 0.3pp away from the first place.

c2026421ea1b830db341c1ece2df54b2.jpeg

Table 6 Comparison of Subtask-1 team effects (the numbers in brackets are the ranking of a single data set, and the Average is the average value)

007ad16cf8606b1595848807f451d64d.jpeg

Table 7 Comparison of Subtask-2 team effects (the numbers in brackets are the ranking of a single data set, and the Average is the average value)

7. Summary

In this evaluation, we mainly explore the task of structured sentiment analysis. Aiming at the lack of interaction between different language data and the lack of annotation resources, we applied a cross-language pre-trained language model, and adopted two data enhancement methods and two auxiliary tasks. Experiments have proved the effectiveness of our method and model, and won the second place in Subtask-1 (Table 6) and the first place in Subtask-2 (Table 7) in SemEval-2022 task 10 Structured Sentiment Analysis grades.

In the future, we will continue to explore other more effective multilingual/cross-lingual resources and application methods of cross-language pre-training models. We are trying to apply the technology in the competition to the specific business of Meituan, such as the intelligent customer service of the voice interaction department and the intelligent outbound robot, so as to provide reference for optimizing intelligent solution capabilities and improving user satisfaction.

8. Author of this article

Chen Cong, Jian Shuo, Liu Cao, Yang Fan, Guang Lu, Jinxiong, etc. are all from the Meituan Platform/Voice Interaction Department.

9. References

[1] Rodrigo Agerri, Montse Cuadros, Sean Gaines, and German Rigau. 2013. OpeNER: Open polarity enhanced named entity recognition. In Spanish Society for Natural Language Processing, volume 51, pages 215–218.

[2] Rodrigo Agerri, Iñaki San Vincent, Jon Ander Campos, Ander Barrena, Xabier Saralegi, Aitor Soroa, and Eneko Agirre. 2020. Give Your Text Representation Models Some Love: The Case for Basque. In Proceedings of the 12th International Conference on Language Resources and Evaluation.

[3] Jordi Armengol-Estapé, Casimiro Pio Carrino, Carlos Rodriguez-Penagos, Ona de Gibert Bonet, Carme Armentano-Oller, Aitor Gonzalez-Agirre, Maite Melero, and Marta Villegas. 2021. Are multilingual models the best choice for moderately underresourced languages? A comprehensive assessment for Catalan. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 4933–4946, Online. Association for Computational Linguistics.

[4] Jeremy Barnes, Toni Badia, and Patrik Lambert. 2018. MultiBooked: A corpus of Basque and Catalan hotel reviews annotated for aspect-level sentiment classification. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation(LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).

[5] Jeremy Barnes, Robin Kurtz, Stephan Oepen, Lilja Øvrelid, and Erik Velldal. 2021. Structured sentiment analysis as dependency graph parsing. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3387–3402, Online. Association for Computational Linguistics.

[6] Jeremy Barnes, Oberländer Laura Ana Maria Kutuzov, Andrey and, Enrica Troiano, Jan Buchmann, Rodrigo Agerri, Lilja Øvrelid, Erik Velldal, and Stephan Oepen. 2022. SemEval-2022 task 10: Structured sentiment analysis. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval2022), Seattle. Association for Computational Linguistics.

[7] José Meléndez Barros and Glauber De Bona. 2021. A deep learning approach for aspect sentiment triplet extraction in portuguese. In Brazilian Conference on Intelligent Systems, pages 343–358. Springer.

[8] Jose Cañete, Gabriel Chaperon, Rodrigo Fountains, JouHui Ho, Hojin Kang, and Jorge Perez. 2020. Spanish pre-trained bert model and evaluation data. In PML4DC at ICLR

[9] Zewen Chi, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian-Ling Mao, Heyan Huang, and M. Zhou. 2021. Infoxlm: An information-theoretic framework for cross-lingual language model pre-training. In NAACL.

[10] Alexis Conneau, Kartika Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzman, Edward Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116.

[11] Alexis Conneau and Guillaume Lample. 2019. Crosslingual language model pretraining. Advances in neural information processing systems, 32.

[12] James Cross and Liang Huang. 2016. Incremental parsing with minimal features using bi-directional lstm. ArXiv, abs/1606.06406.

[13] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

[14] Timothy Dozat and Christopher D Manning. 2016. Deep biaffine attention for neural dependency parsing. arXiv preprint arXiv:1611.01734.

[15] E. Kiperwasser and Yoav Goldberg. 2016. Simple and accurate dependency parsing using bidirectional lstm feature representations. Transactions of the Association for Computational Linguistics, 4:313–327.

[16] Robin Kurtz, Stephan Oepen, and Marco Kuhlmann. 2020. End-to-end negation resolution as graph parsing. In IWPT.

[17] Xin Li, Lidong Bing, Piji Li, and Wai Lam. 2019. A unified model for opinion target extraction and target sentiment prediction. ArXiv, abs/1811.05082.

[18] Bing Liu. 2012. Sentiment analysis and opinion mining. Synthesis lectures on human language technologies, 5(1):1–167.

[19] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv :1907.11692 .

[20] Tomas Mikolov, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. In ICLR.

[21] Margaret Mitchell, Jacqui Aguilar, Theresa Wilson, and Benjamin Van Durme. 2013. Open domain targeted sentiment. In EMNLP.

[22] Stephan Oepen, Omri Abend, Lasha Abzianidze, Johan Bos, Jan Hajic, Daniel Hershcovich, Bin Li, Timothy J. O’Gorman, Nianwen Xue, and Daniel Zeman. 2020. Mrp 2020: The second shared task on crossframework and cross-lingual meaning representation parsing. In CONLL.

[23] Lilja Ovrelid, Petter Maehlum, Jeremy Barnes, and Erik Velldal. 2020. A fine-grained sentiment dataset for norwegian. In LREC.

[24] Lilja Øvrelid, Petter Mæhlum, Jeremy Barnes, and Erik Velldal. 2020. A fine-grained sentiment dataset for Norwegian. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 5025– 5033, Marseille, France. European Language Resources Association.

[25] Bo Pang, Lillian Lee, et al. 2008. Opinion mining and sentiment analysis. Foundations and Trends® in information retrieval, 2(1–2):1–135.

[26] Maria Pontiki, Dimitris Galanis, John Pavlopoulos, Haris Papageorgiou, Ion Androutsopoulos, and Suresh Manandhar. 2014. Semeval-2014 task 4: Aspect based sentiment analysis. In CALLING 2014.

[27] Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners.

[28] Colin Raffel, Noam M. Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. ArXiv, abs/1910.10683.

[29] Mike Schuster and Kuldip K Paliwal. 1997. Bidirectional recurrent neural networks. IEEE transactions on Signal Processing, 45(11):2673–2681.

[30] Cigdem Toprak, Niklas Jakob, and Iryna Gurevych. 2010. Sentence and expression level annotation of opinions in user-generated discourse. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 575–584, Uppsala, Sweden. Association for Computational Linguistics.

[31] Ashish Vaswani, Noam M. Shazeer, Niki Parmar, Jacob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. ArXiv , abs / 1706.03762 .

[32] Janyce Wiebe, Theresa Wilson, and Claire Cardie. 2005. Annotating expressions of opinions and emotions in language. Language Resources and Evaluation, 39(2-3):165–210.

[33] Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, and Jamie Brew. 2019. Huggingface’s transformers: State-of-the-art natural language processing. ArXiv, abs/1910.03771.

[34] Lu Xu, Hao Li, Wei Lu, and Lidong Bing. 2020. Position-aware tagging for aspect sentiment triplet extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2339–2349, Online. Association for Computational Linguistics.

[35] Zhilin Yang, Zihang Dai, Yiming Yang, Jaime G. Carbonell, Ruslan Salakhutdinov, and Quoc V. Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. In NeurIPS.

[36] Elena Zotova, Rodrigo Agerri, Manuel Nunez, and German Rigau. 2020. Multilingual stance detection: The Catalan independence corpus. arXiv preprint arXiv:2004.00050.

----------  END  ----------

Job Offers

The Voice Interaction Department is responsible for the research and development of Meituan's voice and dialogue technology, and provides voice technology and dialogue interaction technical capability support and product applications for Meituan's business and B-end and C-end partners in the ecosystem. After years of research and development, the team has built a large-scale technical platform service in technologies such as speech recognition, synthesis, spoken language understanding, intelligent question answering, and multi-round interaction, and has developed solutions including outbound robots, intelligent customer service, and voice content analysis. Products have been widely implemented in Meituan's rich business scenarios. The Speech Interaction Department has been recruiting natural language processing algorithm engineers and algorithm experts for a long time. Interested students can send their resumes to [email protected].

Meituan scientific research cooperation

Meituan's scientific research cooperation is committed to building a bridge and platform for cooperation between Meituan's technical team and universities, scientific research institutions, and think tanks. Relying on Meituan's rich business scenarios, data resources, and real industrial problems, open innovation, gather upward forces, and focus on robots , artificial intelligence, big data, Internet of Things, unmanned driving, operational optimization and other fields, jointly explore cutting-edge technology and industry focus macro issues, promote industry-university-research cooperation and exchange and achievement transformation, and promote the cultivation of outstanding talents. Facing the future, we look forward to cooperating with more teachers and students from universities and research institutes. Teachers and students are welcome to send emails to: [email protected].

maybe you want to see

  |  Exploration of Dialogue Summary Technology in Meituan (SIGIR)

  Exploration and Practice of  Retrieval Dialogue System in Meituan Customer Service Scenario

  |  DSTC10 Open Domain Dialogue Evaluation Competition Champion Method Summary

read more

Frontend  | Algorithms  | Backend  |  Data   

Security  |  Android  |  iOS   |  Operation and Maintenance  |  Testing

Guess you like

Origin blog.csdn.net/MeituanTech/article/details/128031046