Please use this identifier to cite or link to this item:
https://digital.lib.ueh.edu.vn/handle/UEH/72835
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Đặng Ngọc Hoàng Thành | en_US |
dc.contributor.author | Nguyễn Quỳnh Khánh Hà | en_US |
dc.contributor.other | Nguyễn Quốc Việt | en_US |
dc.contributor.other | Nguyễn Nhật Quang | en_US |
dc.date.accessioned | 2024-11-19T04:14:54Z | - |
dc.date.available | 2024-11-19T04:14:54Z | - |
dc.date.issued | 2024 | - |
dc.identifier.uri | https://digital.lib.ueh.edu.vn/handle/UEH/72835 | - |
dc.description.abstract | Discovering customer intents from their written or spoken language plays a vital role in natural language understanding and automated dialogue response. However, labeling intents for new domains from the ground up is a daunting and time-consuming process, often requiring extensive manual effort from domain experts. To address this challenge, this paper proposes an unsupervised approach for discovering intents and automatically producing meaningful intention labels from a collection of unlabeled utterances in the context of a banking domain. In the initial stage, we deploy Deep Embedded Clustering (DEC) to simultaneously learn feature representations and cluster assignments to create a set of coherent clusters where the utterances within each cluster have the same intent. For enhanced performance, we modify the joint loss functions of DEC to preserve the local structure of the model for improved performance (known as Improved Deep Embedded Clustering with Local Structure Preservation). Importantly, we explore the use of a state-of-the-art optimiza tion technique called Sophia Optimizer and employ the Jensen-Shannon divergence as a measure of similarity in the clustering algorithm. We empirically show that our pro posed modification achieves state-of-the-art results in terms of NMI score, surpassing all prior unsupervised DEC architectures. In the second stage, intent labels for each cluster are automatically generated by extract ing the ACTION-OBJECT pair from each utterance using a dependency parser. The pro posed unsupervised approach is capable of automatically generating meaningful intent labels while obtaining high evaluation scores in utterance clustering and intent discov ery. While initially developed to build an intent model for conversational systems, this framework can also be adapted for short text clustering in various general applications. | en_US |
dc.format.medium | 64 p. | en_US |
dc.language.iso | en | en_US |
dc.publisher | University of Economics Ho Chi Minh City | en_US |
dc.relation.ispartofseries | Giải thưởng Nhà nghiên cứu trẻ UEH 2024 | en_US |
dc.title | Improving deep embedded clustering for intent mining with jensen- shannon divergence and sophia optimizer | en_US |
dc.type | Research Paper | en_US |
ueh.speciality | Khoa học dữ liệu và trí tuệ nhân tạo | en_US |
ueh.award | Giải C | en_US |
item.languageiso639-1 | en | - |
item.cerifentitytype | Publications | - |
item.grantfulltext | reserved | - |
item.openairetype | Research Paper | - |
item.fulltext | Full texts | - |
item.openairecristype | http://purl.org/coar/resource_type/c_18cf | - |
Appears in Collections: | Nhà nghiên cứu trẻ UEH |
Files in This Item:
File
Description
Size
Format
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.