Improving User Intent Detection in Urdu Web Queries with Capsule Net Architectures
Detecting the communicative intent behind user queries is critically required by search engines to understand a user’s search goal and retrieve the desired results. Due to increased web searching in local languages, there is an emerging need to support the language understanding for languages other...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-11-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/12/22/11861 |
_version_ | 1797465928928067584 |
---|---|
author | Sana Shams Muhammad Aslam |
author_facet | Sana Shams Muhammad Aslam |
author_sort | Sana Shams |
collection | DOAJ |
description | Detecting the communicative intent behind user queries is critically required by search engines to understand a user’s search goal and retrieve the desired results. Due to increased web searching in local languages, there is an emerging need to support the language understanding for languages other than English. This article presents a distinctive, capsule neural network architecture for intent detection from search queries in Urdu, a widely spoken South Asian language. The proposed two-tiered capsule network utilizes LSTM cells and an iterative routing mechanism between the capsules to effectively discriminate diversely expressed search intents. Since no Urdu queries dataset is available, a benchmark intent-annotated dataset of 11,751 queries was developed, incorporating 11 query domains and annotated with Broder’s intent taxonomy (i.e., navigational, transactional and informational intents). Through rigorous experimentation, the proposed model attained the state of the art accuracy of 91.12%, significantly improving upon several alternate classification techniques and strong baselines. An error analysis revealed systematic error patterns owing to a class imbalance and large lexical variability in Urdu web queries. |
first_indexed | 2024-03-09T18:29:30Z |
format | Article |
id | doaj.art-99c2bb86421147978395231a480c64ad |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-09T18:29:30Z |
publishDate | 2022-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-99c2bb86421147978395231a480c64ad2023-11-24T07:42:07ZengMDPI AGApplied Sciences2076-34172022-11-0112221186110.3390/app122211861Improving User Intent Detection in Urdu Web Queries with Capsule Net ArchitecturesSana Shams0Muhammad Aslam1Department of Computer Science, University of Engineering and Technology, Lahore, PakistanDepartment of Computer Science, University of Engineering and Technology, Lahore, PakistanDetecting the communicative intent behind user queries is critically required by search engines to understand a user’s search goal and retrieve the desired results. Due to increased web searching in local languages, there is an emerging need to support the language understanding for languages other than English. This article presents a distinctive, capsule neural network architecture for intent detection from search queries in Urdu, a widely spoken South Asian language. The proposed two-tiered capsule network utilizes LSTM cells and an iterative routing mechanism between the capsules to effectively discriminate diversely expressed search intents. Since no Urdu queries dataset is available, a benchmark intent-annotated dataset of 11,751 queries was developed, incorporating 11 query domains and annotated with Broder’s intent taxonomy (i.e., navigational, transactional and informational intents). Through rigorous experimentation, the proposed model attained the state of the art accuracy of 91.12%, significantly improving upon several alternate classification techniques and strong baselines. An error analysis revealed systematic error patterns owing to a class imbalance and large lexical variability in Urdu web queries.https://www.mdpi.com/2076-3417/12/22/11861Urdusearch queriesintent detectioncapsule networkword embeddings |
spellingShingle | Sana Shams Muhammad Aslam Improving User Intent Detection in Urdu Web Queries with Capsule Net Architectures Applied Sciences Urdu search queries intent detection capsule network word embeddings |
title | Improving User Intent Detection in Urdu Web Queries with Capsule Net Architectures |
title_full | Improving User Intent Detection in Urdu Web Queries with Capsule Net Architectures |
title_fullStr | Improving User Intent Detection in Urdu Web Queries with Capsule Net Architectures |
title_full_unstemmed | Improving User Intent Detection in Urdu Web Queries with Capsule Net Architectures |
title_short | Improving User Intent Detection in Urdu Web Queries with Capsule Net Architectures |
title_sort | improving user intent detection in urdu web queries with capsule net architectures |
topic | Urdu search queries intent detection capsule network word embeddings |
url | https://www.mdpi.com/2076-3417/12/22/11861 |
work_keys_str_mv | AT sanashams improvinguserintentdetectioninurduwebquerieswithcapsulenetarchitectures AT muhammadaslam improvinguserintentdetectioninurduwebquerieswithcapsulenetarchitectures |