Improving User Intent Detection in Urdu Web Queries with Capsule Net Architectures

Detecting the communicative intent behind user queries is critically required by search engines to understand a user’s search goal and retrieve the desired results. Due to increased web searching in local languages, there is an emerging need to support the language understanding for languages other...

Full description

Bibliographic Details
Main Authors: Sana Shams, Muhammad Aslam
Format: Article
Language:English
Published: MDPI AG 2022-11-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/22/11861
_version_ 1797465928928067584
author Sana Shams
Muhammad Aslam
author_facet Sana Shams
Muhammad Aslam
author_sort Sana Shams
collection DOAJ
description Detecting the communicative intent behind user queries is critically required by search engines to understand a user’s search goal and retrieve the desired results. Due to increased web searching in local languages, there is an emerging need to support the language understanding for languages other than English. This article presents a distinctive, capsule neural network architecture for intent detection from search queries in Urdu, a widely spoken South Asian language. The proposed two-tiered capsule network utilizes LSTM cells and an iterative routing mechanism between the capsules to effectively discriminate diversely expressed search intents. Since no Urdu queries dataset is available, a benchmark intent-annotated dataset of 11,751 queries was developed, incorporating 11 query domains and annotated with Broder’s intent taxonomy (i.e., navigational, transactional and informational intents). Through rigorous experimentation, the proposed model attained the state of the art accuracy of 91.12%, significantly improving upon several alternate classification techniques and strong baselines. An error analysis revealed systematic error patterns owing to a class imbalance and large lexical variability in Urdu web queries.
first_indexed 2024-03-09T18:29:30Z
format Article
id doaj.art-99c2bb86421147978395231a480c64ad
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-09T18:29:30Z
publishDate 2022-11-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-99c2bb86421147978395231a480c64ad2023-11-24T07:42:07ZengMDPI AGApplied Sciences2076-34172022-11-0112221186110.3390/app122211861Improving User Intent Detection in Urdu Web Queries with Capsule Net ArchitecturesSana Shams0Muhammad Aslam1Department of Computer Science, University of Engineering and Technology, Lahore, PakistanDepartment of Computer Science, University of Engineering and Technology, Lahore, PakistanDetecting the communicative intent behind user queries is critically required by search engines to understand a user’s search goal and retrieve the desired results. Due to increased web searching in local languages, there is an emerging need to support the language understanding for languages other than English. This article presents a distinctive, capsule neural network architecture for intent detection from search queries in Urdu, a widely spoken South Asian language. The proposed two-tiered capsule network utilizes LSTM cells and an iterative routing mechanism between the capsules to effectively discriminate diversely expressed search intents. Since no Urdu queries dataset is available, a benchmark intent-annotated dataset of 11,751 queries was developed, incorporating 11 query domains and annotated with Broder’s intent taxonomy (i.e., navigational, transactional and informational intents). Through rigorous experimentation, the proposed model attained the state of the art accuracy of 91.12%, significantly improving upon several alternate classification techniques and strong baselines. An error analysis revealed systematic error patterns owing to a class imbalance and large lexical variability in Urdu web queries.https://www.mdpi.com/2076-3417/12/22/11861Urdusearch queriesintent detectioncapsule networkword embeddings
spellingShingle Sana Shams
Muhammad Aslam
Improving User Intent Detection in Urdu Web Queries with Capsule Net Architectures
Applied Sciences
Urdu
search queries
intent detection
capsule network
word embeddings
title Improving User Intent Detection in Urdu Web Queries with Capsule Net Architectures
title_full Improving User Intent Detection in Urdu Web Queries with Capsule Net Architectures
title_fullStr Improving User Intent Detection in Urdu Web Queries with Capsule Net Architectures
title_full_unstemmed Improving User Intent Detection in Urdu Web Queries with Capsule Net Architectures
title_short Improving User Intent Detection in Urdu Web Queries with Capsule Net Architectures
title_sort improving user intent detection in urdu web queries with capsule net architectures
topic Urdu
search queries
intent detection
capsule network
word embeddings
url https://www.mdpi.com/2076-3417/12/22/11861
work_keys_str_mv AT sanashams improvinguserintentdetectioninurduwebquerieswithcapsulenetarchitectures
AT muhammadaslam improvinguserintentdetectioninurduwebquerieswithcapsulenetarchitectures