Learning with Small Data: Subgraph Counting Queries

Abstract Deep Learning (DL) has been widely used in many applications, and its success is achieved with large training data. A key issue is how to provide a DL solution when there is no large training data to learn initially. In this paper, we explore a meta-learning approach for a specific problem,...

Full description

Bibliographic Details
Main Authors: Kangfei Zhao, Zongyan He, Jeffrey Xu Yu, Yu Rong
Format: Article
Language:English
Published: SpringerOpen 2023-09-01
Series:Data Science and Engineering
Subjects:
Online Access:https://doi.org/10.1007/s41019-023-00223-w
_version_ 1797675096176852992
author Kangfei Zhao
Zongyan He
Jeffrey Xu Yu
Yu Rong
author_facet Kangfei Zhao
Zongyan He
Jeffrey Xu Yu
Yu Rong
author_sort Kangfei Zhao
collection DOAJ
description Abstract Deep Learning (DL) has been widely used in many applications, and its success is achieved with large training data. A key issue is how to provide a DL solution when there is no large training data to learn initially. In this paper, we explore a meta-learning approach for a specific problem, subgraph isomorphism counting, which is a fundamental problem in graph analysis to count the number of a given pattern graph, p, in a data graph, g, that matches p. There are various data graphs and pattern graphs. A subgraph isomorphism counting query is specified by a pair, (g, p). This problem is NP-hard and needs large training data to learn by DL in nature. We design a Gaussian Process (GP) model which combines Graph Neural Network with Bayesian nonparametric, and we train the GP by a meta-learning algorithm on a small set of training data. By meta-learning, we can obtain a generalized meta-model to better encode the information of data and pattern graphs and capture the prior of small tasks. With the meta-model learned, we handle a collection of pairs (g, p), as a task, where some pairs may be associated with the ground-truth, and some pairs are the queries to answer. There are two cases. One is there are some with ground-truth (few-shot), and one is there is none with ground-truth (zero-shot). We provide our solutions for both. In particular, for zero-shot, we propose a new data-driven approach to predict the count values. Note that zero-shot learning for our regression tasks is difficult, and there is no hands-on solution in the literature. We conducted extensive experimental studies to confirm that our approach is robust to model degeneration on small training data, and our meta-model can fast adapt to new queries by few-shot and zero-shot learning.
first_indexed 2024-03-11T22:09:42Z
format Article
id doaj.art-3baf657fb0774df7a986f395355f6295
institution Directory Open Access Journal
issn 2364-1185
2364-1541
language English
last_indexed 2024-03-11T22:09:42Z
publishDate 2023-09-01
publisher SpringerOpen
record_format Article
series Data Science and Engineering
spelling doaj.art-3baf657fb0774df7a986f395355f62952023-09-24T11:27:27ZengSpringerOpenData Science and Engineering2364-11852364-15412023-09-018329230510.1007/s41019-023-00223-wLearning with Small Data: Subgraph Counting QueriesKangfei Zhao0Zongyan He1Jeffrey Xu Yu2Yu Rong3Beijing Institute of TechnologyThe Chinese University of Hong KongThe Chinese University of Hong KongTencent AI LabAbstract Deep Learning (DL) has been widely used in many applications, and its success is achieved with large training data. A key issue is how to provide a DL solution when there is no large training data to learn initially. In this paper, we explore a meta-learning approach for a specific problem, subgraph isomorphism counting, which is a fundamental problem in graph analysis to count the number of a given pattern graph, p, in a data graph, g, that matches p. There are various data graphs and pattern graphs. A subgraph isomorphism counting query is specified by a pair, (g, p). This problem is NP-hard and needs large training data to learn by DL in nature. We design a Gaussian Process (GP) model which combines Graph Neural Network with Bayesian nonparametric, and we train the GP by a meta-learning algorithm on a small set of training data. By meta-learning, we can obtain a generalized meta-model to better encode the information of data and pattern graphs and capture the prior of small tasks. With the meta-model learned, we handle a collection of pairs (g, p), as a task, where some pairs may be associated with the ground-truth, and some pairs are the queries to answer. There are two cases. One is there are some with ground-truth (few-shot), and one is there is none with ground-truth (zero-shot). We provide our solutions for both. In particular, for zero-shot, we propose a new data-driven approach to predict the count values. Note that zero-shot learning for our regression tasks is difficult, and there is no hands-on solution in the literature. We conducted extensive experimental studies to confirm that our approach is robust to model degeneration on small training data, and our meta-model can fast adapt to new queries by few-shot and zero-shot learning.https://doi.org/10.1007/s41019-023-00223-wSubgraph isomorphismSubgraph countingMeta-learning
spellingShingle Kangfei Zhao
Zongyan He
Jeffrey Xu Yu
Yu Rong
Learning with Small Data: Subgraph Counting Queries
Data Science and Engineering
Subgraph isomorphism
Subgraph counting
Meta-learning
title Learning with Small Data: Subgraph Counting Queries
title_full Learning with Small Data: Subgraph Counting Queries
title_fullStr Learning with Small Data: Subgraph Counting Queries
title_full_unstemmed Learning with Small Data: Subgraph Counting Queries
title_short Learning with Small Data: Subgraph Counting Queries
title_sort learning with small data subgraph counting queries
topic Subgraph isomorphism
Subgraph counting
Meta-learning
url https://doi.org/10.1007/s41019-023-00223-w
work_keys_str_mv AT kangfeizhao learningwithsmalldatasubgraphcountingqueries
AT zongyanhe learningwithsmalldatasubgraphcountingqueries
AT jeffreyxuyu learningwithsmalldatasubgraphcountingqueries
AT yurong learningwithsmalldatasubgraphcountingqueries