Disentangled variational auto-encoder enhanced by counterfactual data for debiasing recommendation
Abstract Recommender system always suffers from various recommendation biases, seriously hindering its development. In this light, a series of debias methods have been proposed in the recommender system, especially for two most common biases, i.e., popularity bias and amplified subjective bias. Howe...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2024-01-01
|
Series: | Complex & Intelligent Systems |
Subjects: | |
Online Access: | https://doi.org/10.1007/s40747-023-01314-x |
Summary: | Abstract Recommender system always suffers from various recommendation biases, seriously hindering its development. In this light, a series of debias methods have been proposed in the recommender system, especially for two most common biases, i.e., popularity bias and amplified subjective bias. However, existing debias methods usually concentrate on correcting a single bias. Such single-functionality debiases neglect the bias-coupling issue in which the recommended items are collectively attributed to multiple biases. Besides, previous work cannot tackle the lacking supervised signals brought by sparse data, yet which has become a commonplace in the recommender system. In this work, we introduce a disentangled debias variational auto-encoder framework (DB-VAE) to address the single-functionality issue as well as a counterfactual data enhancement method to mitigate the adverse effect due to the data sparsity. In specific, DB-VAE first extracts two types of extreme items only affected by a single bias based on the collier theory, which are, respectively, employed to learn the latent representation of corresponding biases, thereby realizing the bias decoupling. In this way, the exact unbiased user representation can be learned by these decoupled bias representations. Furthermore, the data generation module employs Pearl’s framework to produce massive counterfactual data to help fully train the model, making up the lacking supervised signals due to the sparse data. Extensive experiments on three real-world data sets demonstrate the effectiveness of our proposed model. Specifically, our model outperforms the best baseline by 19.5% in terms of Recall@20 and 9.5% in terms of NDCG@100 in the best scenario. Besides, the counterfactual data can further improve DB-VAE, especially on the data set with low sparsity. |
---|---|
ISSN: | 2199-4536 2198-6053 |