Summary: | Recently, predictions based on big data have become more successful. In fact, research using images or text can make a long-imagined future come true. However, the data often contain a lot of noise, or the model does not account for the data, which increases uncertainty. Moreover, the gap between accuracy and likelihood is widening in modern predictive models. This gap may increase the uncertainty of predictions. In particular, applications such as self-driving cars and healthcare have problems that can be directly threatened by these uncertainties. Previous studies have proposed methods for reducing uncertainty in applications using images or signals. However, although studies that use natural language processing are being actively conducted, there remains insufficient discussion about uncertainty in text classification. Therefore, we propose a method that uses Variational Bayes to reduce the difference between accuracy and likelihood in text classification. This paper conducts an experiment using patent data in the field of technology management to confirm the proposed method’s practical applicability. As a result of the experiment, the calibrated confidence in the model was very small, from a minimum of 0.02 to a maximum of 0.04. Furthermore, through statistical tests, we proved that the proposed method within the significance level of 0.05 was more effective at calibrating the confidence than before.
|