Game Tag Extraction Model for Korean Game Classification
Game Classification, Deep Learning, BERT, Word Embedding
Hang Yeol LEE, Jae Wook LEE, So-Young PARK
As the number of games increases in the software distribution network (ESD), it is difficult to find the game that a user wants. Therefore, the game can be recommended based on some game keyword tags for the user. In this paper, we propose a method to automatically generate the game keyword tags from the game description with the deep learning model, BERT. To generate the appropriate game keyword tags, the proposed method extracts the 100 representative game keyword tags from a game publishing platform Steam, and it performs the binary classification per tag. Finally, it selects 4 game keyword tags with the highest Softmax scores. Considering the accuracy improvement, a Korean game description set is used for finetuning and optimization, so that it updates both the BERT model pretrained with approximately 3.3 billion multilingual words, and the KoBERT model pretrained with approximately 50 million Korean words. Experiments show that the BERT model performs 9.19 % at F-score better than the KoBERT model. It describes that the size of the training data set is much more important than the characteristics of the specific language.