International Journal of Web Research

International Journal of Web Research

MultiCGCN: Multi-Label Text Classification using GCNs and Heterogeneous Graphs

Document Type : Original Article

Authors
School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
Abstract
Multi-label text classification is a critical challenge in natural language processing, where the goal is to assign multiple labels to a given document. Recent advances have primarily focused on deep learning approaches, yet many fail to adequately capture the intricate relationships between documents and labels. In this paper, we propose a novel method called MultiCGCN, in which we leverage Graph Convolutional Networks (GCNs) for multi-label text classification by modeling text as a heterogeneous graph. This unified graph incorporates document similarities, label relationships, and document-label associations, enabling the model to effectively capture both document and label dependencies. We transform the multi-label classification problem into a link prediction task, using Term Frequency–Inverse Document Frequency (TF-IDF) for document similarity and applying GCNs to predict label assignments. Our empirical evaluations demonstrate that MultiCGCN achieves a significant performance boost, improving F1 score by 10% over traditional baseline models. This approach opens new avenues for enhancing the accuracy of multi-label classification in various domains.
Keywords

Subjects


  • Meng, Z. Ye, Y. Yang and H. Zhao, “DeepMCGCN: Multi-channel Deep Graph Neural Networks,” International Journal of Computational Intelligence Systems, vol. 17, p. 41, 2024. https://doi.org/10.1007/s44196-024-00432-9
  • Xiong, L. Yu, X. Niu and Y. Leng, “XRR: Extreme multi-label text classification with candidate retrieving and deep ranking,” Inf Sci (N Y), 622, 115–132, 2023. https://doi.org/10.1016/j.ins.2022.11.158
  • Rakhlin, “Convolutional neural networks for sentence classification,” GitHub, 6, 25, 2016.
  • Buchner, L. Cao, J. C. Kalo and V. Von Ehrenheim, “Prompt Tuned Embedding Classification for Industry Sector Allocation,” In: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 6, Industry Track, 2024, pp. 108–118. https://doi.org/10.18653/v1/2024.naacl-industry.10
  • Li, et al., “Enhancing Extreme Multi-Label Text Classification: Addressing Challenges in Model, Data, and Evaluation,” In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track, 2023, pp. 313–321. https://doi.org/10.18653/v1/2023.emnlp-industry.30
  • I. Wang and C. D. Manning, “Baselines and bigrams: Simple, good sentiment and topic classification.” In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, vol. 2: Short Papers, 2024, pp. 90–94.
  • Chenthamarakshan, P. Melville, V. Sindhwani and R. D. Lawrence, “Concept labeling: Building text classifiers with minimal supervision,” In IJCAI proceedings-international joint conference on artificial intelligence, 2011, p. 1225.
  • Luo, Ö. Uzuner and P. Szolovits, “Bridging semantics and syntax with graph algorithms—state-of-the-art of extracting biomedical relations,” Brief Bioinform, vol. 18, no. 1, pp. 160–178, 2017. https://doi.org/10.1093/bib/bbw001
  • Rousseau, E. Kiagias and M. Vazirgiannis, “Text categorization as a graph classification problem,” In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 2015, pp. 1702–1712.
  • Skianis, F. Rousseau and M. Vazirgiannis, “Regularizing text categorization with clusters of words,” In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, 2016, pp. 1827–1837.
  • Luo, A. R. Sohani, E. P. Hochberg and P. Szolovits, “Automatic lymphoma classification with sentence subgraph mining from pathology reports,” Journal of the American Medical Informatics Association, vol. 21, pp. 824–832, 2014. https://doi.org/10.1136/amiajnl-2013-002443
  • Luo, Y. Xin, E. Hochberg, R. Joshi, O. Uzuner and P. Szolovits, “Subgraph augmented non-negative tensor factorization (SANTF) for modeling clinical narrative text,” Journal of the American Medical Informatics Association, vol. 22, no. 5, pp. 1009–1019, 2015. https://doi.org/10.1093/jamia/ocv016
  • Yan, F. Liu, X. Zhuang and J. Ju, “An R-transformer_BiLSTM model based on attention for multi-label text classification, Neural Process Lett, vol. 55, pp. 1293–1316, 2023. https://doi.org/10.1007/s11063-022-10938-y
  • Mikolov, I. Sutskever, K. Chen, G. S. Corrado and J. Dean, “Distributed representations of words and phrases and their compositionality,” Adv Neural Inf Process Syst., vol. 26, 2013.
  • Pennington, R. Socher and C. D. Manning, “Glove: Global vectors for word representation,” In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), Doha, Qatar, 2014, pp. 1532–1543. https://doi.org/10.3115/v1/D14-1162
  • Yu, F. Xiong and Z. Chen, “Text Classification Based on Natural Language Processing and Machine Learning in Multi-Label Corpus,” ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 23, no. 8, pp. 1–14, 2024. https://doi.org/10.1145/3617831
  • Wang, H. Xie, F. L. Wang and L. K. Lee, “Improving text classification via a soft dynamical label strategy,” International Journal of Machine Learning and Cybernetics, vol. 14, pp. 2395–2405, 2023. https://doi.org/10.1007/s13042-022-01770-w
  • Liu, J. Pang, N. Li, X. Zhou and F. Yue, “Research on multi-label text classification method based on tALBERT-CNN,” International Journal of Computational Intelligence Systems, vol. 14, p. 201, 2021. https://doi.org/10.1007/s44196-021-00055-4
  • Joulin, E. Grave and P. B. T. Mikolov, “Bag of Tricks for Efficient Text Classification,” arXiv preprint arXiv:1607.01759, 2017. https://doi.org/10.48550/arXiv.1607.01759
  • Shen et al., “Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms,” In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers, 2018, pp. 440–450. https://doi.org/10.18653/v1/P18-1041
  • Wang et al., “Joint Embedding of Words and Labels for Text Classification,” In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers, 2018, pp. 2321–2331. https://doi.org/10.18653/v1/P18-1216
  • Le and T. Mikolov, “Distributed representations of sentences and documents,” Proceedings of the 31st International Conference on Machine Learning, PMLR , 2014, pp. 1188-1196.
  • Tang, M. Qu and Q. Mei, “Pte: Predictive text embedding through large-scale heterogeneous text networks,” In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, 2015, pp. 1165–1174. https://doi.org/10.1145/2783258.2783307
  • Zhang, J. Zhao and Y. LeCun, “Character-level convolutional networks for text classification,” Adv. Neural Inf Process Syst., vol. 28, 2015.
  • Conneau, H. Schwenk, L. Barrault and Y. Lecun, “Very Deep Convolutional Networks for Text Classification,” In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Volume 1: Long Papers, 2017, pp. 1107–1116. https://aclanthology.org/E17-1104
  • Liu, X. Qiu and X. Huang, “Recurrent neural network for text classification with multi-task learning,” In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016, pp. 2873–2879. https://www.ijcai.org/Proceedings/16/Papers/408.pdf
  • Luo, “Recurrent neural networks for classifying relations in clinical notes,” J. Biomed Inform., vol. 72, pp. 85–95, 2017. https://doi.org/10.1016/j.jbi.2017.07.006
  • S. Tai, R. Socher and C. D. Manning, “Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks,” In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Volume 1: Long Papers, 2015. pp. 1556–1566. https://doi.org/10.3115/v1/P15-1150
  • Wang, M. Huang, X. Zhu and L. Zhao, “Attention-based LSTM for aspect-level sentiment classification,” In Proceedings of the 2016 conference on empirical methods in natural language processing. 2016, pp. 606–615. https://doi.org/10.18653/v1/D16-1058
  • Yang, D. Yang, C. Dyer, X. He, A. Smola and E. Hovy, “Hierarchical attention networks for document classification.” In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, 2016, pp. 1480–1489. https://doi.org/10.18653/v1/N16-1174
  • Lin, Y. Wang, X. Liu and X. Qiu, “A survey of transformers,” AI open, vol. 3, pp. 111–132, 2022. https://doi.org/10.1016/j.aiopen.2022.10.001
  • Zhao, Q. Ai, X. Li, W. Wang, Q. Gao and Y. Liu, “TLC-XML: Transformer with Label Correlation for Extreme Multi-label Text Classification,” Neural Process Lett., vol. 56, p. 25, 2024. https://doi.org/10.1007/s11063-024-11460-z
  • Cunha, F. Viegas, C. França, T. Rosa, L. Rocha and M. A. Gonçalves, “A Comparative Survey of Instance Selection Methods applied to Non-Neural and Transformer-Based Text Classification,” ACM Comput Surv., vol. 55, pp. 1–52, 2023. https://doi.org/10.1145/3582000
  • Li et al., “A survey on text classification: From traditional to deep learning,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 13, no. 2, pp. 1–41, 2022. https://doi.org/10.1145/3495162
  • Palanivinayagam and C. Z., El-Bayeh and R. Damaševičius, “Twenty years of machine-learning-based text classification: A systematic review,” Algorithms, vol. 16, no. 5, p. 236, 2023. https://doi.org/10.3390/a16050236
  • T. Vu, M. T. Nguyen, V. C. Nguyen, M. H. Pham, V. Q. Nguyen and V. H. Nguyen, “Label-representative graph convolutional network for multi-label text classification,” Applied Intelligence, vol. 53, pp. 14759–14774, 2023. https://doi.org/10.1007/s10489-022-04106-x
  • Cai, V. W. Zheng and K. C. C. Chang, “A comprehensive survey of graph embedding: Problems, techniques, and applications,” IEEE Trans Knowl Data Eng., vol. 30, no. 2, pp. 1616–1637, 2018. https://doi.org/10.1109/TKDE.2018.2807452
  • Wang, Y. Ding and S. C. Han, “Graph neural networks for text classification: A survey,” Artif Intell Rev., vol. 57, p. 190, 2024. https://doi.org/10.1007/s10462-024-10808-0
  • Zeng, E. Zha, J. Kuang and Y. Shen, “Multi-label text classification based on semantic-sensitive graph convolutional network,” Knowl Based Syst., vol. 284, p. 111303, 2024. https://doi.org/10.1016/j.knosys.2023.111303
  • Li, B. You, Q. Peng and S. Feng, “Dual-view graph convolutional network for multi-label text classification,” Applied Intelligence, vol. 54, pp. 9363–9380, 2024. https://doi.org/10.1007/s10489-024-05666-w
  • Ma, N. Yan, J. Li, M. Mortazavi and N. V. Chawla, “HetGPT: Harnessing the power of prompt tuning in pre-trained heterogeneous graph neural networks,” In Proceedings of the ACM on Web Conference 2024, 2024, pp. 1015–1023. https://doi.org/10.1145/3589334.3645685
  • N. Kipf and M. Welling, “Semi-Supervised Classification with Graph Convolutional Networks,” In International Conference on Learning Representations, 2017. https://doi.org/10.48550/arXiv.1609.02907
  • Bruna, W. Zaremba, A. Szlam and Y. Lecun, “Spectral networks and locally connected networks on graphs,” In International Conference on Learning Representations (ICLR2014), CBLS, April 2014. https://doi.org/10.48550/arXiv.1312.6203
  • Marcheggiani and I. Titov, “Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling,” In: Palmer, M., Hwa, R., and Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, Denmark, 2017, pp. 1506–1515. https://doi.org/10.18653/v1/D17-1159
  • Bastings, I. Titov, W. Aziz, D. Marcheggiani and K. Sima’an, “Graph Convolutional Encoders for Syntax-aware Neural Machine Translation. In: Palmer, M., Hwa, R., and Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, Denmark, 2017, pp. 1957–1967. https://doi.org/10.18653/v1/D17-1209
  • Li, R. Jin and Y. Luo, “Classifying relations in clinical narratives using segment graph convolutional and recurrent neural networks (Seg-GCRNs),” Journal of the American Medical Informatics Association, vol. 26, no. 3, pp. 262–268, 2019. https://doi.org/10.1093/jamia/ocy157
  • Defferrard, X. Bresson and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” In Proceedings of the 30th International Conference on Neural Information Processing Systems, Curran Associates Inc., Red Hook, NY, USA, 2016, pp. 3844–3852.
  • Cao, X. Deng, S. Yue, P. Jiang, J. Ren and J. Gui, “Dependent Task Offloading in Edge Computing Using GNN and Deep Reinforcement Learning,” IEEE Internet Things J., vol. 11, no. 12, pp. 21632-21646, 2024, https://doi.org10.1109/JIOT.2024.3374969.
  • Li, B. Wang, Y. Wang and M. Wang, “Graph-based text classification by contrastive learning with text-level graph augmentation,” ACM Trans Knowl Discov Data., vol. 18, pp. 1–21, 2024. https://doi.org/10.1145/3638353
  • S. Ziaee, H. Rahmani, M. Tabatabaei, A. H. C. Vlot and A. Bender, “DCGG: drug combination prediction using GNN and GAE. Progress in Artificial Intelligence, vol. 13, pp. 17–30, 2024. https://doi.org/10.1007/s13748-024-00314-3
  • Sharma, S. Singh and S. Ratna, “Graph neural network operators: a review,” Multimed Tools Appl., vol. 83, pp. 23413–23436, 2024. https://doi.org/10.1007/s11042-023-16440-4
  • C. Kuo, Y. T. Chou, K. Y. Li, W. T. Chang, Y. N. Huang and C. S. Chen, “GNN-LSTM-based fusion model for structural dynamic responses prediction,” Eng Struct., vol. 306, p. 117733, 2024. https://doi.org/10.1016/j.engstruct.2024.117733
  • https://www.kaggle.com/datasets/devintheai/arxiv-cs-papers-multi-label-classification-200k-v1/data
  • nltk.org
  • spacy.io
  • H. Hoo, J. Candlish and D. Teare, “What is an ROC curve?,” Emergency Medicine Journal, vol. 34, no. 6, pp. 357-359, 2017. https://doi.org/10.1136/emermed-2017-206735