MultiCGCN: Multi-Label Text Classification using GCNs and Heterogeneous Graphs

Allahgholi, Milad; Rahmani, Hossein; Soltanzadeh, Parinaz; Naebzadeh, Aylin

doi:10.22133/ijwr.2024.485064.1243

MultiCGCN: Multi-Label Text Classification using GCNs and Heterogeneous Graphs

Document Type : Original Article

Authors

Milad Allahgholi

Hossein Rahmani

Parinaz Soltanzadeh

Aylin Naebzadeh

School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran

10.22133/ijwr.2024.485064.1243

Abstract

Multi-label text classification is a critical challenge in natural language processing, where the goal is to assign multiple labels to a given document. Recent advances have primarily focused on deep learning approaches, yet many fail to adequately capture the intricate relationships between documents and labels. In this paper, we propose a novel method called MultiCGCN, in which we leverage Graph Convolutional Networks (GCNs) for multi-label text classification by modeling text as a heterogeneous graph. This unified graph incorporates document similarities, label relationships, and document-label associations, enabling the model to effectively capture both document and label dependencies. We transform the multi-label classification problem into a link prediction task, using Term Frequency–Inverse Document Frequency (TF-IDF) for document similarity and applying GCNs to predict label assignments. Our empirical evaluations demonstrate that MultiCGCN achieves a significant performance boost, improving F1 score by 10% over traditional baseline models. This approach opens new avenues for enhancing the accuracy of multi-label classification in various domains.

Keywords

Text Classification

Graph Convolutional Neural Networks

Multi-label Text Classification

Subjects

Web Retrieval & Content Analysis

Meng, Z. Ye, Y. Yang and H. Zhao, “DeepMCGCN: Multi-channel Deep Graph Neural Networks,” International Journal of Computational Intelligence Systems, vol. 17, p. 41, 2024. https://doi.org/10.1007/s44196-024-00432-9
Xiong, L. Yu, X. Niu and Y. Leng, “XRR: Extreme multi-label text classification with candidate retrieving and deep ranking,” Inf Sci (N Y), 622, 115–132, 2023. https://doi.org/10.1016/j.ins.2022.11.158
Rakhlin, “Convolutional neural networks for sentence classification,” GitHub, 6, 25, 2016.
Buchner, L. Cao, J. C. Kalo and V. Von Ehrenheim, “Prompt Tuned Embedding Classification for Industry Sector Allocation,” In: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 6, Industry Track, 2024, pp. 108–118. https://doi.org/10.18653/v1/2024.naacl-industry.10
Li, et al., “Enhancing Extreme Multi-Label Text Classification: Addressing Challenges in Model, Data, and Evaluation,” In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track, 2023, pp. 313–321. https://doi.org/10.18653/v1/2023.emnlp-industry.30
I. Wang and C. D. Manning, “Baselines and bigrams: Simple, good sentiment and topic classification.” In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, vol. 2: Short Papers, 2024, pp. 90–94.
Chenthamarakshan, P. Melville, V. Sindhwani and R. D. Lawrence, “Concept labeling: Building text classifiers with minimal supervision,” In IJCAI proceedings-international joint conference on artificial intelligence, 2011, p. 1225.
Luo, Ö. Uzuner and P. Szolovits, “Bridging semantics and syntax with graph algorithms—state-of-the-art of extracting biomedical relations,” Brief Bioinform, vol. 18, no. 1, pp. 160–178, 2017. https://doi.org/10.1093/bib/bbw001
Rousseau, E. Kiagias and M. Vazirgiannis, “Text categorization as a graph classification problem,” In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 2015, pp. 1702–1712.
Skianis, F. Rousseau and M. Vazirgiannis, “Regularizing text categorization with clusters of words,” In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, 2016, pp. 1827–1837.
Luo, A. R. Sohani, E. P. Hochberg and P. Szolovits, “Automatic lymphoma classification with sentence subgraph mining from pathology reports,” Journal of the American Medical Informatics Association, vol. 21, pp. 824–832, 2014. https://doi.org/10.1136/amiajnl-2013-002443
Luo, Y. Xin, E. Hochberg, R. Joshi, O. Uzuner and P. Szolovits, “Subgraph augmented non-negative tensor factorization (SANTF) for modeling clinical narrative text,” Journal of the American Medical Informatics Association, vol. 22, no. 5, pp. 1009–1019, 2015. https://doi.org/10.1093/jamia/ocv016
Yan, F. Liu, X. Zhuang and J. Ju, “An R-transformer_BiLSTM model based on attention for multi-label text classification, Neural Process Lett, vol. 55, pp. 1293–1316, 2023. https://doi.org/10.1007/s11063-022-10938-y
Mikolov, I. Sutskever, K. Chen, G. S. Corrado and J. Dean, “Distributed representations of words and phrases and their compositionality,” Adv Neural Inf Process Syst., vol. 26, 2013.
Pennington, R. Socher and C. D. Manning, “Glove: Global vectors for word representation,” In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), Doha, Qatar, 2014, pp. 1532–1543. https://doi.org/10.3115/v1/D14-1162
Yu, F. Xiong and Z. Chen, “Text Classification Based on Natural Language Processing and Machine Learning in Multi-Label Corpus,” ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 23, no. 8, pp. 1–14, 2024. https://doi.org/10.1145/3617831
Wang, H. Xie, F. L. Wang and L. K. Lee, “Improving text classification via a soft dynamical label strategy,” International Journal of Machine Learning and Cybernetics, vol. 14, pp. 2395–2405, 2023. https://doi.org/10.1007/s13042-022-01770-w
Liu, J. Pang, N. Li, X. Zhou and F. Yue, “Research on multi-label text classification method based on tALBERT-CNN,” International Journal of Computational Intelligence Systems, vol. 14, p. 201, 2021. https://doi.org/10.1007/s44196-021-00055-4
Joulin, E. Grave and P. B. T. Mikolov, “Bag of Tricks for Efficient Text Classification,” arXiv preprint arXiv:1607.01759, 2017. https://doi.org/10.48550/arXiv.1607.01759
Shen et al., “Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms,” In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers, 2018, pp. 440–450. https://doi.org/10.18653/v1/P18-1041
Wang et al., “Joint Embedding of Words and Labels for Text Classification,” In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers, 2018, pp. 2321–2331. https://doi.org/10.18653/v1/P18-1216
Le and T. Mikolov, “Distributed representations of sentences and documents,” Proceedings of the 31st International Conference on Machine Learning, PMLR , 2014, pp. 1188-1196.
Tang, M. Qu and Q. Mei, “Pte: Predictive text embedding through large-scale heterogeneous text networks,” In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, 2015, pp. 1165–1174. https://doi.org/10.1145/2783258.2783307
Zhang, J. Zhao and Y. LeCun, “Character-level convolutional networks for text classification,” Adv. Neural Inf Process Syst., vol. 28, 2015.
Conneau, H. Schwenk, L. Barrault and Y. Lecun, “Very Deep Convolutional Networks for Text Classification,” In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Volume 1: Long Papers, 2017, pp. 1107–1116. https://aclanthology.org/E17-1104
Liu, X. Qiu and X. Huang, “Recurrent neural network for text classification with multi-task learning,” In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016, pp. 2873–2879. https://www.ijcai.org/Proceedings/16/Papers/408.pdf
Luo, “Recurrent neural networks for classifying relations in clinical notes,” J. Biomed Inform., vol. 72, pp. 85–95, 2017. https://doi.org/10.1016/j.jbi.2017.07.006
S. Tai, R. Socher and C. D. Manning, “Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks,” In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Volume 1: Long Papers, 2015. pp. 1556–1566. https://doi.org/10.3115/v1/P15-1150
Wang, M. Huang, X. Zhu and L. Zhao, “Attention-based LSTM for aspect-level sentiment classification,” In Proceedings of the 2016 conference on empirical methods in natural language processing. 2016, pp. 606–615. https://doi.org/10.18653/v1/D16-1058
Yang, D. Yang, C. Dyer, X. He, A. Smola and E. Hovy, “Hierarchical attention networks for document classification.” In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, 2016, pp. 1480–1489. https://doi.org/10.18653/v1/N16-1174
Lin, Y. Wang, X. Liu and X. Qiu, “A survey of transformers,” AI open, vol. 3, pp. 111–132, 2022. https://doi.org/10.1016/j.aiopen.2022.10.001
Zhao, Q. Ai, X. Li, W. Wang, Q. Gao and Y. Liu, “TLC-XML: Transformer with Label Correlation for Extreme Multi-label Text Classification,” Neural Process Lett., vol. 56, p. 25, 2024. https://doi.org/10.1007/s11063-024-11460-z
Cunha, F. Viegas, C. França, T. Rosa, L. Rocha and M. A. Gonçalves, “A Comparative Survey of Instance Selection Methods applied to Non-Neural and Transformer-Based Text Classification,” ACM Comput Surv., vol. 55, pp. 1–52, 2023. https://doi.org/10.1145/3582000
Li et al., “A survey on text classification: From traditional to deep learning,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 13, no. 2, pp. 1–41, 2022. https://doi.org/10.1145/3495162
Palanivinayagam and C. Z., El-Bayeh and R. Damaševičius, “Twenty years of machine-learning-based text classification: A systematic review,” Algorithms, vol. 16, no. 5, p. 236, 2023. https://doi.org/10.3390/a16050236
T. Vu, M. T. Nguyen, V. C. Nguyen, M. H. Pham, V. Q. Nguyen and V. H. Nguyen, “Label-representative graph convolutional network for multi-label text classification,” Applied Intelligence, vol. 53, pp. 14759–14774, 2023. https://doi.org/10.1007/s10489-022-04106-x
Cai, V. W. Zheng and K. C. C. Chang, “A comprehensive survey of graph embedding: Problems, techniques, and applications,” IEEE Trans Knowl Data Eng., vol. 30, no. 2, pp. 1616–1637, 2018. https://doi.org/10.1109/TKDE.2018.2807452
Wang, Y. Ding and S. C. Han, “Graph neural networks for text classification: A survey,” Artif Intell Rev., vol. 57, p. 190, 2024. https://doi.org/10.1007/s10462-024-10808-0
Zeng, E. Zha, J. Kuang and Y. Shen, “Multi-label text classification based on semantic-sensitive graph convolutional network,” Knowl Based Syst., vol. 284, p. 111303, 2024. https://doi.org/10.1016/j.knosys.2023.111303
Li, B. You, Q. Peng and S. Feng, “Dual-view graph convolutional network for multi-label text classification,” Applied Intelligence, vol. 54, pp. 9363–9380, 2024. https://doi.org/10.1007/s10489-024-05666-w
Ma, N. Yan, J. Li, M. Mortazavi and N. V. Chawla, “HetGPT: Harnessing the power of prompt tuning in pre-trained heterogeneous graph neural networks,” In Proceedings of the ACM on Web Conference 2024, 2024, pp. 1015–1023. https://doi.org/10.1145/3589334.3645685
N. Kipf and M. Welling, “Semi-Supervised Classification with Graph Convolutional Networks,” In International Conference on Learning Representations, 2017. https://doi.org/10.48550/arXiv.1609.02907
Bruna, W. Zaremba, A. Szlam and Y. Lecun, “Spectral networks and locally connected networks on graphs,” In International Conference on Learning Representations (ICLR2014), CBLS, April 2014. https://doi.org/10.48550/arXiv.1312.6203
Marcheggiani and I. Titov, “Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling,” In: Palmer, M., Hwa, R., and Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, Denmark, 2017, pp. 1506–1515. https://doi.org/10.18653/v1/D17-1159
Bastings, I. Titov, W. Aziz, D. Marcheggiani and K. Sima’an, “Graph Convolutional Encoders for Syntax-aware Neural Machine Translation. In: Palmer, M., Hwa, R., and Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, Denmark, 2017, pp. 1957–1967. https://doi.org/10.18653/v1/D17-1209
Li, R. Jin and Y. Luo, “Classifying relations in clinical narratives using segment graph convolutional and recurrent neural networks (Seg-GCRNs),” Journal of the American Medical Informatics Association, vol. 26, no. 3, pp. 262–268, 2019. https://doi.org/10.1093/jamia/ocy157
Defferrard, X. Bresson and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” In Proceedings of the 30th International Conference on Neural Information Processing Systems, Curran Associates Inc., Red Hook, NY, USA, 2016, pp. 3844–3852.
Cao, X. Deng, S. Yue, P. Jiang, J. Ren and J. Gui, “Dependent Task Offloading in Edge Computing Using GNN and Deep Reinforcement Learning,” IEEE Internet Things J., vol. 11, no. 12, pp. 21632-21646, 2024, https://doi.org10.1109/JIOT.2024.3374969.
Li, B. Wang, Y. Wang and M. Wang, “Graph-based text classification by contrastive learning with text-level graph augmentation,” ACM Trans Knowl Discov Data., vol. 18, pp. 1–21, 2024. https://doi.org/10.1145/3638353
S. Ziaee, H. Rahmani, M. Tabatabaei, A. H. C. Vlot and A. Bender, “DCGG: drug combination prediction using GNN and GAE. Progress in Artificial Intelligence, vol. 13, pp. 17–30, 2024. https://doi.org/10.1007/s13748-024-00314-3
Sharma, S. Singh and S. Ratna, “Graph neural network operators: a review,” Multimed Tools Appl., vol. 83, pp. 23413–23436, 2024. https://doi.org/10.1007/s11042-023-16440-4
C. Kuo, Y. T. Chou, K. Y. Li, W. T. Chang, Y. N. Huang and C. S. Chen, “GNN-LSTM-based fusion model for structural dynamic responses prediction,” Eng Struct., vol. 306, p. 117733, 2024. https://doi.org/10.1016/j.engstruct.2024.117733
https://www.kaggle.com/datasets/devintheai/arxiv-cs-papers-multi-label-classification-200k-v1/data
nltk.org
spacy.io
H. Hoo, J. Candlish and D. Teare, “What is an ROC curve?,” Emergency Medicine Journal, vol. 34, no. 6, pp. 357-359, 2017. https://doi.org/10.1136/emermed-2017-206735

Volume 7, Issue 4
Autumn 2024
Pages 29-37

XML

PDF 597.23 K

Article View 408
PDF Download 381

International Journal of Web Research

MultiCGCN: Multi-Label Text Classification using GCNs and Heterogeneous Graphs

Volume 7, Issue 4Autumn 2024Pages 29-37

Files

Share

How to cite

Statistics

Volume 7, Issue 4
Autumn 2024
Pages 29-37