[1] T. Brown
et al., “Language Models Are Few-Shot Learners,”
Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901, 2020.
https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
[2] V. Sanh
et al., “Multitask Prompted Training Enables Zero-Shot Task Generalization,” in
Proceedings of the 10th International Conference on Learning Representations (ICLR), Apr. 2022.
https://openreview.net/pdf?id=9Vrb9D0WI4
[3] R. Thoppilan
et al., “LaMDA: Language Models for Dialog Applications,” a
rXiv preprint arXiv:2201.08239, 2022.
https://doi.org/10.48550/arXiv.2201.08239
[4] M. Chen et al., “Evaluating Large Language Models Trained on Code,” arXiv preprint arXiv:2107.03374, 2021. https://doi.org/10.48550/arXiv.2107.03374
[5] A. Lewkowycz
et al., “Solving Quantitative Reasoning Problems with Language Models,”
Advances in Neural Information Processing Systems, vol. 35, pp. 3843–3857, Dec. 2022.
https://proceedings.neurips.cc/paper_files/paper/2022/hash/18abbeef8cfe9203fdf9053c9c4fe191-Abstract-Conference.html
[6] J. D. Blom,
A Dictionary of Hallucinations. New York, NY: Springer, 2010.
https://doi.org/10.1007/978-1-4419-1223-7
[7] Y. Bang
et al., “A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity,” in
Proceedings of the 13th International Joint Conference on Natural Language Processing (IJCNLP-AACL 2023), Bali, Indonesia, Nov. 2023, pp. 675–718.
https://doi.org/10.48550/arXiv.2302.04023
[8] Z. Ji
et al., “Survey of Hallucination in Natural Language Generation,”
ACM Comput. Surv., vol. 55, no. 12, p. 248:1-248:38, Mar. 2023,
https://doi.org/10.1145/3571730
[9] L. Huang
et al., “A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions,”
ACM Trans. Inf. Syst., vol. 43, no. 2, Art. no. 28, pp. 1–55, 2025,
https://doi.org/10.1145/3703155
[10] R. Friel and A. S. Sanyal, “ChainPoll: A High Efficacy Method for LLM Hallucination Detection,”
arXiv preprint arXiv:2310.18344, 2023,
https://doi/org/10.48550/arXiv.2310.18344
[11] T. Zare and M. Shamsfard, “Detecting Hallucinations Generated by Large Language Models Using Paraphrasing Technique,” in
Proceedings of the 10th International Web Research Conference (ICWR), Tehran, Iran, Apr. 2024, pp. 1–6.
https://www.sid.ir/paper/1147671/en
[12] S. Lin, J. Hilton, and O. Evans, “TruthfulQA: Measuring How Models Mimic Human Falsehoods,” in
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), S. Muresan, P. Nakov, and A. Villavicencio, Eds., Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 3214–3252.
https://doi.org/10.18653/v1/2022.acl-long.229
[13] P. Manakul, A. Liusie, and M. Gales, “SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models,” in
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, H. Bouamor, J. Pino, and K. Bali, Eds., Singapore: Association for Computational Linguistics, Dec. 2023, pp. 9004–9017.
https://doi.org/10.18653/v1/2023.emnlp-main.557
[14] T. Liu, K. Wang, L. Sha, B. Chang, and Z. Sui, “Table-to-Text Generation by Structure-Aware Seq2seq Learning,”
Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, Art. no. 1, Apr. 2018,
https://doi.org/10.1609/aaai.v32i1.11925
[15] J. Li, X. Cheng, X. Zhao, J.-Y. Nie, and J.-R. Wen, “HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models,” in
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), Singapore, Dec. 2023, pp. 6449–6464,
https://doi.org/10.18653/v1/2023.emnlp-main.397
[16] B. M. Lattimer, P. H. Chen, X. Zhang, and Y. Yang, “Fast and Accurate Factual Inconsistency Detection Over Long Documents,” in
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), Singapore, Dec. 2023, pp. 1691–1703,
https://doi.org/10.18653/v1/2023.emnlp-main.105
[17] Y. Yehuda, I. Malkiel, O. Barkan, J. Weill, R. Ronen, and N. Koenigstein, “InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers,” in
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), L.-W. Ku, A. Martins, and V. Srikumar, Eds., Bangkok, Thailand: Association for Computational Linguistics, Aug. 2024, pp. 9333–9347.
https://doi.org/10.18653/v1/2024.acl-long.506
[18] G. Sriramanan, S. Bharti, V. S. Sadasivan, S. Saha, P. Kattakinda, and S. Feizi, “LLM-Check: Investigating Detection of Hallucinations in Large Language Models,”
Advances in Neural Information Processing Systems, vol. 37, pp. 34188–34216, Dec. 2024.
https://proceedings.neurips.cc/paper_files/paper/2024/hash/3c1e1fdf305195cd620c118aaa9717ad-Abstract-Conference.html
[19] Y. S. Chuang, L. Qiu, C. Y. Hsieh, R. Krishna, Y. Kim, and J. R. Glass, “Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps,” in
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Y. Al-Onaizan, M. Bansal, and Y. N. Chen, Eds., Miami, Florida, USA: Association for Computational Linguistics, Nov. 2024, pp. 1419–1436.
https://doi.org/10.18653/v1/2024.emnlp-main.84
[20] S. Zhang, T. Yu, and Y. Feng, “TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space,” in
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), L.-W. Ku, A. Martins, and V. Srikumar, Eds., Bangkok, Thailand: Association for Computational Linguistics, Aug. 2024, pp. 8908–8949.
https://doi.org/10.18653/v1/2024.acl-long.483