The Application and Analysis of Large Language Models in Linguistics

Enrui Mei

doi:10.54097/wnxext10

Authors

Enrui Mei

DOI:

https://doi.org/10.54097/wnxext10

Keywords:

Large language model, linguistics, reasoning ability, Cohen's κ coefficient.

Abstract

The prevail large language models still needs to improve reasoning ability. By studying large language models in linguistics can not only improve large language mode's reasoning ability in complex question especially in ambiguous tasks but also help linguistics and researchers to gain a useful tool to research ancient literacy and ancient cultures. This paper compares different methods to train large language models by linguistics. This paper showcases that some models have the ability of identifying ambiguity in sentences by Linguistic Recursion and Chain-of-Thought. What’s more, this paper also shows that sometimes commercial models are not as good as open-source models. Researchers have to carefully choose to use which large language model. There aren’t any perfect models. If researchers want high accuracy, they have to give up some efficiency. If users really need high efficiency, the accuracy of large language models may be not good enough. This paper also shows some problems that still need to be solved. The hardware is a significant factor that restrains the improvement reasoning ability of large language models. Linguistics also need to provide more explicit ambiguous data for large language models to train. And nowadays, large language models still have poor performance in systematic reasoning.

References

[1] Q. A. H. Abdulkhaleq, D. A. Shoayee, B. S. Wala, et al., “Automated Sarcasm Recognition Using Applied Linguistics Driven Deep Learning with Large Language Model.” Fractals. (2024).

[2] G. S. Alexander, V. G. Artem, B.R. Roman, et al., “A deep learning method based on language models for processing natural language Russian commands in human robot interaction”. Natural Language Processing Journal. (2024).

[3] Z. Lin, “A Methodological Review of Machine Learning in Applied Linguistics”. English Language Teaching. (2021).

[4] B. Gašper, D. Maksymilian, R. Ryan, “Large Linguistic Models: Investigating LLMs' metalinguistic abilities”. https://arxiv.org/abs/2305.00948. (2025).

[5] A. Ben, B. Liam, “Large language models are better than theoretical linguists at theoretical linguistics”. Theoretical Linguistics. (2024).

[6] V. Sowmya, “Machine Learning and Applied Linguistics”. https://arxiv.org/abs/1803.09103. (2018).

[7] M. Atsushi, F. Mark, “Large language models fall short in classifying learners’ open-ended responses”. Research Methods in Applied Linguistics. (2025).

[8] A. Nathan, C. Mason, “Evaluating Large Language Models through the Lens of Linguistic Proficiency and World Knowledge: A Comparative Study”. A Comparative Study. (2024).

[9] K. Svitlana, Svitlana, Goncharenko, “ETHICS OF USING LARGE LANGUAGE MODELS IN MACHINE LINGUISTICS”. The 5th International Conference on Scientific Practice. (2024).

[10] G. Schneider, “Applying Computational Linguistics and Language Models: From Descriptive Linguistics to Text Mining and Psycholinguistics”. University of Zurich. (2014).