A Word Difficulty Classification Research Based on K-Means Method

Authors

  • Chengxi Wei
  • Di Kuang
  • Keran Xu

DOI:

https://doi.org/10.54097/hset.v70i.12188

Keywords:

Wordle, Word Attributes, K-means, Spearman.

Abstract

Wordle is a globally popular game and many researchers have conducted research on its game mechanics. However, few studies have explored the influence of attributes on the difficulty of the word. Therefore, this paper uses K-means algorithm to classify word difficulty based on certain word attributes. In the paper, the main attributes character repeat times, the presence of "th" or "er", the initial letter (s/c/a/t), and the final letter (e/y/r/t) that affect word difficulty are selected, and a comparison is made regarding the number of difficulty categories, and the most appropriate number of categories is three. Finally, it has been validated by Spearman that this method possesses strong reliability.

Downloads

Download data is not yet available.

References

Wordle. (2023). Retrieved February 17, 2023, from https://en.wikipedia.org/wiki/ Wordle.

de Silva, Nisansa.” Selecting seed words for wordle using character statistics.” arXiv preprint arXiv:2202.03457 (2022).

Wordle – the best word to start the game, according to a language researcher. (2023). Retrieved February 17, 2023, from https://theconversation.com.

Anderson, Benton J., and Jesse G. Meyer.” Finding the optimal human strategy for wordle using maximum correct letter probabilities and reinforcement learning.” arXiv preprint arXiv:2202.00557 (2022).

Twitter. Available at: https://twitter.com/WordleStats?s=20 (Accessed: April 9, 2023).

N. De Silva,” Selecting Optimum Seed Words for Wordle using Character Statistics,” 2022 Moratuwa Engineering Research Conference (MERCon), Moratuwa, Sri Lanka, 2022, pp. 1-6, doi: 10.1109/MERCon55799.2022.9906176.

Tankersley, Karen. The threads of reading: Strategies for literacy development. ASCD, 2003.

Croux, C., Dehon, C. Influence functions of the Spearman and Kendall correlation measures. Stat Methods Appl 19, 497–515 (2010). https://doi.org/10.1007/s10260-010-0142-z

Hartigan, J. A., and M. A. Wong. “Algorithm AS 136: A K-Means Clustering Algorithm.” Journal of the Royal Statistical Society. Series C (Applied Statistics), vol. 28, no. 1, 1979, pp. 100–08. JSTOR, https://doi.org/10.2307/2346830. Accessed 20 Feb. 2023.

A. Chadha, Distilled Notes for Stanford CS229: Machine Learning, https://www.aman.ai, 2020, Accessed: Aug 1, 2020.

Downloads

Published

15-11-2023

How to Cite

Wei, C., Kuang, D., & Xu, K. (2023). A Word Difficulty Classification Research Based on K-Means Method. Highlights in Science, Engineering and Technology, 70, 215-222. https://doi.org/10.54097/hset.v70i.12188