Wordle word difficulty classification based on K-means


  • Chen Xiong
  • Xinbo Yang
  • Jiahui Zhang




Wordle Games, K-means, Difficulty Classification, Word Property.


Wordle is a popular word-guessing game, and the analysis of Wordle games will play an important role in updating its iterations. In this paper, a word difficulty classification model based on K-means cluster analysis is developed. Clustering the dataset of the number of times required for word guessing, the words can be classified into three categories according to their difficulty: hard, medium and easy, with the corresponding labels of 3, 2 and 1. The attributes of the words in each category were counted and analyzed to arrive at the following: 1) The more common the word, the less difficult it was. 2) The more repeated letters in the word, the more difficult it was. 3) The experiment substitutes the example word EERIE into the model and the difficulty classification result is medium. The profile coefficient index of the model is 0.372. The cluster analysis method applied in this paper has good training results and is suitable as technical support for Wordle game analysis.


Download data is not yet available.


Ahmed M, Seraj R, Islam S M S. The k-means algorithm: A comprehensive survey and performance evaluation[J]. Electronics, 2020, 9(8): 1295.

Yu Q, Wang H, Qiao S, et al. k-means Mask Transformer[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 288-307.

Pham D T, Dimov S S, Nguyen C D. Selection of K in K-means clustering[J]. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, 2005, 219(1): 103-119.

Borlea I D, Precup R E, Borlea A B. Improvement of K-means cluster quality by post processing resulted clusters[J]. Procedia Computer Science, 2022, 199: 63-70.

Ghezelbash R, Maghsoudi A, Shamekhi M, et al. Genetic algorithm to optimize the SVM and K-means algorithms for mapping of mineral prospectivity[J]. Neural Computing and Applications, 2023, 35(1): 719-733.

Ghazal T M. Performances of K-means clustering algorithm with different distance metrics[J]. Intelligent Automation & Soft Computing, 2021, 30(2): 735-742.

Rezaee M J, Eshkevari M, Saberi M, et al. GBK-means clustering algorithm: An improvement to the K-means algorithm based on the bargaining game[J]. Knowledge-Based Systems, 2021, 213: 106672.

Zhang H, Peng Q. PSO and K-means-based semantic segmentation toward agricultural products[J]. Future Generation Computer Systems, 2022, 126: 82-87.

Zhang Z, Feng Q, Huang J, et al. A local search algorithm for k-means with outliers[J]. Neurocomputing, 2021, 450: 230-241.

Lu H, Gao Q, Wang Q, et al. Centerless multi-view K-means based on the adjacency matrix[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2023, 37(7): 8949-8956.




How to Cite

Xiong, C., Yang, X., & Zhang, J. (2024). Wordle word difficulty classification based on K-means. Highlights in Science, Engineering and Technology, 82, 19-26. https://doi.org/10.54097/6e8q3v02