MapReduce Word Frequency Statistics Based on Serverless Platform

Authors

  • Yuewei Zhang

DOI:

https://doi.org/10.54097/cxghwq45

Keywords:

Serverless; MapReduce; Alibaba Cloud Function Compute; Word Frequency Statistics.

Abstract

In the past few years, people’s demand of speed and efficiency in extracting information has increased dramatically, which gives rise to the surge in the demand of text data processing. Traditional distributed computing framework based on VMs are limited by some inherent defects such as wasting resource reservation, large overhead, rigid scaling, which further cause bad utilization and poor performance when running on burst workloads. This paper design and implement a Serverless MR system on Alibaba Cloud Function Compute (FC) and Object Storage Service (OSS). The Map and Reduce part is split into independent and event-driven functions, and then automatically executed and allocated across FC instances. The experiment results on text word frequency counting show that the system is efficient in elasticity and cost. Especially, it’s efficient in lightweight applications with high concurrency and transient workloads. This work provides an efficient and manageable big data processing platform for small and medium teams. Meanwhile, it also provides an example of how to apply Serverless computing in distributed processing model.

References

[1] Goyal A. Study on emerging implementations of MapReduce. International Conference on Computing, Communication & Automation. IEEE, 2015, 16-21.

[2] Jiang S, Zeng R, Zhou Y, Lyu M R. Distinguishability-guided Test Program Generation for WebAssembly Runtime Performance Testing. 2025 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2025, 768-779.

[3] Devineni S, Gorantla B, Shaik I, Kumar V A. Optimizing Cloud Resources using Algorithmic Approaches in Serverless Computing. 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT). IEEE, 2024, 1-6.

[4] Veuvolu R, Suryadevar A, Vignesh T, Avthu N R. Cloud computing based (Serverless computing) using Serverless architecture for dynamic web hosting and cost optimization. 2023 International Conference on Computer Communication and Informatics (ICCCI). IEEE, 2023, 1-6.

[5] Huang X, Gu R, Huang Y. Towards Efficient Serverless MapReduce Computing on Cloud-Native Platforms. Big Data Mining and Analytics, 2025, 8(3): 575-591.

[6] Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters. Communications of the ACM, 2008, 51(1): 107-113.

[7] Cai J, Huang K, Liao Z. Efficiency Assessment of MapReduce Algorithm on a Serverless Platform. 2023 IEEE 3rd International Conference on Electronic Technology, Communication and Information (ICETCI). IEEE, 2023, 1201-1207.

[8] Jo M, Kim D, Baek I, Park S. Challenges of Serverless Computing Paradaigm. 2025 International Conference on Information Networking (ICOIN). IEEE, 2025, 707-710.

[9] Barcelona-Pons D, García-López P. Benchmarking parallelism in FaaS platforms. Future Generation Computer Systems, 2021, 124: 268-284.

[10] Chen X, Li X, Zhou Q. Imbalanced Word Counting Using MapReduce in Serverless Platform. 2023 IEEE 3rd International Conference on Electronic Technology, Communication and Information (ICETCI). IEEE, 2023, 1195-1200.

Downloads

Published

15-03-2026

Issue

Section

Articles

How to Cite

Zhang, Y. (2026). MapReduce Word Frequency Statistics Based on Serverless Platform. Mathematical Modeling and Algorithm Application, 9(1), 709-714. https://doi.org/10.54097/cxghwq45