Implementation of an Apache Spark computing cluster based on Raspberry PI microcomputers

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Видавництво Львівської політехніки
Lviv Politechnic Publishing House

Abstract

The paper presents the implementation of an Apache Spark distributed computing cluster based on Raspberry Pi microcomputers. The solution consists of three Raspberry Pi 4 devices (one master node and two worker nodes), each equipped with 8 GB of RAM and a high-speed network connection. The cluster configuration was optimized by adjusting the SPARK_WORKER_MEMORY and SPARK_WORKER_CORES parameters to maximize the use of available hardware resources. Secure communication between nodes was established through authentication using 4096-bit SSH keys. The functionality of the cluster was tested using a test application that demonstrated efficient distribution of computational load across nodes. The developed solution costs $400, which is four times less than the cost of using equivalent cloud resources for one year. The results show that the Raspberry Pi cluster provides all the necessary capabilities for practical learning of distributed computing technologies, offering physical access to all system components at a low cost.

Description

Citation

Vlakh-Vyhrynovska H. Implementation of an Apache Spark computing cluster based on Raspberry PI microcomputers / Halyna Vlakh-Vyhrynovska, Bohdan Boretskyi // Measuring Equipment and Metrology. — Lviv : Lviv Politechnic Publishing House, 2025. — Vol 86. — No 2. — P. 92–97.

Endorsement

Review

Supplemented By

Referenced By