Implementation of an Apache Spark computing cluster based on Raspberry PI microcomputers
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Видавництво Львівської політехніки
Lviv Politechnic Publishing House
Lviv Politechnic Publishing House
Abstract
The paper presents the implementation of an Apache Spark distributed computing cluster based on Raspberry Pi
microcomputers. The solution consists of three Raspberry Pi 4 devices (one master node and two worker nodes), each equipped with
8 GB of RAM and a high-speed network connection. The cluster configuration was optimized by adjusting the
SPARK_WORKER_MEMORY and SPARK_WORKER_CORES parameters to maximize the use of available hardware resources.
Secure communication between nodes was established through authentication using 4096-bit SSH keys. The functionality of the
cluster was tested using a test application that demonstrated efficient distribution of computational load across nodes. The developed
solution costs $400, which is four times less than the cost of using equivalent cloud resources for one year. The results show that the
Raspberry Pi cluster provides all the necessary capabilities for practical learning of distributed computing technologies, offering
physical access to all system components at a low cost.
Description
Citation
Vlakh-Vyhrynovska H. Implementation of an Apache Spark computing cluster based on Raspberry PI microcomputers / Halyna Vlakh-Vyhrynovska, Bohdan Boretskyi // Measuring Equipment and Metrology. — Lviv : Lviv Politechnic Publishing House, 2025. — Vol 86. — No 2. — P. 92–97.