Perancangan System Crawler dengan MenerapkanArsitektur Distributed Task

Main Article Content

Heri Santoso
Indyah Hartami Santi
Ni’ma Kholila

Abstract

PDC Media Group is a company engaged in online trading. The need for data insight in the online marketplace is very important. Likewise, how to get quite a lot of data, of course, requires automation such as crawling data on the marketplace website. Due to the large amount of data, crawler systems are often not optimal in crawling data. The application of distributed tasks on the crawler system provides convenience in scaling the server both vertically and horizontally. Therefore, the large and growing data can be handled by the crawler system. The application was developed with the Python language, with the  application server using Google Cloud Computing. In a distributed task architecture requires a component in the form of a message broker. The message broker used in designing this system is RabbitMQ. Testing the crawler system uses 3 scenarios, namely with 1 worker, 2worker, and 3 worker. The results for the 1 worker scenario are 19.3 requests per second and 332 ms for response time. The results for the 2 worker scenario are 41.4 requests per second and 328 ms for response time. While the results for the 3 worker scenario are 60 requests per second and 331 ms for response time.

Downloads

Download data is not yet available.

Article Details

How to Cite
[1]
H. Santoso, Indyah Hartami Santi, and Ni’ma Kholila, “Perancangan System Crawler dengan MenerapkanArsitektur Distributed Task”, JuTISI, vol. 8, no. 1, pp. 74 –, Apr. 2022.
Section
Articles