Play Store Data Scrapping and Preprocessing done as Sentiment Analysis Material
DOI:
https://doi.org/10.64021/ijmst.1.1.16-21.2025Keywords:
e-commerce, preprocessing, sentiment analysis, scrapping, shopeeAbstract
Sentiment analysis is a computational technique used to interpret user opinions about a product through textual reviews. This research aims to prepare useful data for further research, one of which is sentiment analysis. A total of 12000 recent reviews from July 2024 - January 2025 were collected through web scrapping. The research process includes data preprocessing steps such as case folding and data cleaning to transform the raw data into a usable format. The raw data up to the given changes have been uploaded to the mendeley data repository to be reprocessed into further research, one of which is the sentiment analysis approach.
References
J. Homepage, N. C. Agustina, D. Herlina Citra, W. Purnama, C. Nisa, and A. Rozi Kurnia, “MALCOM: Indonesian Journal of Machine Learning and Computer Science The Implementation of Naïve Bayes Algorithm for Sentiment Analysis of Shopee Reviews on Google Play Store Implementasi Algoritma Naive Bayes untuk Analisis Sentimen Ulasan Shopee pada Google Play Store,” vol. 2, pp. 47–54, 2022.
Tania Puspa Rahayu Sanjaya, Ahmad Fauzi, and Anis Fitri Nur Masruriyah, “Analisis sentimen ulasan pada ecommerce shopee menggunakan algoritma naive bayes dan support vector machine,” INFOTECH: Jurnal Informatika & Teknologi, vol. 4, no. 1, pp. 16–26, Jun. 2023, doi: 10.37373/infotech.v4i1.422.
J. Y. M. Nip and B. Berthelier, “Social Media Sentiment Analysis,” Encyclopedia, vol. 4, no. 4, pp. 1590–1598, Oct. 2024, doi: 10.3390/encyclopedia4040104.
C. Cahyaningtyas, Y. Nataliani, and I. R. Widiasari, “Analisis sentimen pada rating aplikasi Shopee menggunakan metode Decision Tree berbasis SMOTE,” AITI: Jurnal Teknologi Informasi, vol. 18, no. Agustus, pp. 173–184, 2021.
N. Agustina, D. H. Citra, W. Purnama, C. Nisa, and A. R. Kurnia, “Implementasi Algoritma Naive Bayes untuk Analisis Sentimen Ulasan Shopee pada Google Play Store,” MALCOM: Indonesian Journal of Machine Learning and Computer Science, vol. 2, no. 1, pp. 47–54, 2022, doi: 10.57152/malcom.v2i1.195.
B. Bayu Baskoro et al., “Analisis Sentimen Pelanggan Hotel di Purwokerto Menggunakan Metode Random Forest dan TF-IDF (Studi Kasus: Ulasan Pelanggan Pada Situs TRIPADVISOR),” Journal Of Informatics, information system, software engineering application, vol. 3, no. 2, pp. 21–029, May 2021, doi: 10.20895/INISTA.V3I2.
S. Khomsah and Agus Sasmito Aribowo, “Text-Preprocessing Model Youtube Comments in Indonesian,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 4, no. 4, pp. 648–654, Aug. 2020, doi: 10.29207/resti.v4i4.2035.
Tania Puspa Rahayu Sanjaya, Ahmad Fauzi, and Anis Fitri Nur Masruriyah, “Analisis sentimen ulasan pada ecommerce shopee menggunakan algoritma naive bayes dan support vector machine,” INFOTECH: Jurnal Informatika & Teknologi, vol. 4, no. 1, pp. 16–26, 2023, doi: 10.37373/infotech.v4i1.422.
H. Benhar, A. Idri, and J. L. Fernández-Alemán, “Data preprocessing for heart disease classification: A systematic literature review,” Comput Methods Programs Biomed, vol. 195, p. 105635, Oct. 2020, doi: 10.1016/j.cmpb.2020.105635.
H. Utami, “Analisis Sentimen dari Aplikasi Shopee Indonesia Menggunakan Metode Recurrent Neural Network,” Indonesian Journal of Applied Statistics, vol. 5, no. 1, p. 31, May 2022, doi: 10.13057/ijas.v5i1.56825.
Irma Surya Kumala Idris, Yasin Aril Mustofa, and Irvan Abraham Salihi, “Analisis Sentimen Terhadap Penggunaan Aplikasi Shopee Mengunakan Algoritma Support Vector Machine (SVM),” Jambura Journal of Electrical and Electronics Engineering, vol. 6, no. 6, pp. 823–848, Jan. 2023, doi: 10.1177/0165551510388123.
P. Mishra, A. Biancolillo, J. M. Roger, F. Marini, and D. N. Rutledge, “New data preprocessing trends based on ensemble of multiple preprocessing techniques,” Nov. 01, 2020, Elsevier B.V. doi: 10.1016/j.trac.2020.116045.
U. Mufidah, M. Siahaan, and S. Informasi, “PERANCANGAN APLIKASI PERBANNDINGAN HARGA PRODUK (HISTORICAL DATA) MENGGUNAKAN TEKNIK SCRAPING WEB,” 2021. Accessed: Jan. 05, 2025. [Online]. Available: http://pusdansi.org/index.php/pusdansi/article/view/12/12
Y. A. Hafiz and E. Sudarmilah, “IMPLEMENTASI WEB SCRAPING PADA PORTAL BERITA ONLINE,” Inisiasi, pp. 55–60, Nov. 2023, doi: 10.59344/inisiasi.v12i1.120.
L. Hidayati, L. P. Kusuma, D. Agustini, and V. Y. P. Ardhana, “IMPLEMENTASI WEB SCRAPING UNTUK PENGUMPULAN DATA MEDIA SOSIAL LINGKUP PEMERINTAH PROVINSI NTB,” Jurnal Sistem Informasi dan Informatika (Simika), vol. 7, no. 1, pp. 63–72, Mar. 2024, doi: 10.47080/simika.v7i1.3200.
A. S. Yondra, D. Triyanto, and S. Bahri, “IMPLEMENTASI WEB SCRAPING UNTUK MENGUMPULKAN INFORMASI PRODUK DARI SITUS E-COMMERCE DAN MARKETPLACE DENGAN TEKNIK PEMROSESAN PARALEL,” Coding Jurnal Komputer dan Aplikasi, vol. 10, no. 01, p. 93, May 2022, doi: 10.26418/coding.v10i01.52722.
S. Wang et al., “Advances in Data Preprocessing for Biomedical Data Fusion: An Overview of the Methods, Challenges, and Prospects,” Information Fusion, vol. 76, pp. 376–421, Dec. 2021, doi: 10.1016/j.inffus.2021.07.001.
A. Z. Rizquina and C. I. Ratnasari, “Implementasi Web Scraping untuk Pengambilan Data Pada Website ECommerce,” Jurnal Teknologi Dan Sistem Informasi Bisnis, vol. 5, no. 4, pp. 377–383, Oct. 2023, doi: 10.47233/jteksis.v5i4.913.
V. Çetin and O. Yıldız, “A comprehensive review on data preprocessing techniques in data analysis,” Pamukkale University Journal of Engineering Sciences, vol. 28, no. 2, pp. 299–312, 2022, doi: 10.5505/pajes.2021.62687.
M. A. Palomino and F. Aider, “Evaluating the Effectiveness of Text Pre-Processing in Sentiment Analysis,” Applied Sciences (Switzerland), vol. 12, no. 17, Sep. 2022, doi: 10.3390/app
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Indonesian Journal of Modern Science and Technology

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
