Grouping of Village Status in West Java Province Using the Manhattan, Euclidean and Chebyshev Methods on the K-Mean Algorithm

Gatot Tri Pranoto, Wahyu Hadikristanto, Yoga Religia

Abstract


The Ministry of Villages, Development of Disadvantaged Areas and Transmigration (Ministry of Village PDTT) is a ministry within the Indonesian Government in charge of rural and rural development, empowerment of rural communities, accelerated development of disadvantaged areas, and transmigration. Village Potential Data for 2014 (Podes 2014) in West Java Province is data issued by the Central Statistics Agency in collaboration with the Ministry of Village PDTT which is in unsupervised data format, consists of 5319 village data. The Podes 2014 data in West Java Province were made based on the level of village development (village specific) in Indonesia, by making the village as the unit of analysis. Base on the Regulation of the Minister of Villages, Disadvantaged Areas and Transmigration of the Republic of Indonesia number 2 of 2016 concerning the village development index, the Village is classified into 5 village status, namely Very Disadvantaged Village, Disadvantaged Village, Developing Village, Advanced Village and Independent Village based on the ability to manage and increase the potential of social, economic and ecological resources. Village status is in fact inseparable from village development that is under government funding support. However, village development funds have not been distributed effectively and accurately according to the conditions and potential of the village due to the lack of clear information about the status of the village. Therefore, the information regarding the villages priority in term of which villages needs more funding and attention from the government is still lacking. Data mining is a method that can be used to group objects in a data into classes that have the same criteria (clustering). One of the algorithms that can be used for the clustering process is the k-means algorithm. Data grouping using k-means is done by calculating the closest distance from data to a centroid point. In this study, different types of distance calculation in the K-means algorithm are compared. Those types are Manhattan, Euclidean and Chebyshev. Validation tests have been carried out using the execution time and Davies Bouldin index. From this test, the data Village Potential 2014 in West Java province have grouped all the 5 status of the village with the obtained number of villages for each cluster is a cluster village Extremely Backward many as 694 villages, cluster Villages 567 villages, cluster village Evolving as much as 1440 villages, the cluster with Desa Maju1557 villages and the cluster Independent Village for 1061 villages. For distance calculation, Chebyshev has the most efficient accumulation time of 1 second compared to Euclidean 1.6 seconds and Manhattan 2.4 seconds. Meanwhile, the Euclidean method has the value, Davies Index most optimal which is 0.886 compared to the Manhattan method 0.926 and Chebyshev 0.990.

Keywords


Village Development; k-means; Manhattan; Euclidean, Chebyshev; Davies Bouldin index

Full Text:

PDF

References


Al-roby, M. F., & El-halees, A. M. (2013). Classifying Multi-Class Imbalance Data. 37(5), 74–81.

Amandeep Kaur Mann, N. K. (2013). Review Paper on Clustering Techniques. Global Journal of Computer Science and Technology.

Awasthi, R., Tiwari, A. K., & Pathak, S. (2013). Empirical Evaluation On K Means Clustering With Effect Of Distance Functions For Bank Dataset. International Journal of Innovative Technology and Research, 1(3), 233–235.

Mishra, B. K., Rath, A., Nayak, N. R., & Swain, S. (2012). Far efficient K-means clustering algorithm. ACM International Conference Proceeding Series. https://doi.org/10.1145/2345396.2345414

Chakraborty, S., Nagwani, N. K., & Dey, L. (2011). Performance Comparison of Incremental K-means and Incremental DBSCAN Algorithms. International Journal of Computer Applications. https://doi.org/10.5120/3346-4611

Chaudhari, B., & Parikh, M. (2012). A Comparative Study of Clustering Algorithms using Weka Tools. International Journal of Application or Innovation in Engineering and Management (IJAIEM).

Claypo, N., & Jaiyen, S. (2015). Opinion mining for Thai restaurant reviews using K-Means clustering and MRF feature selection. 2015 7th International Conference on Knowledge and Smart Technology (KST), 105–108. https://doi.org/10.1109/KST.2015.7051469

Deepa, V. K., Rexy, J., & Geetha, R. (2013). Rapid development of applications in data mining. 2013 International Conference on Green High Performance Computing, ICGHPC 2013. https://doi.org/10.1109/ICGHPC.2013.6533916

Ding, S., Wu, F., Qian, J., Jia, H., & Jin, F. (2015). Research on data stream clustering algorithms. Artificial Intelligence Review. https://doi.org/10.1007/s10462-013-9398-7

Direktorat Jenderal Pemerintahan Umum, K. D. N. (n.d.). https://www.bps.go.id/statictable/2014/09/05/1366/luas-daerah-dan-jumlah-pulau-menurut-provinsi-2002-2016.html.

Gandhi, G., & Srivastava, R. (2014). Review Paper: A Comparative Study on Partitioning Techniques of Clustering Algorithms. International Journal of Computer Applications, 87(9), 10–13. https://doi.org/10.5120/15235-3770

Ghosh, S., & Kumar, S. (2013). Comparative Analysis of K-Means and Fuzzy C-Means Algorithms. International Journal of Advanced Computer Science and Applications. https://doi.org/10.14569/ijacsa.2013.040406

Grabusts, P. (2011). The choice of metrics for clustering algorithms. Vide. Tehnologija. Resursi - Environment, Technology, Resources. https://doi.org/10.17770/etr2011vol2.973

Harahap, F. R. (2013). Dampak Urbanisasi Bagi Perkembangan Kota Di Indonesia. Society, 1(1), 35–45. https://doi.org/10.33019/society.v1i1.40

Kouser, K., & Sunita, S. (2013). A comparative study of K Means Algorithm by Different Distance Measures. International Journal of Innovative Research in Computer and Communication Engineering.

KumarSagar, H., & Sharma, V. (2014). Error Evaluation on K- Means and Hierarchical Clustering with Effect of Distance Functions for Iris Dataset. International Journal of Computer Applications. https://doi.org/10.5120/15066-3429

Pratap, S., Kushwah, S., Rawat, K., & Gupta, P. (2012). Analysis and Comparison of Efficient Techniques of Clustering Algorithms in Data Mining. 3, 109–113.

Singh, A., Yadav, A., & Rana, A. (2013). K-means with Three different Distance Metrics. International Journal of Computer Applications. https://doi.org/10.5120/11430-6785

Soleh, A. (2017). Strategi Pengembangan Potensi Desa. Jurnal Sungkai, 5(1), 35–52.

Verma, M., Srivastava, M., Chack, N., Diswar, A. K., & Gupta, N. (2012). A Comparative Study of Various Clustering Algorithms in Data Mining. International Journal of Engineering Research and Applications Www.Ijera.Com.

Xu, L., Jiang, C., Wang, J., Yuan, J., & Ren, Y. (2014). Information security in big data: Privacy and data mining. IEEE Access. https://doi.org/10.1109/ACCESS.2014.2362522




DOI: https://doi.org/10.31326/jisa.v5i1.1097

Refbacks

  • There are currently no refbacks.


Copyright (c) 2022 Gatot Tri Pranoto, Wahyu Hadikristanto, Yoga Religia

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


JOURNAL IDENTITY

Journal Name: JISA (Jurnal Informatika dan Sains)
e-ISSN: 2614-8404, p-ISSN: 2776-3234
Publisher: Program Studi Teknik Informatika Universitas Trilogi
Publication Schedule: June and December 
Language: Indonesia & English
APC: The Journal Charges Fees for Publishing 
IndexingEBSCODOAJGoogle ScholarArsip Relawan Jurnal IndonesiaDirectory of Research Journals Indexing, Index Copernicus International, PKP IndexScience and Technology Index (SINTA, S4) , Garuda Index
OAI addresshttp://trilogi.ac.id/journal/ks/index.php/JISA/oai
Contactjisa@trilogi.ac.id
Sponsored by: DOI – Digital Object Identifier Crossref, Universitas Trilogi, Yayasan Damandiri

In Collaboration With: Indonesian Artificial Intelligent Ecosystem(IAIE), Relawan Jurnal IndonesiaJurnal Teknologi dan Sistem Komputer (JTSiskom)

 

 


JISA (Jurnal Informatika dan Sains) is Published by Program Studi Teknik Informatika, Universitas Trilogi under Creative Commons Attribution-ShareAlike 4.0 International License.