Development of a Hybrid Fuzzy Geographically Weighted K-Prototype Clustering and Genetic Algorithm for Enhanced Spatial Analysis: Application to Rural Development Mapping

Authors

DOI:

https://doi.org/10.34123/jurnalasks.v16i2.789

Keywords:

Clustering, Geographically Weighted Cluster, Mixed-type Data, Village Development

Abstract

Introduction/Main Objectives: Clustering methods are crucial for geodemographic analysis (GDA) as they enable a more accurate and distinct characterization of a region. This process facilitates the creation of socio-economic policies and contributes to the overall advancement of the region. Background Problems: The fuzzy geographically weighted clustering (FGWC) method, which is a GDA technique, primarily handles numerical data and is prone to being stuck in local optima. Novelty: This study proposed two novel clustering methodologies: fuzzy geographically weighted k-prototypes (FKP-GW) and its hybrid clustering model, which combines genetic algorithm-based optimization (GA-FKP-GW). Research Methods: This research conduct simulation study comparing two of the proposed clustering method. For the empirical application, this study applied clustering technique using the official Village Potential Survey of Temanggung, Indonesia. Finding/Results: The evaluation results of experiments conducted on simulated data and study cases indicate that the proposed method yields distinct clustering results compared to the previous method while being comparably efficient. The empirical application identifies four distinct groups from the clustered villages, each displaying unique characteristics. The results of our research have the potential to benefit the development of the GDA method and assist the local government in formulating more effective development policies.

Downloads

Download data is not yet available.

References

BPS, “Indeks Pembangunan Desa 2018,” Jakarta, 2019.

BPS, “Statistik Potensi Desa Indonesia (Village Potential Statistics Of Indonesia) 2018,” Jakarta, 2018.

P. Sleight, Targeting customers?: how to use geodemographic and lifestyle data in your business / Peter Sleight., Second edi. Henley-on-Thames: NTC, 1997.

R. Harris, P. Sleight, and R. Webber, Geodemographics, GIS and Neighbourhood Targeting. in Mastering GIS: Technol, Applications & Mgmnt. Wiley, 2005. [Online]. Available: https://books.google.co.id/books?id=Z8K25AxTjDcC

L. H. Son, B. C. Cuong, P. L. Lanzi, and N. T. Thong, “A novel intuitionistic fuzzy clustering method for geo-demographic analysis,” Expert Syst. Appl., vol. 39, no. 10, pp. 9848–9859, Aug. 2012, doi: 10.1016/j.eswa.2012.02.167.

G. A. Mason and R. D. Jacobson, “Fuzzy Geographically Weighted Clustering,” in Proceedings of the 9th International Conference on Geocomputation, Sep. 2007, pp. 1–7.

J. C. Bezdek, R. Ehrlich, and W. Full, “FCM: The fuzzy c-means clustering algorithm,” Comput. Geosci., vol. 10, no. 2–3, pp. 191–203, 1984, doi: 10.1016/0098-3004(84)90020-7.

H. Izakian and A. Abraham, “Fuzzy C-means and fuzzy swarm for fuzzy clustering problem,” Expert Syst. Appl., vol. 38, no. 3, pp. 1835–1838, Mar. 2011, doi: 10.1016/j.eswa.2010.07.112.

J. Wu, H. Xiong, C. Liu, and J. Chen, “A generalization of distance functions for fuzzy c-means clustering with centroids of arithmetic means,” IEEE Trans. Fuzzy Syst., vol. 20, no. 3, pp. 557–571, 2012, doi: 10.1109/TFUZZ.2011.2179659.

Z. Feng and R. Flowerdew, “Fuzzy geodemographics: a contribution from fuzzy clustering methods,” in Innovations In GIS 5, CRC Press, 1998, pp. 141–149. doi: 10.1201/b16831-20.

J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms. 1981.

L. Hunt and M. Jorgensen, “Clustering mixed data,” WIREs Data Min. Knowl. Discov., vol. 1, no. 4, pp. 352–361, Jul. 2011, doi: 10.1002/widm.33.

D.-W. Kim, K. H. Lee, and D. Lee, “Fuzzy clustering of categorical data using fuzzy centroids,” Pattern Recognit. Lett., vol. 25, no. 11, pp. 1263–1271, Aug. 2004, doi: 10.1016/j.patrec.2004.04.004.

Z. Huang, “Clustering Large Data Sets With Mixed Numeric And Categorical Values," Proceedings Of 1st Pacific-Asia Conference on Knowledge Discouvery And Data Mining.” Singapore, 1997.

X. Zhong, T. Yu, and H. Xia, “A new partition-based clustering algorithm for mixed data,” in Proceedings of the International MultiConference of Engineers and Computer Scientists, 2017.

J. Ji, W. Pang, C. Zhou, X. Han, and Z. Wang, “A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data,” Knowledge-Based Syst., vol. 30, pp. 129–135, Jun. 2012, doi: 10.1016/j.knosys.2012.01.006.

J. Ji, Y. Chen, G. Feng, X. Zhao, and F. He, “Clustering mixed numeric and categorical data with artificial bee colony strategy,” J. Intell. Fuzzy Syst., vol. 36, no. 2, pp. 1521–1530, Mar. 2019, doi: 10.3233/JIFS-18146.

W. Alomoush and A. Alrosan, “Review: Metaheuristic Search-Based Fuzzy Clustering Algorithms,” CoRR, vol. abs/1802.0, 2018, [Online]. Available: http://arxiv.org/abs/1802.08729

G. Gan, J. Wu, and Z. Yang, “A genetic fuzzy k -Modes algorithm for clustering categorical data,” Expert Syst. Appl., vol. 36, no. 2, pp. 1615–1620, Mar. 2009, doi: 10.1016/j.eswa.2007.11.045.

W. Min and Y. Siqing, “Improved K-means clustering based on genetic algorithm,” in 2010 International Conference on Computer Application and System Modeling (ICCASM 2010), 2010, pp. V6-636-V6-639. doi: 10.1109/ICCASM.2010.5620383.

B. I. Nasution, R. Kurniawan, T. H. Siagian, and A. Fudholi, “Revisiting social vulnerability analysis in Indonesia: An optimized spatial fuzzy clustering approach,” Int. J. Disaster Risk Reduct., vol. 51, Dec. 2020, doi: 10.1016/j.ijdrr.2020.101801.

B. S. Hadi, “Pendekatan Modified Particle Swarm Optimization dan Artificial Bee Colony pada Fuzzy Geographically Weighted Clustering (Studi Kasus pada Faktor Stunting Balita di Provinsi Jawa Timur),” Inst. Teknol. Sepuluh Nop., 2017.

R. Gupta, S. K. Muttoo, and S. K. Pal, “Meta-Heuristic Algorithms to Improve Fuzzy C-Means and K-Means Clustering for Location Allocation of Telecenters Under E-Governance in Developing Nations,” Int. J. FUZZY Log. Intell. Syst., vol. 19, no. 4, pp. 290–298, Dec. 2019, doi: 10.5391/IJFIS.2019.19.4.290.

M. Gen and R. Cheng, “Genetic algorithms and engineering design, Canada.” John Wiley & Sons, Inc, 1997.

R. L. Haupt and S. E. Haupt, Practical Genetic Algorithms. in Wiley InterScience electronic collection. Wiley, 2004. [Online]. Available: https://books.google.co.id/books?id=k0jFfsmbtZIC

E. Wirsansky, Hands-On Genetic Algorithms with Python: Applying genetic algorithms to solve real-world deep learning and artificial intelligence problems. Packt Publishing, 2020. [Online]. Available: https://books.google.co.id/books?id=A0vODwAAQBAJ

R. Nooraeni, M. I. Arsa, and N. W. Kusumo Projo, “Fuzzy Centroid and Genetic Algorithms: Solutions for Numeric and Categorical Mixed Data Clustering,” Procedia Comput. Sci., vol. 179, pp. 677–684, 2021, doi: 10.1016/j.procs.2021.01.055.

R. Nooraeni, “Cluster Method Using A Combination of Cluster K-Prototype Algorithm and Genetic Algorithm for Mixed Data,” J. Apl. Stat. Komputasi Stat., vol. 7, no. 2 SE-Articles, p. 17, Dec. 2015, doi: 10.34123/jurnalasks.v7i2.23.

R. Nooraeni, N. P. Yudho, and S. Pramana, “Mapping the socio-economic vulnerability in Aceh to reduce the risk of natural disaster,” 2018, p. 030012. doi: 10.1063/1.5062736.

A. W. Wijayanto, A. Purwarianti, and L. H. Son, “Fuzzy geographically weighted clustering using artificial bee colony: An efficient geo-demographic analysis algorithm and applications to the analysis of crime behavior in population,” Appl. Intell., vol. 44, no. 2, pp. 377–398, Mar. 2016, doi: 10.1007/s10489-015-0705-7.

A. Ahmad and L. Dey, “A k-mean clustering algorithm for mixed numeric and categorical data,” Data Knowl. Eng., vol. 63, no. 2, pp. 503–527, Nov. 2007, doi: 10.1016/j.datak.2007.03.016.

P. N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining. in Pearson International Edition. Pearson Addison Wesley, 2006. [Online]. Available: https://books.google.co.id/books?id=_XdrQgAACAAJ

J. Han, M. Kamber, and J. Pei, “Data Mining. Concepts and Techniques, 3rd Edition (The Morgan Kaufmann Series in Data Management Systems),” 2011.

S. Pramana, B. Yuniarto, I. Santoso, R. Nooraeni, and L. H. Suadaa, Data Mining dengan R, Konsep dan Implementasi. 2023.

S.-H. Jun, “An Optimal Clustering using Hybrid Self Organizing Map,” Int. J. Fuzzy Log. Intell. Syst., vol. 6, no. 1, pp. 10–14, Mar. 2006, doi: 10.5391/IJFIS.2006.6.1.010.

C.-C. Hsu and Y.-P. Huang, “Incremental clustering of mixed data based on distance hierarchy,” Expert Syst. Appl., vol. 35, no. 3, pp. 1177–1185, Oct. 2008, doi: 10.1016/j.eswa.2007.08.049.

W. Johnson and R. Wichern, “Applied Multivariate Statistical Analysis Sixth Edition,” 2007.

R. E. Walpole, R. H. Myers, S. L. Myers, and K. Ye, Probability and statistics for engineers and scientists, vol. 5. Macmillan New York, 1993.

Downloads

Published

2024-12-24

How to Cite

Santoso, A. B., Candra, A. C., Nooraeni, R., & Wijayanto, A. W. (2024). Development of a Hybrid Fuzzy Geographically Weighted K-Prototype Clustering and Genetic Algorithm for Enhanced Spatial Analysis: Application to Rural Development Mapping. Jurnal Aplikasi Statistika & Komputasi Statistik, 16(2), 122–139. https://doi.org/10.34123/jurnalasks.v16i2.789