Next Article in Journal
A Modified Network-Wide Road Capacity Reliability Analysis Model for Improving Transportation Sustainability
Previous Article in Journal
Re-Pair in Small Space
Open AccessArticle

Improving Scalable K-Means++

Faculty of Information Technology, University of Jyväskylä, 40014 Jyväskylä, Finland
*
Author to whom correspondence should be addressed.
Received: 25 November 2020 / Revised: 17 December 2020 / Accepted: 21 December 2020 / Published: 27 December 2020
Two new initialization methods for K-means clustering are proposed. Both proposals are based on applying a divide-and-conquer approach for the K-means‖ type of an initialization strategy. The second proposal also uses multiple lower-dimensional subspaces produced by the random projection method for the initialization. The proposed methods are scalable and can be run in parallel, which make them suitable for initializing large-scale problems. In the experiments, comparison of the proposed methods to the K-means++ and K-means‖ methods is conducted using an extensive set of reference and synthetic large-scale datasets. Concerning the latter, a novel high-dimensional clustering data generation algorithm is given. The experiments show that the proposed methods compare favorably to the state-of-the-art by improving clustering accuracy and the speed of convergence. We also observe that the currently most popular K-means++ initialization behaves like the random one in the very high-dimensional cases. View Full-Text
Keywords: clustering initialization; K-means‖; K-means++; random projection clustering initialization; K-means‖; K-means++; random projection
Show Figures

Figure 1

MDPI and ACS Style

Hämäläinen, J.; Kärkkäinen, T.; Rossi, T. Improving Scalable K-Means++. Algorithms 2021, 14, 6. https://0-doi-org.brum.beds.ac.uk/10.3390/a14010006

AMA Style

Hämäläinen J, Kärkkäinen T, Rossi T. Improving Scalable K-Means++. Algorithms. 2021; 14(1):6. https://0-doi-org.brum.beds.ac.uk/10.3390/a14010006

Chicago/Turabian Style

Hämäläinen, Joonas; Kärkkäinen, Tommi; Rossi, Tuomo. 2021. "Improving Scalable K-Means++" Algorithms 14, no. 1: 6. https://0-doi-org.brum.beds.ac.uk/10.3390/a14010006

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop