Next Article in Journal
Adding Matrix Control: Insertion-Deletion Systems with Substitutions III
Next Article in Special Issue
Analysis of Data Presented by Multisets Using a Linguistic Approach
Previous Article in Journal
Three-Dimensional Elastodynamic Analysis Employing Partially Discontinuous Boundary Elements
Previous Article in Special Issue
Difference-Based Mutation Operation for Neuroevolution of Augmented Topologies
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Self-Configuring (1 + 1)-Evolutionary Algorithm for the Continuous p-Median Problem with Agglomerative Mutation †

Reshetnev Siberian State, University of Science and Technology, Institute of Informatics and Telecommunications, 660037 Krasnoyarsk, Russia
*
Author to whom correspondence should be addressed.
Comparative study of local search in SWAP and agglomerative neighbourhoods for the continuous p-median problem. In Proceedings of the 9th International Workshop on Mathematical Models and their Applications, Krasnoyarsk, Russia, 16–18 November 2020.
Submission received: 28 March 2021 / Revised: 17 April 2021 / Accepted: 18 April 2021 / Published: 22 April 2021
(This article belongs to the Special Issue Mathematical Models and Their Applications II)

Abstract

:
The continuous p-median problem (CPMP) is one of the most popular and widely used models in location theory that minimizes the sum of distances from known demand points to the sought points called centers or medians. This NP-hard location problem is also useful for clustering (automatic grouping). In this case, sought points are considered as cluster centers. Unlike similar k-means model, p-median clustering is less sensitive to noisy data and appearance of the outliers (separately located demand points that do not belong to any cluster). Local search algorithms including Variable Neighborhood Search as well as evolutionary algorithms demonstrate rather precise results. Various algorithms based on the use of greedy agglomerative procedures are capable of obtaining very accurate results that are difficult to improve on with other methods. The computational complexity of such procedures limits their use for large problems, although computations on massively parallel systems significantly expand their capabilities. In addition, the efficiency of agglomerative procedures is highly dependent on the setting of their parameters. For the majority of practically important p-median problems, one can choose a very efficient algorithm based on the agglomerative procedures. However, the parameters of such algorithms, which ensure their high efficiency, are difficult to predict. We introduce the concept of the AGGLr neighborhood based on the application of the agglomerative procedure, and investigate the search efficiency in such a neighborhood depending on its parameter r. Using the similarities between local search algorithms and (1 + 1)-evolutionary algorithms, as well as the ability of the latter to adapt their search parameters, we propose a new algorithm based on a greedy agglomerative procedure with the automatically tuned parameter r. Our new algorithm does not require preliminary tuning of the parameter r of the agglomerative procedure, adjusting this parameter online, thus representing a more versatile computational tool. The advantages of the new algorithm are shown experimentally on problems with a data volume of up to 2,000,000 demand points.

1. Introduction

1.1. Problem Statement

One of the central problems of location theory is the p-median problem. The goal of the continuous p-median problem is to find p points (centers, medians) such that the sum of the distances from N known points (called demand points or data vectors) to the nearest of the p centers reaches its minimum.
Let us assume that we have N known demand points A1, …, AN in a continuous space, Ai = (ai,1, …, ai,d), A i R d , and S = {X1, …, Xp} R d is the set of sought points (medians, centers). The objective function (sum of distances) of the p-median problem is [1]:
F ( X 1 , , X p ) = F ( S ) = i = 1 N m i n j = 1 , p ¯ L ( A i , X j ) m i n X 1 , , X p R d .
Here, integer p must be known in advance.
The distance metric L (·,·) between objects is the key concept in the location theory. As a rule, a metric is understood as a function or an equation that determines the distance between any points and classes in a metric space [2]. A metric space is a set of points with a defined distance function. The distance of order l between two points is determined by the Minkowski function [3]:
L l ( X j , A i ) = ( k = 1 d | x j , k a i , k | l ) 1 l .
The parameter l is determined by the researcher. We can use it to progressively increase or decrease the weight of the ith variable. Special cases of the Minkowski function are distinguished by the value of l. For l = 2 the function calculates Euclidean metric between two points. By default, location problems use Euclidean metric:
L 2 ( X j , A i ) = ( k = 1 d ( x j , k a i , k ) 2 ) 1 / 2
Here, X j = ( x j , 1 , , x j , k )     j = 1 , p ¯ , k = 1 , d   ¯ are sought centers, also called medians, A i = ( a i , 1 , , a i , k )     i = 1 , N ¯ are known points called demand points or data vectors.
For l = 1, the function calculates Manhattan distance [3]. For l = +∞, the function calculates Tschebychev distance (L metric), and for l = 0, the function calculates Hamming distance [4].
Weber, in his work [5], investigated the problem of finding the center for a set of weighted points (the Weber problem or 1-median problem [5]) which is a special (simplest) case of the problem (1) for p = 1. At the same time, the Weber problem is a generalization of a similar Fermat problem [6] with three demand points (N = 3, p = 1). Weiszfeld, in his work [7], proved a theorem formulated by Sturm [8] and derived a sequence that converged to the optimal solution of the Weber problem. This was a version of the gradient descent algorithm [9] for the Weber problem.
The clustering problem is to divide a given set (sample) of N objects (data vectors) into p disjoint subsets, called clusters, so that each cluster consists of similar objects, and the objects of different clusters differ significantly. In the process of grouping a set of objects into certain groups (subsets), the general features of the object and the applied algorithms play an important role. Some of the automatic grouping problems can be considered from the point of view of location problems. The most popular k-means clustering model [10] can be described by Equation (1) where L (·,·) is the squared Euclidean distance between two points. The existence of a trivial non-iterative solution of the corresponding Weber problem with the squared Euclidean distances [11] makes the k-means problem one of the most popular optimization models for clustering. A disadvantage of this model is its high sensitivity to the outliers (separately located data points), and the p-median model is free from this drawback.
The network p-median problem [12] is to find the p nodes on a network that minimize the sum of the weighted distances from all nodes to the nearest of the p nodes selected as centers (medians). The network and continuous p-median problems are proved to be NP-hard [13,14].

1.2. State-of-the-Art

Weiszfeld proposed an iterative procedure for solving the Weber problem (1-median problem), one of the simplest location problems, based on the iterative weighted least squares method [7]. This algorithm determines a set of weights that are inversely proportional to the distances from the current estimate to the sample points, and creates a new estimate that is the weighted average of the sample according to those weights. The Weiszfeld procedure is very slow if the solution coincides with one of the demand points. Many researchers have developed modifications of the Weiszfeld algorithm [6,15,16,17,18,19,20] to avoid this disadvantage. The Weiszfeld algorithm can be embedded into more complex algorithms such as the Alternate Location-Allocation (ALA) procedure also called Lloyd’s algorithm. This algorithm alternates the allocation step which divides the demand points into p groups in accordance with the number of the closest center and the location step which recalculates the center of each group (i.e., solves the Weber problem for each of groups). The idea of the ALA algorithm is applicable for both network and continuous p-median problems.
Reviews of various solution techniques for p-median problems can be found in [11,21,22].
Hakimi considered the problem of finding the median of a graph (the Weber problem on a network) [12] and generalized this problem to finding the p-medians [23]. The authors of [24,25,26] proposed a branch-and-bounds algorithm solving the network p-median problem, and in [27,28] the authors considered algorithms based on graph theory for rather small problems. Lagrangian relaxations enables us to obtain an approximate solution to medium-sized problems [29,30].
Many heuristic approaches have been developed for large-scale problems. The simplest approaches consist in local search, in which a set of network nodes adjacent to the nodes of the current intermediate solution is considered as a search neighborhood [31,32]. Drezner et al. [33,34,35,36] presented local search approaches for solving continuous p-median location problems including the Variable Neighborhood Search (VNS). Bernabe-Loranca et al. [37] introduce the use of a Hybrid VNS/TABU Algorithm to determine the correct location of p centers. In addition, Drezner et al. [34] proposed heuristic procedures including the genetic algorithm (GA), for rather small datasets.
Modern literature on location methods offers many heuristic approaches [21,38] to setting the initial centers of the ALA procedure, which are mainly various evolutionary and random search methods. A standard local search algorithm starts with some initial solution S and goes to a neighboring solution if this solution turns out to be superior. Moreover, finding the set of neighbor solutions n(S) is the key issue. Elements of this set are formed by applying a certain procedure to a solution S. At each local search step, the neighborhood function n(S) specifies the set of possible search directions [39,40] and determines the efficiency of the algorithm [41].
Local search methods have been developed in metaheuristics. Mladenovich and Hansen [41,42,43,44] proposed a search algorithm with alternating neighborhoods (Variable Neighborhood Search, VNS). In works of Kochetov, Mladenovich, and Hansen [41,42,45,46,47,48], the authors provide an overview of local search methods based on the idea of alternating neighborhoods, including methods of combining these methods with other metaheuristics which reduce the dependence of the result on the choice of the neighborhood. Flexibility and high efficiency explain the VNS competitiveness in solving NP-hard problems, in particular, for solving automatic grouping and location problems [49,50], multiple Weber problem [51], p-median problem [50,51], and many others. For example, VNS algorithms [35,52] or agglomerative algorithms [53,54] demonstrate good results for the p-median problem. Initialization procedures for local search algorithms may include random seeding or an accurate seeding based on the estimation of the demand points density [38]. However, multiple launches of simple local search algorithms from various randomly generated solutions do not provide a solution to a problem which is close to the global optimum. More advanced algorithms allow us to obtain the objective function values many times better than the local search methods [52].
Evolutionary algorithms are a class of search algorithms that are often used as function optimizers for static objective functions. Using specific evolutionary operators, the evolutionary algorithms recombine and modify a set (population) of candidate solutions to a problem. Various efficient evolutionary algorithms were developed for almost all types of optimization problems including machine learning problems. The use of the genetic and other evolutionary algorithms for solving discrete and continuous location problems [55,56,57,58,59,60] is also a popular approach. For example, in [61], the authors proposed the online-learning-based reference vector evolutionary many-objective algorithm guided by the reference-vector-based decomposition strategy which employs a learning-based technology to enhance its generalization capability. This strategy employs a learning automaton to obtain the feature of the problem and the searching state according to the feedback information from the environment to adjust the mutation strategy for each sub-problem in different status situations and enhance the optimization performance. The authors use clustering to form groups of reference vectors with the same mutation strategy. The work [62] focuses on a discrete optimization problem of improving the seaside operations at marine container terminals. The authors proposed a new Adaptive Island Evolutionary Algorithm for the berth scheduling problem, aiming to minimize the total weighted service cost of vessels. The developed algorithm simultaneously executes separate evolutionary algorithms in parallel on its “islands” (subpopulations of solutions) and exchanges individuals between the “islands” based on an adaptive mechanism, which enables more efficient exploration of the problem search space. The authors of [63] proposed an alternative many-objective EA, called AnD. In evolutionary many-objective optimization, there exists a problem of selecting some promising individuals from the population. The AnD uses two strategies: angle-based selection and shift-based density estimation, which are employed to remove poor individuals one by one. In a pair of individuals with the minimum vector angle (most similar search directions), one of the individuals is considered as a candidate for elimination. In addition, the EA is a popular solution methodology, showing a promising performance in solving different types of vehicle routing problems [64,65]. In [66], the authors presented an efficient EA developed to solve the mathematical model, which addresses the vehicle routing problem with a “factory-in-a-box”. In [67], the authors proposed an approach to distinguish between bacterial and viral meningitis using genetic programming and decision trees able to determine the typology of meningitis for any combination of clinical parameters by achieving 100% of sensitivity.
The EAs are popular for training in various the machine learning methods. In [68], the Salp Swam Algorithm (SSA) was deployed in training the Multilayer Perceptron (MLP) for data classification. The proposed method shows supremacy in results as compared with other evolutionary algorithm-based machine learning problems.
In the case of p-median and similar location problems, evolutionary algorithms recombine the initial solution obtained by the ALA procedure or a local search algorithm. In genetic algorithms, a set (population) of solutions is successively improved with the use special genetic operators (algorithms) of initialization, selection, crossover, and mutation. The crossover operator recombines the elements (“genes”) of two parent solution for generating a new (“offspring”) solution. One-point and two-point crossover which are standard for many other genetic algorithms have been proved to present drawbacks when applied to problems such as k-means and p-median problems [69] due to generating the offspring solutions too far from their parents. The GAs with the greedy agglomerative crossover operator demonstrates better results [58,70,71]. In this crossover operator, as well as in the IBC algorithms [53,54], the number of centers is successively reduced down to the desired number p. Being originally developed for the network p-median problems, they were adapted for continuous p-median and k-means problems [72]. Metaheuristic approaches, such as genetic algorithms [73], are aimed at finding the global optimum. However, in large-scale instances, such approaches require very significant computational costs, especially if they are adapted to solving continuous problems [58]. The greedy agglomerative procedure is a computationally expensive algorithm, especially in the case of continuous problems when it includes multiple execution of a local search algorithm. However, this procedure allows us to find solutions that are difficult to improve by other methods without significantly increasing the calculation time.
The concept of the usage of greedy agglomerative procedures as the crossover operator in genetic algorithms is as follows. First, for two selected parent solutions (sets of medians) S1 and S2, the algorithm constructs an intermediate solution S’ adding r randomly selected medians (centers) from the second parent solution S2 to the first solution S1: S = S 1   S 2 . Here, S2′’ is a randomly chosen subset of S2, |S2| = r. Integer parameter r can be given in advance (often, r = 1 or r = p). After improving this new intermediate solution with the ALA or local search algorithm, one or more excessive medians are removed until |S’| = p. At each iteration, the greedy agglomerative procedure removes such a median that its elimination results in the least significant increase of the objective Function (1).
In [74], the authors systematized approaches to constructing algorithms for searching in neighborhoods (denoted as GREEDYr) formed using greedy algorithmic procedures for the k-means problem. Searching in SWAP and GREEDYr neighborhoods has advantages over the simplest Lloyd’s procedure [75,76]. However, the results strongly depend on the parameter of the greedy agglomerative procedures, and the optimal values of these parameters differ significantly for test problems. However, the GREEDYr search outperforms the SWAP search in terms of accuracy. Moreover, such algorithms often outperform more complex genetic algorithms.
Embedding the computationally expensive greedy agglomerative procedures into metaheuristics is even more computationally expensive, which limits the application of such a combination of algorithms to large problems. Nevertheless, the development of massively parallel systems expands the capabilities of such algorithmic combinations [74,77].
The (1 + 1)-evolutionary algorithms or (1 + 1)-EAs are similar with the local search and VNS algorithms but they have no clearly defined neighborhood; therefore, they can reach in one single step any point in the search space [78]. Unlike genetic algorithms, in (1 + 1)-EAs, there is no true population of solutions, and usually, they successively improve a single current solution with the application of the mutation operator to it.
In an (1 + 1)-evolutionary algorithm [78] for pseudo-Boolean optimization problems, the bitwise mutation operator flips each bit independently of the others with some probability pm that depends on the length of the bit string. The current bit string is replaced by the new one if the fitness of the current bit string is not superior to the fitness of the new string.
In their work, Borisovsky and Eremeev made the study on performance of the (1 + 1)–EA [79] and compared other evolutionary algorithms to the (1 + 1)–EA [80]. In [80], the authors studied the conditions under which (1 + 1)-EA is competitive with other evolutionary algorithms (EAs) in terms of the distribution of the fitness function at a given iteration and relative to the average optimization time. Their approach is applicable when the (1 + 1)-EA mutation operator prevails in the reproduction operator of the evolutionary algorithm. In this case, the lower bounds obtained for the expected optimization time of (1 + 1)-EA can be extended to other EAs based on the dominant operator. They proved that under the domination condition it is an optimal search technique with respect to the probability of finding solutions of sufficient quality after a given number of iterations. In the case of domination, the (1 + 1)-EA is also preferable with respect to the expected fitness at any iteration and the expected optimization time.
Reference [81] considers the scenario of the (1 + 1)-EA and randomized local search (RLS) with memory. The authors present two new algorithms: (1 + 1)-EA-m (with a raw list and hashtable option) and RLS-m+ (and RLS-m if the function is known in advance to be unimodal). These algorithms can be regarded as very simple forms of tabu search. Empirical results, with a reasonable fitness evaluation time assumption, verify that (1 + 1)-EA-m and RLS-m+ are superior to their conventional counterparts. In paper [82], the authors investigate the (1 + 1)-EA for optimizing functions over the space {0, ..., r}n, where r is a positive integer. They show that for linear functions over {0, 1, 2}n, the expected runtime of this algorithm is O(n log n). This result generalizes an existing result on pseudo-Boolean functions and is derived using drift analysis. They also show that for large values of r, no upper bound for the runtime of the (1 + 1)-EA for linear function on {0, ..., r}n can be obtained with this approach nor with any other approach based on drift analysis with weight-independent linear potential functions.
In [83], the authors investigate the performance of the (1 + 1)-EA, on the maximum independent set problem (MISP) from a theoretical point of view. They showed that the (1 + 1)-EA can obtain an approximation ratio of (∆ + 1)/2 on this problem in expected time O(n4), where ∆ and n denote the maximum vertex degree and the number of nodes in a graph, respectively. They reveal that the (1 + 1)-EA has better a performance than the local search algorithm on an instance of MISP and demonstrate that the local search algorithm with 3-flip neighborhood will be trapped in local optimum while the (1 + 1)-EA can find the global optimum in expected running time O(n4).
In [84], the authors theoretically investigate the approximation performance of the (1 + 1)-EA, on the minimum degree spanning tree (MDST) problem which is a classical NP-hard optimization problem and show its capability of obtaining an approximate solution for the MDST problem with a limited maximum degree in expected polynomial time. In [85], the authors analyze the expected runtime of the (1 + 1)-EA solving robust linear optimization problems (i.e., linear problems under robust scenarios) with a cardinality constraint. They consider two common robust scenarios, i.e., deletion-robust and worst-case, and disclose the potential of (1 + 1)-EAs for robust optimization.
An interesting approach was proposed in [86]. The authors propose a (1 + 1)-fast evolutionary algorithm ((1 + 1)-FEA) with an original approach to the adjustment of the mutation rate. This algorithm uses a random variable λ =   { 1 , n ¯ } which takes its values in accordance with some distribution D. The mutation operator in this (1 + 1)-FEA generates an instance λ’ of λ in accordance with this distribution, applies the mutation with the mutation rate λ’ to the current solution (replacing λ’ bits with random values), and, if the new obtained solution improves the objective function value, then it replaces the current solution with the new one and corrects the probability distribution D so that the probability of generating the current value λ’ increases. The worst case estimation of (1 + 1)-FEA for any black box function is significantly smaller than that for the original (1 + 1)-EA with a random mutation rate.

1.3. Research Gap and Our Contribution

Many important practical problems require obtaining a solution to a problem with a minimum error. For the p-median problem, such cases may include optimal location problems with a high cost of error, determined, for example, by the cost of transportation between sites. If the p-median problem is considered as a mathematical statement for clustering problems, it may be necessary to obtain a solution that would be difficult to improve by known methods without a multiple increase in the computation time. When comparing the accuracy of various algorithms, we need a reference solution which is not necessarily the exact global optimum of the problem, but at the same time is the best known solution. In this work, aimed at obtaining the most accurate solutions, we understand the accuracy of the algorithm exclusively as the achieved value of the objective function.
Algorithms based on the use of greedy agglomerative procedures, despite their computational costs without guaranteeing an exact result or even a result with a predetermined accuracy, enable one to obtain solutions to practical p-median problems that are difficult to improve by other known methods in comparable time [71,77]. Such algorithms are extremely sensitive to the parameter r of these procedures which determines the number of centers (medians) added to the intermediate solution [77]. In the field of (1 + 1)-evolutionary algorithms for pseudo-Boolean optimization problems, there are known approaches that can adjust a similar mutation parameter that determines the number of replaced bits in the current solution.
The work hypotheses of this article are as follows:
  • The efficiency of the greedy agglomerative procedure applied to successive improvement of the p-median problem solution embedded into a more complex algorithm, such as evolutionary algorithm, highly depends on its parameter r (a number of excessive centers to be eliminated), and this dependence is hardly predictable.
  • The principle of adjusting the numerical parameter of the mutation operator by changing the probability distribution of its values, which is used in (1 + 1)-evolutionary algorithms with 0–1 coding of solutions for pseudo-Boolean optimization problems, can also be effectively applied to adjust the numerical parameter of the agglomerative mutation operator based on the use of agglomerative procedures in an evolutionary algorithm with real coding of solutions when solving the p-median problems.
We propose a new algorithm based on the ideas of known search algorithms with greedy agglomerative procedures and (1 + 1)-evolutionary algorithms which successively improves the current solution with the use of greedy agglomerative procedures with the randomly chosen value of parameter r and simultaneously adjusts the probability distribution of parameter r in such procedures.
This article is an extended version of the paper presented at the International Workshop on Mathematical Models and their Applications (IWMMA’2020, 16–18 November 2020, Krasnoyarsk, Russia) [79].
The rest of this article is organized as follows. In Section 2, we describe known algorithms for the p-median problem including the algorithms with greedy agglomerative procedures. We investigate the dependence of their efficiency on parameter r. We propose the a massive parallel (CUDA) version for the greedy agglomerative procedures. In addition, using the similarities between local search algorithms and (1 + 1)-evolutionary algorithms, as well as the ability of the latter to adapt their search parameters, we propose a new algorithm based on the use of greedy agglomerative procedures with the automatically tuned parameter r. In Section 3, we describe the results of our computational experiments on various datasets up to 2 million demand points. Experiments demonstrate the advantage of our new algorithm in comparison with known ones. In Section 4, we discuss the applicability of the new algorithm to practical problems and propose possible directions for the further development of such algorithms. In Section 5, we give a short conclusion.

2. Methods

2.1. The Basic Algorithmic Approaches

Based on the algorithm proposed by Lloyd [75] for discrete p-median problems on a network, Alternate Location-Allocation (ALA) procedure [87,88] also known as the standard k-means procedure [76] the most popular algorithm for the k-means and p-median problems. This simple algorithm can be considered as a special case of similar Expectation Maximization (EM) procedure [89,90,91,92]. The ALA procedure sequentially improves an intermediate solution, enabling us to find a local minimum of the objective Function (1).
In terms of continuous optimization, the ALA procedure is not a true local search algorithm since it searches for a new solution not necessarily in the ε -neighborhood of the existing solution.
However, it usually outperforms simpler local optimization methods: gradient or coordinate descent, etc.
In the case of a p-median problem, Algorithm 1 solves the Weber problem (i.e., center search problem or a 1-median problem) for each cluster. The iterative Weiszfeld procedure gives an approximate solution to the Weber problem with given accuracy ε 1 . At each subsequent iteration, the coordinates of the solution (center) X of a subset S = { A 1 , , A N } { A 1 , , A N } of N demand points are calculated from the previous solution X as follows:
X i = 1 N A i L ( X , A i ) / i = 1 N 1 L ( X , A i )
Algorithm 1 ALA (): Alternate Location-Allocation (Lloyd’s procedure)
Require: Set S of p initial centers S = { X 1 , , X p } .
1. For each center X i , i = 1 , p ¯ , define its cluster C i { A 1 , , A N } as a subset of demand points having the same closest center X i .
2. For each cluster C i , i = 1 , p ¯ , recalculate its center X i , i.e., solve the Weber problem:
X i argmin X Y C i L ( X , Y )
3. Repeat from Step 1 if Steps 1, 2 made any changes.
These calculations are repeated until L ( X , X ) < ε 1 . Usually, the Weiszfeld procedure converges quickly. For the majority of problems, no more than 50 iterations are required in order to reach the accuracy limits of the double data type except include the cases where the solution coincides with one of the demand points. Such situations are easily bypassed:
X i { 1 , N ˉ } , L ( X , A i ) < ε 2 [ A i L ( X , A i ) ] i { 1 , N ˉ } , L ( X , A i ) < ε 2 [ 1 L ( X , A i ) ]
Here, ε 2 < ε 1 is a minimum distance at which two points are considered different.
Embedding the Weiszfeld iterative procedure in an iterative ALA algorithm, which, in turn, is embedded in more complex algorithms, makes such algorithmic constructions difficult to use effectively with large-scale problems. Steps 1 and 2 of the ALA algorithm are aimed at the reduction of the objective function. Knowing the coordinates of the centers X i , without changing their coordinates, Step 1 redistributes the demand points A i , , A N among the centers (i.e., forms clusters of demand points around the centers) so that the sum of the distances from each of the demand points to the nearest center becomes minimal. Step 1 solves a simplest simple discrete optimization problem. Step 2 solves a series of continuous optimization problems (Weber problems) which minimize the total distance from the center X i to the demand points A 1 , , A N of its cluster. Each iteration of the Weiszfeld procedure (4) is also aimed at gradual improvement of the current solution X. For a gradual improvement of solutions, the Weber problem does not have to be solved to a given accuracy ε 1 at each iteration of the ALA algorithm. Thus, Steps 2 and 3 can take the following form:
Step 2: For each cluster C i , i = 1 , p ¯ , correct its center X i with an iteration of Weiszfeld procedure:
X i Y C i , L ( X , Y ) < ε 2 [ Y L ( X i , Y ) ] Y C i , L ( X , Y ) < ε 2 [ 1 L ( X i , Y ) ]
Step 3: Repeat from Step 1 if Steps 1 and 2 moved at least one of centers X i by a distance exceeding ε 1 .
The essence of the VNS [42] is that for some intermediate solution, we determine a set of neighborhoods of this solution. From this set, the next type of neighborhood is selected, and we apply the corresponding local search algorithm for searching in this solution. If this algorithm finds an improved solution, the intermediate solution is replaced by this new solution, and the search continues in the same neighborhood. If the next local search algorithm cannot improve the solution, a new search neighborhood is selected from the set of neighborhoods. The stop criterion is the time limitation.
Several algorithms use only two types of neighborhoods: the first is ε-neighborhood (i.e., at this step, the problem of continuous optimization is solved). For example, the search in SWAP neighborhood is a popular method for solving p-median and k-means problems. The j-means [93] algorithm using these neighborhoods is one of the most accurate methods for solving such problems. The essence of the search in SWAP neighborhoods is as follows. Let S = {X1, …, Xp} be the local optimum of the p-median problem in ε-neighborhood (such a solution can be obtained by the ALA algorithm or other algorithms such as gradient descent). If the value of the objective function has improved due to this replacement, then the search continues in the ε-neighborhood of the new obtained solution (the problem of continuous optimization is solved again).
The search algorithm makes an attempt to replace some of the medians X1, …, Xp with one of the data vectors A1, …AN. Search in the SWAP neighborhoods can be regular (all possible replacements are enumerated) or randomized (the medians and demand points are randomly picked for the replacement).
Let us denote by SWAPr a neighborhood (set of solutions) of a current solution obtained by the replacement of r medians by demand points.
In our computational experiments (described in detail in Section 3) on various problems from repositories [94,95], we investigated the dependence of the efficiency (ability to reach the minimum values of the objective functions) of the search in SWAPr neighborhood on the parameter r. The results obtained after the fixed runtime are given in Figure 1. Obviously, these results are highly dependent on r.

2.2. Greedy Agglomerative Procedures

The agglomerative approach to solving the p-median problem is often successful [70]. To solve the p-median problem in a continuous space, the authors of [60] use genetic algorithms with various crossover operators based on greedy agglomerative procedures. Alp, Erkut, and Drezner in [70] presented a genetic algorithm for facility location problems, where evolution is facilitated by a greedy agglomerative heuristic procedure. A genetic algorithm with a faster greedy heuristic procedure for clustering and location problems was also proposed in [71].
Greedy agglomerative procedures can be used as independent algorithms or embedded in VNS or evolutionary algorithms [52,96]. Such procedures can be described as follows in Algorithm 2:
Algorithm 2 BasicAggl (S)
Require: Set of initial centroids S = {X1, …, XK}, |S| = K > k, required final number of centroids k.
Improve S with the two-step local search algorithm if possible;
while |S| > k do
    for i = 1 , | S | ¯ do
         F i S S E ( S \ { X i } ) ;
    end for;
  Select a subset S’⊂S of to remove centroids with the minimum values of the corresponding variables Fi; // By default, rtoremove = 1;
  Improve this new solution S ( S   \ S ) ; with the two-step local search algorithm;
end while.
To improve the performance of such a procedure, the number of simultaneously eliminated centers can be calculated as r t o r e m o v e = max { 1 , ( | S | p ) r c o e f } . In [52,97], the authors used the elimination coefficient value r c o e f = 0.2. This means that at each iteration, up to 20% of the excessive centers are eliminated, and such values are proved to make the algorithm faster.
The agglomerative procedure for obtaining an AGGLr neighborhood can be defined as follows in Algorithm 3:
Algorithm 3 AGGLr (S, S2)
  Require: Two sets of centers S, S2, |S|=|S2|= p, the number of centers r of the solution S2 which are used to obtain the resulting solution, r { 1 , p ¯ } .
  for i = 1 , n r e p e a t s ¯ do
     1. Select a subset S S 2 :   | S | = r ;
     2. S B a s i c G r e e d y ( S   S ) ;
     3. if F(S’) < F(S) then
        S S
     end if;
  end for;
  return S.
Such procedures use various values of r { 1 , p ¯ } , and nrepeats depends on r: nrepeats = max {1, [ p / r ] }.
If the solution S2 is fixed, then all possible results of applying the AGGLr(S, S2) procedure form a neighborhood of the solution S, and S2 as well as r are parameters of such a neighborhood. If S2 is a randomly chosen locally optimal solution obtained by ALA (S2′) procedure applied to a randomly chosen subset S 2 { A 1 , , A N } ,   | S 2 | = p , then we deal with a randomized neighborhood.
Let us denote such a neighborhood by AGGLr (S). Our experiments demonstrate that the obtained result of the local search in AGGLr neighborhoods strongly depends on r (see Figure 2).
As mentioned above, the greedy agglomerative procedure AGGLr can be used as the crossover operator of the genetic algorithms. Algorithms proposed in [52,71] use such procedures with r = 1 and r = p for solving the k-means and p-median problems.
As a rule, the VNS algorithms move from neighborhoods of lower cardinality to wider neighborhoods. For instance, in [52], the authors propose a sequential search in the neighborhoods AGGL1 AGGLrandom   AGGLp AGGL1 … Here, AGGLrandom is a neighborhood with randomly selected r { 2 , p 1 ¯ } . In this case, the initial neighborhood type has a strong influence on the result [52]. Figure 1 shows that the dependence of the obtained objective function value on r is complex. For the specific problems, there may exist specific values of the parameter r which enable us to obtain better results which are hardly predictable. Nevertheless, a search algorithm may collect the information of the most efficient values of r during its runtime.

2.3. CUDA Implementation of Greedy Agglomerative Procedures

Compute Unified Device Architecture (CUDA) is a hardware-software parallel computing architecture that can significantly increase computational performance [98] designed for the effective use of non-graphical computing on the graphics processing units (GPUs). A GPU is a set of multiprocessors that simultaneously execute parallel threads grouped into data blocks. The threads execute the same instructions on different data in parallel. Communication and synchronization in blocks is impossible.
A thread is assigned an identifier (threadIdx.x) depending on its position in the block, and a block identifier (blockIdx.x) is also assigned to a thread depending on its position in the grid. The thread and block identifiers are available at run time, which allows us to set specific memory access patterns. Each thread in the GPU performs the same procedure, known as the kernel [98].
CUDA versions of the algorithms for solving discrete p-median problems [99,100,101] (which actually solve rather small discrete problems from the OR (Operation Research) Library) as well as parallel implementations of the ALA algorithm for the k-means problems are described in the modern literature [74,98,102]. In our research, we take into account the specificity of the continuous problems.
For large amounts of data, calculating the distances L ( X i , A j ) from demand points to cluster centers is the most computationally expensive part of the ALA algorithm. The distances should be calculated to all centers, since the algorithm does not know in advance which center is the nearest one. Traditionally used in solving the k-means problem, this approach can be accelerated by using the triangle inequality [103,104] in the case of p-median problems: if the displacement of the i th center relative to the previous iteration was Δ X i = L ( X i , X i )   , and its distance to the j th demand point was L i j = L ( X i , A j ) , then the new distance lies within L i j ± Δ X i . If i * { 1 , p ¯ } :
L i * j + Δ X i * < L i j Δ X i ,
such that for new distances, L ( X i * , A j ) < L ( X i , A j ) . Knowing all distances   L i j from the previous iteration and the shift of the centers X i   , we can avoid calculating a significant part of the distances at each iteration, which is especially important when processing multidimensional data. Nevertheless, for large-scale problems, matrix { L i j } requires a significant additional amount of memory ( p N values). Moreover, checking the condition (7) for each pair ( A j , X i ) which requires significantly less computational resources than distance calculation
L i j = L ( A j , X i ) = l = 1 d ( a j , l x i , l ) 2
will not accelerate the execution of the algorithm in the case of its CUDA implementation, which assumes strictly simultaneous execution of all instructions by all threads. Therefore, in this work, we do not use of the triangle inequality to accelerate the CUDA version of the ALA algorithm, although more complex implementations for CUDA could probably take advantage of the triangle inequality.
Note that the distances (8) are calculated both in Step 1 of Algorithm 1, and in its Step 2 which includes the iteration of the Weiszfeld procedure (4).
We used the following CUDA implementation of Step 1 of Algorithm 1. The step is divided into two parts. First, variables common for all threads are initialized Algorithm 4:
Algorithm 4 CUDA implementation of Step 1 in the ALA algorithm, part 1 (initialization)
X j 0 j = 1 , p ¯ // Here, X j are vectors used for recalculation of centers.
c o u n t e r j 0 j = 1 , p ¯ // Counters of objects assigned to each of centers.
D j 0 j = 1 , p ¯ // D j are used to calculate j C j , L ( A j , X i ) > ε 2 1 L ( X i , A j ) .
D j 0 j = 1 , p ¯ // d -dimensional vectors to calculate j C j , L ( A j , X i ) > ε 2 A j L ( X i , A j ) .
For the rest of the CUDA implementation of Step 1 of the ALA algorithm, we used the number of threads N t r r e a d s = 512 for each CUDA block. The number of blocks is calculated as
N b l o c k s = ( N + N t h r e a d s 1 ) / N t h r e a d s .
Thus, each thread processes a single demand point in Algorithm 5:
Algorithm 5 CUDA implementation of Step 1 in the ALA algorithm
i b l o c k I d x . x   t i m e s b l o c k D i m . x + t h r e a d I d x . x
if i > N then
return
end if
j argmin j A j X i 2 // number of a center for the i th data point
d min j A j X i 2 ; C i j // Assign A i to center j .
D ( d ) 1 ; D j D j + D ; D j D j + A j D ; c o u n t e r j c o u n t e r j + 1 ;
Synchronize threads.
The parallel implementation of Step 2 of the ALA algorithm (see Algorithm 6) uses N t h r e a d s = 512 threads for each CUDA block. The number of blocks is N b l o c k s 2 = ( p + N t h r e a d s 1 ) / N t h r e a d s . The changed variable, common to all threads, is initialized to 0.
Algorithm 6 CUDA implementation of Step 2 in the ALA algorithm
j b l o c k I d x . x × b l o c k D i m . x + t h r e a d I d x . x
if j > k then
return;
end if;
X j D j A j ;
if L ( X j , X j ) ε 2 then
c h a n g e d 1 ;
end if;
X j X j ;
Synchronize threads.
Variable changed is used at the last step of the ALA algorithm and signals the need to continue the iterations.
In addition, we implemented Step 3 of BasicAggl algorithm on the GPU. At this step, Algorithm 2 calculates the sum of the distances after removing a single center with index i : F i = F ( S \ { X i } ) . Knowing the objective function value F ( S ) for set S of centers, we can calculate its new value
F i F ( S { X i } ) = F ( S ) + l = 1 N Δ D l
where
  Δ D l = { 0 ,               C i l , ( min j { 1 , p ¯ } ,   j i L ( A j , X C j ) ) L ( A j , X l ) , C i = l .
Here, Ci is the center number which Ai is assigned to (nearest to Xi). We also used 512 threads per block, the number of blocks is calculated according to (9).
After initialization of common variable Dsum←0, the Algorithm 7 is started in a separate thread for each data point:
Algorithm 7 CUDA implementation of Step 3 in Algorithm 2
l b l o c k I d x . x   t i m e s b l o c k D i m . x + t h r e a d I d x . x ;
if l > k then
return;
end if;
Calculate Δ D l in accordance with (2.8);
if Δ D l > 0 then
D s u m D s u m + Δ D l ;
end if;
Synchronize threads.
After parallel running this algorithm, F i is calculated in accordance with (10) on Step 3 of Algorithm 2: F i F ( S ) + D s u m .
The remaining parts of our GA are implemented by the central processor unit. The remaining parts performing the selection do not have any significant effect on the calculation speed. This CUDA implementation of the ALA algorithm enables us to use both the ALA algorithm and more complex algorithms that include it, on large amounts of data, up to millions of demand points.

2.4. New Algorithm

As in local search algorithms, in accordance with the general idea of (1 + 1)-EAs, our new algorithm successively improves the single intermediate solution. To improve it, the AGGLr procedure is applied to it (see Algorithm 3) with an additional intermediate solution randomly generated as the second parameter S2, which is improved by the ALA procedure. A similar idea was applied for the k-means problem in VNS algorithms with randomized neighborhoods [54], which we included in the list of algorithms for comparison. The key feature of the new algorithm is that it does not fix values of the r parameter nor does it generate r values with equal probability, but it adjusts the values of the generation probabilities for each of the possible r values in accordance with the results obtained. If the use of the AGGLr—procedure has resulted in an improvement in the value of the objective function, our algorithm increases the probability Pr for the used r and for values close to it.
The new algorithm can be described as follows in Algorithm 8.
Algorithm 8 Aggl-EA ()
Randomly select a subset S {A1, …, AN}; S A L A ( S ) ; assign P i 1 / p   i = 1 , p ¯ ;
repeat
  randomly select a subset S2 {A1, …, AN}; S 2 A L A ( S 2 ) ;
  in proportion to the values of the probabilities Pi, i = 1 , p ¯ , choose the value of r { 1 , p ¯ } ;
   S 3 AGGL r ( S , S 2 ) ;
  if F ( S 3 ) < F ( S ) then
       P i k 1 · P i   i = r k 2 , min { p , k 2 · r } ¯ ; // we used k1 = 1.1 and k2 = 1.5.
       P i P i i = 1 p P i i = 1 , p ¯ ; // probability normalization
      S S 3 ;
  end if;
until time limitation is reached.
returnS.

3. Results of Computational Experiments

In all our experiments, we used the classic datasets from the UCI Machine Learning and Clustering basic benchmark repositories [94,95,105]: (a) Individual Household Electric Power Consumption (IHEPC)—energy consumption data of households during several years, more than 2 million demand points (data vectors) in R 2 , 0–1 normalized data, “date” and ”time” columns removed; (b) BIRCH3: 100 groups of points of random size, 100,000 demand points in R 2 ; (c,d) S1 and S4 datasets, respectively: Gaussian clusters with cluster overlap (5000 demand points in R 2 ); (e) Mopsi-Joensuu: geographic locations of users (6014 data vectors, 2 dimensions) in Joensuu city; (f) Mopsi-Finland: geographic locations of users (13,467 data vectors, 2 dimensions) in Finland.
For our computational experiments, we used the following test system: Intel Core 2 Duo E8400 CPU, 16GB RAM, NVIDIA GeForce GTX1050ti GPU with 4096 MB RAM, floating-point performance 2138 GFLOPS. This choice of the GPU hardware was made due to its prevalence, and also one of the best values of the price/performance ratio. The program code was written in C++. We used Visual C++ 2017 compiler embedded into Visual Studio v.15.9.5, NVIDIA CUDA 10.0 Wizards, and NVIDIA Nsight Visual Studio Edition CUDA Support v.6.0.0. For all datasets, 30 attempts were made to run each of algorithms (Algorithms 1–6).
We examined the following algorithms: (a) Lloyd: the ALA algorithm in the multi-start mode; (b) j-means: j-means algorithm (regular search in SWAP1 neighborhood in combination with the ALA algorithm) in the multi-start mode; (c) AGGLr: randomized search in the AGGLr neighborhood, r = 1 , p ¯ ; (d) SWAPr: local search in SWAPr neighborhoods, r = 1 , p ¯ (only the best result for r = 1 , p ¯ is given); (e–g) GH-VNS1, GH-VNS2, GH-VNS3: Variable Neighborhood Search algorithms with neighborhoods formed by application of AGGLr() procedure, see [52]; (h) GA-1POINT: genetic algorithm with a standard 1-point crossover; (i) GA-UNIFORM: genetic algorithm with a standard 1-point crossover and uniform random mutation [56]; (j) Aggl-EA: our new algorithm.
For all datasets, 30 attempts were made to run each of the algorithms (see Table 1 and Table A1, Table A2, Table A3, Table A4, Table A5, Table A6, Table A7, Table A8, Table A9, Table A10, Table A11, Table A12, Table A13, Table A14 and Table A15 in Appendix A). In Table 1, we present the results of our new algorithm Aggl-EA() and the best of listed known algorithms, i.e., known algorithms which provided the best average or median values of the objective function (1) after 30 runs.
For the smallest and simplest test problems (see Table A1 and Table A4 in Appendix A), several algorithms including Aggl-EA() and search in the AGGL neighborhoods resulted in the same objective function value (standard deviation of the result is 0.0) which is probably the global minimum. Nevertheless, the Aggl-EA() algorithm outperforms both genetic algorithms and simplest local search algorithms such as Lloyd() or j-means which do not reach this minimum value in all attempts. For more complex test problems, the comparative efficiency varies. However, there were no test problems for which our new algorithm demonstrated a statistically significant disadvantage in comparison with the best of known algorithms.
The best (minimum), worst (maximum), average, and median values of the achieved objective function were averaged after 30 runs of each algorithm with fixed time limitation. The best average and median values of the objective Function (1) are underlined. We compared our new Aggl-EA () algorithm with known examined algorithm having the best median or average results. The significance of advantage/disadvantage of Aggl-EA() algorithm was estimated with the t-test [106,107] and non-parametric Wilcoxon rank sum test [108,109].
In the analysis of algorithm efficiency, the researchers often estimate the computational expenses as the number of the objective function calculations. The ALA procedure is the most computationally expensive part of all examined algorithms. In the ALA algorithm, the objective Function (1) is never calculated directly. The program code profiling results show that Step 2 of Algorithm 1 and Step 3 of Algorithm 2 occupy more than 99% of computational resources (processor time). These steps were implemented on GPUs. Moreover, we use the same implementation of the ALA procedure and BasicAggl() algorithm for all examined algorithms. Thus, the consumed time could be proportional with the number of iterations of the ALA algorithm (its Step 2) and the most time-consuming iterations of Algorithm 2 (its Step 3). However, the time consumption of these iterations depends on the number of centers which successively decreases during the work of the BasicAggl () algorithm. Thus, for the p-median problems, we used the astronomical time as the most informative unit for the computational expenses.
The time limitation plays an important role: the fastest algorithms may stop their convergence after several iterations and, vice versa, the slowest algorithms may continue their slow convergence and outperform the fastest algorithms if we enlarge their time limits. However, our new algorithms demonstrate their comparative advantage within wide time intervals (see Figure 3).

4. Discussion

The genetic algorithms and Variable Neighborhood Search are algorithmic frameworks useful for creating efficient solvers of various problems including location problems such as p-median. The greedy agglomerative approach can be efficiently used as the crossover genetic operator [58,70,71,102] as well as in the VNS [52]. However, as our experiments show (Appendix A), the correct choice of parameter r in such procedures plays the most important role and simple search algorithms with constant value of r may outperform more complex VNS algorithms if r is tuned correctly.
Unlike the majority of evolutionary algorithms, the (1 + 1)-evolutionary algorithms as well as local search algorithms focus on the successive improvement of a single intermediate solution. Successful application of various local search algorithms such as SWAP search, j-means or various VNS algorithms including to the p-median problem shows that the correct choice of a neighborhood or an algorithmic procedure for the successive improvement of a single current solution can be a more profitable strategy for solving practical problems than operating with a large population of solutions. In this work, we have increased the efficiency of one of these procedures by adjusting its parameter. Nevertheless, adjusting the parameters of greedy agglomerative procedures can also enhance the capabilities of genetic algorithms with a greedy agglomerative crossing operator, which, in our opinion, is an urgent area of further research. In addition, similar algorithms can be developed for a wider range of problems, including k-means, k-medoid, and the problem of separating a mixture of probability distributions.
Our new algorithm demands significant computational resources. Nevertheless, obtaining the most accurate results for solving problems, in addition to being of immediate practical importance in the case of a high cost of error, solves the problem of obtaining reference solutions with which the results of other, less computationally expensive algorithms can be compared.
As in the case of the k-means problem [74], the most complex dependence of the greedy procedure efficiency on its parameter r and important advantage of our new algorithm is detected for the largest tested problems (IHEPC dataset) as well as for the problems of “geographic” location (Mopsi datasets) with the sets of demand points formed under the influence of natural factors, as well as factors associated with the development of urban infrastructure.
For the majority of test problems, our computational experiments demonstrate the advantage of our new algorithm or its approximately equal effectiveness in comparison with known algorithms. For large-scale problem, the effect of our new algorithm is more significant. Nevertheless, even with equal results in comparison with known algorithms, our new Aggl-EA algorithm is a more versatile tool due to its ability of adjusting its parameter in accordance with its behavior. Sometimes, our new algorithm demonstrates less stable results (higher standard deviation of the objective function) which may limit its scope or demands running in multi-start mode. Thus, further study of the causes and factors leading to the instability of the result is required.
We considered the use of the self-adjusted agglomerative procedures as the mutation operator of the simplest evolutionary algorithm with no true population of solutions. The efficiency of embedding similar mutation operators (without any self-adjustment) was shown in the paper [96]. Thus, the investigation of the self-adjustment features of other evolutionary operators (first of all, crossover operator of genetic algorithms) is a promising direction for the further research.

5. Conclusions

In this article, we introduced the concept of the AGGLr neighborhood based on the application of the agglomerative procedure, and investigate the search efficiency in such a neighborhood depending on the parameter r. Using the similarities between local search algorithms and (1 + 1)-evolutionary algorithms, as well as the ability of the latter to adapt their search parameters, we introduced our new Aggl-EA () algorithm with an embedded greedy agglomerative procedure with the automatically tuned parameter r. Our computational experiments on large-scale and medium-scale problems demonstrate the advantages of the new algorithm in comparison with known algorithms including the algorithms based on greedy agglomerative procedures.
Our computational experiments confirmed the proposed working hypotheses and led to the following conclusions:
The agglomerative mutation operator, when used as part of an evolutionary algorithm, is not only able to improve its solutions, moreover, it can also be efficiently used as the only evolutionary operator in the (1 + 1)-evolutionary algorithm with results outperforming more complex genetic algorithms. Therefore, self-adjustment capability of other evolutionary operators based on agglomerative procedures is a promising direction for the further research.
The algorithm with the adjustment of the parameter r (the number of excess centers to be removed) of the agglomerative procedure shows the most impressive results in the case of solving large problems, as well as problems of a “geographic” nature, where many points of demand have a complex structure, formed under the influence of natural factors.
Such algorithms, implemented for massively parallel systems, despite the high computational complexity of the embedded agglomerative procedure, ALA-algorithm and Weiszfeld procedure, are capable of solving problems with several million demand points in a reasonable time.

Author Contributions

Conceptualization, L.K.; methodology, L.K. and I.R.; software, L.K.; validation, I.R.; formal analysis, L.K. and G.S.; investigation, I.R.; resources, L.K.; data curation, I.R.; writing—original draft preparation, L.K. and G.S.; writing—review and editing, L.K.; visualization, I.R.; supervision, L.K.; project administration, L.K.; funding acquisition, L.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by The Ministry of Science and Higher Education of the Russian Federation, project No. FEFE-2020-0013.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: http://cs.joensuu.fi/sipu/datasets/ (accessed on 21 April 2020), https://archive.ics.uci.edu/ml/index.php (accessed on 21 April 2020).

Conflicts of Interest

The authors declare no conflict of interest. The funder had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A. Detailed Results of Computational Experiments

Table A1. Comparative results for BIRCH3 dataset. 105 data vectors in R 2 , p = 30 centers, time limitation 10 s.
Table A1. Comparative results for BIRCH3 dataset. 105 data vectors in R 2 , p = 30 centers, time limitation 10 s.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)4.38183 × 1094.67540 × 1094.52387 × 1094.51929 × 1097.82433 × 107
j-means (SWAP1 + Lloyd)3.56107 × 1095.25877 × 1093.92248 × 1093.88872 × 1093.09203 × 108
AGGL13.45692 × 1093.56599 × 1093.46697 × 1093.45692 × 1092.40890 × 107
AGGL23.45057 × 1093.45692 × 1093.45650 × 1093.45692 × 1091.61204 × 106
AGGL33.45057 × 1093.45692 × 1093.45650 × 1093.45692 × 1091.61204 × 106
AGGL5–20 (equal results)3.45692 × 1093.45692 × 1093.45692 × 1093.45692 × 1090.00000
AGGL253.45057 × 1093.45057 × 1093.45057 × 1093.45057 × 10946.7390
AGGL303.45057 × 1093.45057 × 1093.45057 × 1093.45057 × 109127.558
GH-VNS1-3 (equal results)3.45057 × 1093.45057 × 1093.45057 × 1093.45057 × 1090.00000
SWAP1 (the best of SWAP)3.70386 × 1094.43612 × 1093.89665 × 1093.89201 × 1091.24246 × 108
GA-1POINT3.51505 × 1093.63646 × 1093.56546 × 1093.56848 × 1093.24395 × 107
GA-UNIFORM3.45057 × 1093.59155 × 1093.49515 × 1093.49778 × 1093.07081 × 107
Aggl-EA3.45057 × 1093.45057 × 1093.45057 × 1093.45057 × 1090.00000
Note (for all tables): the best results are underlined.
Table A2. Comparative results for BIRCH3 dataset. 105 data vectors in R 2 , p = 100 centers, time limitation 10 s.
Table A2. Comparative results for BIRCH3 dataset. 105 data vectors in R 2 , p = 100 centers, time limitation 10 s.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)2.27163 × 1092.54224 × 1092.44280 × 1092.46163 × 1097.42330 × 107
j-means (SWAP1 + Lloyd)1.52130 × 1091.88212 × 1091.69226 × 1091.67712 × 1098.03684 × 107
AGGL11.56737 × 1092.14554 × 1091.81181 × 1091.80137 × 1091.60897 × 108
AGGL21.49271 × 1091.57474 × 1091.49822 × 1091.49445 × 1091.51721 × 107
AGGL31.49232 × 1091.59406 × 1091.50429 × 1091.49447 × 1092.64626 × 107
AGGL51.49341 × 1091.57503 × 1091.50433 × 1091.49439 × 1092.44200 × 107
AGGL71.49341 × 1091.59630 × 1091.51531 × 1091.49515 × 1093.58332 × 107
AGGL101.49261 × 1091.60585 × 1091.51345 × 1091.50733 × 1092.79785 × 107
AGGL121.49358 × 1091.61554 × 1091.51105 × 1091.49754 × 1093.16092 × 107
AGGL151.49414 × 1091.57599 × 1091.50791 × 1091.49667 × 1092.39108 × 107
AGGL201.49406 × 1091.57432 × 1091.51096 × 1091.50880 × 1091.83795 × 107
AGGL251.49386 × 1091.53280 × 1091.51017 × 1091.51159 × 1091.23389 × 107
AGGL301.49355 × 1091.53113 × 1091.50656 × 1091.50311 × 1091.12716 × 107
AGGL501.49380 × 1091.52818 × 1091.50495 × 1091.50073 × 1091.11666 × 107
AGGL751.49415 × 1091.56935 × 1091.51541 × 1091.51203 × 1091.95288 × 107
AGGL1001.49413 × 1091.56603 × 1091.51133 × 1091.49943 × 1092.01570 × 107
GH-VNS11.49362 × 1091.66273 × 1091.54540 × 1091.51469 × 1095.07461 × 107
GH-VNS21.49413 × 1091.59073 × 1091.50692 × 1091.50235 × 1091.87495 × 107
GH-VNS31.49247 × 1091.55423 × 1091.50561 × 1091.49497 × 1091.84421 × 107
SWAP1 (the best of SWAP)1.70412 × 1092.00689 × 1091.83218 × 1091.83832 × 1096.04418 × 107
GA-1POINT1.61068 × 1091.85786 × 1091.69042 × 1091.65346 × 1097.23930 × 107
GA-UNIFORM1.67990 × 1091.92521 × 1091.77471 × 1091.76783 × 1096.95451 × 107
Aggl-EA1.49199 × 1091.57449 × 1091.50670 × 1091.49495 × 1092.46374 × 107
Table A3. Comparative results for BIRCH3 dataset. 105 data vectors in R 2 , p = 300 centers, time limitation 10 s.
Table A3. Comparative results for BIRCH3 dataset. 105 data vectors in R 2 , p = 300 centers, time limitation 10 s.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)1.54171 × 1091.64834 × 1091.59859 × 1091.60045 × 1092.62443 × 107
j-means (SWAP1 + Lloyd)9.95926 × 1081.15171 × 1091.05927 × 1091.04561 × 1094.63788 × 107
AGGL11.14784 × 1091.88404 × 1091.47311 × 1091.42352 × 1091.94210 × 108
AGGL29.96893 × 1081.23746 × 1091.08456 × 1091.07354 × 1095.76487 × 107
AGGL39.87064 × 1081.21980 × 1091.07148 × 1091.06503 × 1094.84664 × 107
AGGL59.68357 × 1081.13508 × 1091.06002 × 1091.06617 × 1094.50726 × 107
AGGL71.00537 × 1091.19050 × 1091.07440 × 1091.06799 × 1094.73520 × 107
AGGL109.78925 × 1081.17041 × 1091.06123 × 1091.05264 × 1095.10834 × 107
AGGL129.94720 × 1081.16133 × 1091.05659 × 1091.05930 × 1093.59604 × 107
AGGL159.64129 × 1081.18512 × 1091.05884 × 1091.04935 × 1095.54774 × 107
AGGL209.53544 × 1081.15383 × 1091.03729 × 1091.03124 × 1094.58983 × 107
AGGL259.98103 × 1081.13062 × 1091.03990 × 1091.02958 × 1093.39141 × 107
AGGL309.59132 × 1081.08646 × 1091.02528 × 1091.02102 × 1093.48713 × 107
AGGL509.48152 × 1081.07480 × 1091.00313 × 1099.97259 × 1082.65919 × 107
AGGL759.44390 × 1081.05320 × 1099.90754 × 1089.86511 × 1083.04626 × 107
AGGL1009.33977 × 1081.02524 × 1099.70365 × 1089.63525 × 1082.48207 × 107
AGGL1509.20206 × 1081.02162 × 1099.61877 × 1089.60465 × 1082.42984 × 107
AGGL2009.18310 × 1081.01805 × 1099.52666 × 1089.42071 × 1082.73422 × 107
AGGL2509.08532 × 1089.78792 × 1089.36947 × 1089.29497 × 1081.97893 × 107
AGGL3009.10975 × 1089.77193 × 1089.39030 × 1089.39434 × 1081.27907 × 107
GH-VNS11.00289 × 1091.08471 × 1091.03856 × 1091.03485 × 1092.20294 × 107
GH-VNS29.68045 × 1081.11832 × 1091.01461 × 1091.00404 × 1093.80607 × 107
GH-VNS39.12455 × 1089.44414 × 1089.31225 × 1089.32001 × 1088.25653 × 106
SWAP2 (the best of SWAP by avg.)1.23379 × 1091.46395 × 1091.33987 × 1091.35305 × 1095.64094 × 107
SWAP3 (the best of SWAP by median)1.25432 × 1091.44136 × 1091.34388 × 1091.34424 × 1095.01104 × 107
GA-1POINT1.11630 × 1091.38598 × 1091.23404 × 1091.22828 × 1097.05936 × 107
GA-UNIFORM1.17534 × 1091.37190 × 1091.26758 × 1091.25424 × 1095.20153 × 107
Aggl-EA9.14179 × 1089.71905 × 1089.34535 × 1089.33403 × 1081.51920 × 107
Table A4. Comparative results for S1 dataset. 5000 data vectors in R 2 , p = 15 centers, time limitation 1 s.
Table A4. Comparative results for S1 dataset. 5000 data vectors in R 2 , p = 15 centers, time limitation 1 s.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)1.69034 × 1082.02847 × 1081.79099 × 1081.69034 × 1081.56374 × 107
j-means (SWAP1 + Lloyd)1.69034 × 1082.18797 × 1081.74375 × 1081.69034 × 1081.40223 × 107
AGGL11.69034 × 1082.02619 × 1081.71265 × 1081.69034 × 1088.49108 × 106
AGGL2-15 (equal results)1.69034 × 1081.69034 × 1081.69034 × 1081.69034 × 1080.00000
SWAP1 (the best of SWAP)1.69034 × 1082.08453 × 1081.70348 × 1081.69034 × 1087.19690 × 106
GA-1POINT1.69034 × 1082.02783 × 1081.76877 × 1081.69034 × 1081.44603 × 107
GA-UNIFORM1.69034 × 1081.69034 × 1081.69034 × 1081.69034 × 1082.92119
Aggl-EA1.69034 × 1081.69034 × 1081.69034 × 1081.69034 × 1080.00000
Table A5. Comparative results for S1 dataset. 5000 data vectors in R 2 , p = 50 centers, time limitation 1 s.
Table A5. Comparative results for S1 dataset. 5000 data vectors in R 2 , p = 50 centers, time limitation 1 s.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)1.14205 × 1081.16737 × 1081.15594 × 1081.15666 × 1086.33573 × 105
j-means (SWAP1 + Lloyd)1.13500 × 1081.17631 × 1081.15035 × 1081.14954 × 1089.38353 × 105
AGGL11.13121 × 1081.15265 × 1081.14219 × 1081.14035 × 1086.23645 × 105
AGGL21.12430 × 1081.12572 × 1081.12475 × 1081.12464 × 1083.76180 × 104
AGGL31.12426 × 1081.12548 × 1081.12465 × 1081.12457 × 1082.92460 × 104
AGGL51.12446E × 1081.12552 × 1081.12487 × 1081.12483 × 1083.00588 × 104
AGGL71.12437 × 1081.12591 × 1081.12481 × 1081.12478 × 1083.26658 × 104
AGGL101.12462 × 1081.12558 × 1081.12500 × 1081.12500 × 1082.68212 × 104
AGGL151.12462 × 1081.12579 × 1081.12526 × 1081.12528 × 1082.83854 × 104
AGGL201.12461 × 1081.12590 × 1081.12535 × 1081.12538 × 1083.27419 × 104
AGGL251.12472 × 1081.12632 × 1081.12541 × 1081.12543 × 1084.34545 × 104
AGGL301.12523 × 1081.12662 × 1081.12572 × 1081.12566 × 1083.55121 × 104
AGGL501.12541 × 1081.12848 × 1081.12666 × 1081.12651 × 1087.48936 × 104
GH-VNS11.12419 × 1081.12796 × 1081.12467 × 1081.12446 × 1087.47785 × 104
GH-VNS21.12472 × 1081.12601 × 1081.12525 × 1081.12519 × 1083.37221 × 104
GH-VNS31.12531 × 1081.12969 × 1081.12712 × 1081.12708 × 1089.60927 × 104
SWAP1 (the best of SWAP)1.13142 × 1081.16627 × 1081.14430 × 1081.14412 × 1088.48529 × 105
GA-1POINT1.14271 × 1081.16790 × 1081.15443 × 1081.15343 × 1087.16204 × 105
GA-UNIFORM1.13119 × 1081.15805 × 1081.14384 × 1081.14405 × 1086.75227 × 105
Aggl-EA1.12476 × 1081.15978 × 1081.14049 × 1081.13985 × 1081.05017 × 106
Table A6. Comparative results for S4 dataset. 5000 data vectors in R 2 , p = 15 centers, time limitation 1 s.
Table A6. Comparative results for S4 dataset. 5000 data vectors in R 2 , p = 15 centers, time limitation 1 s.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)2.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 10825.4490
j-means (SWAP1 + Lloyd)2.27694 × 1082.60475 × 1082.30812 × 1082.27694 × 1087.33507 × 106
AGGL12.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 10826.4692
AGGL22.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 1086.88293
AGGL32.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 1087.84212
AGGL52.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 10814.9936
AGGL72.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 10826.1335
AGGL102.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 108112.307
AGGL122.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 10841.2754
AGGL152.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 10881.9819
GH-VNS12.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 1080.00000
GH-VNS22.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 10835.8018
GH-VNS32.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 10881.1213
SWAP1 (the best of SWAP)2.27694 × 1082.38720 × 1082.28342 × 1082.27694 × 1082.48983 × 106
GA-1POINT2.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 10817.3669
GA-UNIFORM2.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 1088.55973
Aggl-EA2.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 10878.6011
Table A7. Comparative results for S4 dataset. 5000 data vectors in R 2 , p = 50 centers, time limitation 1 s.
Table A7. Comparative results for S4 dataset. 5000 data vectors in R 2 , p = 50 centers, time limitation 1 s.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)1.35718 × 1081.38141 × 1081.37187 × 1081.37299 × 1084.98608 × 105
j-means (SWAP1 + Lloyd)1.35353 × 1081.38227 × 1081.36673 × 1081.36589 × 1087.55140 × 105
AGGL11.35378 × 1081.37459 × 1081.36291 × 1081.36286 × 1085.78265 × 105
AGGL21.35237 × 1081.35416 × 1081.35353 × 1081.35378 × 1085.39798 × 104
AGGL31.35237 × 1081.35450 × 1081.35372 × 1081.35384 × 1085.64379 × 104
AGGL51.35229 × 1081.35675 × 1081.35388 × 1081.35388 × 1086.46057 × 104
AGGL71.35232 × 1081.35449 × 1081.35306 × 1081.35294 × 1085.64815 × 104
AGGL101.35248 × 1081.35438 × 1081.35325 × 1081.35314 × 1085.64928 × 104
AGGL121.35259 × 1081.35422 × 1081.35321 × 1081.35312 × 1084.98093 × 104
AGGL151.35267 × 1081.35424 × 1081.35323 × 1081.35312 × 1084.16267 × 104
AGGL201.35294 × 1081.35448 × 1081.35347 × 1081.35331 × 1084.45414 × 104
AGGL251.35264 × 1081.35470 × 1081.35348 × 1081.35342 × 1085.03965 × 104
AGGL301.35260 × 1081.35453 × 1081.35345 × 1081.35336 × 1084.55230 × 104
AGGL501.35254 × 1081.35467 × 1081.35354 × 1081.35336 × 1085.50925 × 104
GH-VNS11.35219 × 1081.35758 × 1081.35408 × 1081.35383 × 1081.27073 × 105
GH-VNS21.35239 × 1081.35433 × 1081.35327 × 1081.35324 × 1084.90352 × 104
GH-VNS31.35299 × 1081.35469 × 1081.35374 × 1081.35382 × 1085.29113 × 104
SWAP1 (the best of SWAP)1.35930 × 1081.39391 × 1081.37402 × 1081.37351 × 1088.40302 × 105
GA-1POINT1.35943 × 1081.38086 × 1081.36776 × 1081.36723 × 1084.96830 × 105
GA-UNIFORM1.35260 × 1081.36698 × 1081.35917 × 1081.35937 × 1083.50731 × 105
Aggl-EA1.35241 × 1081.35438 × 1081.35313 × 1081.35304 × 1085.26194 × 104
Table A8. Comparative results for Mopsi-Joensuu dataset. 6014 data vectors in R 2 , p = 100 centers, time limitation 5 s.
Table A8. Comparative results for Mopsi-Joensuu dataset. 6014 data vectors in R 2 , p = 100 centers, time limitation 5 s.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)99.7279123.1325110.4963109.97065.6551
j-means (SWAP1 + Lloyd)48.331157.168951.667351.11462.0749
AGGL151.969681.984864.245462.43508.3403
AGGL244.486650.092446.154446.09731.0708
AGGL343.861546.388145.173545.33030.7306
AGGL543.737346.029444.836944.72450.6888
AGGL744.082048.448345.225645.08510.9135
AGGL1043.860345.714944.645544.63870.5400
AGGL1543.799545.592744.700544.71730.5203
AGGL2043.662946.031644.462644.33530.5953
AGGL2543.635746.555044.517944.34080.6543
AGGL3043.672844.583044.167744.11370.2945
AGGL5043.620245.043344.225744.31370.3590
AGGL7543.637545.685644.145043.98600.4683
AGGL10043.605444.903544.047243.98750.3061
GH-VNS147.717159.697053.489653.19483.4123
GH-VNS243.778146.008544.860244.91490.5772
GH-VNS343.858546.449044.726344.65930.6018
SWAP1 (the best of SWAP)47.880052.403049.828749.55231.1239
GA-1POINT61.267776.911467.229766.91823.8341
GA-UNIFORM68.6276102.086784.870583.81527.4129
Aggl-EA43.582644.445243.748643.75600.1624
Table A9. Comparative results for Mopsi-Joensuu dataset. 6014 data vectors in R 2 , p = 30 centers, time limitation 5 s.
Table A9. Comparative results for Mopsi-Joensuu dataset. 6014 data vectors in R 2 , p = 30 centers, time limitation 5 s.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)190.1025229.0199218.3214219.66118.1825
j-means (SWAP1 + Lloyd)146.3243158.0798151.0530150.69792.7908
AGGL1145.7872163.1270152.6856153.24325.6803
AGGL2145.7738146.2262145.8728145.77980.1532
AGGL3145.7745146.2161145.9185145.78410.1744
AGGL5145.7742146.1392145.8211145.78300.1088
AGGL7145.7738146.1416145.8188145.78290.1084
AGGL10145.7770146.1370145.8295145.78310.1209
AGGL15145.7742146.1392145.8211145.78300.1088
AGGL20145.7784145.8113145.7869145.78470.0084
AGGL25145.7753146.1479145.8043145.78930.0680
AGGL30145.7791146.1466145.8022145.78770.0670
GH-VNS1145.7738146.2193145.8769145.77890.1537
GH-VNS2145.7729146.1506145.8323145.77800.1243
GH-VNS3145.7721146.1273145.7932145.77610.0664
SWAP1 (the best of SWAP)145.9265155.8482148.4986148.09951.8866
GA-1POINT148.2359164.0893154.9963154.69303.4858
GA-UNIFORM153.8591201.7184175.8969174.268210.9540
Aggl-EA145.7721145.7752145.7738145.77390.0008
Table A10. Comparative results for Mopsi-Joensuu dataset. 6014 data vectors in R 2 , p = 300 centers, time limitation 5 s.
Table A10. Comparative results for Mopsi-Joensuu dataset. 6014 data vectors in R 2 , p = 300 centers, time limitation 5 s.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)47.887454.485051.639551.11621.6453
j-means (SWAP1 + Lloyd)23.679835.380531.255431.86702.7425
AGGL134.399054.529142.589041.88795.0023
AGGL218.625522.899520.485820.38531.0785
AGGL316.738919.841518.537618.60630.7018
AGGL516.594419.430517.751217.55570.7611
AGGL716.160920.056317.891817.83970.8182
AGGL1016.409919.008717.392217.22200.6228
AGGL1516.070617.483516.758416.75370.4276
AGGL2015.778317.612216.685216.73380.4712
AGGL2515.685418.301116.784716.64730.5885
AGGL5015.796317.794816.286016.29200.4237
AGGL10015.194216.337015.695115.67380.3081
AGGL15015.202516.499615.789815.79900.2939
AGGL20015.180516.224515.725215.78010.2843
AGGL25015.097516.850015.710315.67570.3640
AGGL30015.250916.280315.710815.72240.2694
GH-VNS121.558331.446727.785327.90112.5876
GH-VNS215.592817.619716.642416.63450.5166
GH-VNS315.348816.886415.964415.88500.4229
SWAP5 (the best of SWAP by avg.)22.319330.739827.517427.97111.9760
SWAP7 (the best of SWAP by median)24.024330.935627.632927.89341.9714
GA-1POINT34.442945.086840.153939.68962.3970
GA-UNIFORM37.480653.375043.810043.72753.7585
Aggl-EA14.835419.453115.457615.33100.8131
Table A11. Comparative results for Individual Household Electric Power Consumption (IHEPC) dataset. 2,075,259 data vectors in R 7 , p = 30 centers, time limitation 5 min.
Table A11. Comparative results for Individual Household Electric Power Consumption (IHEPC) dataset. 2,075,259 data vectors in R 7 , p = 30 centers, time limitation 5 min.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)88,145.648493,677.328190,681.166389,967.92971818.1249
j-means (SWAP1 + Lloyd)87,907.781395,055.242289,657.889588,702.92972442.1302
AGGL191,021.6016110,467.5625104,694.9900109,976.72669278.2649
AGGL286,291.1406109,972.078199,788.2522109,817.812512,557.4381
AGGL3109,817.8125109,999.1328109,913.6953109,972.078190.3626
AGGL586,240.7344109,999.1328103,145.4487109,817.828111,519.4208
AGGL786,345.9297109,817.835999,781.6283109,817.820312,517.3778
AGGL1086,414.0938109,999.1563106,500.3449109,817.82818857.4620
AGGL1587,253.8281109,817.8438103,391.2009109,817.835910,975.6502
AGGL2087,616.3984109,999.1563106,677.1384109,817.84388405.2571
AGGL2587,409.8203109,817.8438106,616.6953109,817.84388469.4358
AGGL3087,852.5938109,878.0156106,707.1674109,853.05478314.1220
GH-VNS186,228.4844109,999.125099,811.1853109,817.773412,593.8724
GH-VNS286,368.7891109,817.828199,807.6674109,817.812512,485.0346
GH-VNS386,304.9141109,817.828196,518.552586,865.281312,441.4202
SWAP1 (the best of SWAP)86,672.6250112,683.0469100,566.9531109,979.859412,666.0305
GA-1POINT88,170.4141100,368.343892,664.373992,332.99224102.0677
GA-UNIFORM87,650.1016110,271.617296,631.676392,752.28139762.9279
Aggl-EA86,147.6953109,817.796991,515.165886,393.375010,377.0956
Table A12. Comparative results for Individual Household Electric Power Consumption (IHEPC) dataset. 2,075,259 data vectors in R 7 , p = 300 centers, time limitation 5 min.
Table A12. Comparative results for Individual Household Electric Power Consumption (IHEPC) dataset. 2,075,259 data vectors in R 7 , p = 300 centers, time limitation 5 min.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)49,901.808658,872.109453,113.427552,599.02732781.6693
j-means (SWAP1 + Lloyd)44,074.644547,562.492245,375.165744,608.19531505.4071
AGGL167,508.234469,943.500069,070.911869,328.3750845.3473
AGGL243,966.859467,854.250060,923.938667,576.687511,525.2718
AGGL343,867.121168,638.195358,670.092167,070.937511,658.5487
AGGL544,337.664168,679.312561,875.659067,577.062510,517.0718
AGGL743,688.375068,976.546964,480.627267,636.71889192.1896
AGGL1066,566.296968,004.250067,175.485567,188.0938510.9257
AGGL1543,160.312567,295.179757,406.819866,625.804712,069.9982
AGGL2043,819.128967,265.218860,462.291966,883.593811,234.3160
AGGL2543,324.320367,231.085960,237.428666,842.171911,452.1535
AGGL5042,628.914166,279.078162,340.241165,467.88288700.5474
AGGL10064,635.605565,408.789165,055.410265,155.6328284.9207
AGGL15064,458.304765,008.164164,682.862264,706.7969188.7753
AGGL20041,754.168064,694.632858,240.814764,582.562510,883.6398
AGGL25064,385.246164,646.914164,538.078764,547.082082.7326
AGGL30041,373.445365,051.359461,283.102764,492.88288781.8445
GH-VNS167,290.140668,655.812567,645.656367,467.4375478.1909
GH-VNS243,395.027367,046.023463,081.623366,222.03138689.9374
GH-VNS342,124.937564,425.273461,204.448764,358.48448413.3478
SWAP3 (the best of SWAP by avg.)46,137.957070,000.281363,618.636268,361.86729467.2699
SWAP100 (the best of SWAP by median)51,026.585970,514.335965,787.661867,617.57816667.2423
GA-1POINT45,618.382852,858.687549,198.696447,304.44533129.2994
GA-UNIFORM46,171.714862,034.527352,218.107750,642.49615712.0544
Aggl-EA41,795.507865,057.937551,888.064242,525.500012,122.2332
Table A13. Comparative results for Individual Household Electric Power Consumption (IHEPC) dataset. 2,075,259 data vectors in R 7 , p = 100 centers, time limitation 5 min.
Table A13. Comparative results for Individual Household Electric Power Consumption (IHEPC) dataset. 2,075,259 data vectors in R 7 , p = 100 centers, time limitation 5 min.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)61,803.304765,997.140664,612.337165,037.82811431.8760
j-means (SWAP1 + Lloyd)61,374.574270,854.695363,433.896262,227.11723308.9649
AGGL160,709.250088,798.945382,890.425586,253.25399221.4664
AGGL258,098.757885,587.093878,612.585984,406.789111,517.2424
AGGL358,508.136784,416.421975,654.490583,573.390611,916.7366
AGGL558,326.464883,891.578169,386.221059,046.070313,415.1144
AGGL758,623.820384,465.250069,618.775759,477.093813,118.7602
AGGL1058,665.441484,494.093872,985.395182,966.515613,078.1135
AGGL1558,306.496183,287.789169,407.902359,881.015612,347.6134
AGGL2058,666.238383,110.843876,161.161882,986.757811,757.2349
AGGL2558,533.140683,046.070369,305.302559,392.367212,821.3261
AGGL5058,768.543082,938.179769,450.185360,540.539112,428.4255
AGGL7558,538.992282,544.687575,635.435882,477.617211,673.3413
AGGL10058,460.191482,531.000078,951.649082,422.81259036.9781
GH-VNS159,228.605585,246.796973,899.848283,708.703113,239.8092
GH-VNS282,966.875083,334.968883,061.505683,033.9844128.9717
GH-VNS359,417.937582,291.039178,905.581582,124.60948593.7433
SWAP10 (the best of SWAP)61,196.441484,698.210970,051.593862,676.312510,652.0083
GA-1POINT62,192.171964,413.257863,051.750063,028.9922733.3966
GA-UNIFORM60,873.585966,829.296963,656.855564,155.68752084.6202
Aggl-EA57,594.070372,909.500065,782.609458,506.64843208.6879
Table A14. Comparative results for Mopsi-Finland dataset. 13,467 data vectors in R 2 , p = 100 centers, time limitation 5 s.
Table A14. Comparative results for Mopsi-Finland dataset. 13,467 data vectors in R 2 , p = 100 centers, time limitation 5 s.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)6.28454 × 1067.25139 × 1066.92349 × 1066.95864 × 106198,384
j-means (SWAP1 + Lloyd)3.76240 × 1064.00651 × 1063.85266 × 1063.84859 × 10659,481.8
AGGL14.17611 × 1065.71925 × 1064.87105 × 1064.79499 × 106417,612
AGGL23.64502 × 1063.67299 × 1063.65556 × 1063.65350 × 1067644.61
AGGL33.64508 × 1063.77758 × 1063.66037 × 1063.65225 × 10628,242.8
AGGL53.64502 × 1063.77520 × 1063.65956 × 1063.65221 × 10631,464.1
AGGL73.64352 × 1063.77238 × 1063.66457 × 1063.65259 × 10636,374.3
AGGL103.64503 × 1063.74222 × 1063.65639 × 1063.65191 × 10618,087.8
AGGL123.64508 × 1063.77121 × 1063.65745 × 1063.65299 × 10622,688.4
AGGL153.64351 × 1063.78305 × 1063.66115 × 1063.65050 × 10632,121.2
AGGL203.64289 × 1063.74971 × 1063.65333 × 1063.64965 × 10619,010.0
AGGL253.64354 × 1063.66605 × 1063.64913 × 1063.64784 × 1065184.35
AGGL303.64290 × 1063.67201 × 1063.65106 × 1063.64702 × 1067780.79
AGGL503.64511 × 1063.67594 × 1063.65120 × 1063.64940 × 1066920.69
AGGL753.64510 × 1063.68136 × 1063.65552 × 1063.65355 × 1067722.98
AGGL1003.64508 × 1063.67524 × 1063.65651 × 1063.65395 × 1069106.84
GH-VNS13.69265 × 1065.20291 × 1064.05038 × 1063.93463 × 106336,130
GH-VNS23.64503 × 1063.66884 × 1063.65123 × 1063.65015 × 1066284.54
GH-VNS33.64503 × 1063.67655 × 1063.65735 × 1063.65370 × 1069247.64
SWAP1 (the best of SWAP)3.68260 × 1063.79031 × 1063.74321 × 1063.74447 × 10625,846.1
GA-1POINT4.38348 × 1065.29721 × 1064.83383 × 1064.82569 × 106198,278
GA-UNIFORM5.00518 × 1066.29115 × 1065.62017 × 1065.60439 × 106347,120
Aggl-EA3.64285 × 1063.65052 × 1063.64473 × 1063.64502 × 1061467.87
Table A15. Comparative results for Mopsi-Finland dataset. 13,467 data vectors in R 2 , p = 300 centers, time limitation 5 s.
Table A15. Comparative results for Mopsi-Finland dataset. 13,467 data vectors in R 2 , p = 300 centers, time limitation 5 s.
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
Lloyd (multistart)3.28826 × 1063.58614 × 1063.46203 × 1063.47443 × 10680,996.0
j-means (SWAP1 + Lloyd)1.59039 × 1062.13796 × 1061.74473 × 1061.71790 × 106120,717
AGGL12.34524 × 1063.59085 × 1063.03226 × 1062.97788 × 106300,773
AGGL21.47203 × 1061.68652 × 1061.53552 × 1061.52156 × 10655,909.7
AGGL31.41824 × 1061.65971 × 1061.49980 × 1061.49186 × 10650,351.7
AGGL51.41376 × 1061.69573 × 1061.48142 × 1061.47572 × 10653,828.0
AGGL71.42594 × 1061.67197 × 1061.49188 × 1061.47882 × 10650,893.8
AGGL101.42666 × 1061.61224 × 1061.48780 × 1061.48111 × 10637,416.1
AGGL151.38648 × 1061.68429 × 1061.51728 × 1061.50331 × 10661,906.0
AGGL201.43027 × 1061.67186 × 1061.51494 × 1061.49912 × 10663,635.8
AGGL251.42352 × 1061.74796 × 1061.51280 × 1061.49324 × 10675,237.2
AGGL301.42871 × 1061.66177 × 1061.49960 × 1061.48752 × 10651,468.1
AGGL501.40068 × 1061.47648 × 1061.44410 × 1061.44245 × 10617,875.2
AGGL751.38586 × 1061.48921 × 1061.42238 × 1061.42229 × 10623,714.1
AGGL1001.37383 × 1061.43596 × 1061.40575 × 1061.41181 × 10617,442.5
AGGL1501.36240 × 1061.40877 × 1061.38853 × 1061.38929 × 10611,719.5
AGGL2001.35686 × 1061.42109 × 1061.38415 × 1061.38600 × 10617,504.0
AGGL2501.35268 × 1061.40887 × 1061.38198 × 1061.38182 × 10613,577.4
AGGL3001.35163 × 1061.41077 × 1061.38198 × 1061.37998 × 10615,384.2
GH-VNS11.70022 × 1062.04728 × 1061.87817 × 1061.88773 × 106101,777
GH-VNS21.43218 × 1061.78314 × 1061.51093 × 1061.49528 × 10670,431.8
GH-VNS31.36493 × 1061.43665 × 1061.38037 × 1061.37633 × 10614,603.0
SWAP1 (the best of SWAP)1.49832 × 1061.72366 × 1061.57134 × 1061.56292 × 10647,997.3
GA-1POINT2.69754 × 1063.29563 × 1062.91834 × 1062.88716 × 106144,687
GA-UNIFORM2.55194 × 1063.53833 × 1062.91604 × 1062.89257 × 106205,946
Aggl-EA1.36496 × 1061.40537 × 1061.38311 × 1061.38149 × 10612,302.2

References

  1. Drezner, Z.; Hamacher, H. Facility Location: Applications and Theory; Springer: Berlin, Germany, 2004. [Google Scholar]
  2. Khachumov, M.V. Distances, metrics and data clustering. Sci. Tech. Inf. Proc. 2012, 39, 310–316. [Google Scholar] [CrossRef]
  3. Çolakoglu, H.B. A Generalization of the Minkowski Distance and a New Definition of the Ellipse. Available online: https://arxiv.org/abs/1903.09657v1 (accessed on 12 March 2021).
  4. France, S.; Carroll, J.D.; Xiong, H. Distance metrics for high dimensional nearest neighborhood recovery: Compression and normalization. Inform. Sci. 2012, 184, 92–110. [Google Scholar] [CrossRef]
  5. Weiszfeld, E.; Plastria, F. On the point for which the sum of the distances to n given points is minimum. Ann. Oper. Res. 2009, 167, 7–41. [Google Scholar] [CrossRef]
  6. Kuhn, H.W. A note on Fermat’s problem. Math. Program. 1973, 4, 98–107. [Google Scholar] [CrossRef]
  7. Weiszfeld, E. Sur le point sur lequel la somme des distances de n points donnes est minimum. Tohoku Math. J. 1937, 43, 335–386. [Google Scholar]
  8. Sturm, R. Ueber den Punkt kleinster Entfernungssumme von gegebenen Punkten. J. Rein. Angew. Math. 1884, 97, 49–61. [Google Scholar]
  9. Beck, A. Weiszfeld’s Method: Old and New Results. J. Optim. Theory Appl. 2015, 164, 1–40. [Google Scholar] [CrossRef]
  10. Garey, M.; Johnson, D.; Witsenhausen, H. The complexity of the generalized Lloyd—Max problem (Corresp.). IEEE Trans. Inf. Theory 1982, 28, 255–256. [Google Scholar] [CrossRef]
  11. Farahani, R.Z.; Hekmatfar, M. Facility Location Concepts, Models, Algorithms and Case Studies; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar] [CrossRef]
  12. Hakimi, S.L. Optimum locations of switching centers and the absolute centers and medians of a graph. Oper. Res. 1964, 12, 450–459. [Google Scholar] [CrossRef]
  13. Masuyama, S.; Ibaraki, T.; Hasegawa, T. The computational complexity of the m center problems on the plane. Trans. Inst. Electron. Commun. Eng. Jpn. 1981, 64E, 57–64. [Google Scholar]
  14. Kariv, O.; Hakimi, S.L. An algorithmic approach to network location problems. The p-medians. SIAM J. Appl. Math 1979, 37, 539–560. [Google Scholar] [CrossRef]
  15. Cooper, L. The weber problem revisited. Comput. Math. Appl. 1981, 7, 225–234. [Google Scholar] [CrossRef] [Green Version]
  16. Lawrence, M. Ostresh. On the convergence of a class of iterative methods for solving the weber location problem. Oper. Res. 1978, 26, 597–609. [Google Scholar]
  17. Plastria, F.; Elosmani, M. On the convergence of the Weiszfeld algorithm for continuous single facility location allocation problems. TOP 2008, 16, 388–406. [Google Scholar] [CrossRef]
  18. Vardi, Y. The multivariate L1-median and associated data depth. Proc. Natl. Acad. Sci. USA 2000, 97, 1423–1426. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  19. Badoiu, M. Approximate clustering via core-sets. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing, Montréal, QC, Canada, 19–21 May 2002; pp. 250–257. [Google Scholar]
  20. Kuhn, H.W. An efficient algorithm for the numerical solution of the generalized Weber problem in spatial economics. J. Reg. Sci. 1962, 4, 21–34. [Google Scholar]
  21. Mladenovic, N.; Brimberg, J.; Hansen, P.; Moreno-Perez, J.A. The p-median problem: A survey of metaheuristic approaches. Eur. J. Oper. Res. 2007, 179, 927–939. [Google Scholar] [CrossRef] [Green Version]
  22. Reese, J. Solution methods for the p-median problem: An annotated bibliography. Networks 2006, 48, 125–142. [Google Scholar] [CrossRef]
  23. Hakimi, S.L. Optimum distribution of switching centers in a communication network and some related graph theoretic problems. Oper. Res. 1965, 13, 462–475. [Google Scholar] [CrossRef]
  24. Kuenne, R.E.; Soland, R.M. Exact and approximate solutions to the multisource Weber problem. Math. Program. 1972, 3, 193–209. [Google Scholar] [CrossRef]
  25. Ostresh, L.M.J. The stepwise location-allocation problem: Exact solutions in continuous and discrete spaces. Geogr. Anal. 1978, 10, 174–185. [Google Scholar] [CrossRef]
  26. Rosing, K.E. An optimal method for solving the (generalized) multi-weber problem. Eur. J. Oper. Res. 1992, 58, 414–426. [Google Scholar] [CrossRef]
  27. Rabbani, M. A novel approach for solving a constrained location allocation problem. Int. J. Ind. Eng. Comput. 2013, 4, 203–214. [Google Scholar] [CrossRef]
  28. Fathali, J.; Rad, N.J.; Sherbaf, S.R. The p-median and p-center problems on bipartite graphs. Iran. J. Math. Sci. Inf. 2014, 9, 37–43. [Google Scholar] [CrossRef]
  29. Avella, P.; Sassano, A.; Vasil’ev, I. Computational study of large-scale p-median problems. Math. Program. 2007, 109, 89–114. [Google Scholar] [CrossRef]
  30. Avella, P.; Boccia, M.; Salerno, S.; Vasilyev, I. An aggregation heuristic for large-scale p-median problem. Comput. Oper. Res. 2012, 39, 1625–1632. [Google Scholar] [CrossRef]
  31. Resende, M.G.C. Metaheuristic hybridization with greedy randomized adaptive search procedures. Inf. TutORials Oper. Res. 2008, 295–319. [Google Scholar] [CrossRef] [Green Version]
  32. Resende, M.G.C.; Ribeiro, C.C.; Glover, F.; Marti, R. Scatter search and path relinking: Fundamentals, advances, and applications. In Handbook of Metaheuristics; Gendreau, M., Potvin, J.-Y., Eds.; Springer: Boston, MA, USA, 2010; pp. 87–107. [Google Scholar] [CrossRef]
  33. Brimberg, J.; Drezner, Z.; Mladenovic, N.; Salhi, S. A New Local Search for Continuous Location Problems. Eur. J. Oper. Res. 2014, 232, 256–265. [Google Scholar] [CrossRef] [Green Version]
  34. Drezner, Z.; Brimberg, J.; Mladenovic, N.; Salhi, S. New heuristic algorithms for solving the planar p-median problem. Comput. Oper. Res. 2015, 62, 296–304. [Google Scholar] [CrossRef]
  35. Drezner, Z.; Brimberg, J.; Mladenovic, N.; Salhi, S. Solving the planar p-median problem by variable neighborhood and concentric searches. J. Glob. Optim. 2015, 63, 501–514. [Google Scholar] [CrossRef] [Green Version]
  36. Mladenovic, N.; Alkandari, A.; Pei, J.; Todosijevic, R.; Pardalos, P.M. Less is more approach: Basic variable neighborhood search for the obnoxious p -median problem. Int. Trans. Oper. Res. 2019, 27, 480–493. [Google Scholar] [CrossRef] [Green Version]
  37. Bernábe-Loranca, M.; González-Velázquez, R.; Granillo-Martinez, E.; Romero-Montoya, M.; Barrera-Cámara, R. P-median problem: A real case application. In Intelligent Systems Design and Applications. ISDA 2019. Advances in Intelligent Systems and Computing; Springer: Cham, Switherlands, 2021; Volume 1181. [Google Scholar] [CrossRef]
  38. Arthur, D.; Vassilvitskii, S. k-Means++: The Advantages of Careful Seeding. In Proceedings of the SODA’07, SIAM, New Orleans, LA, USA, 7–9 January 2007; pp. 1027–1035. [Google Scholar]
  39. Hromkovic, J. Algorithmics for Hard Problems: Introduction to Combinatorial Optimization, Randomization, Approximation, and Heuristics; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
  40. Ng, T. Expanding Neighborhood Tabu Search for facility location problems in water infrastructure planning. In Proceedings of the 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC), San Diego, CA, USA, 5–8 October 2014; pp. 3851–3854. [Google Scholar] [CrossRef]
  41. Kochetov, Y.; Mladenovic, N.; Hansen, P. Local search with alternating neighborhoods. Discret. Anal. Oper. Res. 2003, 10, 11–43. (In Russian) [Google Scholar]
  42. Hansen, P. Variable neighborhood search: Principles and applications. Eur. J. Oper. Res 2001, 130, 449–467. [Google Scholar] [CrossRef]
  43. Hansen, P.; Mladenovic, N. Development of Variable Neighborhood Search. In Essays and Surveys in Metaheuristics; Ribeiro, C.C., Hansen, P., Eds.; Kluwer Academic Publishers: Dordrecht, The Netherlands, 2002; pp. 415–440. [Google Scholar]
  44. Mladenovic, N. Variable neighborhood search. Comput. Oper. Res. 1997, 24, 1097–1100. [Google Scholar] [CrossRef]
  45. Kochetov, Y.A. Local Search Methods for Discrete Location Problems. Ph.D. Thesis, Sobolev Institute of Mathematics SB RAS, Novosibirsk, Russia, 19 January 2010. (In Russian). [Google Scholar]
  46. Hansen, P. Variable Neighborhood Search. In Search Methodology; Bruke, E.K., Kendall, G., Eds.; Springer: Berlin/Heidelberg, Germany, 2005; pp. 211–238. [Google Scholar] [CrossRef]
  47. Brimberg, J.; Mladenovic, N. A variable neighborhood algorithm for solving the continuous location-allocation problem. Stud. Locat. Anal. 1996, 10, 1–12. [Google Scholar]
  48. Hansen, P.; Mladenovic, N.; Perez-Brito, D. Variable neighborhood decomposition search. J. Heuristics 2001, 7, 335–350. [Google Scholar] [CrossRef]
  49. Brimberg, J.; Hansen, P.; Mladenovic, N.; Taillard, E. Improvements and comparison of heuristics for solving the uncapacitated multisource Weber problem. Oper. Res. 2000, 48, 444–460. [Google Scholar] [CrossRef]
  50. Kochetov, Y.; Alekseeva, E.; Levanova, T.; Loresh, M. Large neighborhood local search for the p-median problem. Yugosl. J. Oper. Res. 2005, 15, 53–63. [Google Scholar] [CrossRef]
  51. Lopez, F.G.; Batista, B.M.; Moreno-Perez, J.; Moreno-Vega, M. The parallel variable neighborhood search for the p-median problem. J. Heuristics 2002, 8, 375–388. [Google Scholar] [CrossRef]
  52. Rozhnov, I.P.; Orlov, V.I.; Kazakovtsev, L.A. VNS-Based algorithms for the centroid-based clustering problem. FACTA Univ. Ser. Math. Inform. 2019, 34, 957–972. [Google Scholar]
  53. Still, S.; Bialek, W.; Bottou, L. Geometric clustering using the information bottleneck method, Advances. In Neural Information Processing Systems. 16; MIT Press: Cambridge, UK, 2004. [Google Scholar]
  54. Sun, Z.; Fox, G.; Gu, W.; Li, Z. A parallel clustering method combined information bottleneck theory and centroid-based clustering. J. Supercomput. 2014, 69, 452–467. [Google Scholar] [CrossRef]
  55. Houck, C.R.; Joines, J.A.; Kay, M.G. Comparison of genetic algorithms, random restart and two-opt switching for solving large location-allocation problems. Comput. Oper. Res. 1996, 23, 587–596. [Google Scholar] [CrossRef]
  56. Maulik, U.; Bandyopadhyay, S. Genetic algorithm-based clustering technique. Pattern Recognit. 2000, 33, 1455–1465. [Google Scholar] [CrossRef]
  57. Krishna, K.; Murty, M.M. Genetic k-means algorithm. IEEE Trans. Syst. Man Cybernetics. Part B 1999, 29, 433–439. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  58. Neema, M.N.; Maniruzzaman, K.M.; Ohgai, A. New genetic algorithms based approaches to continuous p-median problem. Netw. Spat. Econ. 2011, 11, 83–99. [Google Scholar] [CrossRef]
  59. Tuba, E.; Strumberger, I.; Tuba, I.; Bacanin, N.; Tuba, M. Water cycle algorithm for solving continuous p-median problem. In Proceedings of the SACI 2018 IEEE 12th International Symposium on Applied Computational Intelligence and Informatics, Timiuoara, Romania, 17–19 May 2018; pp. 351–354. [Google Scholar] [CrossRef]
  60. Levanova, T.V.; Gnusarev, A.Y. Simulated annealing for competitive p–median facility location problem. J. Phys. Conf. Ser. 2018, 1050, 012044. [Google Scholar] [CrossRef] [Green Version]
  61. Zhao, H.; Zhang, C. An online-learning-based evolutionary many-objective algorithm. Inf. Sci. 2020, 509, 1–21. [Google Scholar] [CrossRef]
  62. Dulebenets, M.A. An adaptive island evolutionary algorithm for the berth scheduling problem. Memetic Comp. 2020, 12, 51–72. [Google Scholar] [CrossRef]
  63. Liu, Z.Z.; Wang, Y.; Huang, P.Q. AnD: A many-objective evolutionary algorithm with angle-based selection and shift-based density estimation. Inf. Sci. 2020, 509, 400–419. [Google Scholar] [CrossRef] [Green Version]
  64. Ruiz, E.; Soto-Mendoza, V.; Barbosa, A.E.R.; Reyes, R. Solving the open vehicle routing problem with capacity and distance constraints with a biased random key genetic algorithm. Comput. Ind. Eng. 2019, 133, 207–219. [Google Scholar] [CrossRef]
  65. Bae, H.; Moon, I. Multi-depot vehicle routing problem with time windows considering delivery and installation vehicles. Appl. Math. Model. 2016, 40, 6536–6549. [Google Scholar] [CrossRef]
  66. Pasha, J.; Dulebenets, M.A.; Kavoosi, M.; Abioye, O.F.; Wang, H.; Guo, W. An optimization model and solution algorithms for the vehicle routing problem with a “factory-in-a-box”. IEEE Access 2020, 8, 134743–134763. [Google Scholar] [CrossRef]
  67. D’Angelo, G.; Pilla, R.; Tascini, C.; Rampone, S. A proposal for distinguishing between bacterial and viral meningitis using genetic programming and decision trees. Soft Comput. 2019, 23, 11775–11791. [Google Scholar] [CrossRef]
  68. Panda, N.; Majhi, S.K. How Effective is the Salp Swarm Algorithm in Data Classification. In Computational Intelligence in Pattern Recognition. Advances in Intelligent Systems and Computing; Das, A., Nayak, J., Naik, B., Pati, S., Pelusi, D., Eds.; Springer: Singapore, 2020; Volume 999. [Google Scholar] [CrossRef]
  69. Falkenauer, E. Genetic Algorithms and Grouping Problems; Wiley: New York, NY, USA, 1998. [Google Scholar]
  70. Alp, O.; Erkut, E.; Drezner, Z. An efficient genetic algorithm for the p-median problem. Ann. Oper. Res. 2003, 122, 21–42. [Google Scholar] [CrossRef]
  71. Kazakovtsev, L.A.; Antamoshkin, A.N. Genetic algorithm with fast greedy heuristic for clustering and location problems. Informatica 2014, 38, 229–240. [Google Scholar]
  72. Hosage, C.M.; Goodchild, M.F. Discrete space location-allocation solutions from genetic algorithms. Ann. Oper. Res. 1986, 6, 35–46. [Google Scholar] [CrossRef]
  73. Blum, C.; Roli, A. Metaheuristics in combinatorial optimization: Overview and conceptual comparison. Acm Comput. Surv. 2001, 35, 268–308. [Google Scholar] [CrossRef]
  74. Kazakovtsev, L.; Rozhnov, I.; Popov, A.; Tovbis, E.M. Self-adjusting variable neighborhood search algorithm for near-optimal k-means clustering. Computation 2020, 8, 90. [Google Scholar] [CrossRef]
  75. Lloyd, S.P. Least Squares Quantization in PCM. IEEE Trans. Inf. Theory 1982, 28, 129–137. [Google Scholar] [CrossRef]
  76. MacQueen, J.B. Some methods of classification and analysis of multivariate observations. In Proceedings of the 5th Berkley Symposium on Mathematical Statistics and Probability, California, CA, USA, 21 June–18 July 1965; Volume 1, pp. 281–297. [Google Scholar]
  77. Kazakovtsev, L.A.; Rozhnov, I.P. Comparative study of local search in SWAP and agglomerative neighbourhoods for the continuous p-median problem. In Proceedings of the IOP Conference Series: Materials Science and Engineering, Volume 1047, III International Conference MIST: Aerospace 2020: Advanced Technologies in Aerospace, Mechanical and Automation Engineering (Aerospace 2020), Krasnoyarsk, Russia, 20–21 November 2020; Volume 1047. [Google Scholar] [CrossRef]
  78. Droste, S.; Jansen, T.; Wegener, I. On the analysis of the (1+1) evolutionary algorithm. Theor. Comput. Sci. 2002, 276, 51–81. [Google Scholar] [CrossRef] [Green Version]
  79. Borisovsky, P.A.; Eremeev, A.V. A study on performance of the (1+1)-Evolutionary Algorithm. In Foundations of Genetic Algorithms; De Jong, K., Poli, R., Rowe, J., Eds.; Morgan Kaufmann: San Francisco, CA, USA, 2003; pp. 271–287. [Google Scholar]
  80. Eremeev, A.V.; Borisovsky, P.A. Comparing evolutionary algorithms to the (1+1)–EA. Theor. Comput. Sci. 2008, 403, 33–41. [Google Scholar] [CrossRef] [Green Version]
  81. Sung, C.W.; Yuen, S.Y. Analysis of (1+1) evolutionary algorithm and randomized local search with memory. Evol. Comput. 2011, 19, 287–323. [Google Scholar] [CrossRef]
  82. Doerr, B.; Johannsen, D.; Schmidt, M. Runtime analysis of the (1+1) evolutionary algorithm on strings over finite alphabets. In Proceedings of the 11th Workshop on Foundations of Genetic Algorithms (FOGA’11), Schwarzenberg, Austria, 5–9 January 2011; pp. 119–126. [Google Scholar] [CrossRef]
  83. Peng, X. Performance analysis of (1+1)EA on the maximum independent set problem. In Lecture Notes in Computer Science; Springer: Cham, Switherlands, 2015; Volume 9483. [Google Scholar] [CrossRef]
  84. Xia, X.; Zhou, Y. Approximation performance of the (1+1) evolutionary algorithm for the minimum degree spanning tree problem. In Communications in Computer and Information Science; Springer: Berlin/Heidelberg, Germany, 2015; Volume 562. [Google Scholar] [CrossRef]
  85. Bian, C.; Qian, C.; Tang, K.; Yu, Y. Running time analysis of the (1+1)-EA for robust linear optimization. Theor. Comput. Sci. 2020, 843, 57–72. [Google Scholar] [CrossRef]
  86. Doerr, B.; Le, H.P.; Makhmara, R.; Nguyen, T.D. Fast genetic algorithms. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2017; Bosman, P.A.N., Ed.; Spriger: Berlin, Germany, 2017; pp. 777–784. [Google Scholar] [CrossRef]
  87. Cooper, L. Heuristic methods for location-allocation problems. SIAM Rev. 1964, 6, 37–53. [Google Scholar] [CrossRef]
  88. Jiang, J.L.; Yuan, X.M. A heuristic algorithm for constrained multi-source Weber problem. The variational inequality approach. Eur. J. Oper. Res. 2007, 187, 357–370. [Google Scholar] [CrossRef]
  89. Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 1977, 39, 1–38. [Google Scholar] [CrossRef]
  90. O’Callaghan, L.; Mishra, N.; Meyerson, A.; Guha, S.; Motwani, R. Streaming-data algorithms for high-quality clustering. In Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, 26 February–1 March 2002; pp. 685–694. [Google Scholar] [CrossRef]
  91. Ackermann, M.R.; Martens, M.; Raupach, C.; Swierkot, K.; Lammersen, C.; Sohler, C. Streamkm: A clustering algorithm for data streams. J. Exp. Algorithms 2010, 17, art.2.4. [Google Scholar] [CrossRef]
  92. Kazakovtsev, L.; Stashkov, D.; Gudyma, M.; Kazakovtsev, V. Algorithms with Greedy Heuristic Procedures for Mixture Probability Distribution Separation. Yugosl. J. Oper. Res. 2018, 29, 51–67. [Google Scholar] [CrossRef]
  93. Nikolaev, A.; Mladenovic, N.; Todosijevic, R. J-means and I-means for minimum sum-of-squares clustering on networks. Optim. Lett. 2017, 11, 359–376. [Google Scholar] [CrossRef]
  94. Clustering Basic Benchmark. Available online: http://cs.joensuu.fi/sipu/datasets/ (accessed on 25 September 2020).
  95. Dua, D.; Graff, C. UCI Machine Learning Repository 2019. Available online: http://archive.ics.uci.edu/ml (accessed on 30 September 2020).
  96. Kazakovtsev, L.; Rozhnov, I.; Shkaberina, G.; Orlov, V. K-Means genetic algorithms with greedy genetic operators. Math. Probl. Eng. 2020, 2020, 8839763. [Google Scholar] [CrossRef]
  97. Kazakovtsev, L.; Rozhnov, I. Application of algorithms with variable greedy heuristics for k-medoids problems. Informatica 2020, 44, 55–61. [Google Scholar] [CrossRef] [Green Version]
  98. Luebke, D.; Humphreys, G. How GPUs work. Computer 2007, 40, 96–100. [Google Scholar] [CrossRef]
  99. Lim, G.; Ma, L. GPU-based parallel vertex substitution algorithm for the p-median problem. Comput. Ind. Eng. 2013, 64, 381–388. [Google Scholar] [CrossRef]
  100. AlBdaiwi, B.F.; AboElFotoh, H.M.F. A GPU-based genetic algorithm for the p-median problem. J. Supercomput. 2017, 73, 4221–4244. [Google Scholar] [CrossRef] [Green Version]
  101. Herda, M. Parallel genetic algorithm for capacitated p-median problem. Procedia Eng. 2017, 192, 313–317. [Google Scholar] [CrossRef]
  102. Zechner, M.; Granitzer, M. Accelerating K-Means on the Graphics Processor via CUDA. In Proceedings of the International Conference on Intensive Applications and Services, Valencia, Spain, 20–25 April 2009; pp. 7–15. [Google Scholar] [CrossRef] [Green Version]
  103. Charikar, M.; Guha, S.; Tardos, E.; Shmoys, D.B. A constant-factor approximation algorithm for the k-median problem. In Proceedings of the 31st Annual ACM Symposium on Theory of Computing, Atlanta, GA, USA, 1–4 May 1999; pp. 1–10. [Google Scholar]
  104. Jain, K.; Vazirani, V. Approximation algorithms for metric facility location and k-median problems using the primal-dual schema and lagrangian relaxation. J. ACM 2001, 48, 274–296. [Google Scholar] [CrossRef]
  105. Fränti, P.; Sieranoja, S. K-means properties on six clustering benchmark datasets. Appl. Intell. 2018, 48, 4743–4759. [Google Scholar] [CrossRef]
  106. Smucker, M.D.; Allan, J.; Carterette, B.A. Comparison of Statistical Significance Tests for Information Retrieval. In Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management (CIKM’07), Lisbon, Portugal, 6−10 November 2007; pp. 623–632. [Google Scholar]
  107. Park, H.M. Comparing Group Means: The t-Test and One-Way ANOVA Using STATA, SAS, and SPSS; Indiana University: Bloomington, IN, USA, 2009. [Google Scholar]
  108. Mann, H.B.; Whitney, D.R. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other. Ann. Math. Stat. 1947, 18, 50–60. [Google Scholar] [CrossRef]
  109. Fay, M.P.; Proschan, M.A. Wilcoxon-Mann-Whitney or t-Test? On Assumptions for Hypothesis Tests and Multiple Interpretations of Decision Rules. Stat. Surv. 2010, 4, 1–39. [Google Scholar] [CrossRef]
Figure 1. Comparative results of the search in SWAPr neighborhoods. Dependence of the result (1) on r: (a) BIRCH3 dataset, search for 300 centers, 100,000 data vectors, time limitation 10 s; (b,c) IHEPC dataset, search for 100 and 300 centers, respectively, 2,075,259 data vectors, time limitation 300 s; (d) Mopsi-Finland dataset, search for 30 centers, 13467 data vectors, time limitation 5 s.
Figure 1. Comparative results of the search in SWAPr neighborhoods. Dependence of the result (1) on r: (a) BIRCH3 dataset, search for 300 centers, 100,000 data vectors, time limitation 10 s; (b,c) IHEPC dataset, search for 100 and 300 centers, respectively, 2,075,259 data vectors, time limitation 300 s; (d) Mopsi-Finland dataset, search for 30 centers, 13467 data vectors, time limitation 5 s.
Algorithms 14 00130 g001
Figure 2. Comparative results of the search in AGGLr neighborhoods. Dependence of the result (1) on r: (a,b) BIRCH3 dataset, search for 100 and 300 centers, respectively, 100,000 data vectors, time limitation 10 s; (c,d) IHEPC dataset, search for 30 and 100 centers, respectively, 2,075,259 data vectors, time limitation 300 s; (e) Mopsi-Finland dataset, search for 300 centers, 13,467 data vectors, time limitation 5 s; (f) Mopsi-Joensuu dataset, search for 100 centers, 6014 data vectors, time limitation 5 s.
Figure 2. Comparative results of the search in AGGLr neighborhoods. Dependence of the result (1) on r: (a,b) BIRCH3 dataset, search for 100 and 300 centers, respectively, 100,000 data vectors, time limitation 10 s; (c,d) IHEPC dataset, search for 30 and 100 centers, respectively, 2,075,259 data vectors, time limitation 300 s; (e) Mopsi-Finland dataset, search for 300 centers, 13,467 data vectors, time limitation 5 s; (f) Mopsi-Joensuu dataset, search for 100 centers, 6014 data vectors, time limitation 5 s.
Algorithms 14 00130 g002
Figure 3. Comparative analysis of the convergence speed. Dependence of the median result on computation time for: (a) Individual Household Electric Power Consumption (IHEPC) dataset, 50 centers, 2,075,259 data vectors, time limitation 300 s; (b) Mopsi-Joensuu dataset, search for 300 centers, 6014 data vectors, time limitation 5 s. (c) Mopsi-Finland dataset, search for 300 centers, 13,467 data vectors, time limitation 5 s.
Figure 3. Comparative analysis of the convergence speed. Dependence of the median result on computation time for: (a) Individual Household Electric Power Consumption (IHEPC) dataset, 50 centers, 2,075,259 data vectors, time limitation 300 s; (b) Mopsi-Joensuu dataset, search for 300 centers, 6014 data vectors, time limitation 5 s. (c) Mopsi-Finland dataset, search for 300 centers, 13,467 data vectors, time limitation 5 s.
Algorithms 14 00130 g003
Table 1. Comparative results for all datasets (best of known algorithms vs. new algorithm).
Table 1. Comparative results for all datasets (best of known algorithms vs. new algorithm).
Algorithm or NeighborhoodAchieved Objective Function Values (1) Summarized after 30 Runs
Min (Record)Max (Worst)AverageMedianStd. Dev
BIRCH3 dataset. 100,000 data vectors in R 2 , k = 30 clusters, time limitation 10 s
GH-VNS13.45057 × 1093.45057 × 1093.45057 × 1093.45057 × 1090.00000
Aggl-EA↔⇔3.45057 × 1093.45057 × 1093.45057 × 1093.45057 × 1090.00000
BIRCH3 dataset. 100.000 data vectors in R 2 , k = 100 clusters, time limitation 10 s
AGGL21.49271 × 1091.57474 × 1091.49822 × 1091.49445 × 1091.51721 × 107
AGGL51.49341 × 1091.57503 × 1091.50433 × 1091.49439 × 1092.44200 × 107
Aggl-EA↔⇔1.49199 × 1091.57449 × 1091.50670 × 1091.49495 × 1092.46374 × 107
BIRCH3 dataset. 100.000 data vectors in R 2 , k = 300 clusters, time limitation 10 s
AGGL2509.08532 × 1089.78792 × 1089.36947 × 1089.29497 × 1081.97893 × 107
GH-VNS39.12455 × 1089.44414 × 1089.31225 × 1089.32001 × 1088.25653 × 106
Aggl-EA↔⇔9.14179 × 1089.71905 × 1089.34535 × 1089.33403 × 1081.51920 × 107
Mopsi-Joensuu dataset. 6014 data vectors in R 2 , k = 30 clusters, time limitation 5 s
AGGL20145.7784145.8113145.7869145.78470.0084
GH-VNS3145.7721146.1273145.7932145.77610.0664
Aggl-EA↑⇑145.7721145.7752145.7738145.77390.0008
Mopsi-Joensuu dataset. 6014 data vectors in R 2 , k = 100 clusters, time limitation 5 s
AGGL7543.637545.685644.145043.98600.4683
AGGL10043.605444.903544.047243.98750.3061
Aggl-EA↑⇑43.582644.445243.748643.75600.1624
Mopsi-Joensuu dataset. 6014 data vectors in R 2 , k = 300 clusters, time limitation 5 s
AGGL25015.097516.850015.710315.67570.3640
Aggl-EA↔⇔14.835419.453115.457615.33100.8131
IHEPC dataset. 2,075,259 data vectors in R 7 , k = 30 clusters, time limitation 5 min
j-means87907.781395,055.242289,657.889588,702.92972442.1302
Aggl-EA↔⇔86147.695310,9817.796991,515.165886,393.375010,377.0956
IHEPC dataset. 2,075,259 data vectors in R 7 , k = 100 clusters, time limitation 5 min s
GA-1POINT62,192.171964,413.257863,051.750063,028.9922733.3966
Aggl-EA↔⇑57,594.070372,909.500065,782.609458,506.64843208.6879
IHEPC dataset. 2,075,259 data vectors in R 7 , k = 300 clusters, time limitation 5 min s
j-means44,074.644547,562.492245,375.165744,608.19531505.4071
Aggl-EA↔⇔41,795.507865,057.937551,888.064242,525.500012,122.2332
Mopsi- Finland dataset.13,467 data vectors in R 2 , k = 30 clusters, time limitation 5 s
AGGL71.03013 × 1071.03013 × 1071.03013 × 1071.03013 × 1073.35641
Aggl-EA↔⇔1.03013 × 1071.03013 × 1071.03013 × 1071.03013 × 1072.42022
Mopsi-Finland dataset. 13,467 data vectors in R 2 , k = 100 clusters, time limitation 5 s
AGGL253.64354 × 1063.66605 × 1063.64913 × 1063.64784 × 1065184.35
AGGL303.64290 × 1063.67201 × 1063.65106 × 1063.64702 × 1067780.79
Aggl-EA↔⇔3.64285 × 1063.65052 × 1063.64473 × 1063.64502 × 1061467.87
Mopsi-Finland dataset. 13,467 data vectors in R 2 , k = 300 clusters, time limitation 5 s
GH-VNS31.36493 × 1061.43665 × 1061.38037 × 1061.37633 × 10614603.0
Aggl-EA↔⇔1.36496 × 1061.40537 × 1061.38311 × 1061.38149 × 10612302.2
S1 dataset. 5000 data vectors in R 2 , k = 15 clusters, time limitation 1 s
GH-VNS11.69034 × 1081.69034 × 1081.69034 × 1081.69034 × 1080.00000
Aggl-EA↔⇔1.69034 × 1081.69034 × 1081.69034 × 1081.69034 × 1080.00000
S1 dataset. 5000 data vectors in R 2 , k = 50 clusters, time limitation 1 s
AGGL31.12426 × 1081.12548 × 1081.12465 × 1081.12457 × 1082.92460 × 104
GH-VNS11.12419 × 1081.12796 × 1081.12467 × 1081.12446 × 1087.47785 × 104
Aggl-EA↔⇔1.12476 × 1081.15978 × 1081.14049 × 1081.13985 × 1081.05017 × 106
S4 dataset. 5000 data vectors in R 2 , k = 15 clusters, time limitation 1 s
GH-VNS12.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 1080.00000
Aggl-EA↔⇔2.27694 × 1082.27694 × 1082.27694 × 1082.27694 × 10878.6011
S4 dataset. 5000 data vectors in R 2 , k = 50 clusters, time limitation 1 s
AGGL71.35232 × 1081.35449 × 1081.35306 × 1081.35294 × 1085.64815 × 104
Aggl-EA↔⇔1.35241 × 1081.35438 × 1081.35313 × 1081.35304 × 1085.26194 × 104
Note: “↑”, “⇑”: the advantage of the new algorithms over known algorithms is statistically significant (“↑” for t-test and “⇑” for Mann–Whitney U test), “↓”, “⇓”: the disadvantage of the new algorithm over known algorithms is statistically significant; “↔”, “⇔”: the advantage or disadvantage is statistically insignificant. Significance level is 0.05. The best results are underlined.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Kazakovtsev, L.; Rozhnov, I.; Shkaberina, G. Self-Configuring (1 + 1)-Evolutionary Algorithm for the Continuous p-Median Problem with Agglomerative Mutation. Algorithms 2021, 14, 130. https://0-doi-org.brum.beds.ac.uk/10.3390/a14050130

AMA Style

Kazakovtsev L, Rozhnov I, Shkaberina G. Self-Configuring (1 + 1)-Evolutionary Algorithm for the Continuous p-Median Problem with Agglomerative Mutation. Algorithms. 2021; 14(5):130. https://0-doi-org.brum.beds.ac.uk/10.3390/a14050130

Chicago/Turabian Style

Kazakovtsev, Lev, Ivan Rozhnov, and Guzel Shkaberina. 2021. "Self-Configuring (1 + 1)-Evolutionary Algorithm for the Continuous p-Median Problem with Agglomerative Mutation" Algorithms 14, no. 5: 130. https://0-doi-org.brum.beds.ac.uk/10.3390/a14050130

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop