The majority of the research conducted on the automation of map generalization has thus far been dedicated to topographic maps, with a focus on the generalization of roads, boundaries, river networks, and buildings; that is, with a focus on linear objects or small area objects. The generalization of thematic map types, including categorical maps (e.g., geological, soil, or land use maps), has received less attention, perhaps since categorical maps contain polygons of potentially arbitrary shapes and sizes, rendering them more complex than typical shapes found, for instance, for buildings on a topographic map. Nevertheless, categorical maps, with geological maps as a prominent representative, are a frequent map type and require specific methods with which to automate generalization. For instance, in topographic maps, buildings are usually of a regular shape and are often arranged in a regular fashion (e.g., in linear alignments), while in categorical maps, polygon features can be of an arbitrary shape, occurring in arbitrary spatial arrangements. Merely reusing the approach and processes of topographic map generalization for categorical mapping will not provide a proper solution [1
], as requirements for categorical and thus geological map generalization are different from those used for topographic mapping, as detailed in Section 2
Over the past two decades, constraint-based techniques have evolved as the leading paradigm used to model and automate the map generalization process [3
] and to evaluate generalization results [7
]. However, while use of the constraint-based paradigm is wide-spread in topographic mapping, generalization methods for categorical and geological maps have rarely adopted this approach (see [8
] for one exception). This is somewhat disappointing, as the complexity of categorical maps could favor an approach to allow differentiated and adaptive modeling and monitoring of the conditions that govern the map generalization process.
The objective of this paper is to present a methodology using constraints, more specifically so-called size constraints, as a basis for the geological map generalization. Size constraints deal with the minimum area and distance relations in individual or pairs of map features. They have also been termed metric or graphical constraints [4
] and express the natural limits of human perception in measurable, minimal dimensions [8
]. The proposed approach first identifies a list of size constraints, their goal values, and measures; it also prioritizes the logical treatment of constraints, which in turn dictates the sequence of generalization operators and algorithms to be used in case a constraint is violated. The main driver for the proposed methodology is the “minimum area” (MA) constraint, which influences other constraints and is coupled with descriptive attributes of the polygon features that are being generalized. In an adaptive workflow, the MA and related constraints are successively tested and, if violated, trigger appropriate generalization operators, including object elimination (i.e., selection), enlargement, aggregation, and displacement.
The use of the constraint-based approach in map generalization has predominantly been linked to the agent-based paradigm [3
]. Since building an agent engine, however, is by no means a trivial task, the uptake of the agent-based approach, and with it the constraint-based approach, in practice has been limited. Hence, the main contribution of this research is the demonstration of the usage of constraints, with a focus on size constraints, for the automated generalization of geological maps, as this approach has so far not been explicitly applied to the automated generalization of geological maps. Furthermore, we use a workflow-based approach consisting of a sequence of several generalization operators, as this has a better potential to be adopted in practice. Although designed with geological maps in mind, the approach is also applicable to other categorical maps: maps that are entirely covered with polygonal features (i.e., so-called polygonal subdivisions) such as soil maps, vegetation maps, or land-use and land-cover maps. In experiments, we show that the proposed methodology, despite its relative simplicity, is capable of more appropriately generalizing geological maps, with better local control over the generalization operations that are applied as compared to results generated by state-of-the-art solutions that do not use a constraint-based approach. Finally, we are aware that with its focus on size constraints only, our methodology has limitations, which we point out in Section 6
, providing leads for further research.
In Section 2
, the characteristics of geological maps are discussed. Section 3
provides a review of related work on geological map generalization. Section 4
represents the core of the article, introducing the elements of the size constraint-based methodology one by one. Section 5
presents the results of a series of experiments, firstly to illustrate the effect of the individual generalization operators, and then for different parameterizations of the combined workflow. Finally, Section 6
provides a discussion of the experimental results, while Section 7
ends the paper with conclusions and an outlook.
2. Geological Maps: Purpose and Peculiarities
Geological maps are among the most complex thematic maps, with various elaborate shapes and structures. This renders the generalization process more demanding, and thus an in-depth analysis of these structures is required before the generalization process [2
]. Creating geological maps requires not only cartographic expertise but also general geological knowledge. Thus, for instance, knowledge on the formation and structure of rock types occurring in a study area will be crucial when it comes to identifying the relative significance of map features. However, the importance of certain map features may vary depending on the purpose and type of geological map that is to be made.
Bedrocks assist geologists in portraying the natural history of a study area and identifying associated rock formations. They also carry essential mineral resources such as coal, oil, and uranium, which are in the focus of the mining industry. Thus, a map for mining purposes highlights ancient bedrocks that may carry particular minerals, while neglecting sedimentary rocks as they are more recent [11
]. Geophysicists, in turn, place more emphasis on the intrinsic characteristics of features such as porosity and permeability in rock and sediments [12
The goal of a geological map is “to interpretively portray the spatial and temporal interactions between rocks, earth materials, and landforms at the Earth’s surface” [13
]. Geological materials are the igneous, metamorphic, and sedimentary rock and surficial sediments that form the landscape around us. Most geological maps use colors and labels to show areas of different geological materials, called geological units. Geological structures are the breaks and beds in the geological structures resulting from the slow but strong forces that form the world [14
]. “Geological maps show the location of these structures with different types of lines. Because the Earth is complex, no two maps show the same materials and structures, and so the meaning of the colors, labels, and lines is explained on each map” [15
Geological maps consist of diverse patterns formed by a fabric of polygons, plus additional linear objects (e.g., fault lines) and point objects (e.g., wells) which however are not of concern here. The polygons can be described by spatial, structural, and semantic properties to evaluate similarities or differences and thus infer spatial patterns relating to the perceptual structure or arrangement of polygon features on the map [16
]. Figure 1
shows some sample excerpts of geological maps, ordered by increasing geometric and graphical complexity. In the simple map extract of Figure 1
a, few geological units are involved, with relatively simple shapes, which could be generalized by merely using simplification operators. The next level of complexity comprises many small polygons of the same or similar geological units (Figure 1
b), which may be generalized by removing or alternatively aggregating units or sub-units into a single group. Another level consists of a series of elongated polygons of the same unit embedded in, and possibly crossing, other units (Figure 1
c), where a cartographer may recommend merging neighboring units while trying to maintain their overall arrangement (i.e., using the so-called typification operator). Another complicated form found in geological maps are tree-like, dendritic forms, which were created at a later stage of the quaternary period by rivers and streams carrying sediments and other minerals. This type also defines the position of a river system (Figure 1
d). In this case, a possible solution is to replace several branches with a smaller number of simplified tree branches. Figure 1
e shows that various kinds of units consisting of small and large/long and narrow polygons and tree-like structures may also co-exist, rendering the generalization process even more challenging. The generalization of such complex fabrics requires making multiple, interrelated (and possibly conflicting) generalization decisions.
The few examples in Figure 1
illustrate that the polygonal layer of geological maps consists of a far greater variability in category, size, shape, boundary sinuosity, and topological (in particular containment) relations of the concerned polygons than can typically be found on topographic maps. The spatial arrangement of polygons in geological maps can take many different forms, as is noticeable in Figure 1
, although even in the complex map of Figure 1
e we can perceive alignments and clusters of polygons. The key feature classes in topographic maps are predominantly anthropogenic and hence tend to have more regular shapes and a lesser degree of variability, and they are often arranged in regular alignments (e.g., grid street networks or straight alignments of buildings). “Natural” feature classes, such as land cover, in topographic maps usually are restricted to few categories (e.g., woodlands, waterbodies, built-up areas) and are of secondary priority. Hence, we would argue that while the same operators (elimination, simplification, aggregation, typification displacement, etc.) are valid in both domains, different generalization algorithms, or at least different combinations of algorithms, have to be used to adapt to the peculiarities of geological maps or more generally categorical maps.
The next section offers a review of various generalization methods that have been explicitly proposed to deal with the peculiarities of the geological maps listed above.
3. Related Work
The earliest significant attempt at specifically generalizing geological maps was made by the authors of [17
], who tried to automate the generalization of a 1:250,000 scale geological map from a 1:50,000 scale source map (a product of the British Geological Survey (BGS)), using the conceptual model previously suggested in [18
]. While the results obtained were encouraging, the BGS concluded that the strategy still required intervention by the human operator, rendering it less flexible and more subjective. Importantly, however, this early work highlighted the significance of basing the generalization process upon an understanding of the essential structures and patterns inherent to the source map, a task that is of course not unique to the case of geological maps [18
The authors of [20
] presented a conceptual workflow model dedicated to the semi-automated generalization of thematic maps with three main phases: structural analysis, generalization, and visualization. Structural analysis (or “structure recognition” according to [18
]) was deemed especially crucial, as once all relevant structures present in a map are made known, this information can support the decision of “when and how to generalize” [21
]. The second step of their conceptual model consisted of constraint modeling in a multi-agent system, aiming to build an objective and flexible workflow.
Inspired by the aforementioned conceptual model [20
], the author of [22
] developed a generalization workflow based on ArcGIS tools. A sample geological map at a 1:24,000 scale was generalized to three target scales, namely 1:50,000, 1:100,000, and 1:250,000. The study results were compared with the corresponding U.S. Geological Survey (USGS) geological maps and helped to summarize the strengths and limitations of the tools available for generalization at the time within ArcGIS.
Another experiment on the effectiveness of ArcGIS was conducted by Smirnoff et al. [23
], where a cellular automata (CA) approach was developed specifically for geological map generalization using ArcGIS tools as a basis. When comparing their results with those of a process using the generalization tools directly available in ArcGIS, they concluded that the cell-based, or cellular, automata model had essential advantages for the automated generalization of geological maps. In more recent research [24
], the ArcGIS toolbox called “GeoScaler” was tested on surficial and bedrock maps. The results were evaluated and found adequate. Also, repeatable results were obtained by maintaining some amount of human intervention in the process. Very importantly, since the GeoScaler toolbox was made freely available, this methodology can be used in the practice of geological mapping and has thus defined the state of the art of generalization tools for geological map generalization, which persists today. However, the approach does not consider the individual, local properties of geological features such as the size, shape, and orientation of the polygons, or the distance between them, which is crucial for carrying over the critical patterns of the source map to the derived map. Moreover, as most geological map data exist in vector format, conversion from vector to raster format (and possibly back to vector format again) causes additional, uncontrolled loss of data accuracy.
Hence, a vector-based approach that uses a combination of generalization operators that can be adaptively applied depending on the local situation seems more appropriate. In this vein, the authors of [25
] proposed algorithms for several operators of polygon generalization, including elimination, enlargement, aggregation, and displacement, based on a rolling ball principle, conforming Delaunay triangulation, and skeleton approaches. The authors of [8
] proposed a list of constraints and an agent-based process for the generalization of polygonal maps, extending earlier works by the authors of [4
Probably the most comprehensive work regarding constraint modeling for geological maps was presented by Edwardes and Mackaness [27
], who developed and illustrated a rich set of constraints and proposed ways to represent these in formal language. While conceptually intriguing, the proposed set of constraints was unfortunately not linked to particular generalization algorithms and was not implemented. The study also demonstrated the vast complexity that is involved when trying to model the constraints governing geological maps comprehensively and cast these into a computer-based process.
Müller and Wang [28
] proposed an automated methodology for area patch generalization that encompassed elimination, enlargement/contraction, merging, and displacement operators. They addressed a problem similar to that dealt with in this paper, and they also used a similar set of operators. However, their approach considered all polygons as semantically identical. In contrast, geological maps easily consist of over 20 rock types, demanding simultaneous consideration of geometrical as well as attribute properties of the polygons. Their approach also leaves rather little flexibility for modifying the control parameters, as it lacks the capability of testing different solutions for a given conflict.
As argued in Section 1
, the best way to detect cartographic conflicts and formalize and control generalization algorithms for resolving such situations is by using constraints, as has been shown in topographic map generalization. In this paper, we seek to demonstrate the application of the constraint-based paradigm to the case of geological maps. As shown in Section 2
and another study [27
], there is an almost infinite complexity involved when trying to solve geological map generalization comprehensively. Thus, we focus on a particular generalization problem: small, free-standing polygons (area patches in the terminology of [28
]), which represent a frequent case in geological maps (Figure 1
b,d,e) and other types of categorical maps, such as soil maps. Since such small polygons are prone to legibility problems in scale reduction but also often represent geological units of superior economic value (Section 4.4.2
), we focus on size constraints, which are simple yet allow many of the problems tied to small polygons to be addressed. A limited set of rather simple constraints alleviates tracing the effects of using a constraint-based approach in polygonal map generalization.
This paper presented a methodology for geological map generalization using a constraint-based approach. The main objective of this paper was to demonstrate the usefulness of the constraint-based approach in map generalization when applied to geological maps, not to solve geological map generalization comprehensively (more feature classes are addressed in [23
]). This was done against the background that the constraint-based approach has so far not been explicitly applied to the automated generalization of geological maps. Hence, the proposed methodology focused on a sub-problem of geological map generalization, though nevertheless an important one. In particular, it focused on size constraints as the primary driving force of maintaining map legibility through map generalization. Furthermore, we focused on the case of small, free-standing polygons (or area patches; [28
]), as they represent a frequent case not only in geological maps but also other types of categorical maps such as soil maps or land cover maps.
Based on the experiments reported in Section 5
, we can make several general observations:
The proposed methodology resolves the main legibility problems associated with small polygons in a step-by-step manner. The resulting map is more readable, and map features remain distinguishable after generalization (Figure 13
Although the goal values of the size constraints are defined globally, each polygon is treated individually to its own, specific properties rather than by a global process such as cellular automata. Hence, by consideration of the semantics of individual polygons, important polygons can be protected by enlargement; by consideration of shape properties, the shape characteristics of the individual polygons are largely maintained.
The approach, by being based on testing whether the goal values of constraints are met, allows self-evaluation, and can guarantee that the legibility limits are met—but not more than that. Again, this differs from global approaches such as cellular automata [23
Few parameters are required to control the generalization methodology. The process is initially triggered by the MA constraint and further assisted by few additional size constraints (most importantly, the OS constraint). Once the goal values of the size constraints and additional algorithm-specific parameters have been set, the methodology operates automatically, without further human intervention.
The constraints’ goal values are, first of all, a function of the map legibility at the target scale, and hence allow adapting to the desired scale transition. Furthermore, the goal values also allow for controlling the overall granularity of the output map (Table 4
and Figure 15
), depending on the map purpose.
Despite the rather low number of constraints and control parameters, the methodology is modular and features several generalization operators, thus achieving considerable flexibility.
Because the proposed methodology builds on the universally valid concept of constraints relating to map legibility, the potential exists to apply the same approach to the generalization of other categorical maps, such as vegetation or soil maps. Furthermore, since the workflow is modular, any of the algorithms used in our methodology could be replaced by another algorithm that is perhaps better suited for the peculiarities of the given generalization problem. Thus, for instance, the aggregation or displacement algorithms used in this paper could be replaced by more elaborate algorithms if so desired.
Using a workflow approach, we showed how the proposed methodology could be implemented using the libraries of a general-purpose GIS, achieving a behavior similar that of an agent-based system, but with less implementation effort and less effort for parameter tuning.
Comparing Figure 10
, Figure 11
, Figure 12
and Figure 13
, which present the results of the individual generalization operators, it becomes obvious that elimination and aggregation have the most far-reaching effects. Elimination may retain, or alternatively destroy, both the patterns of spatial arrangements of polygons, as well as the balance of area proportions of the different categories appearing on the map. Below, we discuss the available elimination algorithms separately through Test Case 1. Aggregation is the other operator that can lead to significant alteration of the map image, in this case however by area gain and shape modification of the polygons concerned. As Figure 12
suggests, the aggregation algorithm that was implemented in our workflow can sometimes lead to undesirable shape and area changes, and would be better replaced by a typification algorithm [36
], which however would require preceding detection of group patterns amenable to typification [10
Elimination is the first but also the most radical step of the methodology, as some polygons are permanently removed from the target map. Selective elimination of small island polygons supported by importance values gives way to preserve polygons that are important relative to others, thus reflecting geological importance or economic value. However, in the elimination process, the connectivity between neighboring polygons is not considered, which may lead to the removal or uncontrolled dissolution of certain group patterns (e.g., clusters, alignments) of polygons. Examples of this effect can be observed comparing Figure 10
, left (the original map), and Figure 10
, right (the map after elimination).
To address this issue, we ran Test Case 1 on three selection methods (Section 5.6.1
): Radical Law selection, area loss–gain selection, and category-wise selection. As Figure 14
and Table 3
show for the scale transition to 1:50,000, area loss–gain selection preserved the number and structure of polygons best. In contrast, Radical Law selection had the most destructive effect on polygon groups. However, when the scale transition is larger, e.g., with a target scale of 1:100,000 or 200,000, more polygons will have to be removed to balance the area gain induced by enlargement. Thus, category-wise selection seems to be the better approach for the elimination operation in greater scale reductions, as it distributes the removal of polygons evenly across each category. In general, Radical Law selection is not recommended for use in categorical map generalization unless more strongly generalized (“overgeneralized”) maps are desired. One of the essential requirements of categorical map generalization is to preserve areas as much as possible across scale ranges, which can be achieved using the area loss–gain balance selection method. Based on Test Case 1, however, the category-wise selection method ultimately seems more appropriate as it evens out the elimination of polygons across categories, and also represents a good compromise. Nevertheless, regardless of which of the three selection methods is used, none can safeguard against the inadvertent destruction of group patterns, especially not at greater scale reductions. The actual generalization stage should thus be preceded by a stage devoted to the recognition of essential collective patterns present in the source map [16
], which led us to develop a process for the recognition of polygon group patterns [10
In Test Case 2, documented in Figure 15
and Table 4
, we compared three different sets of goal values to explore their effect on the granularity of the resulting map. The most fine-grained of the three sets (FG) was originally developed for topographic mapping [30
]. As Figure 15
shows, it is however reaching its limits for the generalization of geological maps that are usually populated by tinted, irregularly shaped polygons that often have reduced visual contrast (as is for instance particularly the case with the yellow pegmatite polygons). For such maps, the goal values of the FG set seem too detailed. Thus, in [31
] the use of considerably higher goal values for geological maps was proposed, which produces a rather coarse-grained result with more polygons being enlarged and aggregated. As Figure 15
suggests, this Coarse-grained (CG) set of goal values may lead to overgeneralization and excessive loss of detail, particularly when the focus is on small polygons, as is the case in this paper. This is confirmed by the average polygon area and the number of polygons (Table 4
), which are markedly different from those of the other two sets of goal values. The Compromise (CM) set of goal values allows for a more differentiated picture to be maintained, while ensuring that legibility is maintained. Note that in this set, the values were chosen to lie closer to those of the FG set than those of the CG set, with the intention of retaining as much detail as possible, thus also allowing a greater range of scale reduction. This is also reflected by the results reported in Table 4
for the average polygon area and the number of polygons, which are very similar to those obtained with the FG set.
In Test Case 3, we used the Compromise goal values and the Category-wise selection strategy to produce a series of generalized maps, shown in Figure 16
at the corresponding target scales. While the resulting map at 1:50,000 overall looks convincing, the map at 1:100,000 starts showing signs of visual imbalance. At the scale of 1:200,000, the map breaks down completely. The visual impression is supported by Table 5
, which shows that the relative area per class becomes heavily imbalanced after 1:100,000. In the original map, the Amphibolite and Pegmatite categories are the most frequent ones regarding the number of polygons, which is due to the fact that they mostly occur as small polygons. Hence, they are candidates for removal due to their generally small size, and their number decreases significantly in the transition from 1:100,000 and 1:200,000 (Table 5
). However, as they are so numerous, these two categories are also most affected by enlargement and particularly by aggregation; hence, the area gained in the generalization process is disproportional. Again, this effect suggests that beyond a scale of 1:100,000 the results are no longer pleasing. Once again, this points to the necessity of an approach that is based on the recognition of local group structures which can then inform improved contextual generalization operators [10
compares the proposed constraint-based methodology and the CA approach of GeoScaler [23
], and both methods indeed yielded comparable results in that the legibility was improved. However, the CA approach, as it is based on a moving-window operator that uses the same, essentially majority-based principle across the entire map, has a tendency towards dilating larger polygons while eroding small polygons. Hence, more small polygons disappeared than in the result of the proposed methodology, and some polygons became hardly visible. GeoScaler includes post-processing operations, such as enlargement of too small polygons, which can be applied to ensure the legibility of all polygons. However, even that postprocessing operation will not bring back those small polygons that have been removed. Thus, overall the proposed constraint-based methodology seems to outperform the CA-based approach regarding the adequate maintenance of small polygons. Additionally, due to the raster-based majority filtering of the CA approach, characteristic polygon shapes also become more uniform than in our proposed methodology.
We have already mentioned the lack of explicit polygon group recognition in the elimination operator as the first shortcoming of the proposed methodology. Here, we see the second weakness: because the methodology focuses on small polygons and favors small polygons of important geological units by enlarging them, those small polygons grow disproportionately when the scale reduction factor is large. Once again, this points to the necessity of treating groups of polygons rather than individual polygons. While the methodology of this paper includes contextual generalization operators (aggregation, displacement), “context” is merely understood as the immediate, first-order neighborhood defined by the OS constraint. If polygons are separated by a distance slightly exceeding the OS limit, no link is detected. Likewise, there is no facility for detecting higher-order neighbors and hence forming larger, contiguous groups of polygons. Hence, for a full treatment of the small polygons forming the focus of this paper, group pattern detection procedures are needed, as well as contextual generalization operators that can make use of such group patterns. In related work [10
], we therefore propose a methodology that considers proximity as well as geometrical similarities (shape, size, and orientation of map features) and attribute similarities to find, refine, and form groups that can be used to subsequently inform contextual generalization operators, such as aggregation and typification, to overcome persistent limitations in generalization approaches such as those shown in this paper.