A New Way of Cataloging Research through Grounded Theory

Navas, Gustavo; Yagüe, Agustín

doi:10.3390/app13105889

Open AccessArticle

A New Way of Cataloging Research through Grounded Theory

by

Gustavo Navas

^1,2,* and

Agustín Yagüe

¹

Escuela Técnica Superior de Ingeniería de Sistemas Informáticos ETSISI, Universidad Politécnica de Madrid (UPM), Calle Alan Turing s/n, Ctra. de Valencia, Km. 7, 28031 Madrid, Spain

²

IDEIAGEOCA Research Group, Universidad Politécnica Salesiana, Morán Valverde S/N y Rumichaca, Quito 170702, Ecuador

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(10), 5889; https://0-doi-org.brum.beds.ac.uk/10.3390/app13105889

Submission received: 28 March 2023 / Revised: 19 April 2023 / Accepted: 25 April 2023 / Published: 10 May 2023

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

Grounded theory (GT) has been extensively used in social studies through surveys and interviews. However, its application in software development has not been appropriately categorized, limiting its in-depth study in this field. Additionally, the qualitative analysis provided by GT is in increasing demand in software engineering, presenting a significant opportunity to further investigate this topic. This article discusses the identification and analysis of key GT elements beyond traditional data sources, such as research results, engineering artifacts, and written documents, and introduces the role of basic coding, master core category, and the theory emerging, thus showing a way to present the results of GT studies in software development. The study provides valuable insights for researchers and practitioners interested in applying GT in software development. The article also explores the crucial role of constant comparison until saturation and the challenges it presents. Additionally, the integration of Glaserian grounded theory (GGT) with systematic mapping study (SMS) is examined, resulting in a novel approach called Glaserian systematic mapping study (GSMS), which defines saturation through three equations, providing a set of components that satisfactorily categorize GT in software development. This article discusses the identification and analysis of key grounded theory (GT) elements beyond traditional data sources in the context of software development.

Keywords:

Glaserian systematic mapping study; grounded theory; software development

1. Introduction

The study of GT in software development environments is challenging due to the elusive classification of GT. The variety of versions, diverse forms of application, and the relative novelty of qualitative analysis in software engineering, combined with the multitude of knowledge fields in which GT can be used, make its use difficult. In software engineering, reviewers and expert editors still face difficulties in understanding the nature and boundaries of GT. Researchers tend to be overly cautious in writing their articles and provide excessive explanations of and justifications for the GT method [1]. This conservative approach may be due to the perception that engineers receive insufficient training in the research methodologies of social sciences [2]. As a result, there are limited literature reviews on GT in software development and related topics, though the number of scientific publications that focus on literature reviews and systematic mapping is increasing [3,4,5].

It is clear from various bibliographic databases that the main challenge is the difficulty in classifying scientific papers in software development within the variants of GT [3,4,5], which has hindered the ability to conduct a deeper analysis. This lack of clear classification has led to difficulties in comparing and synthesizing findings across studies, limiting the depth of analysis and hindering the advancement of knowledge in the field of software development. Qualitative analysis, including GT, has paved the way for new methods of studying and analyzing software development. GT provides new insights into the study of humans and their environment [6,7,8], particularly regarding the cost of a project and the performance of software equipment [2].

Data are a crucial aspect in both software development and GT. However, the relationship between data in GT and software development has not been thoroughly examined. Software development deals with a wide variety of data that impacts the various aspects of its processes. At the same time, GT involves the collection and analysis of data until theories emerge and saturation is reached [2,5,9].

GT is typically utilized in the social study of individuals, groups, or communities. However, when applied to software engineering, GT faces new challenges as it becomes a socio-technical issue [10]; in this context, the subjective (sociological) aspect is intertwined with the objective (technical) aspect [10]. This convergence of social and technical aspects adds complexity to the GT analysis in the context of software engineering.

In software development, various types of data are processed, including both the volume of information and data related to processes, activities, artifacts, and the individuals and teams involved. Additionally, a diverse range of data [4] is used in the GT studies applied to software development (all data are relevant) [6,8,11]. However, this diversity has not been adequately addressed in previous studies.

Many articles applying GT in software development do not clearly state which variant of GT they are using or if they are combining variants [5]. There are also some challenges when using GT in software development. Firstly, GT is demanding and requires strict adherence to its methodology, making it difficult to implement in practice [12]. Secondly, the complexity of the processes in software development adds to the difficulty of conducting GT [13]. Additionally, there are several variants of GT, the most prominent being Glaserian and Straussian, which have significant differences [8]. Lastly, it is unclear whether it is appropriate to mix these variants [7] or not [8] when conducting GT research.

This research employs the Glaserian systematic mapping study (GSMS) methodology developed by Navas [14] to thoroughly search through selected scientific articles. This approach allows for the consideration of new questions during the data analysis process and the implementation of a series of iterations guided by three equations, which the authors call search saturation [14]. In GSMS, the Glaserian grounded theory (GGT) is employed to analyze the results and refine the SMS classification schemas through iterative processes, ultimately leading to identifying the core category and associated emerging theories [14]. Literature review documents are considered data in this context [15].

This research aims to categorize the use of GT in the field of software development using grounded theory elements (GTe or GT Elements), which are described in Section 4.4 “Data Analysis: Iteration 1”.

This paper is outlined as follows: Section 2 covers the GT definitions, concepts, the differences between versions, and the diversity of criteria around them. Section 3 is the application of methodology GSMS. In Section 4, the discussion is addressed. Finally, in Section 5, the conclusions are drawn.

2. Grounded Theory: Concepts, Differences, and Disagreements

GT is a qualitative research methodology that aims to develop or test a theory about the subject under study. It is commonly used to study human behavior and interactions with the environment. GT involves integrating empirical data into the creation of a theory through the development of abstract concepts based on time, place, and people [16]. The process involves searching for relevant data and conducting theoretical sampling to establish categories and subcategories until the emergence of new ones [7].

GT was created in 1965 through research on the awareness of death by two sociologists, Glaser and Strauss [17]. They developed a methodology to understand death in terminally ill patients. It was consolidated in 1972 in the book by the same authors [6]. Later, the way in which the methodology was applied changed, resulting in two GT: Glaserian and Straussian. Glaserian grounded theory (GGT) is viewed as more faithful to the original method and is also called classical grounded theory. Straussian grounded theory (SGT) is also called Evolved Grounded Theory [8]. The differences between the two GTs are shown in Table 1.

The fundamental differences between the two GT are:

Glaser emphasized the search for emerging theories. In contrast, Strauss highlighted the importance of a systematic approach and validation criteria [18].
Glaser paid special attention to the meaning of the data and asked: “What do we have here?”. On the other hand, Strauss studied every word present in the data and asked: “What if?” [8].
In GGT, the research problems and questions are not previously defined, and the literature should not be revised at the beginning of the GT process, while SGT proposes the opposite [19].
In GGT, open coding starts the data analysis; in SGT, open coding is used for data to be decomposed and reach higher levels of abstraction [19].

Table 1. Specific differences between the two GTs. Adapted from [8,19].

Glasserian	Straussian
Open Coding
Start the data analysis	The datum is broken down to a higher level of abstraction
Axial Coding
Not used	It runs parallel to open coding
Selective Coding
Identification of the core category	It is similar to axial coding but with a higher level of abstraction
Regarding Research Questions
The problem is not established initially; therefore, it does not start with the research questions. There is only one central concern.	There are research questions at the beginning
Theoretical Sensitivity
Generates concepts from the data through the process of abstraction	The process of abstraction is based on the researcher and their knowledge about the subject
Core Category
It is the crux and the theory is developed around it	Selective coding provides an abstract category; if this does not adequately cover the GT, a new abstract category is chosen, and the cycle is repeated until reaching the core category.

The two GT approaches differ significantly in their approach, initial considerations, analysis principles, coding techniques, memo writing, use of diagrams, writing phase, and evaluation criteria. Both have their merits as research methods, but researchers must decide which method they want to use since they cannot be combined [8]. It is recommended to be clear about the GT method used and not become involved in the debate surrounding which is the “best” method [19,20].

3. Research Methodology

This section summarizes the applied Glaserian systematic mapping study (GSMS) methodology [14]. GSMS generates a theory about the results of the research work conducted [14,19], as depicted in Figure 1. The first phase is data collection (S1), which starts with collecting information about the topic of interest. The second phase is data analysis (S2), where the data are analyzed using constant comparisons to develop a theory and identify new categories until saturation is reached. The third phase is literature comparison (S3) [14], where the results of the data analysis are compared with the existing literature, as shown in Figure 1.

The terms “phase,” “stage,” and “step” were used similarly to the terminology used in Navas’ paper [14]. Specifically, “phase” refers to the processes in GSMS, “stage” refers to GGT, and “step” refers to SMS.

The GSMS Data collection S1 phase takes the following steps from SMS: Research Questions Definition (p1), Conduct Search (p2), and Screening of Papers (p3). GSMS Data Analysis S2 comprises a complex grid structure using the SMS steps Keywording (p4), Mapping (p5), and Synthesis (p7) along with GGT processes (open, selective, and theoretical Coding), conducted through iterations to obtain an appropriate abstraction level until saturation [14]. Finally, in the comparison with the literature S3 phase, the Rigor and relevance assessment (p6) step is incorporated from SMS.

GSMS consists of GGT sub-processes: open, selective, and theoretical coding are associated with outcomes concepts, categories, and propositions, respectively [2]. The concepts emerge from the data and are the basic ideas that, when related, lead to the categories [14]. Finally, the propositions take the previous concepts and categories that produce a discursive set of theoretical statements.

The application of GSMS data analysis is conducted through several iterations. The beginning of the iteration uses previous iteration outcomes, a sample of which can be seen in Section 4.3 and Section 4.4. In each iteration in GSMS [14], each SMS step is described through the GGT stages, ending with the corresponding outputs for the next iteration. For the case of iteration 0, its inputs are the outcomes of the data collection stage, as presented in Section 4.3.

Sometimes, within an iteration, it is necessary to study several components or elements, and for each detail, the Keywording, Mapping, and Synthesis stages must be developed in a particular loop; see, for example, Section 4.4.1 and Section 4.4.2.

Finally, in the comparison with the literature S3 phase, in the works where the reviews of scientific articles focusing on software engineering are conducted within the variants of GT [3,4,5], there are different levels of classification with relatively low values within them, probably because it has not been adequately assessed.

In the comparison with the literature S3 phase, the Rigor and relevance assessment (p6) step is incorporated from SMS.

4. Study Case

The following section explains the previously described methodology used in a specific case related to applying grounded theory in software development environments.

4.1. GSMS Data Collection Phase

This section describes the GSMS data collection phase and comprises three subsections: definition of the research questions, search performance, and screening of papers (see Section 3).

4.1.1. Definition of the Research Questions Phase

The definition of the GSMS research questions phase is the initial research question or questions. In GSMS, it is recommended that the research question is generic, using the paradigm proposed by GGT, ‘What do we have here?’ It also affects the outcome, ‘Review Scope’, on the topic under study. Therefore, it drives the data analysis phase, as illustrated in Figure 2 label P1, with its corresponding outcome Review Scope (Figure 2 O1). P1 corresponds to the definition of the research question phase and provides as an outcome the Review Scope within GSMS.

In this research, it is interesting to know which areas of software development are suitable for studying with GT and which are not. The goal is to locate the application domain of GT in the area of software development. The aim is to establish relationships between different types of GT and various fields of software development, as well as different types of article distributions, such as annual mapping, types of GT, and areas of software development.

Additionally, it is desired to know if GT is appropriately applied within software development and among its various activities. Due to qualitative data analysis in general and GT being relatively new in software development, its application and correct use in the area are essential.

Finally, it is interesting to determine if GT is useful or not for software engineers and to see its practical utility in the industry. Given the importance of software engineering and its applicability in the industry, it is essential to locate the usefulness of GT.

The previous paragraphs determined the three research questions of this paper, which are:

RQ1: In what context is GT appropriate in the field of software development?

RQ2: Is GT applied correctly in the process and tasks of software development?

RQ3: Is GT useful for software engineers in the industry?

4.1.2. Label P2: Search Performance Phase

The search performance phase collected initial sources from scientific article libraries’ databases, including ISI Web of Science, IEEE, Scopus, and ACM. A rigorous and systematic procedure was employed [21], which started with a search string established based on the requirements of this paper. The aim was to determine if this search string was appropriate for achieving the desired outcome. To this end, a series of tests were conducted.

This work applied different chains using the ISI Web of Science database to define the search string. The first attempt was to use long and complex chains, such as ((“software development” OR “requirements engineering” OR “information systems” OR “computer systems”) & “Grounded Theory”), which could lead to errors or exclude relevant topics [21]. The second attempt was to use short terms, such as “software,” “software development,” and “Grounded Theory.” This approach yielded more articles, as shown in Table 2. The first rows are related to terms associated with software; the following rows are about GT and the combination of the two terms.

The combination was addressed by combining the two search strings “Grounded Theory” & “Software Development”, while excluding terms such as “requirement,” “design,” or “development.” This search yielded 197 articles, as shown in Table 2.

4.1.3. P3: Screening of Paper

In the screening phase (P3), the search string “Grounded Theory and Software” was used (Table 2), which resulted in 700 articles from the Web of Science database, 348 from IEEE, 418 from Scopus, and 284 from ACM, with a total of 1750 articles. Inclusion and exclusion criteria were applied to these articles, as outlined in Table 3, And are shown below.

The inclusion criteria were:

Peer-reviewed documents.
Only primary papers were used.

The exclusion criteria were:

Duplicated papers.
Paper not written in English.
GT was not mentioned in the abstract.
Documents in which the GT was used as an alternative methodology.
Software development and GT were not the focus of the article.
GT within a software implementation study.
Dark literature.

The initial list of articles was obtained from the four selected databases, as shown at the top of Table 4. After applying the inclusion and exclusion criteria, the number of articles was reduced, as shown at the bottom of Table 3, resulting in 70 papers. Then, a thorough review of the document was conducted using a snowball process, which resulted in the removal of three articles [22,23,24], which included a literature review, and the addition of three more [25,26,27].

Table 3. Database with the exclusion and inclusion criteria.

Database	Web of Science		IEEE		Scopus		ACM
Description	Minus	Total	Minus	Total	Minus	Total	Minus	Total
Initial articles	-	700	-	348	-	418	-	284
Duplicates in the previous database	-	700	121	227	264	154	21	263
Duplicates	43	657	16	211	0	154	25	238
Not in English	22	635	0	211	0	154	0	238
The word “Grounded” does not appear in the article’s title and/or abstract	56	579	19	192	0	154	186	52
Software development is not the topic	214	365	-	192	0	154	36	16
Grounded theory is not the topic	238	127	125	67	137	17	0	16
Software development and grounded theory are not core topics	104	53	61	6	15	2	7	9
Total = 70 [28]		53		6		2		9
Total, after the inclusion criteria (primary papers)		53		6		2		9
Total = 70 (after of snowball process)		53		6		2		9

Table 4. Software development fields.

Description	No. Papers	Percent (%) over Classified Articles (42)	Percent (%) over Total Papers
Software Requirements	6	14.3	10.0
Software Design	4	9.5	4.3
Software Construction	5	11.9	7.1
Software Testing	6	14.3	8.6
Software Maintenance	1	2.4	1.4
Software Engineering Process	5	11.9	7.1
Software Engineering Models and Methods	12	28.6	15.7
Software Quality	2	4.7	2.9
Software Engineering Management	1	2.4	1.4
Software Configuration Management	0
Total	42	100	58.5

4.2. GSMS Data Analysis Phase

This article only covers the first two iterations (0 and 1) related to the paper’s objectives, even though a complete data analysis was performed with four iterations. The first iteration deals primarily with GT, while the second focuses on software development. The third iteration examines the relationship between GT and software development. The final iteration aims to demonstrate that saturation was achieved by fulfilling the three equations proposed by Navas and Yagüe [14].

According to the three equations of Navas [14], for the start of data analysis, these elements compose RQ₁, RQ₂, and RQ₃, which are the input for iteration 0.

Q_{D C} = \{\sum_{i = 1}^{n} {R Q}_{i}\} = \{{R Q}_{1}, {R Q}_{2}, {R Q}_{3}\},

(1)

where

Q_{D C}

is the set of three research questions of the data collection.

A_{D C} (k) = \sum_{j = 1}^{m} {A Q}_{j} = \emptyset,

(2)

where

A_{D C} (k)

is the set of answered questions of the data collection, which is currently empty.

4.3. Data Analysis: Iteration 0

Navas and Yagüe established, in their methodology [14], that inputs of the current iteration are outcomes of the previous iteration. In the case of iteration 0, its inputs are outputs of the data collection phase. It comprises the research questions to be answered and the set of papers to be analyzed. In the case of this research, they are:

Research questions “RQ1, RQ2, RQ3”.
The 70 papers concerned with grounded theory and software engineering.

It is essential to note that open, selective, and theoretical coding lead to concepts, categories, and propositions, respectively. Figure 3 shows the activity diagram corresponding to iteration 0. For the purpose of clarity, the codes associated with concepts, categories, and propositions are presented in curly brackets, for example, {Grounded Theory} and {Software Development}.

4.3.1. Keywording

Keywording is a crucial component of the GSMS methodology, which involves the constant comparison of data and abstractions through open, selective, and theoretical coding that are associated with concepts, categories, and propositions, respectively [14]. The goal of GSMS keywording is to develop a classification schema based on the previous knowledge in the literature. Figure 4 shows the saturation search process.

In open coding, two concepts were identified: software development and GT. These concepts have been extensively studied in the scientific literature, and they were compiled in two lists of relevant terms: one for {Software Development}, obtained from Swebok v3.0 [13], and the other for {Grounded Theory variants} [5,8,9].

In selective coding, categories were established based on these lists. The {Grounded Theory list} [5,8,9] had two variants: {Glaserian Grounded} and {Straussian Grounded}. The {Software Development list} [13] included ten categories that are the body of knowledge for software development: 1. {Software Requirements}, 2. {Software Design}, 3. {Software Construction}, 4. {Software Testing}, 5. {Software Maintenance}, 6. {Software Configuration Management}, 7. {Software Engineering Process}, 8. {Software Engineering Models and Methods}, 9. {Software Quality}, and 10. {Software Engineering Management}.

Theoretical coding seeks to establish propositions, which are abstract and deeper levels of the categories obtained in selective coding, and they encompass several concepts. These propositions were refined and evolved into specific ways about software development and GT, and were unified into what is called the {Pre-classification schema}.

Keywording aims to find a classification schema in an SMS. On the other hand, the {Pre-classification schema} is defined as arising from previous knowledge for the current iteration and refers to the two lists related to the knowledge available about software development and grounded theory.

4.3.2. Mapping

In the GSMS methodology, mapping involves open, selective, and theoretical coding [14].

Open coding began with each element of the two lists of the {Pre-classification schema}, creating a code for each. The articles were then categorized within the codes of the lists.

In selective coding, the aim is to map the number of articles that can be categorized within the two lists of the pre-classification schema to categorize each paper within one of the codes of the open coding. Table 4 shows the number of documents that can be organized under the issues of software development, with 41 articles corresponding to 58.5% of the total. One paper was classified into two topics, so it is duplicated, resulting in 42. A column also ranks the list considering that 42 are the 100% of the papers.

Table 5 shows the two variants of GT. A total of 48.57% of the articles were classified in these two variant. The third column ranks the list of 34 items, which are 100%.

The elements described in Table 4 and Table 5 were converted into codes within Atlas.ti, and the articles were assigned to one of these codes from the two lists corresponding to software development and GT.

Theoretical coding involves establishing propositions that summarize the previous results. These theoretical propositions lead to four levels of classification:

Papers that are on both lists and are fully classified.
Articles in the software development list are partially classified in software development.
Papers in the grounded theory list are partially classified in GT.
Articles that are not in either list remain unclassified.

This work analyzed how the articles were classified in the two lists of the {Pre-classification schema}. It was determined that 20 articles (28.6%) were in the two lists. A total of 18 (25.7%) papers were not in either list. A total of 19 (27.1%) papers were classified in software development lists, but none in GT. Additionally, 13 articles (18.6%) were classified in GT lists but not in software development, as shown in Table 6.

In Atlas.ti, four stored queries [29] were performed to determine the articles’ numerical distribution within each pre-classification schema list, as shown in Table 6.

4.3.3. Synthesis and Outcomes

The synthesis phase highlights the lack of a proper categorization of the selected works. We did not obtain enough representative results when classifying the documents. This process was similar to the traditional SMS conducted by [21,28,30], with an additional phase that we called the pre-classification schema. The two topics were consolidated in the specialized publications, resulting in a low classification in the GT and software list versions.

The lists of the pre-classification schema were obtained from fully consolidated lists on software development and GT. However, other articles could not be classified. It led us to pose two new research questions, as follows:

RQ4: Is there a way to categorize the documents within GT in software development to increase the percentage of cataloged papers?

RQ5: Are there other ways to categorize the software than that given by the pre-classification schema?

4.3.4. Comparison with the Literature

In the works where reviews of scientific articles on software engineering are made within the variants of GT [3,4,5], they show different levels of classification with relatively low values within them, probably because it has not been adequately assessed.

4.3.5. Summary Iteration 0

Applying the equations of Navas [14] for iteration 0, the research questions were increased by two, corresponding to RQ₄ and RQ₅. These elements create RQ₁, RQ₂, RQ₃, RQ₄, and RQ₅, which are the input for iteration 1.

Q_{0} = \{\sum_{i = 1}^{n} {R Q}_{i}\} = \{{R Q}_{1}, {R Q}_{2}, {R Q}_{3,} {R Q}_{4}, {R Q}_{5}\},

(3)

where

Q_{0}

is the set of research questions of iteration 0.

A_{0} (k) = \sum_{j = 1}^{m} {A Q}_{j},

(4)

where

A_{0} (k)

is the set of answered questions of iteration 0; it is empty.

A_{0} (1) = Ø; A_{0} (2) = Ø; A_{0} (3) = Ø; A_{0} (4) = Ø; A_{0} (5) = Ø

where

A_{0} (1)

is the answered question to

{R Q}_{1}

and so on. In general,

A_{0} (n)

is the answered question to

{R Q}_{n}

.

A_{0} (k) = \emptyset,

(5)

To finish the GSMS data analysis, three conditions must be completed, which leads to saturation [14]. First, there should be no new further questions in the current iteration, but there are two additional research questions in the current iteration, so a new iteration must be initiated.

4.4. Data Analysis: Iteration 1

The inputs for this iteration include:

The five research questions from iteration zero, which are RQ1, RQ2, RQ3, RQ4, and RQ5.
The pre-classification schema.
The 70 selected papers.
Equation (1) of the available three to reach saturation is not fulfilled.
Two new research questions are added $\{{R Q}_{4} a n d {R Q}_{5}\}$ .

$Q_{1} = Q_{0} + \{{R Q}_{4}, {R Q}_{5}\}$

Figure 5 shows the activity diagram for iteration 1, which contains two loops. Each loop consists of the keywording (see Section 4.3.1), mapping (see Section 4.3.2), and synthesis (Section 4.3.3) steps. Each step also includes its respective open, selective, and theoretical coding stages (see Section 4.3).

This iteration focuses on achieving a complete understanding of grounded theory in two respects. The first is the fundamental findings in each article, and the second is the concepts presented within the papers. These two issues lead to the two loops.

4.4.1. Loop 1

Loop 1 focuses on the fundamental findings in each paper through the constant comparison of data and abstractions. Figure 6 shows the progress of the keywording, mapping, and synthesis steps, each of which includes open, selective, and theoretical coding stages.

Open coding focuses on general concepts, while selective coding evolves from general concepts to specific categories. Finally, theoretical coding deepens the categories on theoretical propositions.

Table 7 expands on the data and examples shown in Figure 6. In the keywording section of Table 7 are shown examples of the codes per article, illustrating the evolution from keywording open coding to selective coding. The Keywording theoretical coding concludes that the findings found in each article can be expressed through two essential propositions, which can be synthesized as {A GT of…} and {Use GT to…}.

{A GT of…} presents a theory about the study, while {Use GT to…} uses GT to study the subject in depth. In the mapping phase, the sub-phase open coding unified {A GT of…} and {Use GT to…} conceptually. Selective coding deepened the analysis and unified the two codes. Theoretical coding led to the proposition that 98% of articles were categorized. Finally, synthesis emphasized the need to clearly express the {Emerging theory} in these two ways. See Figure 6 and Table 7 for more details on the keywording, mapping, and synthesis steps and their concepts, categories, and propositions.

4.4.2. Loop 2

GSMS’s keywording involves the constant comparison of data through an in-depth study of each cycle. In this loop, the starting point is the final propositions of loop 1, which become the initial concepts of loop 2 and concepts that arise from the keywording, which can be obtained from each paper.

Figure 7 shows the progress from keywording to synthesis, each with their respective open, selective, and theoretical coding.

Table 8 expands on the data and examples shown in Figure 7. In the keywording column, examples of the codes per paper of the evolution from keywording open coding to selective coding are shown. Finally, keywording theoretical coding proposes that the findings in each article can be expressed through two essential propositions.

These two propositions were studied in-depth and characterized. They changed and went through various names or codes.

The first one had the following names or codes: {Basic elements}, {Initial coding}, and {Basic coding}. The second proposition had the following codes: {Result products}, {Core category}, and {Master Core category}.

Finally, the two selected codes were {Basic coding} and {Master Core category}. These resulted from a process of maturation of the name and are in accordance with the terminology of GT, but should not be confused with existing terms, such as the cases of {initial coding} used in constructivist GT by Charmaz [7] or {core category} used in the GGT and SGT processes.

These two propositions are analyzed in the mapping stage and it was established that 85 and 87% of the articles, respectively, contain these codes.

Finally, the code resulting from loop 1 ({Emerging theory}) and the two codes obtained from loop 2 ({Basic coding} and {Master Core category}) were present in the open coding concepts in the synthesis phase. Table 9 shows examples of how these three coding are used in the articles.

{Basic coding}, {Master core category}, and {Emerging theory} were analyzed in-depth and characterized to find relationships between them. The idea behind this was to unify or relate these elements and reduce them to common concepts. The result is a set of diverse relationships between them, unified into common concepts. Two propositions arise, {Data constant comparison} and {Data analysis in GT}, which are the forces that move and relate the findings of the previous loops.

These five propositions were analyzed together to determine if there is any additional element that complements it, which was {source of data}. This entire process can be observed in Figure 7 and Table 8.

Six elements compose the unifying concept concerning GT studies in software development, {GT Elements (GTe)}.

4.4.3. Comparison with the Literature

According to Navas and Yagüe [14], the findings should be compared with tertiary articles or reviewed for scientific rigor and industrial relevance at the end of each iteration.

Our findings on GT elements, although independent, match the criteria of the two papers that named them “key components of GT” [4] and “principles and coding procedures” [5]. Table 10 compares these two with our {GT Elements}.

In Stol’s first paper, 98 articles were reviewed [4]. Forty-six of these articles uses GT ambiguously, while the others used more specifically. GT was classified as “Using a grounded theory approach […]”, “We used grounded theory to […]”, and “We generated a grounded theory” [4]. Our findings show two ways of manifesting the theory that emerges: {A GT of…} and {Use of GT to…}. It allowed us to classify the emerging theories in these two ways by establishing specific sentences for each paper.

4.4.4. Summary of Iteration 1

In summary, GT has the following elements:

{Data source} is the data collection conducted through interviews, questionnaires, and observations.
{Data analysis} is confirmed by various stages of the GT, such as open coding, selective coding, and theoretical coding, which are interrelated [9].
{Constant comparison} is developed through the GT process.
{Basic Coding} is the concepts, indicators, patterns, categories, or initial properties that support the GT structure. {Basic Coding} could be found at an early or middle stage of the GT process, and they are the foundations of the {emerging theory}.
The {Master Core category} is the final category obtained after undertaking the entire GT process. They are propositions, recommendations, strategies, contingencies, consequences, and changes.
{Emerging theory} emerges from the core category and is formed through the resulting products or final categories that were defined in two ways:

{A GT of…} corresponds to the central statement theory that deals with the article.

{Use of GT to…} corresponds to the main theory statement used to accomplish something.

The current iteration corresponds to number 1, as shown in Figure 7. GT Elements is the first answer to RQ4, which can be expressed as:

Q 4 = {A Q 41 : {G T e}},

(6)

One additional question that arises is RQ6:

RQ6: How are GT elements associated with software development?

In iteration 1, the research questions were increased by one, which corresponded to RQ6.

Q_{1} = \{\sum_{i = 1}^{n} {R Q}_{i}\} = \{{R Q}_{1}, {R Q}_{2}, {R Q}_{3,} {R Q}_{4}, {R Q}_{5}, {R Q}_{6}\}

(7)

where

Q_{1},

is the set of research questions of iteration 1.

A_{1} (k) = \sum_{j = 1}^{m} {A Q}_{j} = {{AQ 4}_{1} : G T e}

(8)

where

A_{1} (k)

is the set of answered questions of iteration 1; it is

{A Q 4}_{1}

.

5. Discussion

GT elements have an established a pattern and were characterized according to the data, its collection, its process, and its result. We called these {GT Elements}. They were used to classify the articles efficiently and allowed us to include a significant number of them.

{Basic coding}, {Master Core category}, and {Emerging Theory} are the elements driven through the {data source} and related to each other through {data analysis} and {constant data comparison}, as shown in Figure 8.

Concerning GT, there are the {Glaserian} and {Straussian} variants within the pre-classification scheme in iteration 0. In iteration 1, the {GT Elements} were reached, and it was possible to identify if a higher classification was achieved despite having a higher level through the {GTe}. GT variants are the traditional classification within GT. Through the six elements of GTe, it was possible to establish a more significant categorization of articles, as recommended by Navas and Yagüe [14].

Table 11 includes 18 articles (from 2020 to 2023) from the IEEE, Web of Science, and ACM libraries related to the topic of study. The components of the six GTe were searched in these articles, and the results were very satisfactory. In all cases, the emerging theory was classified through a code using one of the two options: {A GT of…} (11) and {Use of GT to…} (7). Additionally, a different code was written for each article for basic coding and master core category, maintaining the format of the number and description. The existence of another three items, which are {constant data comparison}, {data collection}, and {data analysis}, was also validated. There was only one article [45] in which the {constant data comparison} code was not found.

These 18 papers were added to the 70 papers studied, resulting in a total of 88 papers. The results concerning the GTe are more satisfactory than with the original 70 articles, as the percentages of the GTe components increased. Regarding GT variants, seven were Glaserian, four were Straussian, four were Constructivist, and the remaining papers did not have a defined variant, as shown in Figure 9.

Table 12 compares our research papers’ {GT variants} and the {GT Elements} of the 88 papers.

The first column is the article code.
The following two columns show the GT classification.
The following six columns show the {GT Elements}.

The present methodology incorporates the emergence of questions within the process of continuous iteration and their subsequent answering as the iterative process progresses. In this study, RQ4 was addressed. The results correspond to the six elements of the GTe, enabling a more comprehensive answer to the remaining research questions. For future work, we intend to address the other questions that are part of this study and that emerged during the application of the GSMS methodology. Through this, it will be possible to visualize how the enrichment is in terms of qualitative analysis towards software engineering and in the reverse direction. In other words, the contribution is in both directions.

6. Conclusions

It was demonstrated that, when using GT in the field of software engineering, it can be applied to elements beyond traditional sources, such as interviews, observations, surveys, focus groups, and social studies. It can be used on research results, engineering artifacts, and written documents to obtain strategies and conclusions through the expression of emerging theories.

The grounded theory elements (GTe) formalize the GT processes in software development. GTe provides an answer to research question 4 (RQ4) in iteration 1, and iteration 2 confirmed that there are no further answers to RQ4. Additionally, no previous attempts to formalize the GT processes have been discovered, making the determination of GT Elements a significant finding.

The process of {constant comparison} until saturation is essential in GT, and different GT processes in software development focus on different moments. The main challenge of {constant comparison} may be the diversity of data sources or dimensions in finding the core category in each paper. Therefore, the {data analysis} and constant comparative techniques were created to complete the {GTe} and are a common part of all GT processes.
This leads to the question of when the initial elements in the GT process emerge, which we called {Basic Coding}. For the Glasserian variant, it could be the first result of open coding; for the Straussian variant, it could be the data of the initial research question, and for the constructivist variant, it could be initial coding.
The result is the {Master Core Category}, which was initially called {Result product}. The relationship between {Basic Coding} and the {Master Core category} leads to a holistic and multidimensional result.
If a highlighted quote is classified as {Basic Coding} or {Master Core category}, it is always shown as a list, matrix, or chart of topics and is quantifiable. Therefore, the description of the code begins with a number.
Two ways to present the theory GT were found: {A GT of…} and {Use of GT to…}. “A GT of” and “Use of GT to” were later converted into a short description of what we called {Emerging theory}.

The emergence of codes followed a path of discovery that is different from that of the traditional GT analysis. They appeared at various stages of theoretical coding. Initially, the {Emerging theory} appeared in two forms, {A GT of…} and {Use of GT to…}, which occurred at the end of loop 1 in the synthesis stage. Then, two elements, {Basic Coding} and {Master Core category}, emerged during the keywording of loop 2. Lastly, the last three codes, {Data analysis}, {Constant Comparison}, and {Data collection}, emerged during the synthesis of loop 2. All six codes together form what was defined as {GT Elements}.

Due to its ease of adaptation, GTe could be applied in various areas where qualitative elements are a fundamental component, achieving a way to write the codes for the Master Core Category and emerging Theory. Table 11 shows an example of this coding.

Integrating the methodology of GGT with SMS works systematically and rigorously, supported by equations, leading to saturation to obtain new theories. Therefore, GSMS creates a novel way of defining saturation through three equations to handle new research questions, which provide a set of components for categorizing GT in software development. It allows for a rigorous and systematic analysis of the existing literature, identifying key elements of GT and their application in the context of software development. By clarifying the position of GSMS in bibliographic analysis, it could highlight the unique contribution of this approach and provide a clear framework for future research in the field.

This holistic approach of GSMS acknowledges the complex interplay between social and technical factors in software development research, providing a more comprehensive understanding of the field. Therefore, the relevance of GSMS in addressing the challenges for bibliographic analysis could be emphasized.

Although saturation is commonly used in grounded theory, we can also apply it in engineering and software development. In some cases, we may create algorithms, programs, codes, computer designs, modules, classes, architectures, or any exceptionally well-designed and aesthetically pleasing artifact, which we can consider as reaching saturation in engineering.

Author Contributions

Conceptualization, G.N. and A.Y.; methodology, G.N.; validation, G.N. and A.Y.; formal analysis, G.N.; investigation, G.N.; resources, G.N. and A.Y.; data curation, G.N.; writing—original draft preparation, G.N.; writing—review and editing, A.Y.; visualization, G.N.; supervision, A.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

I would like to express my gratitude to ETSISI of the Polytechnic University of Madrid, and specifically to Juan Garbajosa. Additionally, I would like to thank the members of the IDEIAGEOCA research group of the Universidad Politécnica Salesiana, as well as Elena Reascos, Gabriel and Gustavo Navas-Reascos, who provided invaluable assistance throughout this investigation.

Conflicts of Interest

The authors declare no conflict of interest.

References

Birks, D.F.; Fernandez, W.; Levina, N.; Nasirin, S. Grounded theory method in information systems research: Its nature, diversity and opportunities. Eur. J. Inf. Syst. 2013, 22, 1–8. [Google Scholar] [CrossRef]
Adolph, S.; Hall, W.; Kruchten, P. A Methodological Leg to Stand on: Lessons Learned Using Grounded Theory to Study Software Development. In Proceedings of the 2008 Conference of the Center for Advanced Studies on Collaborative Research: Meeting of Minds, CASCON’08, New York, NY, USA, 27–30 October 2008; pp. 13:166–13:178. [Google Scholar] [CrossRef]
Kroeger, T.A.; Davidson, N.J.; Cook, S.C. Understanding the characteristics of quality for software engineering processes: A Grounded Theory investigation. Inf. Softw. Technol. 2014, 56, 252–271. [Google Scholar] [CrossRef]
Stol, K.-J.; Ralph, P.; Fitzgerald, B. Grounded theory in software engineering research: A Critical Review and Guidelines. In Proceedings of the ICSE’16: 38th International Conference on Software Engineering, Austin, TX, USA, 14–22 May 2016; pp. 120–131. [Google Scholar] [CrossRef] [Green Version]
Matavire, R.; Brown, I. Profiling grounded theory approaches in information systems research. Eur. J. Inf. Syst. 2013, 22, 119–129. [Google Scholar] [CrossRef]
Glaser, B.G.; Strauss, A.L. The Discovery of Grounded Theory: Strategies for Qualitative Research; Aldine Transaction: Piscataway, NJ, USA, 1973. [Google Scholar]
Charmaz, K. Constructing Grounded Theory. A Practical Guide Through Qualitative Analysis; Sage: Thousand Oaks, CA, USA, 2006; p. 224. ISBN 9780761973539. [Google Scholar]
Van Niekerk, J.C.; Roode, J. Glaserian and Straussian Grounded Theory: Similar or Completely Different ? In Proceedings of the 2009 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists, Emfuleni, South Africa, 12–14 October 2009; pp. 96–103. [Google Scholar] [CrossRef]
Urquhart, C. An encounter with grounded theory: Tackling the practical and philosophical issues. In Qualitative Research in IS: Issues and Trends; IGI Global: Queensland, Australia, 2001; pp. 104–140. [Google Scholar]
Zayour, I.; Hamdar, A. A qualitative study on debugging under an enterprise IDE. Inf. Softw. Technol. 2016, 70, 130–139. [Google Scholar] [CrossRef]
Strauss, A.; Corbin, J. Grounded Theory Methodology: An Overview. In Handbook of Qualitative Research; Denzin, N.K., Lincoln, Y.S., Eds.; SAGE: Thousand Oaks, CA, USA, 1994; Charpter 17; pp. 273–285. [Google Scholar]
Adolph, S.; Kruchten, P.; Hall, W. Reconciling perspectives: A grounded theory of how people manage the process of software development. J. Syst. Softw. 2012, 85, 1269–1286. [Google Scholar] [CrossRef]
Bourque, P.; Fairley, R.E. Guide to the Software Engineering Body of Knowledge (SWEBOK(R)): Version 3.0, 3rd ed.; IEEE Computer Society Press: Los Alamitos, CA, USA, 2014. [Google Scholar]
Navas, G.; Yagüe, A. Glaserian Systematic Mapping Study: An Integrating Methodology. In Proceedings of the 17th International Conference on Evaluation of Novel Approaches to Software Engineering, Online, 25–26 April 2022; pp. 519–527. [Google Scholar] [CrossRef]
Biaggi, C.; Wa-Mbaleka, S. Grounded Theory: A Practical Overview of the Glaserian School. JPAIR Multidiscip. Res. 2018, 32, 1–29. [Google Scholar] [CrossRef]
Kaskenpalo, P.; MacDonell, S.G. Valuing evaluation: Methodologies to bridge research and practice. In Proceedings of the EAST’12: 2nd International Workshop on Evidential Assessment of Software Technologies, Lund, Sweden, 22 September 2012. [Google Scholar] [CrossRef] [Green Version]
Glaser, B.G.; Strauss, A.L. Awareness of Dying; Routledge: Chicago, IL, USA, 1965. [Google Scholar]
Gandomani, T.J.; Zulzalil, H.; Ghani, A.A.A.; Sultan, A.B.M.; Sharif, K.Y. How human aspects impress Agile software development transition and adoption. Int. J. Softw. Eng. Appl. 2014, 8, 129–148. [Google Scholar] [CrossRef]
Adolph, S.; Hall, W.; Kruchten, P. Using grounded theory to study the experience of software development. Empir. Softw. Eng. 2011, 16, 487–513. [Google Scholar] [CrossRef]
Adolph, S.; Kruchten, P. Generating a useful theory of software engineering. In Proceedings of the 2013 2nd SEMAT Workshop on a General Theory of Software Engineering, GTSE 2013, San Francisco, CA, USA, 26 May 2013; pp. 47–50. [Google Scholar] [CrossRef]
Petersen, K.; Feldt, R.; Mujtaba, S.; Mattsson, M. Systematic Mapping Studies in Software Engineering. In Proceedings of the EASE’08: 12th international conference on Evaluation and Assessment in Software Engineering, Swindon, UK, 26–27 June 2008; pp. 68–77. [Google Scholar]
Moghadas, M.; Hashemi, M.R. Toward a Unified Characterization of Mapping Algorithms in Cloud and MPSoC Environments Using a Literature-Based Approach. Can. J. Electr. Comput. Eng. 2015, 38, 204–218. [Google Scholar] [CrossRef]
Brown, J.; Lindgaard, G.; Biddle, R. Stories, sketches, and lists: Developers and interaction designers interacting through artefacts. In Proceedings of the Agile 2008 Conference, Toronto, ON, Canada, 4–8 August 2008; pp. 39–50. [Google Scholar] [CrossRef]
Coleman, G. EXtreme Programming (XP) as a ‘minimum’ software Process: A grounded theory. In Proceedings of the International Computer Software and Applications Conference, Hong Kong, China, 28–30 September 2004; pp. 30–31. [Google Scholar] [CrossRef] [Green Version]
Sedano, T.; Ralph, P.; Peraire, C. Sustainable Software Development through Overlapping Pair Rotation. In Proceedings of the ESEM’16: 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, New York, NY, USA, 8–9 September 2016; pp. 1–10. [Google Scholar] [CrossRef]
Würfel, D.; Lutz, R.; Diehl, S. Grounded requirements engineering: An approach to use case driven requirements engineering. J. Syst. Softw. 2016, 117, 645–657. [Google Scholar] [CrossRef]
Hoda, R.; Noble, J. Becoming Agile: A Grounded Theory of Agile Transitions in Practice. In Proceedings of the 2017 IEEE/ACM 39th International Conference on Software Engineering, ICSE 2017, Buenos Aires, Argentina, 20–28 May 2017; pp. 141–151. [Google Scholar] [CrossRef]
Paternoster, N.; Giardino, C.; Unterkalmsteiner, M.; Gorschek, T.; Abrahamsson, P. Software development in startup companies: A systematic mapping study. Inf. Softw. Technol. 2014, 56, 1200–1218. [Google Scholar] [CrossRef] [Green Version]
Atlas.ti GmbH. ATLAS.ti 8 Windows–User Manual. 2018. Available online: http://downloads.atlasti.com/docs/manual/atlasti_v8_manual_en.pdf?_ga=2.109817989.1433951203.1546789831-208372801.1521470586 (accessed on 20 February 2021).
Petersen, K.; Vakkalanka, S.; Kuzniarz, L. Guidelines for conducting systematic mapping studies in software engineering: An update. Inf. Softw. Technol. 2015, 64, 1–18. [Google Scholar] [CrossRef]
Seth, F.P.; Mustonen-Ollila, E.; Taipale, O.; Smolander, K. Software quality construction in 11 companies: An empirical study using the grounded theory. Softw. Qual. J. 2015, 23, 627–660. [Google Scholar] [CrossRef]
Gandomani, T.J.; Zulzalil, H.; Ghani, A.A.A.; Abu, A.B.; Parizi, R.M. The impact of inadequate and dysfunctional training on agile transformation process: A grounded theory study. Inf. Softw. Technol. 2015, 57, 295–309. [Google Scholar] [CrossRef]
Waterman, M.; Noble, J.; Allan, G. How much up-front? A grounded theory of agile architecture. In Proceedings of the International Conference on Software Engineering, Florence, Italy, 16–24 May 2015; pp. 347–357. [Google Scholar] [CrossRef]
Fagerholm, F.; Ikonen, M.; Kettunen, P.; Münch, J.; Roto, V.; Abrahamsson, P. Performance Alignment Work: How software developers experience the continuous adaptation of team performance in Lean and Agile environments. Inf. Softw. Technol. 2015, 64, 132–147. [Google Scholar] [CrossRef] [Green Version]
Galster, M.; Avgeriou, P. An industrial case study on variability handling in large enterprise software systems. Inf. Softw. Technol. 2015, 60, 16–31. [Google Scholar] [CrossRef]
Clarke, P.; O’Connor, R.V. The situational factors that affect the software development process: Towards a comprehensive reference framework. Inf. Softw. Technol. 2012, 54, 433–447. [Google Scholar] [CrossRef] [Green Version]
Yu, L.; Xu, X.; Liu, C.; Sheng, B. Using grounded theory to understand testing engineers’ soft skills of third-party software testing centers. In Proceedings of the ICSESS 2012: 2012 IEEE International Conference on Computer Science and Automation Engineering, Beijing, China, 22–24 June 2012; pp. 403–406. [Google Scholar] [CrossRef]
Dorairaj, S.; Noble, J.; Malik, P. Understanding lack of trust in distributed agile teams: A grounded theory study. In Proceedings of the 16th International Conference on Evaluation & Assessment in Software Engineering (EASE 2012), Ciudad Real, Spain, 14–15 May 2012. [Google Scholar]
Stray, V.; Sjøberg, D.I.K.; Dybå, T. The daily stand-up meeting: A grounded theory study. J. Syst. Softw. 2016, 114, 101–124. [Google Scholar] [CrossRef] [Green Version]
Schenk, J. Evaluating awareness information in distributed collaborative editing by software-engineers. In Proceedings of the 2012 1st International Workshop on User Evaluation for Software Engineering Researchers, USER 2012, Zurich, Switzerland, 5 June 2012; pp. 35–38. [Google Scholar] [CrossRef]
Ghanbari, H. Seeking technical debt in critical software development projects: An exploratory field study. In Proceedings of the Annual Hawaii International Conference on System Sciences, Koloa, HI, USA, 5–8 January 2016; pp. 5407–5416. [Google Scholar] [CrossRef] [Green Version]
Santos, V.; Goldman, A.; de Souza, C.R.B. Fostering effective inter-team knowledge sharing in agile software development. Empir. Softw. Eng. 2015, 20, 1006–1051. [Google Scholar] [CrossRef]
Waterman, M.; Noble, J.; Allan, G. How much architecture? Reducing the up-front effort. In Proceedings of the Agile India 2012, Bengaluru, India, 17–19 February 2012; pp. 56–59. [Google Scholar] [CrossRef]
Fernández, D.M.; Wagner, S. Naming the pain in requirements engineering: A design for a global family of surveys and first results from Germany. Inf. Softw. Technol. 2015, 57, 616–643. [Google Scholar] [CrossRef] [Green Version]
Ayas, H.M.; Leitner, P.; Hebig, R. Facing the giant: A grounded theory study of decision-making in microservices migrations. In Proceedings of the International Symposium on Empirical Software Engineering and Measurement, Bari, Italy, 11–15 October 2021. [Google Scholar] [CrossRef]
Moshtari, S.; Okutan, A.; Mirakhorli, M. A Grounded Theory Based Approach to Characterize Software Attack Surfaces. In Proceedings of the 44th International Conference on Software Engineering, Pittsburgh, PA, USA, 21–29 May 2022. [Google Scholar] [CrossRef]
De Souza Santos, R.E.; Ralph, P. A Grounded Theory of Coordination in Remote-First and Hybrid Software Teams. In Proceedings of the International Conference on Software Engineering, Pittsburgh, PA, USA, 21–29 May 2022; pp. 25–35. [Google Scholar] [CrossRef]
Rodríguez, P.; Urquhart, C.; Mendes, E. A Theory of Value for Value-Based Feature Selection in Software Engineering. IEEE Trans. Softw. Eng. 2022, 48, 466–484. [Google Scholar] [CrossRef]
Pillay, N.; Wing, J. Agile UX: Integrating good UX development practices in Agile. In Proceedings of the 2019 Conference on Information Communications Technology and Society, ICTAS 2019, Durban, South Africa, 6–8 March 2019; pp. 2–7. [Google Scholar] [CrossRef]
Danilova, A.; Naiakshina, A.; Smith, M. One size does not fit all: A grounded theory and online survey study of developer preferences for securitywarning types. In Proceedings of the International Conference on Software Engineering, Seoul, Republic of Korea, 27 June–19 July 2020; pp. 136–148. [Google Scholar] [CrossRef]
Pina, D.; Seaman, C.; Goldman, A. Technical Debt Prioritization: A Developer’s Perspective. In Proceedings of the International Conference on Technical Debt 2022, Pittsburgh, PA, USA, 16–18 May 2022; pp. 46–55. [Google Scholar] [CrossRef]
MacArthy, R.W.; Bass, J.M. The Role of Skillset in the Determination of DevOps implementation Strategy. In Proceedings of the 2021 IEEE/ACM Joint 15th International Conference on Software and System Processes and 16th ACM/IEEE International Conference on Global Software Engineering, ICSSP/ICGSE 2021, Madrid, Spain, 17–19 May 2021; pp. 50–60. [Google Scholar] [CrossRef]
Chitchyan, R.; Bird, C. Theory as a Source of Software Requirements. In Proceedings of the IEEE International Conference on Requirements Engineering, Zurich, Switzerland, 31 August–4 September 2020; pp. 227–237. [Google Scholar] [CrossRef]
Tuape, M.; Hasheela-Mufeti, V.T.; Kasurinen, J. Theory on Non-Technical Characteristics Affecting Process Adoption in Small Software Companies: A Grounded Theory Study. IEEE Access 2022, 10, 103382–103400. [Google Scholar] [CrossRef]
Ardo, A.A.; Bass, J.M.; Gaber, T. Towards Secure Agile Software Development Process: A Practice-Based Model. In Proceedings of the 48th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2022, Gran Canaria, Spain, 31 August–2 September 2022; pp. 149–156. [Google Scholar] [CrossRef]
Salman, I.; Rodriguez, P.; Turhan, B.; Tosun, A.; Gureller, A. What Leads to a Confirmatory or Disconfirmatory Behavior of Software Testers? IEEE Trans. Softw. Eng. 2022, 48, 1351–1368. [Google Scholar] [CrossRef]
Farias, R.S.; de Souza, R.M.; McGregor, J.D.; de Almeida, E.S. Designing smart city mobile applications: An initial grounded theory. Empir. Softw. Eng. 2019, 24, 3255–3289. [Google Scholar] [CrossRef]
Masood, Z.; Hoda, R.; Blincoe, K. How agile teams make self-assignment work: A grounded theory study. Empir. Softw. Eng. 2020, 25, 4962–5005. [Google Scholar] [CrossRef]
Dissanayake, N.; Zahedi, M.; Jayatilaka, A.; Babar, M.A. A Grounded Theory of the Role of Coordination in Software Security Patch Management. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece, 23–28 August 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 793–805. [Google Scholar] [CrossRef]
Dissanayake, N.; Zahedi, M.; Jayatilaka, A.; Babar, M.A. Investigating technological risks and mitigation strategies in software projects. In Proceedings of the ACM Symposium on Applied Computing, Virtual Event, 25–29 April 2022; pp. 1527–1535. [Google Scholar] [CrossRef]
Rafi, S.; Yu, W.; Akbar, M.A. Towards a Hypothetical Framework to Secure DevOps Adoption: Grounded Theory Approach. In Proceedings of the ACM International Conference Proceeding Series, Trondheim, Norway, 15–17 April 2020; pp. 457–462. [Google Scholar] [CrossRef]
Lubin, J. How Statically-Typed Functional Programmers Author Code. In Proceedings of the Conference on Human Factors in Computing Systems, Yokohama, Japan, 8–13 May 2021. [Google Scholar] [CrossRef]

Figure 1. GSMS methodology, adapted from Navas [14].

Figure 2. Data collection in the GSMS process as an activity diagram.

Figure 3. Activity diagram for iteration 0.

Figure 4. Saturation search within keywording in iteration 0.

Figure 5. Activity diagram for iteration 1.

Figure 6. Loop 1 of iteration 1.

Figure 7. Loop 2 of iteration 1.

Figure 8. GT elements: basic coding generates the core category, and the GT emerges through the analysis and constant comparison of data.

Figure 9. (a) GT variants vs. quantity of articles. (b) GT Elements vs. quantity of articles.

Table 2. Quantity of papers according to the search string in the ISI Web of Science database.

Topic	Search String	Quantity
About Software	Software	1,994,441
	software development	376,823
	“software development”	92,070
About Grounded Theory	Grounded Theory	195,631
About Grounded Theory	“Grounded Theory”	24,725
Combined	“Grounded Theory” & “Software Development”	197
Combined	“Grounded Theory” & “Software” ¹	700

¹ Selected search string.

Table 5. Grounded theory variants.

Description	Qty Papers	Percent (%) over the Classified Articles (34)	Percent (%) over the Total Papers
Glaserian Grounded or Classical Grounded	12	35.3	17.14
Straussian Grounded or Evolved Grounded	22	64.7	31.43
Total	34	100	48.57

Table 6. Paper classification levels.

Documents Classified in the List of Software Development	Documents Classified in the List of Grounded Theory	Quantity of Papers	Percentage (%)
X	X	20	28.6
X		20	28.6
	X	14	20.0
		16	22.8
Total		70	100

Table 7. Loop 1 of iteration 1 in GSMS.

	Open Coding Concepts	Selective Coding Categories	Theoretical Coding Propositions
Keywording phase	{GGT} {SD} {Software Quality} {GGT type} [31]	{A GT of human factors in software quality construction} [31].	{A GT of …}
	{training} {agile} {Agile transformation process (ATP)} [32].	{A GT of the impact of inadequate and dysfunctional training on the Agile transformation process (ATP)} [32].
	{agile}{architecture} {agile architecture}	{A GT of agile architecture} [33].
	{teams} {high-performing teams} [34]	{Use GT to construct the core category of Performance Alignment Work in high-performing teams} [34].	{Use GT to …}
	{variability} [35]	{Use GT to define the mapping between variability types and mechanisms to handle variability} [35].
	{software process} {situational factor} [36]	{Use GT to construct a reference framework for the software process’s situational factors} [36].
	{testing} {engineers’ soft skills} [37]	{Use GT to understand testing engineers’ soft skills from different viewpoints} [37].
	{agile} {Agile software development} {Distributed teams} [38]	{Use GT to investigate the impact of trust on Agile software development with distributed teams} [38].
Mapping phase	{A GT of …} {Use GT to …}	{A GT of …} (48.6%) {Use GT to …} (54.3%)	Most articles were categorized in these two ways (98.6%)
Synthesis phase	{A GT of …} {Use GT to …}	Categorized in two ways	{Emerging theory}

Table 8. Loop 2 of iteration 1 in GSMS.

	Open Coding Concepts	Selective Coding Categories	Theoretical Coding Propositions
Keywording phase	Examples: {process}, {data}, {tools}	{Seven constructs or features of DSM} [39].	{Basic elements} {Initial Coding}, {Basic Coding}
	{Programmers} {tasks}	{Ten tasks performed by programmers} [10].
	{Awareness information} {Main categories}	{Four main categories in collaboration in awareness of information} [40].
	{GT method} {process}	{Three phases of the GT process} [19]
	{DSM} {process}	{Six propositions that affect the DSM process positively or negatively} [39].	{Result products} {Master core category}
	{Bug resolution}	{Four categories of activities during bug resolution} [10].
	{Evaluating awareness} {Discussion}	{Three methods and discussions for evaluating awareness} [40].
	{Software engineering} {Research}	{15 guidelines to conduct software engineering research} [19].
			Two propositions: {Basic Coding} & {Master core category}
Mapping phase	{Basic coding}	59 papers {Basic coding} (84.3%) and	Of the papers categorized in these two codings, over 80% were categorized.
Mapping phase	{Master core category}	61 {Master core category} (87.1%)
Synthesis phase	{Emerging theory} {Basic coding} {Master core category}	{Basic Coding} and {Master core category} lead to expressions with systematic rules	{Data constant comparative} {Data analysis in GT}
			Data Source
			GT Elements

Table 9. Examples of basic coding, master core category, and {A GT of…} and {Use of GT to…}.

Document	Basic Coding	Master Core Category	Emerging Theory
Stray et al. (2016) [39]	Seven constructs o features of DSM	Six propositions that affect the DSM process positively or negatively	A GT of DSM
Zayour et al. (2016) [10]	Ten tasks performed by programmers	Four categories of activities during bug resolution	A GT of debugging coding in an IDE
Ghanbari et al. (2016) [41]		Four categories make software processes challenging and could lead to the production of technical debts	A GT of technical debt in critical domains
Seth et al. (2015) [31]	Eight super-categories	Five findings that illustrate the human factor in the software quality construction	A GT of human factors in software quality construction
Santos et al. (2015) [42]		One inter-team knowledge-sharing effectiveness	A GT of effective knowledge sharing in agile teams
Warweman et al. (2012) [43]	Six forces to the agile architecture	Five strategies to the agile architecture
Gandomani et al. (2015) [32]	Six coding with six C’s family	One inadequate and dysfunctional training	A GT of agile architecture
Fernández et al. (2015) [44]	Four subsets of the hypothesis	Five results structured about the suggested survey	A GT of the impact of inadequate and dysfunctional training on the agile transformation process
Document	Basic Coding	Core Category	Emerging Theory
Fagerholm et al. (2015) [34]	Four factors: two positive and two negative	Three main categories of how teams adapt their performance	Use of GT to construct the core category of performance alignment work in high-performing teams [34]
Galster et al. (2015) [35]	Seven types of variability	Eight mechanisms to handle variability	Use of GT to define the mapping between variability types and mechanisms to handle variability
Clarke et al. (2012) [36]	Seven related research domains	Eight factors with 34 sub-factors	Use of GT to construct a reference framework of the situational factors affecting the software process
Yu et al. (2012) [37]	Four interview rounds	Five topics for further research	Use of GT to understand the testing engineers’ soft skills from different viewpoints
Dorairaj et al. (2012) [38]	Six causes of lack of trust	Four consequences of lack of trust	Use of GT to investigate the impact of trust on agile software development withing the distributed teams
Schenk et al. (2012) [40]	Four main categories in collaboration in awareness of information	Three methods and discussions to evaluate awareness	Use of GT to improve the evaluation of awareness in distributed collaborative teams
Adolph et al. (2011) [19]	Three phases of the GT process	15 guidelines to conduct software engineering research	Use of GT to apply GT to empirical software engineering research

Table 10. Comparison between GT elements and other GT characterizations.

Key Components of GT	Principles and Coding Procedure	Our {GT Elements}
Limited exposure in the literature	Role of a priori theory and literature review	It is a recommendation, not an element
Treat everything as data	GTM and types of data	Data collection in software development could be greater. Program code, architectural software, GT emerging
Immediate and continuous data analysis		Data analysis
Theoretical sampling	Theoretical sampling.	Data analysis
Coding	Coding procedures: Glaserian is associated with open, selective, and theoretical coding. Straussian is associated with open, axial, and selective.	Basic coding and data analysis
Memoing		Core category
Memo sorting		Core category
Constant comparison	Constant comparative analysis	Constant comparison
Theoretical saturation: T he e theoretical saturation is reached and the theory emerges	The principle of emergence: Both the outcome (grounded theory) and the research design process should be emergent	GT emerging theory

Table 11. GT elements in another 18 papers.

1. Reference	2. GT Theory	3. Basic Coding	4. Master Core Category	5 ¹	6 ²	7 ³
Moshtari et al. (2022) [46]	{Use of GT to construct a generic attack surface model, using vulnerability data}	3 branches considered within attack surface models	4 major categories for each of these branches	✔	✔	✔
De Souza Santos and Ralph (2022) [47]	{A GT of coordination within software teams in the context of remote and hybrid work arrangements}	4 possible ways to have work arrangements	4 factors that influence coordination	✔	✔	✔
Rodríguez et al. (2022) [48]	{A GT of value-based feature selection in software engineering}	4 research gaps motivate the study of value-based feature selection	5 characterizations of value propositions	✔	✔	✔
Pillay and Wing (2019) [49]	{A GT of Agile UX integration practices and UX vision}	2 themes considered on Agile UX Integration and UX Vision	2 sprints were conducted around Agile UX Integration Practices and UX Vision	✔	✔	✔
Danilova et al. (2020) [50]	{A GT of developer preferences for security warning types}	4 aspects to consider for security warning design	3 phases based on the GT approach	✔	✔	✔
Pina et al. (2022) [51]	{A GT of technical debt prioritization from a developer’s perspective}	15 categories to group criteria	5 super-categories: 2 related to paying the technical debt and 3 related to not paying	✔	✔	✔
MacArthy and Bass (2021) [52]	{A GT of the role of skillset in the determination of DevOps implementation strategy}	7 memos emerged	6 strategies used by organisations to implement DevOps	✔	✔	✔
Chitchyan and Bird (2020) [53]	{Use of GT to generate additional software requirements through theory development in energy demand–response systems}	8 categories during the focused coding activity	2 approaches to research questions about the use and theory for DSR adoption	✔	✔	✔
Tuape et al. (2022) [54]	{A GT of non-technical characteristics affecting process adoption in small software companies (SSC)}	5 non-technical characteristics	5 hypotheses for predicting and explaining the adoption of software engineering processes by small software companies	✔	✔	✔
Ardo et al. (2022) [55]	{A GT of secure agile software development process}	3 ways of organizing groups	2 contributions about security practices and collaborative ceremonies	✔	✔	✔
Salman et al. (2022) [56]	{A GT of antecedents to confirmatory and disconfirmatory behavior of software testers}	4 antecedent classification about confirmatory and disconfirmatory	9 categories resulting from the antecedents	✔	✔	✔
Farias et al. (2019) [57]	{Use of GT to develop smart city mobile applications (SCMA)}	21 categories related to SCMA	16 propositions related to SCMA	✔	✔	✔
Masood et al. (2020) [58]	{Use of GT to explore self-assignment work in agile teams}	6 coding paradigms adaptions about self-assignment in agile teams	6 coding paradigms results about self-assignment in agile teams	✔	✔	✔
Dissanayake et al. (2021) [59]	{A GT of role of coordination in security patch management}	3 codes relevant to the role of coordination in security patch management	4 inter-related dimensions of coordination in security patch management	✔	✔	✔
Dantas et al. (2022) [60]	{Use of GT to investigate technological risks and mitigation strategies in software projects}	8 codes to form the integration axioms	9 technological risk factors	✔	✔	✔
Rafi et al. (2020) [61]	{Use of GT to create a hypothetical framework to secure DevOps adoption}	15 security concerns	2 types of security concerns	✔	✔	✔
Ayas et al. (2021) [45]	{A GT of decision-making in microservices migrations}	3 decision-making processes	3 types of migrations		✔	✔
Lubin (2021) [62]	{Use of GT to study statically typed functional programmers author code}	4 ways programmers approach problem domain modeling	4 observations applied in the statically-typed functional programming process	✔	✔	✔

¹ 5. Constant comparative Data; ² 6. Data collection; ³ 7. Data analysis.

Table 12. GT variants vs. GT elements.

	GT Variants			{GT Elements}
	Glaserian Grounded	Straussian Grounded	Total Number of Papers: Glaserian + Straussian Grounded	Data Collection	Constant Comparative Method	Data Analysis	Basic Coding	Master Core Category	GT Emerging	The Average Number of Papers
Quantity of papers	19	26	45	71	57	74	77	79	87	74.2
Percentage %	21.6	29.5	51.1	80.7	64.8	84.1	87.5	89.8	98.9	84.3

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Navas, G.; Yagüe, A. A New Way of Cataloging Research through Grounded Theory. Appl. Sci. 2023, 13, 5889. https://0-doi-org.brum.beds.ac.uk/10.3390/app13105889

AMA Style

Navas G, Yagüe A. A New Way of Cataloging Research through Grounded Theory. Applied Sciences. 2023; 13(10):5889. https://0-doi-org.brum.beds.ac.uk/10.3390/app13105889

Chicago/Turabian Style

Navas, Gustavo, and Agustín Yagüe. 2023. "A New Way of Cataloging Research through Grounded Theory" Applied Sciences 13, no. 10: 5889. https://0-doi-org.brum.beds.ac.uk/10.3390/app13105889

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A New Way of Cataloging Research through Grounded Theory

Abstract

1. Introduction

2. Grounded Theory: Concepts, Differences, and Disagreements

3. Research Methodology

4. Study Case

4.1. GSMS Data Collection Phase

4.1.1. Definition of the Research Questions Phase

4.1.2. Label P2: Search Performance Phase

4.1.3. P3: Screening of Paper

4.2. GSMS Data Analysis Phase

4.3. Data Analysis: Iteration 0

4.3.1. Keywording

4.3.2. Mapping

4.3.3. Synthesis and Outcomes

4.3.4. Comparison with the Literature

4.3.5. Summary Iteration 0

4.4. Data Analysis: Iteration 1

4.4.1. Loop 1

4.4.2. Loop 2

4.4.3. Comparison with the Literature

4.4.4. Summary of Iteration 1

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI