Securing Big Data for Knowledge Management into a Circular Economy

Bucea-Manea-Tonis, Radu; Kuleto, Valentin; Plojović, Šemsudin A.; Beteringhe, Adrian; Ilić, Milena P.

doi:10.3390/su142214851

Open AccessArticle

Securing Big Data for Knowledge Management into a Circular Economy

by

Radu Bucea-Manea-Tonis

^1,*

,

Valentin Kuleto

²

,

Šemsudin A. Plojović

²

,

Adrian Beteringhe

³ and

Milena P. Ilić

²

¹

Economic Sciences, Hyperion University of Bucharest, 030615 Bucuresti, Romania

²

Information Technology School ITS-Belgrade, LINK Group Belgrade, Faculty of Contemporary Arts Belgrade, University Business Academy in Novi Sad, 11000 Belgrade, Serbia

³

Informatics and Social Assistance, Danubius University of Galati, 800654 Galati, Romania

^*

Author to whom correspondence should be addressed.

Sustainability 2022, 14(22), 14851; https://0-doi-org.brum.beds.ac.uk/10.3390/su142214851

Submission received: 29 September 2022 / Revised: 4 November 2022 / Accepted: 6 November 2022 / Published: 10 November 2022

(This article belongs to the Special Issue Blockchain and Agile Management - Important Tools for Circular Economy)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The main purpose of the article is to find a solution to secure data transfer. Big data, a mark of Industry 4.0, comes with the risk of transferring knowledge. Data management is becoming harder to be administered, and Big Data (BD) can be a good solution for it. Sending data containing information about actions to be taken by the system may expose the values of the system’s parameters. Securing the data will stop any third party from interfering with the communication and sending its commands to the remote system. Thus, the article presents a case study that proves how to secure BD read/write operations using the Gauss function and how to use Secure Shell (SSH), a cryptographic network protocol, to generate key-value pairs for safely operating services over an unsecured network.

Keywords:

big data; NoSQL; RSA

1. Introduction

Today data volume is no longer measured in terabytes but in yottabytes, which represents 1000 zettabytes. The old technologies, such as data warehouses or OLAP (On-Line Analytical Processing), are overwhelmed by the immense volume of data processed and interpreted by companies to take the best decision. Even BI (Business Intelligence) solutions are inefficient without association with Big Data (BD). BD is about the collection and storage of data, preprocessing, and visualization. There are more and more devices that are connected to the Internet, that produce, send and receive data. On the other hand, more and more devices are getting “smart” successors, at work, in homes, and our environment. Companies also create a very large amount of business data and information that needs to be processed, saved, and made available and searchable. A sharp increase in the need for processing and storage of data requires the development of infrastructure, both physical and electronic. That is, the improvement of resources related to hardware, but at the same time, more efficient ways of storing, packaging, and searching data are required. This requires new resources, that is, new consumption of resources, which is sociologically an environmental challenge and at the same time an economic expense. Any improvement in the area of data compression or their easier and faster processing enables a significant saving in resources. Being a collection of data that is huge in volume and is growing exponentially with time, BD is more and more associated with distributed storage and cloud computing (e.g., Grid, Flow processing). With Java, among other JVM (Java Virtual Machine)-based languages, the Hadoop ecosystem is broadly adopted for storage.

2. Literature Review

The current economic, social, and ecological issues need to be resolved immediately because if they are not, the effects on future generations will be catastrophic. Therefore, a system of novel and long-lasting solutions ought to be left as the legacy of the future. Innovation, waste management, innovative economic and ecological systems, smart consumption, production, business strategies, green public procurement, and agile management, based on BigData technology, are possible sustainable solutions [1].

A circular economy may be able to address systemic issues and provide adequate conditions for developing a secure data management solution utilizing blockchain technology [1,2,3]. While taking into account the country’s risk premium assessment elements, coherent data management will result in economic risk reduction, a decline in unemployment, thriving foreign investments, and GDP (gross domestic product) growth [2,3]. There is no doubt that anyone involved in socioeconomic or societal activities will eventually realize how important a circular economy is concerning the preservation of data warehouses, economic expansion, and the creation of new jobs as the main sources of innovation and competitive advantages, the efficient use of resources by businesses [4,5,6,7].

The generalization of competitive intelligence used to produce more digital drivers for a more efficient market and a richer society, as well as human resources capable of using blockchain technology and developing innovative data management solutions, are the main pillars of the sustainable implementation of CE (circular economy) principles. Hull et al. concluded that circular economy approaches to education must inspire and instruct incubator members and that circular economy incubators would help entrepreneurs uncover more possibilities by depending more on community participation and accountability than on government aid. The incubator’s informational resources must be carefully managed by BigData to ensure both security and accuracy [8,9]. Within an integrated CE concept approach, CE plans are anticipated to throw up the door to intelligent assets and unleash the potential of the circular economy [10].

Accordingly, to [10,11], there are nine known security vulnerabilities for NoSQL databases, the leading storage solution for BD:

Injection;
Broken Authentication and Session Management;
Cross-Site Scripting (XSS);
Insecure Direct Object References;
Security Misconfiguration;
Sensitive Data Exposure;
Missing Function Level Access Control;
Cross-Site Request Forgery (CSRF);
Unvalidated Redirects and Forward.

3. Materials and Methods

The HBase system has been developed since 2006 as a necessity to prevent the limitations present in relational database management systems (RDBMS). HBase is the implementation of a logically distributed (HBase, BigTable) and physical (Hadoop File System—HDFS) system. Except for the row number, no field is required, resulting in a sparse array—a table with lines whose length may differ.

3.1. NoSQL Database

NoSQL designates a database category that is not built according to the relational database model (RDBMS, Relational Database Management Systems). Since researchers prove that traditional databases are not suitable for Big Data management [12] and that NoSQL DB is better in the Field of scalability and availability [13], NoSQL databases are increasingly used in cloud computing [14,15]. NoSQL systems do not use SQL as a query language and do not provide full ACID guarantees. NoSQL databases querying capabilities are based on data and distribution models and consistency guarantees [16]. Usually, only consistency is guaranteed for transactions limited to a single data element. However, for multimedia, the challenge is to process transferring data where opportunistic protocols usually need large messages for the restoration of the cluster, leading to more energy consumption and packet loss [17].

The most notable advantage due to distributed architecture is fault tolerance. Depending on how the data is recorded, NoSQL systems can be divided into:

Key-Value pairs (Apache-Cassandra, Dynamo, Hibari, OpenLink Virtuoso, Riak, Apache River, Apache Hadoop, Tarantools, etc.);
BigTable Implementations (BigTable is a proprietary (Google) proprietary memory format, compressed, high performance);
Document-oriented databases (BaseX, Clusterpoint, Apache CouchDB, eXist, OrientDB, MavenDB, Simplestore, MongoDB);
Graph databases, in which relationships are well represented in the form of a graph (AllegroGraph, DEX, Neo4j, OWLIM).

CAP theorem proposes three properties that should describe the performance of databases. Consistency(C) refers to the consistency of data from the point of view of system customers. In other words, all customers see the same data all the time. Availability(A) guarantees that each request will receive a response, and partition tolerance(P) is the property of the system to function when nodes in the system fall. According to the CAP theorem, a system cannot simultaneously meet all three constraints, but it can excel at any two of them. Thus, we may define three storage strategies:

Key-Value—e.g., Riak; advantages: A + P;
Document—e.g., Couchbase and MongoDB; advantages: C + P;
Columnar—e.g., BigTable, HBase, and Hypertable; advantages: C + P.

The first attempt towards columnar databases was made by Stephen Tarin, who patented the Trans Relational (TR) model. The TR model is considered the most significant contribution from Codd. The Tarin transformation method targets data storage or the physical level within relational databases. This aspect improves the physical independence of the data, as complex data optimization is required, resulting in:

Direct image systems;
No index is needed;
A single optimal resulting model.

The BD system is scalable and easy to manage. In a TR table, the intersection of a row with a column is called a cell and can be addressed using index pairs [i, j]. The values contained in such cells are numbers corresponding to a row index and are stored in the Records reconstruction table. The obvious advantages are the orderly and distinct storage of data in the Field oriented table of values. In conclusion, the entries in the cells of the Records reconstruction table represent pointers to the rows in the Field-oriented table of values. We can also see that the Reconstruction table is unique only in the case of the condensed version of the Field-oriented table. It is very useful to consider this table (along with the row domains) as a set of histograms, one for each condensed column, and the record reconstruction table as a set of permutations. These histograms and permutations are, in fact, the essence of the TR representation [18]. The HBase system has been developed since 2006 as a necessity to reduce limitations present in relational database management systems (RDBMS). HBase is the implementation of a logically distributed (HBase, BigTable) [19] and physical (Hadoop File System—HDFS) system.

3.1.1. Logical Level of Data

HBase architecture consists of the following components: Master Server, Zookeeper, MapReduce, and HDFS. Master Server allocates server partitions in the region with Apache Zookeeper, and it is responsible for balancing partition loading between region servers. It also frees up busy servers and moves partitions to less busy servers.

We note that the regional servers allocate rows in the memory of client computers redundantly according to the following model, see Table 1:

Zookeeper provides services, such as keeping configuration information, securing name and distributed synchronization, and facilitating customers’ communication with region servers.

3.1.2. The Physical Level of Data

BD integrates first-level vertical apps, log data apps, media apps, data as a service, BI, and analytics/visualization. This level is supported in Cloud by infrastructure as a service, structured databases, and other analytic and operational databases. On a physical level, from MemStore, HBase stores data on families of columns in files (Hfile), managed by the Hadoop Distributed File System (HDFS), see Figure 1:

In Hadoop, the physical level of data consists of:

MapReduce;
HDFS (Hadoop distributed File System);
YARN (Yet Another Resource Framework)—manages resources, allocates tasks, and schedules jobs;
Common Utilities—Java libraries.

Map Reduce is a Java programming paradigm that provides the allocation of data to computer memory according to a three-dimensional scheme for processing large amounts of data in parallel and distributed:

Columns are stored in associative structures (Map) consisting of a key (Key)–value (Value) pairs;
Column families are data dictionaries that allow data to be managed in columns by a timestamp (Timestamp);
Rows are associative structures that aggregate the family of columns by row number (Row Id).

The stages of the MapReduce process are:

Mapping of the individual components that form the input, e.g., creating pairs of type (word, frequency) or (year, consumption);
Ordering and grouping pairs by key (Combining), e.g., Word or year;
Reducing (Reducing) or aggregating values using an aggregation function, e.g., SUM (frequency) or AVG (consumption).

HDFS stores meta-data (file, size, block size + operation type to execute) using NameNode, one per cluster, and data grouped in Racks using DataNodes.

3.2. Improving the Security in Apache HBase

The Java Cryptography software supports secure streams and sealed objects [20]. Another supported cipher is Custom HBase Cipher implementation from org.apache.hadoop, [21].

In [22], the security analysis on the HBase database shows the following:

HBase supports token-based Authentication, for MapReduce tasks, and user Authentication is done by using SASL (Simple Authentication and Security Layer) with Kerberos;
HBase supports authorization by access control list. Permissions include create, read, write, and admin; HBase supports auditing;
HBase does not support the encryption of data at rest but supports the encryption in communication between client and server;
There is no report of a denial-of-service attack;
There is no report for script injection.

All values written to HBase are stored in cells. Cells can now also carry an arbitrary number of tags, metadata is considered distinct from the key and the value, see Figure 2.

According to [22], enhancing data security in Cloud is ensured by:

Data Anonymization;
Encryption;
Data Fragmentation with RPF(Random Pattern Fragmentation).

HFile blocks are encrypted as written and decrypted as read due to a new extensible cryptographic codec and key management framework in HBase, see Figure 3.

The key management is simple, with the Java Keystore being integrated with the default provider. The steps for transparent encryption are:

Segregate sensitive information into one or a few column families;
Trigger major compaction to normalize encryption and data-keying state;
Store a copy of the current master key with an alternate alias, e.g., “hbase-master-alt”.

4. Case Study

In our case study, we use the encryption algorithm with pairs of public and private keys based on the difficulty of identifying primitive divisors in the case of large natural numbers. The first step in highlighting these divisors was made by Mersenne [23], who discovered the series of prime numbers powers that fulfill the formula of generating other numbers 2^p − 1: 2, 3, 5, 7, 13, 17, 19, 31, 61, 89, 107, 127.

5. Results

In our case study, we use the encryption algorithm with pairs of public and private keys based on the difficulty of identifying primitive divisors in the case of large natural numbers. The first step in highlighting these divisors was made by [23] who discovered the series of prime numbers powers that fulfill the formula of generating other numbers 2^p − 1: 2, 3, 5, 7, 13, 17, 19, 31, 61, 89, 107, 127.

Starting from Fermat’s little theory [24], the author discovers another formula generating prime numbers, 2²ⁿ + 1, generating the series 3, 5, 17, 257, 65,537 for natural numbers lesser or equal to 4. Since the occurrence of these prime numbers on the axis of natural numbers seems random, Gauss creates the function π(x) which for any natural value of x returns the number of primes smaller than x [25]. The distribution of the values π(x)/x overlaps the function 1/ln(x), see Figure 3.

In this specific use case, the coding and decoding algorithms have to be stored in the same computing subsystem, which is performing all the other necessary actions, so algorithms’ implementations have to be developed in the same programming environment. Further below is presented the code for Algorithm 1 that generates the function π(x) up to the number 10.000:

Algorithm 1: The code generates the function π(x) up to the number 10.000

#include <stdio.h>
int chk_prime(int n)
{
int i, flag = 0;
for (i = 2; i <= n/2; ++i)
{
// condition for non prime number
if (n % i == 0)
{
flag = 1;
break;
}
}
if (n == 1)
{
printf(“1 is neither a prime nor a composite number.”);
}
else
{
if (flag == 0)
return n;
}
return 0;
}
int cnt_prime(double p)
{
double s = 0;
for (int i = 2;i <= p;i++)
if (chk_prime(i) > 0)
s++;
double pi = s/p;
printf(“%lf \n”, pi);
return 0;
}
int main()
{
for (int d = 2;d <= 10.000;d++)
{
cnt_prime(d);
}//for n...
return 0;
}

As we can see in Figure 4, the function tends slowly to zero, which means that for higher natural values, the probability of identifying a prime number decreases.

The basic formula for the RSA algorithm starts from the congruent concept in arithmetic, relevant for determining the largest prime divisor of a number, expressed by formula 17 ∼= 2 (mod5), i.e., 17 − 2 is a multiple of 5, de = 1(mod((p − 1)(q − 1)), where e is the public key, and d is the private key. The two prime numbers p and q from which the first part of the public key modulus n is obtained (n = p × q) are distinct for each client and are generated, when the client account is created, randomly choosing from a set of the first 10.000 prime numbers.

For establishing the criteria for choosing the second part of the public key, the exponent e, the Euler’s totient function is computed as φ(n) = (p − 1) × (q− 1). The value of e is chosen as big as possible, as long as it is a prime number smaller than φ(n) and is not a factor of n. The private key exponent, d is computed as the least common multiple of p − 1 and q − 1, where the Euclidean algorithm is used for computing the greatest common divisor. For bigger p’s and q’s, the values are chosen from the last 1000 values of the set, which is offering enough distinct combinations for one million clients.

Currently, it has opted for probabilistic and deterministic methods for generating large prime numbers, following Euler’s model proposing the polynomial form x² + x + q, which returns prime numbers for values of x between 0 and q − 2. In addition, the ssh-keygen -t RSA command is used to generate a key-value pair using SSH to perform different operations on the cluster, such as start, stop, and distributed daemon shell operations.

Further on, we need to copy the public keys from the test rsa.pub file to the server-authorized keys in order to use SSH-key-based authentication to log in. Now, we may safely perform HBase CRUD operations, e.g.,: put ‘/user/user01/customer’, ‘rmanea’, ‘addr:city’, ‘braila’ or get ‘/user/user01/customer’, ‘rmanea’.

6. Discussion

The idea of this paper was centered around prime numbers identification as a possible solution for double-key cryptographic systems. Furthermore, we implemented the Gauss π function and its results were validated by comparing them with the 1/ln(x) function that tended slowly to zero, meaning that for higher natural values, the probability of identifying a prime number decreases. After that, we tested the RSA algorithm based on the Euclidian extended formula and concluded that using a deterministic method for generating primes is a feasible way to secure HBase CRUD operations. Further research will contain the minimal requirements the public keys generation algorithm should meet to use SSH-key-based authentication.

A paper by Kaighobadi, K., Fernandez, E.B. [26] also aims to find a way to minimize requirements for SSH-key-based authentication. However, their process is based on the use of an SKE algorithm relying on the results from the Diffie–Hellman key exchange and demands more resources than the solution proposed in this paper, which is also referred to in a paper by Ramakrishna and colleagues [27].

7. Conclusions

Based on what has been said in this paper, the following conclusions can be drawn:

The risk of knowledge transfer is a hallmark of Industry 4.0’s use of big data. Large amounts of data may represent a challenge for data administration. To demonstrate how to secure read/write operations using the Gauss prime-counting function and Secure Shell-generated key-value pairs, the article provided a case study in this aspect (SSH). Compared with an implementation of a classical algorithm, which needed an average time (over 10 runs) of 674 ms for computing primes with an xmax upper limit of 1000, the optimized implementation needed only 278.52 ms for the same limit, in the same conditions, and 16.85 sec for xmax = 10.000, please see Figure 5:

The efficiency of Algorithm 2 was measured using sys/time.h C library. Thus, we called clock_gettime function with start/end parameters before the main loop is executed and after the cnt_prime function returns the last value, respectively:

Algorithm 2: using sys/time.h C library measures the efficiency

// start timer:
clock_gettime(CLOCK_MONOTONIC, &start);
// unsync the I/O:
ios_base::sync_with_stdio(false);

for (int d = 2;d <= 10.000;d++)
{
cnt_prime(d);
}
// stop timer:
clock_gettime(CLOCK_MONOTONIC, &end);
// calculating the amount of time taken by the execution:
double time_taken;
time_taken = (end.tv_sec − start.tv_sec) × 10⁹;
time_taken = (time_taken + (end.tv_nsec − start.tv_nsec)) × 10⁻⁹;

cout << “Time taken by program is: “ << fixed << time_taken << setprecision(9);
cout << “ sec” << endl;

NoSQL database system is suitable for scalability and availability issues in Big Data knowledge transfer. Since data in the NoSQL system is stored in one table and the access is regulated by the key of each user, data retrieval is significantly faster than with conventional data storage methods. On the other hand, the question arises whether and to what extent these inclinations can endanger the security of the system. A very large number of users, who have simultaneous access to data, in a system where data is of different formats and memory capacities, cause concerns about system security and data transfer.

As shown in the research conducted in this paper, it is critical to define pairs of keys based on large prime numbers to improve the security of data/knowledge transfer to the highest possible level and to reduce the risk of data corruption and leakage to a minimum. All of the above points converge to the fact that BigData access should mainly be granted through NoSQL data storage systems using BigTables columnar architecture (e.g., HBase) for secured knowledge transfer.

Author Contributions

Conceptualization, R.B.-M.-T. and Š.A.P.; methodology, R.B.-M.-T. and V.K.; software, R.B.-M.-T. and Š.A.P.; validation, M.P.I. and V.K.; formal analysis, V.K. and A.B.; investigation, A.B. and M.P.I.; resources, M.P.I. and A.B.; data curation, A.B. and M.P.I.; writing—original draft preparation, R.B.-M.-T. and V.K.; writing—review and editing, R.B.-M.-T. and Š.A.P.; visualization, V.K. and M.P.I.; supervision, Š.A.P. and A.B.; project administration, R.B.-M.-T. and M.P.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Franco-García, M.; Carpio Aguilar, J.-C.; Bressers, H.T. Towards Zero Waste—Circular Economy Boost, Waste to Resources; Springer International: London, UK, 2019. [Google Scholar]
Raluca-Florentina, T. The Utility of Blockchain Technology in the Electronic Commerce of Tourism Services: An Exploratory Study on Romanian Consumers. Sustainability 2022, 14, 943. [Google Scholar] [CrossRef]
Ilić, M.P.; Ranković, M.; Dobrilović, M.; Bucea-Manea-Țoniş, R.; Mihoreanu, L.; Gheța, M.I.; Simion, V.-E. Challenging Novelties within the Circular Economy Concept under the Digital Transformation of Society. Sustainability 2022, 14, 702. [Google Scholar] [CrossRef]
Hysa, E.; Kruja, A.; Naqeeb, R.; Laurenti, R. Circular Economy Innovation and Environmental Sustainability Impact on Economic Growth: An Integrated Model for Sustainable Development. Sustainability 2020, 12, 4831. [Google Scholar] [CrossRef]
Trica, C.L.; Banacu, C.S.; Busu, M. Environmental Factors and Sustainability of the Circular Economy Model at the European Union level. Sustainability 2019, 11, 1114. [Google Scholar] [CrossRef] [Green Version]
Laurenti, R.; Singh, J.; Frostell, B.; Sinha, R.; Binder, C.R. The Socio-Economic Embeddedness of the Circular Economy: An Integrative Framework. Sustainability 2018, 10, 2129. [Google Scholar] [CrossRef] [Green Version]
Lazarevic, D.; Valve, H. Narrating expectations for the circular economy: Towards a common and contested European transition. Energy Res. Soc. Sci. 2017, 31, 60–69. [Google Scholar] [CrossRef]
Hull, C.E.; Millette, S.; William, E. Challenges and opportunities in building circular-economy incubators: Stakeholder perspectives in Trinidad and Tobago. J. Clean. Prod. 2021, 296, 126412. [Google Scholar] [CrossRef]
Bucea-Manea-Țoniş, R.; Šević, A.; Ilić, M.P.; Bucea-Manea-Țoniş, R.; PopovićŠević, N.; Mihoreanu, L. Untapped Aspects of Innovation and Competition within a European Resilient Circular Economy. A Dual Comparative Study. Sustainability 2021, 13, 8290. [Google Scholar] [CrossRef]
Nica, E.; Tudorica, B.G.; Dusmanescu, D.-M.; Popescu, G.; Breaz, A.M. Databases security issues—A short anlysis on the emergent security problems generated by NoSQL databases. Econ. Comput. Econ. Cybern. Stud. Res. 2019, 53, 115–119. [Google Scholar] [CrossRef]
Sadalage, P.J.; Fowler, M. NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence; Addison-Wesley: Upper Saddle River, NJ, USA, 2013. [Google Scholar]
Ramzan, S.; Bajwa, I.; Ramzan BAnwar, W. Intelligent Data Engineering for Migration to NoSQL Based Secure Environments. IEEE Access 2019, 7, 69042–69057. [Google Scholar] [CrossRef]
Mostajabi, F.; Safaei, A.; Sahafi, A. A Systematic Review of Data Models for the Big Data Problem. IEEE Access 2021, 9, 128889–128904. [Google Scholar] [CrossRef]
Han, J.; Song, M.; Song, J. A Novel Solution of Distributed Memory NoSQL Database for Cloud Computing. In Proceedings of the 10th IEEE/ACIS International Conference on Computer and Information Science, Sanya, China, 16–18 May 2011. [Google Scholar]
Gessert, F.; Wingerath, W.; Friedrich, S.; Ritter, N. NoSQL database systems: A survey and decision guidance. Comput. Sci. Res. Dev. 2017, 32, 353–365. [Google Scholar] [CrossRef]
Rajasekar, V.; Jayapaul, P.; Krishnamoorthi, S.; Saracevic, M.; Elhoseny, M.; Al-Akaidi, M. Enhanced WSN Routing Protocol for Internet of Things to Process Multimedia Big Data. Wireless Pers. Commun. 2021. [Google Scholar] [CrossRef]
Date, C.J. An Introduction to Database Systems, 8th ed.; Pearson: London, UK, 2004; pp. 154–196. [Google Scholar]
Wiese, L. Advanced Data Management: For SQL. Cloud and Distributed Databases; Walter de Gruyter GmbH & Co KG: Berlin, Germany, 2015. [Google Scholar]
The Java EE 6 Tutorial. Available online: https://docs.oracle.com/javaee/6/tutorial/doc/bnbwy.html (accessed on 8 February 2021).
Security Features in Apache HBase. Available online: https://www.slideshare.net/HBaseCon/features-session-2 (accessed on 8 February 2021).
ElDahshan, K.A.; AlHabshy, A.A.; Abutaleb, G.E. Data in the time of COVID-19: A general methodology to select and secure a NoSQL DBMS for medical data. PeerJ Comput. Sci. 2020, 10, 13–14. [Google Scholar] [CrossRef] [PubMed]
Santos, N.L.; Ghita, B.; Masala, G.L. Enhancing Data Security in Cloud using Random Pattern Fragmentation and a Distributed NoSQL Database, In Proceedings of the 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), Bari, Italy, 6–9 October 2019. [CrossRef]
Mersenne, M. Preface to Cogitata Physica-Mathematica; Bertier: Paris, France, 1644; pp. 4–10. [Google Scholar]
Golomb, S.W. Combinatorial proof of Fermat’s “Little” Theorem. Am. Math. Mon. 1956, 63, 718. [Google Scholar] [CrossRef] [Green Version]
Adrian, A.A. Modern Higher Algebra, 2nd ed.; Cambridge University Press: Cambridge, UK, 2015; ISBN 978-1-107-54462-8. [Google Scholar]
Kaighobadi, K.; Fernandez, E.B. A Pattern for the Secure Shell Protocol. In Proceedings of the 1st LACCEI International Symposium on Software Architecture and Patterns (LACCEI-ISAP-MiniPLoP’2012), Panama City, Panama, 23–27 July 2012. [Google Scholar]
Ramakrishna, P.; Harivamsi, R.; Sujan, K.; Vijayakumar, P. Jagannath M FPGA Implementation of Hybrid Asymmetric key based Digital Signature and Diffie-Hellman Key Exchange Algorithm for IoT Application. Int. J. Electron. Secur. Digit. Forensics 2022, 1, 1. [Google Scholar] [CrossRef]

Figure 1. New Hbase data record.

Figure 2. HBase new extensible key management.

Figure 3. π(x)/x distribution of data variation is close to that of 1/ln(x) function.

Figure 4. Generate a key-value pair using SSH.

Figure 5. Generating primes up to 10.000.

Table 1. RAID (Redundant Array of Inexpensive Disks) is similar to NoSQL data redundancy model for performance improvement.

Hosts	Node 1	Node 2	Node 3	Node 4
Row No. 1	1	2	3	4
Row No. 2	3	4	5	6
Row No. 3	5	6	7	8
Row No. 4	7	8	9	10

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bucea-Manea-Tonis, R.; Kuleto, V.; Plojović, Š.A.; Beteringhe, A.; Ilić, M.P. Securing Big Data for Knowledge Management into a Circular Economy. Sustainability 2022, 14, 14851. https://0-doi-org.brum.beds.ac.uk/10.3390/su142214851

AMA Style

Bucea-Manea-Tonis R, Kuleto V, Plojović ŠA, Beteringhe A, Ilić MP. Securing Big Data for Knowledge Management into a Circular Economy. Sustainability. 2022; 14(22):14851. https://0-doi-org.brum.beds.ac.uk/10.3390/su142214851

Chicago/Turabian Style

Bucea-Manea-Tonis, Radu, Valentin Kuleto, Šemsudin A. Plojović, Adrian Beteringhe, and Milena P. Ilić. 2022. "Securing Big Data for Knowledge Management into a Circular Economy" Sustainability 14, no. 22: 14851. https://0-doi-org.brum.beds.ac.uk/10.3390/su142214851

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Securing Big Data for Knowledge Management into a Circular Economy

Abstract

1. Introduction

2. Literature Review

3. Materials and Methods

3.1. NoSQL Database

3.1.1. Logical Level of Data

3.1.2. The Physical Level of Data

3.2. Improving the Security in Apache HBase

4. Case Study

5. Results

6. Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI