1. Introduction
With the rise and popularity of cloud computing technology, more and more data users are motivated to outsource their data to cloud for storage, and start to enjoy the various advantages brought by cloud computing, such as large storage capacity, flexible accessibility and vast computationality. By outsourcing data to the cloud, data users are relieved from the heavy burden of local storage and management costs. However, as the cloud server is not fully trusted and the outsourced data may contain sensitive information, sensitive data privacy and security in the cloud naturally become a primary concern for users. In order to address this issue, sensitive data should be encrypted before outsourcing to the cloud. Typical practical applications include cloud-based storage systems, such as electronic medical record systems where electronic health record data-sharing can help doctors to effectively assess patients’ conditions and make correct treatments. To prevent the leakage of patient information, we store the encrypted medical data on the cloud. Although encryption can ensure the confidentiality of data, it brings the inconvenience of data access and processing. For example, when doctors want to query the required medical data, the cloud server needs to perform search functions without knowing the content of the data. However, the conventional information retrieval methods based on plaintext can not directly be used on ciphertext. Therefore, how to search over encrypted data is of importance and becomes a new challenge.
Homomorphic encryption [
1,
2] can operate on encrypted data, it can be used to search the keyword over encrypted data theoretically. However, constructions based on homomorphic encryption needs large key sizes and brings huge communication costs. To better solve the keyword search problem over encrypted data, searchable encryption (SE) was proposed as an efficient solution. SE not only enables users to search queries over encrypted data stored on an untrusted server but also protects data privacy and search privacy. Compared with homomorphic encryption, SE is designed for keyword search, which is more efficient.
Song et al. [
3] seminally proposed the notion of searchable encryption, which was the first concrete symmetric searchable encryption (SSE) construction. Subsequently, a number of efficient SSE schemes [
4,
5,
6,
7,
8,
9,
10] have been proposed. However, data users have to securely share key for encryption in SSE schemes due to the property of symmetric cryptography. Therefore, most SSE schemes inevitably suffer from secret key distribution and management problems. Later, at Eurocrypt’04, Boneh et al. [
11] first introduced the concept of public key searchable encryption based on the public key encryption algorithm, namely public key encryption with keyword search (PEKS). PEKS solved the secret key management and distribution yield by SSE, so PEKS has aroused great attention in research and has a broader application prospect.
PEKS is a critically important and promising technique, and this paper concentrates on PEKS. At present, various PEKS schemes have been proposed in the literature [
12,
13,
14,
15,
16,
17,
18,
19,
20], and PEKS has been rapidly developed [
21]. Most existing PEKS schemes focus on data users’ rich search functionalities, such as conjunctive keyword search [
14] and multi-keyword search [
20]. In the above schemes, all search users can be regarded as authorized users, who can have unrestricted access to the system. However, it is not suitable for practical requirements. For example, only the mail receiver is allowed to search on the encrypted emails in the cloud-based email system, while other users have no such permissions. To solve this problem, CP-ABE technology is widely adopted as a viable tool to achieve flexible data access control over encrypted data, which can gain one-to-many encryption instead of one-to-one.
Zheng et al. [
22] constructed an attribute-based keyword search scheme on the basis of CP-ABE, however, computation and storage costs grow linearly with the number of system attributes. Yin et al. [
23] improved the work of Zheng, but encryption and key generation operations also lead to much higher computational overheads. Thus, in this paper, we propose a new privacy-preserving and efficient public key encryption with keyword search scheme based on ciphertext-policy attribute-based encryption (PEKS-CPABE) to support both fine-grained access control and keyword search over encrypted data simultaneously. Specifically, the main contributions of this paper can be summarized as follows.
We propose an efficient and privacy-preserving PEKS-CPABE scheme, which enables the data owner to control the search and use of its outsourced encrypted data in the cloud according to its access control policy. Our scheme can achieve keyword search queries over encrypted data with fine-grained access control;
We formalize the security definition, and prove that our scheme achieves selective indistinguishability security against an adaptive chosen keyword attack and can also ensure keyword privacy;
We present the performance analysis in terms of theoretical analysis and experimental analysis, and demonstrate the efficiency of our scheme. Finally, the experimental results show that our PEKS-CPABE scheme performs better than the CP-ABKS scheme proposed in [
22] and CP-ABSE scheme proposed in [
23].
The remainder of this paper is organized as follows:
Section 2 presents an overview of related work.
Section 3 reviews some necessary and basic preliminary notions used in this paper.
Section 4 defines the system model, algorithm description and security model of our proposed scheme.
Section 5 introduces the concrete construction of our proposed scheme.
Section 6 provides detailed security proof of our proposed scheme.
Section 7 shows the performance analysis of our proposed scheme in terms of theoretical analysis and experimental analysis. Finally,
Section 8 summarizes this paper.
2. Related Work
In 2004, Boneh et al. [
11] devised the first public key searchable encryption scheme based on a standard public key encryption algorithm. In this scheme, multiple users are allowed to encrypt data using the public key, but only the private key holder can search the keyword over encrypted data. Inspired by Boneh et al. pioneering work, more and more researchers have been working on the public key searchable encryption to achieve various functionalities. Park et al. [
13] first proposed a PEKS scheme to support conjunctive keyword search on encrypted data. Subsequently, Boneh and Waters [
14] extended PEKS to support conjunctive, subset, and range queries on encrypted data. Later on, Lv et al. [
18] proposed an expressive and secure PEKS scheme based on composite-order groups to support conjunctive, disjunctive and negation search operations that can be extended to support a range search. Huang and Li [
19] proposed a public key authenticated encryption with keyword search scheme that can resist the inside keyword guessing attack, in which the data user can use the secret key to authenticate the data. Recently, Zhang et al. [
20] constructed a novel PEKS scheme supporting semantic multi-keyword search by leveraging an efficient inner product encryption technology. These schemes guarantee that data users are able to efficiently perform search queries and prevent the untrusted cloud server from infringing data confidentiality and search privacy.
Attribute-based encryption (ABE), as a useful data encryption tool to address the problem of fine-grained data sharing and decentralized access control, was first proposed by Sahai and Waters [
24]. This kind of new cryptographic technology enables users to achieve access control over encrypted data by using an access control policy associated with the private key or ciphertext. There are two types of attribute-based encryption, depending on whether the private key or ciphertext is associated with the access control policy, namely key-policy attribute-based encryption (KP-ABE) and ciphertext-policy attribute-based encryption (CP-ABE). Goyal et al. [
25] proposed the first KP-ABE scheme, where the private key is associated with an access control policy and the ciphertext is associated with attributes. A user can decrypt a ciphertext only if the attributes that are used for encryption satisfy the access policy on the user private key. Bethencourt et al. [
26] proposed the first CP-ABE scheme, where the ciphertext is associated with an access control policy and the key is associated with attributes. A user can decrypt a ciphertext only if the attributes on the user private key satisfy the access policy associated with the ciphertext. Compared with KP-ABE, CP-ABE is a preferred choice for designing an access control mechanism, because it is conceptually closer to traditional role-based access control.
Based on the special features of ABE, Zheng et al. [
22] proposed a ciphertext-policy attribute-based keyword search (CP-ABKS) scheme, in which keywords are encrypted with an access control policy so that only data users with a legitimate credential can generate the trapdoor used to search over encrypted data. Dong et al. [
27] introduced an enhanced attribute-based keyword search scheme via an online/offline approach. Their schemes mainly consider resource-constrained mobile devices, and allow data owners and data users to perform related operations online and offline. However, two encryption and decryption operations incur huge computational costs. Li et al. [
28] proposed an attribute-based keyword search scheme with outsourcing key-issuing and outsourcing decryption, which is shown to be secure against chosen-plaintext attack, but it introduces three cloud storage providers and needs a number of expenses. Sun et al. [
29] designed an authorized keyword search scheme with user revocation by utilizing CP-ABE technology, however, the computational costs in the trapdoor generation grow linearly with the number of system attributes. Qiu et al. [
30] proposed a hidden policy attribute-based keyword search scheme to protect the privacy of the access control policy, which incurs abundant computational costs at the same time. Recently, Yin et al. [
23] proposed a ciphertext-policy attribute-based searchable encryption scheme, and improved the scheme of Zheng et al. Inspired by both the work of Zheng and Yin, in this paper, we propose a new privacy-preserving and efficient public key encryption with a keyword search scheme based on ciphertext-policy attribute-based encryption (PEKS-CPABE) to support both fine-grained access control and keyword search over encrypted data simultaneously.
3. Preliminaries
In this section, we review some necessary and basic preliminary notions used in this paper.
3.1. Bilinear Map
Let and be two multiplicative cyclic groups with the same prime order, p. g is a generator of . Let be a computable bilinear map. The map has the following properties:
Bilinearity: Given and for all , there exists ;
Non-degeneracy: ;
Computability: Given and for all , there is an efficient algorithm to compute .
3.2. Decisional Bilinear Diffie-Hellman Assumption
Decisional Bilinear Diffie–Hellman (DBDH) Assumption is given as follows: Let be a cyclic group with prime order p. g is a generator of . is a bilinear map. We define the advantage as of a PPT adversary to solve the DBDH problem as , where . We say that the DBDH assumption holds if the PPT adversary has a negligible advantage in solving the DBDH problem.
3.3. Access Structure
Access structure is defined by the concepts of an authorized access subset and unauthorized access subset. Let be a set of parties. A collection is monotone if : if and , then . A monotone access structure is a monotone collection which is a non-empty subset for , i.e., . The set in is called an authorized set, and the set not in is called an unauthorized set.
3.4. Access Tree
An access tree is usually used to represent an access control policy. Let T be an access tree, and each non-leaf represents a threshold gate described by a threshold value and the number of its children nodes. Let represent the number of children of node x, and the children be labeled from the left to the right as . Let represent the threshold value of x with . If , the logic gate of node x is an “OR” gate. If , the logic gate of node x is an “AND” gate. In an access tree T, a leaf is associated with an attribute, so each leaf node of T is described by an attribute and a threshold value .
Let denote the parent of node x, denote the attribute associated with leaf node x, and denote the set of leaves of the access tree T. Let denote the label of the node x, and represent the subtree of T rooted at node x (thus, ). If an attribute set meets the access tree , we denotes it as . Otherwise, . When x is a leaf node, if and only if , then . When x is a non-leaf node, we can compute for each child node of node x. If at least children of the node x return 1, then .
6. Security Proof
In this section, we give the security proof of the proposed PEKS-CPABE scheme. Based on the aforementioned security model, we provide a detailed security proof that our scheme achieves selective indistinguishability security against an adaptive chosen keyword attack.
Theorem 1. PEKS-CPABE scheme achieves IND-CKA security on the condition that DBDH problem is intractable.
Proof. Assume that is an adversary that can break our proposed scheme, then we can construct a simulator , and the goal is to distinguish between the DBDH tuple and a random tuple where . The IND-CKA security game between the adversary and the simulator is conducted as follows.
Init: The adversary first chooses an access tree to be challenged, which is sent to the simulator ;
Setup: The simulator randomly chooses , and computes ,, . Then, returns the public parameter which is sent to adversary . is simulated as follows. If the has not been queried before, the simulator randomly chooses , adds to the list and outputs ; otherwise, the simulator returns by searching from ;
Phase 1: can adaptively ask the simulator for the trapdoors of a series of keywords , and issue the and oracles as follows.
: The adversary can adaptively ask the simulator for a group of private keys of some attributes . The attribute sets embedded into corresponding private keys do not satisfy . The simulator randomly selects and computes . For each attribute , compute . The simulator sends the private key to , and maintains a list of private keys.
: The simulator first searches the oracle to obtain the secret key as
. Then, computes . For each , compute . Finally, generates the trapdoor as , and sends to the adversary ;
Challenge: The adversary sends the simulator two keywords , on which it wishes to be challenged. The simulator randomly selects a bit to encrypt and generates the encrypted keyword index as follows: , , , . Namely, where . Then, sends to the adversary ;
Phase 2: The adversary can repeat query in Phase 1 and continue to ask for any keyword of his choice, except for the ,;
Guess: The adversary outputs its guess of b. If , outputs 1; otherwise randomly outputs 0 or 1.
There are two conditions, as follows:
If , a valid ciphertext is given to the adversary , and the adversary has an advantage to win this game.
The ciphertexts is valid. , . Since are randomly selected, , let , can be denoted as , . This means is the valid ciphertext. Through is the advantage that the adversary outputs a correct guess, therefore we have that ;
If , is a random ciphertext. has nothing to do with the guess, so the adversary cannot acquire any advantage in this IND-CKA security game but a random guess. Therefore, we have .
Finally, the overall advantage that can solve the DBDH problem in the IND-CKA security is as follows:
Thus, we have a conclusion that if the probabilistic polynomial time adversary has a non-negligible advantage in breaking IND-CKA security, then we can construct a simulator to solve the DBDH problem with the non-negligible advantage . From the above analysis, we know that our proposed PEKS-CPABE scheme is IND-CKA secure on the condition that the DBDH problem is intractable. This completes the proof. □
7. Performance Analysis
In this section, we show the performance analysis of our proposed PEKS-CPABE scheme in terms of theoretical analysis and experimental analysis, and further compare our proposed scheme with the state-of-the-art CP-ABKS [
22] scheme and CP-ABSE [
23] scheme.
Table 1 shows the notations of the performance analysis. For the theoretical analysis, we mainly focus on the computation cost and storage cost. For convenience of comparison, we mainly consider several time-consuming operations, as follows: the bilinear pairing operation
mapping two elements in group
to group
, the hash function operation
H mapping the arbitrary string to an element in group
, a modular exponentiation operation
in
and a modular exponentiation operation
in
. We ignore the multiplication operations and hash operations mapping the arbitrary string to an element in group
because of the increased efficiency.
7.1. Theoretical Analysis
In
Table 2, we evaluate the computation cost of
KeyGen algorithm,
Encryption algorithm,
TrapGen algorithm and
Search algorithm under the same access control policy tree, respectively. We observe that our scheme is much more efficient than the CP-ABKS [
22] scheme and CP-ABSE [
23] scheme in the
KeyGen algorithm and
Encryption algorithm. Although we notice that our scheme has a higher computational overhead than the CP-ABSE [
23] scheme in the
TrapGen algorithm, the hash function operation time is minimal. In the
Search algorithm, our scheme performs similarly to that of CP-ABSE [
23] scheme, and the PEKS-CPABE scheme has a better performance than the CP-ABKS [
22] scheme.
In
Table 3, we show the storage cost comparison. We observe that the storage cost of our scheme outperforms the CP-ABKS [
22] scheme and CP-ABSE [
23] scheme in the
KeyGen algorithm. In the execution
Encryption algorithm and
TrapGen algorithm, the storage cost of our scheme is the same as that of the CP-ABSE [
23] scheme, but much less than that of the CP-ABKS [
22] scheme. Although the CP-ABSE scheme has better search performance than the CP-ABKS scheme, the cost of
KeyGen algorithm and
Encryption algorithm bring in a much higher overhead. Our scheme solves this problem at the same time, with greater efficiency, therefore, with respect to theoretical analysis, our scheme is acceptable in the cloud.
7.2. Experimental Analysis
To evaluate the actual performance of our scheme, we implement CP-ABKS [
22], CP-ABSE [
23] and our scheme using Java language based on the Java Pairing Based Cryptography Library (JPBC) [
31]. Our experimental platform is based on a Windows 10 server with Intel(R) Core(TM) i7-8565U CPU @ 1.80 GHz and 8.00GB RAM. The running environment of our experiment is Java Runtime Environment 1.8 (JRE1.8). In our experiment, we instantiated the bilinear map with Type A:
. For comparison convenience, we set the access policy tree as “AND” access tree “
AND
AND, ⋯, AND
”, and the number of a data user’s attributes
S is equal to the number of attributes that are involved in the access policy, from 1 to 50 with step length 10, namely,
. We conduct each experiment many times to obtain the average execution time under the same access control policy.
Figure 2 shows the performance comparison of various schemes.
As shown in
Figure 2a, we can find that the time of key generation in our scheme is more efficient than the CP-ABKS scheme and CP-ABSE scheme. For example, when setting
, our scheme needs 1061 ms to generate keys, however, both CP-ABKS scheme and CP-ABSE scheme need 4333 and 4314 ms, respectively. As illustrated in
Figure 2b, in our scheme, the time cost to generate ciphertexts is the lowest out of the three schemes. For example, when setting
, our scheme needs 3074 ms to generate ciphertexts, however, both the CP-ABKS scheme and CP-ABSE scheme need 4375 and 4277 ms, respectively. From
Figure 2c,d, we can show that our scheme has the better search performance. In the execution of trapdoor generation and search, our scheme and the CP-ABSE scheme outperform the CP-ABKS scheme. Therefore, our scheme is acceptable in practice and suitable for the cloud.