Fuzzy Keyword Searchable Encryption Scheme Based on Blockchain

Jiang, Yongbo; Lu, Juncheng; Feng, Tao

doi:10.3390/info13110517

Open AccessArticle

Fuzzy Keyword Searchable Encryption Scheme Based on Blockchain

by

Yongbo Jiang

,

Juncheng Lu

and

Tao Feng

^*

School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China

^*

Author to whom correspondence should be addressed.

Information 2022, 13(11), 517; https://0-doi-org.brum.beds.ac.uk/10.3390/info13110517

Submission received: 20 September 2022 / Revised: 21 October 2022 / Accepted: 27 October 2022 / Published: 28 October 2022

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Searchable encryption is a keyword-based ciphertext retrieval scheme, which can selectively retrieve encrypted documents on encrypted cloud data. Most existing searchable encryption schemes focus only on exact keyword searches and cannot return data of interest in fuzzy search. In addition, during the searchable encryption, the cloud server may return invalid results to the data user to save computing costs or for other reasons. At the same time, the user may refuse to pay the service fee after receiving the correct result. To solve the above problems, this paper proposes a fuzzy keyword searchable encryption scheme based on blockchain, which uses edit distance to generate fuzzy keyword sets and generates a secure index with verification tags for each fuzzy keyword set to verify the authenticity of the returned results. The penalty mechanism is introduced through the blockchain to realize the fairness of service payment between users and cloud servers. Security analysis shows that the scheme achieves non-adaptive semantic security. Performance analysis and functional comparison show that the scheme is effective and can meet the requirements of searching applications in the cloud environment.

Keywords:

blockchain; searchable encryption; fuzzy keyword; fair payment

1. Introduction

With the continuous expansion of Internet data, cloud storage has become a popular data storage mode. It has the feature of paying on demand, so cloud servers provide services to users who are willing to pay for them. Due to its flexible and efficient computing power, a large number of people prefer to store their data in the cloud to save local space and maintenance costs. To protect the privacy of high-sensitivity data, users need to encrypt the data before uploading. Although this method protects the privacy of sensitive data, it also causes inconvenience in retrieving encrypted documents.

In order to solve the inconvenience of ciphertext search, searchable encryption was first proposed in 2000 [1]. However, the traditional searchable encryption scheme is an honest and curious system that relies on cloud servers. That is, the server must return the correct result to the user without verification. However, in practice, the cloud server is a semi-honest and curious entity, so there may be malicious behavior, and the server may return a subset of search results or some irrelevant documents. In addition, users may refuse to pay the service fee by falsely claiming that they did not receive the correct results. This situation leads to unfair payment for services and mutual distrust between users and servers. Ideally, if the server does not return search results or returns incorrect results to the user, the user would like to be able to redeem the promised service fees, and the server would be financially penalized. Conversely, the server is guaranteed to receive a service fee when it returns the correct search results. Therefore, a reliable ciphertext retrieval scheme is urgently needed to solve the unfair payment of services between cloud servers and users, which can effectively prevent malicious operations and achieve fair payment. To solve the problem of unfair payment, we usually rely on trusted third parties, such as banks or other trusted institutions. When conflicts occur, they are resolved by trusted third parties. In order to avoid the collusive threat of third-party organizations and achieve fair service payment, blockchain technology has attracted much attention.

Blockchain has the characteristic of decentralization and is tamper-proof, which can be well combined with cloud technology. A smart contract on blockchain is a range of digitally defined commitments that can be written directly into code and executed automatically. Smart contracts allow trusted transactions without third parties that can be tracked and irreversible. As such, smart contracts are well suited to serve as verification nodes for searchable encryption systems to ensure that users get the right results. In such systems, smart contracts compel each party to perform actions in accordance with the contract. That is, users must pay the corresponding service fee after getting the correct search results, and as long as the cloud server returns the correct results, the corresponding service fee can be obtained. If the participants act in bad faith, they will be punished accordingly. Therefore, blockchain can solve the problem of unfairness between users and cloud servers in searchable encryption systems.

The blockchain-based searchable encryption scheme not only realizes ciphertext search but also solves the fairness problem between the user and the cloud. However, a scheme that can realize fair payment and support fuzzy keyword query is still lacking. Simply put, the search keyword is inconsistent with the preset keyword. For example, the default keyword is “Word”, but the user enters “Ward” for a search, which is very common in practice. When the above situation occurs, the cloud server will return a large amount of irrelevant data to the user, which will greatly reduce the availability of the system and make the user experience very poor. The fuzzy keyword searchable encryption algorithm can well solve the problem of keyword inconsistency and return the files required by users as much as possible, which greatly enhances the usability of the searchable encryption system based on blockchain. In addition, during data encryption, the system also needs to consider data updates, such as adding, deleting, and modifying data. Therefore, a searchable encryption scheme that supports fuzzy keyword search with verification of search results, fair payment, and dynamic data update is very important.

2. Related Work

In order to solve the retrieval problem on ciphertext, Song et al. [1] first proposed a symmetric searchable encryption (SSE) scheme, which realizes the search function based on ciphertext but needs to scan all documents, so the efficiency is low. To improve the search efficiency of the ciphertext, Goh et al. [2] established a secure index according to the keywords, and the cloud server returned ciphertext documents required by the user through the match between the trapdoor and index. Curtmola et al. [3] redefined the security of symmetric searchable encryption (SSE) and proposed two more secure SSE schemes. Guo et al. [4] used a Bloom filter to build the index tree and proposed a multi-keyword searchable encryption scheme, which not only supported the ranking of results but also realized the dynamic operation of data, which greatly improved the availability of the system. In reality, cloud servers are semi-honest, inquisitive entities that can be dishonest in their search. In order to prevent dishonest behavior of cloud servers, Chai et al. [5] proposed a verifiable SSE scheme. Jiang et al. [6] proposed a verifiable multi-keyword ranking search scheme, which constructed a special data structure QSet based on the inverted index structure to achieve efficient multi-keyword search. Zhang et al. [7] proposed an efficient and verifiable multi-keyword search encryption scheme. This scheme not only realizes the privacy of search mode and forward security of files but also realizes the integrity verification of outsourced data. Although the above scheme considers malicious behaviors of the cloud server, it does not consider malicious behaviors of the user, such as refusing the service fee after receiving correct results. Therefore, there is an unfair service payment problem between the cloud server and the user in the retrieval process.

In order to realize the fairness of ciphertext retrieval, Cai et al. [8] used smart contracts to record encrypted logs on the blockchain as unforgeable evidence and designed a fair protocol to detect and punish dishonest behaviors of cloud servers and users so as to realize the fairness of the system. Li et al. [9] stored encrypted data and secure indexes on the blockchain in order to avoid malicious attacks from the cloud and proposed two corresponding searchable encryption schemes according to the data size. Hu et al. [10] constructed a distributed searchable encryption scheme based on blockchain and encouraged participants to honestly implement contract rules by designing smart contracts to ensure that cloud servers and users get corresponding rewards. Zhang et al. [11] used digital signatures to construct secure indexes and proposed a searchable encryption scheme that supports two-way authentication, which can conduct a trusted keyword search without a third party and support result verification. In addition, the scheme implements server-side verifiability to prevent cloud servers from being attacked by malicious data owners. Chen et al. [12] proposed a blockchain-based EHR searchable encryption scheme. In this scenario, indexes are constructed and stored in the blockchain through complex logical expressions so that data users can use the expressions to search for indexes. Wang et al. [13] proposed a verifiable health record sharing scheme based on blockchain, which combined symmetric searchable encryption and attribute-based encryption technology to realize ciphertext search and support fine-grained access control. Li et al. [14] proposed a verifiable public-key searchable encryption scheme based on blockchain and outsourced the verification work to the TrueBit network, which greatly reduced the computational cost of miners. However, the above schemes only support an exact keyword search; that is, the correct results can be obtained only when the search keyword entered by the user is consistent with the predefined keyword. However, few schemes support both fuzzy keyword search and fair payment.

Exact keyword searchable encryption does not tolerate keyword inconsistency problems. In order to solve this problem, Li et al. [15] implemented fuzzy keyword retrieval on encrypted cloud data for the first time and constructed fuzzy keyword sets by using edit distance and wildcard technology. Wang et al. [16] constructed a new Trie-traverse search index based on the existing fuzzy keyword search scheme so that it has a constant search time. Ge et al. [17] proposed a verifiable fuzzy keyword searchable encryption scheme, which generates an index vector for each fuzzy keyword set to reduce computation cost and storage space. Jiang et al. [18] proposed a new verifiable fuzzy keyword search scheme, in which a confusion function is calculated for each fuzzy keyword set index to encrypt the true index. In addition, the cloud can decrypt the corresponding index directly by fuzzy keywords, which greatly simplifies the search process and improves search efficiency. None of the above fuzzy keyword search schemes can guarantee fairness between cloud servers and users. Yan et al. [19] proposed a fuzzy searchable encryption scheme based on blockchain. Although the scheme supports fuzzy keyword query and realizes fair payment, the efficiency of using an RSA accumulator to realize result verification is low, so it is not applicable in practical application. Therefore, an efficient and fair fuzzy keyword search scheme is urgently needed.

Based on the above analysis, the main contributions of this paper are as follows:

(1): Fuzzy keyword search. In this paper, fuzzy keyword sets are constructed by editing distance, and security indexes are constructed for each fuzzy keyword set. The cloud server can return search results by searching for matches between trap doors and security indexes. When a keyword inconsistency problem occurs, our scheme will deduce some valid keywords from the input keyword as much as possible, and return the closest matching document.
(2): Fair payment. In order to ensure the correctness of the results obtained by the data user, the data user generates an authentication tag for each fuzzy keyword set to verify the search results. At the same time, the smart contract is deployed to perform the validation operation and return the correct results to the user. In addition, smart contracts can automatically detect malicious behaviors of users or cloud servers and punish them accordingly to ensure the honest behavior of all participants so as to achieve fair payment.
(3): Dynamic update. To improve system availability, our scheme supports the addition, modification, and deletion of cloud data. In addition, through security analysis and performance comparison analysis, it is proven that the scheme is secure and effective and can meet the requirements of ciphertext search applications in the cloud environment.

3. Preliminaries

3.1. Symmetric Searchable Encryption

The symmetric searchable encryption scheme includes five polynomial time algorithms

S S E = (G e n, E n c, T r o p d o o r, S e a r c h, D e c)

. Algorithm

G e n

is responsible for generating key

S K

; algorithm

E n c

is responsible for encrypting document

C

and building index

I

;

T r o p d o o r

is responsible for generating search traps. The algorithm

S e a r c h

is responsible for the search operation and outputs the ciphertext document

C (W)

containing the keyword

W

. Algorithm

D e c

is responsible for decrypting the returned ciphertext document

C (W)

. Symmetric encryption schemes should be able to resist selective plaintext attacks. We choose the AES encryption algorithm that meets the above requirements to encrypt the document.

3.2. Pseudorandom Function and Message Authentication Function

The pseudorandom function (PRF) and the pseudorandom permutation function (PRP) are computable random functions that cannot be distinguished by any polynomial-time adversary. In our scheme, PRF-f is used to blind the index vectors, and PRP-π is used to blur the positions of keywords. The parameters of the above two functions are as follows:

π : {0, 1}^{k} \times {0, 1}^{l} \to {0, 1}^{l}

(1)

f : {0, 1}^{k} \times {0, 1}^{l} \to {0, 1}^{N}

(2)

where l is the maximum length of the keyword, k is the key, and N is the number of documents. Set:

MAC : {0, 1}^{k} \times {0, 1}^{*} \to {0, 1}^{N}

(3)

which is a verification tag generating function, which has the characteristics of irreversibility and message unforgeability. We use the MAC function to generate verification labels for the encrypted data to verify the correctness of the search results returned from the cloud server. In our scheme, we define

Tag = MAC (k_{0}, m)

, where

k_{0}

is the key, and

m

is the message.

3.3. Edit Distance

In our scheme, edit distance is used to evaluate the similarity between two keywords. Edit distance is the minimum number of operations required to convert keyword A to keyword B using character operations. There are generally three basic operations:

Substitution: convert one character in a word to another character.

Delete: remove a character from words.

Insert: insert a character into words.

For example, the following would be required to convert the keyword “word” to the keyword “wear”:

(1): word → ward (o → a)
(2): ward → war (delete d)
(3): war → wear (insert e)

Assuming given keyword

w_{i}

and edit distance

d

, the fuzzy keyword set based on wildcards can be expressed as follows:

S w_{i, t} = {w_{i, 1}, w_{i, 2}, \cdot \cdot \cdot w_{i, t}}

(4)

If we assume that

d

= 1, the fuzzy keyword set of keyword “word” is as follows:

S_{w o r d, 1} = {w o r d, * o r d, w * r d, w o * d, w o r *, w o r d *, w o r * d, w o * r d, w * o r d, * w o r d}

(5)

3.4. Inverted Index

In this paper, the index table is constructed based on the inverted index [20], as shown in Table 1. Denote the index vector of keyword

w_{i}

by

v (w_{i})

and set

a d d r_{i} = w_{i}

. If the file

F_{j} (1 \leq j \leq N)

contains the keyword

w_{i}

, then let

v (w_{i}) [j] = 1

; otherwise, let

v (w_{i}) [j] = 0

.

3.5. Security Index

For any fuzzy keyword set

s_{w_{i, d}}

, the confusion function can be calculated as follows:

Φ (w_{i}) = g^{\prod w_{i, j} \in s_{w_{i, d}}^{π_{k_{2}} (w_{i, j})}}

(6)

To encrypt the real index, by using PRP-π to disrupt the real position of keywords, the index vector

v (w_{i})

is blinded with PRF-f to get the

π k_{1} (w_{i})

and security index:

E v (w_{i}) \leftarrow f (Φ (w_{i})) \oplus v (w_{i})

(7)

For any fuzzy keyword

w_{i, t} \in s_{w_{i, d}}

, it is calculated as follows:

φ (w_{i, t}) = g^{\prod w_{i, j} \in s_{w_{i, d}} - w_{i . t}^{π_{k_{2}} (w_{i, j})}}

(8)

For any fuzzy keyword

w_{i, a}

,

w_{i, b} \in s_{w_{i, d}}

, the same confusion function can be obtained by using

π_{k_{2}} (w_{i, a})

,

π_{k_{2}} (w_{i, b})

.

\begin{array}{l} Φ (w_{i}) = φ {(w_{i, a})}^{π_{k_{2}} (w_{i, a})} = φ {(w_{i, b})}^{π_{k_{2}} (w_{i, b})} \\ = g^{π_{k_{2}} (w_{i, 1}) \cdot π_{k_{2}} (w_{i, 2}) \cdot \cdot \cdot π_{k_{2}} (w_{i, t})} \\ = g^{\prod w_{i, j} \in s_{w_{i, d}}^{π_{k_{2}} (w_{i, j})}} \end{array}

(9)

Therefore, any fuzzy keyword

w_{i, t} \in s_{w_{i, d}}

can be used to decrypt the security index and obtain the inverted index:

v (w_{i}) \leftarrow f (Φ (w_{i})) \oplus E v (w_{i}) = f (φ {(w_{i, t})}^{π_{k_{2}} (w_{i, t})}) \oplus E v (w_{i})

(10)

4. Problem Formulation

4.1. System Model

The model of the fuzzy keyword searchable encryption system based on blockchain is shown in Figure 1. It consists of four actors, namely the data owner (DO), the data user (DU), the cloud server (CSP), and the blockchain (BC). DO is primarily responsible for encrypting data and building indexes and uploading them to cloud servers. DU is mainly responsible for generating search traps and uploading them to the cloud server to complete the search. The CSP performs the search operation and sends the search results to the blockchain for verification. The BC performs the verification operation, and the smart contract motivates the CSP and DU to perform the contract operation honestly to get the correct result and the corresponding service fee.

4.2. Algorithm Definition

The blockchain-based fuzzy keyword searchable encryption scheme consists of the following seven probabilistic polynomial-time algorithms.

(1): $K e y G e n (k)$ : The algorithm is executed by the DO, which inputs security parameter $k$ and outputs the key set $K = (s k, k_{0}, k_{1}, k_{2})$ .
(2): $B u i l d I n d e x (K, F)$ : The algorithm is executed by the DO, which inputs key set $K$ and the plaintext set $F = (F_{1}, F_{2}, \dots, F_{N})$ and then outputs the security index $I$ and the encrypted document set $C = (C_{1}, C_{2}, \dots, C_{N})$ .
(3): $T r a p d o o r (K, w)$ : The algorithm is executed by the DU, which inputs security parameter $K$ and query keyword $w$ and then outputs the search trap door $T w$ .
(4): $S e a r c h (T w, I, C)$ : The algorithm is executed by the CSP. The CSP inputs trap door $T w$ , security index $I$ , and encrypted document set $C$ and then outputs encrypted document result $C (w)$ and verification tag pair $(φ (w), T a g_{w})$ .
(5): $V e r i f y (K, C (w), T w, (φ (w), T a g_{w}))$ : The algorithm is executed by a smart contract. The smart contract inputs security parameter $K$ , the encrypted document set $C (w)$ , search trap door $T w$ , and verification tag pair $(φ (w), T a g_{w})$ and then outputs receive or reject.
(6): $D e c (s k, C (w))$ : The algorithm is executed by the DU, which inputs key $s k$ and encrypted document set $C (w)$ and then outputs plaintext set $D (w)$ .
(7): $U p d a t a (K, F_{n + 1}, F_{j})$ : The algorithm is executed by the DO. The DO inputs key set $K$ , adds the new document $F_{n + 1}$ , modifies or deletes the document $F_{j}$ , and then outputs the new encrypted document set $C^{'}$ , security index $I^{'}$ , and verification tag ${T a g w}^{'}$ .

4.3. Specific Construction

The algorithm of the scheme is described as follows:

(1): $K e y G e n (k)$ : The data owner inputs the security parameters $k$ and outputs the key set $K = (s k, k_{0}, k_{1}, k_{2})$ , where $s k \leftarrow S K E . G e n (1^{k})$ is the key for encrypting and decrypting documents; $k_{0}, k_{1}, k_{2} \leftarrow {0, 1}^{k}$ , $k_{0}$ is the key of MAC, $k_{1}$ is the key of PRP-π, and $k_{2}$ is the key of PRF-f.
(2): $B u i l d I n d e x (K, F)$ : In order to improve the search efficiency of the system, we extend the inverted index and construct a linked list with three nodes as the index. The structure of the index is shown in Figure 2.

a. The data owner extracts the keywords from the documents that need to be uploaded to the cloud and builds the keyword set

W

. Next, the document set

F (w_{i})

is established according to the keyword

w_{i} \in W

.

b. After completing step a, build the index according to

F (w_{i})

. Let

V (w_{i}) (1 \leq I \leq n)

denote the index vector of keyword

w_{i}

. If

w_{i}

is in file

F_{j}

, the data owner will set

v (w_{i}) [j] = 1

; otherwise,

v (w_{i}) [j] = 0

.

c. The data owner constructs a fuzzy keyword

S_{w_{i, d}}

, where

d

represents the edit distance,

w_{i}

represents the included keywords, and

W_{i, t} (1 \leq i \leq n, 1 \leq t \leq | S w_{i}_{, d} |)

denotes the keywords in

S_{w_{i, d}}

.

d. The data owner calculates

C (w_{i}) \leftarrow S K E . E n c_{s k} (F (w_{i}))

to encrypt all documents. Calculate

π_{k 1} (w_{i}_{, t}) (1 \leq i \leq n, 1 \leq t \leq | S w_{i}_{, d} |)

for each fuzzy key in

S w_{i}_{, d}

and store it in the first node. The data owner computes

E v (w_{i}) \leftarrow f (Φ (w_{i})) \oplus v (w_{i})

and stores

E v (w_{i})

in the second node. Finally, the authentication tag pair

(φ (w_{i, t}), T a g w_{i, t})

is calculated for all fuzzy keywords in

S w_{i}_{, d}

, where:

φ (w_{i, t}) = g^{\prod w_{i, j} \in s_{w_{i, d} - w_{i, t}}^{π_{k_{2}} (w_{i, j})}}

(11)

T a g w_{i, t} = M A C (π_{k 1} (w_{i}), π_{k 1} (w_{i}_{, t}), E v (w_{i}), C (w_{i}), φ (w_{i, t}))

(12)

e. Finally, the data owner stores the linked lists in a hash table as a secure index

I

.

(3): $T r a p d o o r (K, w)$ : When data users want to search for files that contain the keyword $w^{'}$ , they first calculate trap ${T w}^{'} = {(π k_{1} ({w^{'}}_{1}), π k_{2} ({w^{'}}_{1})), (π k_{1} ({w^{'}}_{2}), π k_{2} ({w^{'}}_{2})), \dots, (π k_{1} ({w^{'}}_{t}), π k_{2} ({w^{'}}_{t}))}$ and send it to the cloud server.
(4): $S e a r c h (T w, I, C)$ : When the cloud server receives the trapdoor, it looks for all $π k_{1} ({w^{'}}_{i})$ in ${T w}^{'} = {(π k_{1} ({w^{'}}_{1}), π k_{2} ({w^{'}}_{1}), (π k_{1} ({w^{'}}_{2}), π k_{2} ({w^{'}}_{2})), \dots, (π k_{1} ({w^{'}}_{t}), π k_{2} ({w^{'}}_{t}))}$ . If the $π k_{1} (w_{i, a})$ exists in the index ${I w}^{'}_{i} = {(π k_{1} (w_{i, 1}), φ (w_{i, 1})), (π k_{1} (w_{i, 2}), φ (w_{i, 2})), \dots, (π k_{1} (w_{i, t}), φ (w_{i, t}))}$ , make $π k_{1} (w_{i, a}) = π k_{1} ({w^{'}}_{i})$ . Then, find the corresponding index file $C_{j}$ and corresponding proof $(φ (w_{i, a}), T a g w_{i . a})$ and calculate $Φ (w_{i}) = φ (w_{i, a})^{π_{k 2} (w_{i, a})}$ to decrypt the security index $E v (w_{i})$ , which is used to obtain the inverted index $v (w_{i}) \leftarrow f (Φ (w_{i})) \oplus E v (w_{i})$ . If $v (w_{i}) [j] = 1$ , the cloud server adds $C_{j}$ to $C (w^{'})$ ; otherwise, do not add it. Finally, the cloud service submits $C (w^{'})$ , $(φ (w_{i, t}), T a g w_{i, a})$ , $π k_{1} ({w^{'}}_{i})$ together with the margin to the blockchain to complete the verification.
(5): $V e r i f y (K, {T a g}_{i, t}, C (w^{'}), π_{k 1} (w_{i, 1}), T_{w^{'}} φ (w_{i, t}))$ : Users invoke smart contracts by submitting service fees and a search trapdoor to the blockchain. The blockchain extracts $v (w^{'})$ from $C (w^{'})$ . If $C (w^{'})$ is in $C_{j}$ , the jth bit in $v (w^{'})$ is 1; otherwise, it is 0. Then, the blockchain computes $E v (w_{i}) \leftarrow f (Φ (w_{i})) \oplus v (w_{i})$ , where $Φ (w^{'}) = φ (w_{i, t})^{π_{k 2} (w_{i, t})}$ . Then, check if $T a g w_{i, t}$ is equal to $M A C (π_{k 1} (w_{i}), π_{k 1} (w_{i, t}), E v (w_{i}), C (w_{i}), φ (w_{i, t}))$ . If so, the blockchain sends the result to the data user and submits the deposit and service fee to the cloud server; otherwise, it rejects the result. In addition, the deposit of the cloud server is deducted as a penalty, and the service fee is returned to the user.
(6): $D e c (s k, C (w))$ : The data user decrypts $C (w^{'})$ by computing $F (w^{'}) \leftarrow S K E . D e c_{s k} (C (w^{'}))$ .
(7): $U p d a t a (K, F_{n + 1}, F_{j})$ : DO can add document $F_{n + 1}$ , delete document $F_{j}$ or dynamically modify document $F_{j}$ , output the updated encrypted document $C^{'}$ , security index $I^{'}$ , and verification tag ${T a g w}^{'}$ , and upload them to the cloud again.

Case 1:

Adding. The DO adds a document

F_{n + 1}

and then encrypts the document

F_{n + 1}

using the

C (w_{i}) \leftarrow S K E . E n c_{s k} (F (w_{i}))

algorithm and calls

B u i l d I n d e x (K, F)

algorithm to update the security index

I^{'}

. Finally, the data user sends the encrypted document

C^{'}

and the security index

C^{'}

to the CSP.

Case 2:

Deleting. The DO tends to delete the document

F_{j}

. First of all, DO gets encrypted document

C_{j}

and security index

I^{'}

by calculating

C (w_{i}) \leftarrow S K E . E n c_{s k} (F (w_{i}))

and

B u i l d I n d e x (K, F)

and then sends

I^{'}

and

C_{j}

to the CSP. After receiving a delete request from DO, the CSP removes

C_{j}

from the encrypted document set

C

.

Case 3:

Modifying. The DO wants to replace

F_{j}

with

{F_{j}}^{'}

. The DO encrypts

{F_{j}}^{'}

to get the corresponding ciphertext

{C_{j}}^{'}

and generates a new security index

I^{'}

. After receiving a modification request from DO, the CSP replaces

C_{j}

with

{C_{j}}^{'}

in the encrypted document set

C

.

5. Scheme Analysis

5.1. Security Analysis

In this paper, we demonstrate the security of the system by designing real games and simulation games. Real games are played by a challenger and an adversary A, while simulation games are played by a simulator Sim and an adversary A.

Real Games

(G a m e_{r e a l})

:

(1): Adversary A selects an arbitrary set of document sets and keyword sets $(F, W)$ and sends them to the challenger.
(2): The challenger uses the key generation algorithm to obtain the key set K and, on this basis, uses the encryption algorithm and index generation algorithm to obtain $(I, C)$ and then sends $(I, C)$ to adversary A.
(3): Adversary A selects a keyword $w$ and sends it to the challenger.
(4): Challenger sends trapdoor $T w$ to adversary A.
(5): Adversary A outputs $b \in {0, 1}$ .

Simulation Game

(G a m e_{s i m})

:

(1): Adversary A chooses the same set $(F, W)$ as in the real games and sends it to the simulator Sim.
(2): Simulator Sim calculates $(I^{'}, C^{'})$ and sends it to adversary A.
(3): Adversary A selects a keyword $w$ and sends it to simulator Sim.
(4): Simulator Sim calculates the trapdoor $T^{'} w$ and sends it to adversary A.
(5): Adversary A outputs $b \in {0, 1}$ .

If there exists any probabilistic polynomial-time (PPT) simulator Sim and PPT adversary A, such that |Pr(A outputs b = 1 in

G a m e_{r e a l}

)−Pr(A outputs b = 1 in

G a m e_{s i m}

)| is negligible, it can be verified that the symmetric searchable encryption scheme satisfies security.

5.2. Fairness Analysis

The fairness of the proposed scheme is guaranteed by the smart contract, which can automatically perform the predefined verification operations and handle the payment process of both parties.

Due to the irreversible and tamper-proof characteristics of blockchain, smart contracts cannot be modified once deployed. In our deployed validation contract, only if the cloud server returns the correct search results to the user, it can get the corresponding service fee. At the same time, users need to pay a deposit as a service fee during the search before getting the results. In addition, if the blockchain detects that the result returned by the cloud server is invalid, the verification contract returns the deposit originally temporarily stored to the user. Therefore, due to the strict reward and punishment mechanism of the blockchain, both cloud servers and users will honestly follow the contract rules to satisfy their own needs.

In our scheme, we combine blockchain technology with searchable encryption technology to make the system realize fair payment as well as ciphertext search.

5.3. Performance Analysis

In this paper, we compared our scheme with the schemes proposed in the literature [13,17,19] in terms of function, as shown in Table 2. Both schemes [13,19] achieve result verification and fair payment through smart contracts, but scheme [13] does not support fuzzy keyword search. Although scheme [17] also realizes result verification, its verification operation is performed by the user, which undoubtedly increases the computational burden of the user. Both schemes [17,19] support fuzzy keyword search, but they do not use wildcard technology when building fuzzy keyword sets, which leads to large index space and low efficiency.

In terms of functions, our proposed scheme has more complete functions and higher system availability, so it is more suitable for cloud storage applications.

The effectiveness of the proposed scheme is compared with other schemes in the four aspects of index generation time cost, trapdoor generation time cost, search time cost, and verification time cost, as shown in Table 3. Among them, N indicates the number of documents, n indicates the number of exact keywords, M indicates the maximum number of fuzzy keywords, |W’| indicates the number of documents containing search keywords, XOR indicates the exclusive or operation, E indicates the exponential operation, F indicates the pseudo-random function, and H indicates the hash operation.

In the index generation stage, both schemes [17,19] and this scheme use edit distance to construct a fuzzy keyword set. However, this scheme uses wildcard technology to greatly reduce the number of fuzzy keywords in the fuzzy keyword set, so the efficiency of this scheme is better than that of the scheme [17,19].

In the search phase, this scheme uses the hash table structure instead of the linked list structure used by schemes [17,19] to construct the index, so the time complexity of this scheme in the search phase is O(1). In addition, this paper uses the confusion function to encrypt and confuse the real index, so that the cloud can directly decrypt the index by fuzzy keywords. In this scheme, the cloud can find the corresponding encrypted index by searching πk₁(w) and then obtain the corresponding proof to decrypt the secure index. Instead of using fuzzy keywords to match the exact keywords, the corresponding index can be obtained, which greatly simplifies the search process and improves the search efficiency.

In the verification phase, scheme [19] needs to calculate one exponential operation and two hash operations for each document containing keywords, while scheme [17] and this scheme only need simple XOR operation and hash operation. Therefore, in terms of time cost, this scheme is comparable to scheme [17] but is far less than scheme [19]. In addition, the verification operation in this paper is performed by the smart contract instead of the user, so the user’s computation overhead is reduced.

6. Conclusions

In this paper, we propose a fuzzy keyword searchable encryption scheme based on blockchain. Firstly, the fuzzy keyword set is constructed for each exact keyword, then the index vector is generated for the fuzzy keyword set, and the secure index is constructed by using a three-node hash table. In addition, an obfuscation function is calculated for each fuzzy keyword set index to encrypt and obfuscate the real index so that the cloud can directly decrypt the corresponding index through the fuzzy keyword, instead of finding the exact keyword first and then using the returned keyword to construct the decryption scheme again to obtain the index, which simplifies the search process and greatly improves the search efficiency. Finally, in order to resist the malicious attack of the cloud server, a verification tag is generated for each fuzzy keyword set, and the verification is completed by the smart contract to achieve the verifiability of the returned results and the fairness of the transaction. Through security analysis and performance comparison, the security and effectiveness of our scheme are proven.

Author Contributions

Conceptualization, J.L. and Y.J.; methodology, J.L.; validation, J.L., Y.J. and T.F.; formal analysis, T.F.; writing—original draft preparation, J.L.; writing—review and editing, J.L.; supervision, Y.J.; funding acquisition, T.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (No. 62162039) and (61762060), and the Key Research and Development Program of Gansu Province (No. 20YF3GA016).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Song, D.; Wagner, D.; Perrig, A. Practical techniques for searches on encrypted data. In Proceedings of the 2000 IEEE Security and Privacy Symposium, Berkeley, CA, USA, 14–17 May 2000; pp. 44–55. [Google Scholar]
Goh, E.-J. Secure Indexes. Cryptology ePrint Archive. 2003. Available online: http://eprint.iacr.org/2003/216 (accessed on 19 September 2022).
Curtmola, R.; Garay, J.; Kamara, S.; Ostrovsky, R. Searchable symmetric encryption: Improved definitions and efficient constructions. J. Comput. Secur. 2011, 19, 895–934. [Google Scholar] [CrossRef] [Green Version]
Guo, C.; Zhuang, R.; Chang, C.-C.; Yuan, Q. Dynamic multi-keyword ranked search based on Bloom filter over encrypted cloud data. IEEE Access 2019, 7, 35826–35837. [Google Scholar] [CrossRef]
Chai, Q.; Gong, G. Verifiable symmetric searchable encryption for semi-honest-but-curious cloud servers. In Proceedings of the IEEE International Conference on Communications(ICC), Ottawa, ON, Canada, 10–15 June 2012; pp. 917–922. [Google Scholar]
Jiang, X.; Yu, J.; Yan, J.; Hao, R. Enabling efficient and verifiable multikeyword ranked search over encrypted cloud data. Inf. Sci. Int. J. 2017, 403–404, 22–41. [Google Scholar]
Zhang, J.; Wu, M.; Wang, J.; Liu, P.; Jiang, Z.; Peng, C. Secure and verifiable multi-keyword searchable encryption scheme in cloud. J. Commun. 2021, 42, 139–149. [Google Scholar]
Cai, C.; Weng, J.; Yuan, X.; Wang, C. Enabling reliable key-word search in encrypted decentralized storage with fairness. IEEE Trans. Depend. Sec. Comput. 2018, 18, 131–144. [Google Scholar] [CrossRef]
Li, H.; Zhang, F.; He, J.; Tian, H. A searchable symmetric encryption scheme using BlockChain. arXiv 2017, arXiv:1711.01030. [Google Scholar]
Hu, S.; Cai, C.; Wang, Q.; Wang, C.; Luo, X.; Ren, K. Searching an encrypted cloud meets blockchain: A decentralized, reliable and fair realization. In Proceedings of the IEEE INFOCOM 2018—IEEE Conference on Computer Communications, Honolulu, HI, USA, 16–19 April 2018; pp. 792–800. [Google Scholar]
Zhang, Y.; Deng, R.H.; Shu, J.; Yang, K.; Zheng, D. TKSE: Trust-worthy keyword search over encrypted data with two-side verifiability via blockchain. IEEE Access 2018, 6, 31077–31087. [Google Scholar] [CrossRef]
Chen, L.; Lee, W.-K.; Chang, C.-C.; Choo, K.-K.-R.; Zhang, N. Blockchain based searchable encryption for electronic health record sharing. Future Gener. Comput. Syst. 2019, 95, 420–429. [Google Scholar] [CrossRef]
Wang, S.; Zhang, D.; Zhang, Y. Blockchain-based personal health records sharing scheme with data integrity verifiable. IEEE Access 2019, 7, 102887–102901. [Google Scholar] [CrossRef]
Li, H.; Wang, T.; Qiao, Z.; Yang, B.; Gong, Y.; Wang, J.; Qiu, G. Blockchain-based searchable encryption with efficient result verification and fair payment. J. Inf. Secur. Appl. 2021, 58, 102791. [Google Scholar] [CrossRef]
Li, J.; Wang, Q.; Wang, C.; Cao, N.; Ren, K.; Lou, W. Fuzzy key-word search over encrypted data in cloud computing. In Proceedings of the IEEEINFOCOM, San Diego, CA, USA, 14–19 March 2010; pp. 1–5. [Google Scholar]
Wang, C.; Ren, K.; Yu, S.; Mahendra Raje Urs, K. Achieving usableand privacy-assured similarity search over outsourced cloud data. In Proceedings of the IEEEINFOCOM, Orlando, FL, USA, 25–30 March 2012; pp. 451–459. [Google Scholar]
Ge, X.; Yu, J.; Hu, C.; Zhang, H.; Hao, R. Enabling efficient verifiable fuzzy keyword search over encrypted data in cloud computing. IEEE Access 2018, 6, 45725–45739. [Google Scholar] [CrossRef]
Jiang, J.; Cai, L.; Wei, P.; Li, L. Aretrieval scheme supporting verifiable ciphertext fuzzy keyword. Comput. Eng. Appl. 2020, 56, 74–80. [Google Scholar]
Yan, X.; Yuan, X.; Ye, Q.; Tang, Y. Blockchain-Based Searchable Encryption Scheme With Fair Payment. IEEE Access 2020, 8, 109687–109706. [Google Scholar] [CrossRef]
Curtmola, R.; Garay, J.; Kamara, S.; Ostrovsky, R. Searchable symmetric encryption: Improved definitions and efficient constructions. In Proceedings of the ACM Conference on Computer and Communications Security, Alexandria, VA, USA, 30 October–3 November 2006; pp. 79–88. [Google Scholar]

Figure 1. System model.

Figure 2. Index structure.

Table 1. Inverted index.

	F₁	F₂	F₃	···	F_N
w₁	1	1	0	···	0
w₂	0	1	1	···	0
w₃	1	0	0	···	1
⋮	⋮	⋮	⋮		⋮
w_n	0	0	1	···	1

Table 2. Function comparison.

Scheme	Fair Payment	Keyword Type	Verification Side	Update Dynamically	Wildcard Technology
Scheme [13]	√	Exact keyword	Smart contract	×	×
Scheme [17]	×	Fuzzy keyword	User node	×	×
Scheme [19]	√	Fuzzy keyword	Smart contract	√	×
Our scheme	√	Fuzzy keyword	Smart contract	√	√

Table 3. Performance comparison.

Scheme	Index Generation	Trapdoor Generation	Search	Verification
Scheme [13]	O(n)	O(1)	O(1)	\|w’\|²
Scheme [17]	O(n)	O(1)	O(m)	1XOR + 1F
Scheme [19]	O(n)	O(1)	O(m)	1E + 2H\|w’\|
Our scheme	O(n)	O(1)	O(1)	1XOR + 1F

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiang, Y.; Lu, J.; Feng, T. Fuzzy Keyword Searchable Encryption Scheme Based on Blockchain. Information 2022, 13, 517. https://0-doi-org.brum.beds.ac.uk/10.3390/info13110517

AMA Style

Jiang Y, Lu J, Feng T. Fuzzy Keyword Searchable Encryption Scheme Based on Blockchain. Information. 2022; 13(11):517. https://0-doi-org.brum.beds.ac.uk/10.3390/info13110517

Chicago/Turabian Style

Jiang, Yongbo, Juncheng Lu, and Tao Feng. 2022. "Fuzzy Keyword Searchable Encryption Scheme Based on Blockchain" Information 13, no. 11: 517. https://0-doi-org.brum.beds.ac.uk/10.3390/info13110517

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fuzzy Keyword Searchable Encryption Scheme Based on Blockchain

Abstract

1. Introduction

2. Related Work

3. Preliminaries

3.1. Symmetric Searchable Encryption

3.2. Pseudorandom Function and Message Authentication Function

3.3. Edit Distance

3.4. Inverted Index

3.5. Security Index

4. Problem Formulation

4.1. System Model

4.2. Algorithm Definition

4.3. Specific Construction

5. Scheme Analysis

5.1. Security Analysis

5.2. Fairness Analysis

5.3. Performance Analysis

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI