Next Article in Journal
Dynamic Route Discovery Using Modified Grasshopper Optimization Algorithm in Wireless Ad-Hoc Visible Light Communication Network
Next Article in Special Issue
Time-Variant Front-End Read-Out Electronics for High-Data-Rate Detectors
Previous Article in Journal
A Hybrid Radix-4 and Approximate Logarithmic Multiplier for Energy Efficient Image Processing
Previous Article in Special Issue
Design Trade-Offs in Common-Mode Feedback Implementations for Highly Linear Three-Stage Operational Transconductance Amplifiers
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Structural Decomposition in FSM Design: Roots, Evolution, Current State—A Review

by
Alexander Barkalov
1,2,
Larysa Titarenko
1,3 and
Kazimierz Krzywicki
4,*
1
Institute of Metrology, Electronics and Computer Science, University of Zielona Góra, ul. Licealna 9, 65-417 Zielona Góra, Poland
2
Department of Mathematics and Information Technology, Vasyl’ Stus Donetsk National University, 600-richya str. 21, 21021 Vinnytsia, Ukraine
3
Department of Infocommunication Engineering, Faculty of Infocommunications, Kharkiv National University of Radio Electronics, Nauky Avenue 14, 61166 Kharkiv, Ukraine
4
Department of Technology, The Jacob of Paradies University, ul. Teatralna 25, 66-400 Gorzów Wielkopolski, Poland
*
Author to whom correspondence should be addressed.
Submission received: 14 April 2021 / Revised: 29 April 2021 / Accepted: 12 May 2021 / Published: 14 May 2021

Abstract

:
The review is devoted to methods of structural decomposition that are used for optimizing characteristics of circuits of finite state machines (FSMs). These methods are connected with the increasing the number of logic levels in resulting FSM circuits. They can be viewed as an alternative to methods of functional decompositions. The roots of these methods are analysed. It is shown that the first methods of structural decomposition have appeared in 1950s together with microprogram control units. The basic methods of structural decomposition are analysed. They are such methods as the replacement of FSM inputs, encoding collections of FSM outputs, and encoding of terms. It is shown that these methods can be used for any element basis. Additionally, the joint application of different methods is shown. The analysis of change in these methods related to the evolution of the logic elements is performed. The application of these methods for optimizing FPGA- based FSMs is shown. Such new methods as twofold state assignment and mixed encoding of outputs are analysed. Some methods are illustrated with examples of FSM synthesis. Additionally, some experimental results are represented. These results prove that the methods of structural decomposition really improve the characteristics of FSM circuits.

1. Introduction

The development of information technologies has led to the widespread use of various digital systems in different areas of mankind’s activity [1,2,3,4,5,6,7,8,9]. It is known that digital systems consist of various combinational and sequential blocks [10,11]. As a rule, the circuits of combinational blocks are regular [12]. A designer can use standard library elements of computer-aided design (CAD) systems to implement such circuits [11]. For example, a multi-bit adder can be represented as a composition of standard single-bit adders. The sequential blocks could be very complex (for example, control units of computers) or rather simple (such as binary counters). It is known that the circuits of complex sequential blocks are irregular [10,12]. As a rule, there are no standard library solutions for such blocks. It means that each sequential block is synthesised anew. To synthesise the logic circuit of a sequential block, some tools are used to present the law of its behaviour.
Very often, the behaviour of sequential blocks is represented using the model of a finite state machine (FSM) [10,13,14]. Three characteristics of an FSM circuit significantly influence the characteristics of a digital system. These characteristics are the hardware amount, the operating frequency (the performance), and the power consumption. Because of it, there is continuous interest in developing the various approaches that aimed at optimizing the basic characteristics of FSM circuits. As a rule, the less hardware is consumed by a sequential block’s circuit, the less power it requires [15,16,17,18,19,20]. Accordingly, it is very important to reduce the amount of hardware that is consumed by an FSM circuit.
The development of various methods of optimizing the characteristics of the FSM circuits has been started since 1951. A characteristic feature of such methods is the consideration of the characteristics of the logic elements that are used for the design of FSM circuits. At various times, various logic elements were used for implementing FSM circuits. Among these elements, there are logic gates, decoders, multiplexors, read-only memories (ROMs), programmable ROMs (PROMs), programmable logic arrays (PLAs), programmable array logic (PAL), complex programmable logic devices (CPLDs), and field-programmable gate arrays (FPGAs). Some of these logic elements have been used together.
The structural decomposition is one of the approaches used for reducing the hardware amount [21,22,23,24]. The roots of this approach go back to 1951, when M. Wilkes put forward the idea of a microprogram control unit (MCU) [25,26]. Over the following times, Wilkes’ ideas were modified with a change in the elemental basis used for implementing FSM circuits.
The main idea of the structural decomposition is the following. An FSM circuit is represented by some big logic blocks. Each such a block has its own unique input variables and output functions. The outputs of some blocks are used as the inputs of other blocks. This allows for eliminating the direct connection between FSM inputs and outputs. In the best case, the logic circuit of each block has exactly a single level of logic elements [21,27,28]. In this article, we present a rather brief survey of the known methods of structural decomposition. At the same time, almost half of the article is devoted to the methods of structural decomposition used for optimizing the circuits of FPGA-based FSMs.
The main contribution of this paper is a survey of methods of structural decomposition of FSM circuits. The analysis of these methods shows that the structural decomposition is a powerful tool that allows for significantly improving the characteristics of FSM circuits as compared to their counterparts based on other known approaches.
The rest of the paper is organized as the following. Section 2 presents the theoretical background of finite state machines. Section 3 discusses the methods of implementing microprogram control units. Section 4 presents the methods of structural decomposition used in application-specific integratedcircuits. Section 5 considers the methods targeting simple programmable logic devices. Section 6 considers the structural decomposition of FPGA-based FSMs. A brief conclusion ends the paper.

2. Implementing Circuits of Finite State Machines

An FSM can be defined as a tuple A = S , I , O , δ , λ , s 1 [10,13], where S = { s 1 , , s M } is a set of internal states, I = { i 1 , , i L } is a set of inputs, O = { o 1 , , o N } is a set of outputs, δ is a transition function, λ is a function of output, and s 1 S is an initial state. An FSM can be represented using such tools as: state transition graphs [10], binary decision diagrams [29], and-inverter graphs [30], graph-schemes of algorithms [13].
The most obvious way to represent an FSM is the state transition graph. For example, the STG that is shown in Figure 1 represents a Mealy FSM A 1 .
The FSM states are represented by the nodes s 1 , , s 4 . The arcs define the interstate transitions that are determined by the input signals that are the conjunctions of inputs (or their complements). These conjunctions are written above the arcs together with the outputs generated during these transitions. Using STG (Figure 1), we can find the following parameters of Mealy FSM A 1 : the number of inputs L = 3 , the number of outputs N = 5 , the number of states M = 4 , and the number of transitions H = 8 . Additionally, this STG uniquely defines the functions of transitions and output of FSM A 1 .
To design an FSM circuit, an STG should be transformed into the corresponding STT. An STT includes the following columns [10,13]: s C is a current state; s T is a state of transition; I h an input signal determining a transition from s C to s T ; and, O h is a subset of the set of outputs generated during the transition from the current state s C to the state of transition s T . We name this subset a collection of outputs. The numbers of the transitions ( h { 1 , , H } ) are shown in the last column of the STT.
In this article, we mostly use STTs for initial representation of FSMs. For example, the FSM A 1 is represented by the STT (Table 1). We hope that there is the transparent connection between the STG (Figure 1) and STT (Table 1).
There are two main types of FSM, namely, Mealy [31] and Moore [32] FSMs. The first of them was proposed in 1955 by G. Mealy; the second was proposed in 1956 by E. Moore. In both cases, the function δ determines the states of transition as functions depending on the current states and inputs. So, it is the following function:
δ ( A , X ) = A .
For Mealy FSMs, the function λ determines the outputs as functions depending on the current states and inputs. It gives the following function:
λ ( A , X ) = O .
For Moore FSMs, the function λ determines the outputs as functions depending only on the current states. So, it is the following function:
λ ( A ) = O .
The difference among (2) and (3) leads to a difference in the synthesis methods of Mealy and Moore FSMs. We now explain the stages of Mealy FSM’s synthesis starting from Table 1.
In 1965, Viktor Glushkov proved a theorem of the structural completeness [33]. According to this theorem, an FSM circuit is represented as a composition of the combinational part and the memory. The memory is necessary for keeping the history of the FSM’s operation. The history is represented by FSM internal states. This fundamental approach is still widely used for the synthesis of FSM circuits [34,35,36,37,38].
An FSM logic circuit is represented by some systems of Boolean functions (SBFs) [10,13]. To find these SBFs for Mealy FSMs, it is necessary to [13]: (1) encode states s m S by binary codes K ( s m ) ; (2) construct sets of state variables T = { T 1 , , T R } and input memory functions (IMFs) D = { D 1 , , D R } ; and, (3) transform an initial STT into a direct structure table (DST). The states s m S are encoded during the step of state assignment [10].
The minimum possible number of state variables R S is determined as
R S = l o g 2 M .
The approach based on (4) defines so-called maximum binary codes [10]. This method is used, for example, in the well-known academic system SIS [39]. However, the number of state variables can be different from (4). For example, the one-hot state codes with R = M are used in the academic system ABC [30,40] of Berkeley. The maximum binary codes and one-hot codes define the extreme points of the encoding space. There are other approaches for state assignment where the following relation holds: l o g 2 M R S M .
A state register (RG) keeps the state codes. The register includes R memory elements (flip-flops) having shared inputs of synchronization (Clock) and reset (Start). Very often, master–slave D flip-flops are used to organize state registers [41,42]. The pulse Clock allows the functions D r D to change the RG content.
After the execution of the state assignment, we should create a direct structure table. A DST includes all of the columns of an STT and three additional columns. These columns include the current state codes K ( s C ) and the codes K ( s T ) of the states of transitions. Finally, a column Φ h includes the symbols D r D corresponding to 1’s in the code K ( s T ) from the row h of a DST ( h { 1 , , H } ). A DST is a base to construct the following SBFs:
D = D ( T , I ) ;
Y = Y ( T , I ) .
The systems (5) and (6) determine a structural diagram of P Mealy FSM (Figure 2) [42].
The block of input memory functions generates the functions (5). The block of outputs generates the system (6). The pulse Start loads the code of the initial state to RG. The pulse of synchronization Clock allows information to be written to the register.
A DST of Moore FSM is a base for deriving the systems (5) and
O = O ( T ) .
A P Moore FSM is represented by a structural diagram that is similar to the one shown in Figure 2. However, as follows from SBF (7), there is no connection between the inputs i l I and block of outputs.
We now discuss how to obtain systems (5) and (6) for P Mealy FSM A 1 . There is M = 4 . Using (4) gives the value of R S = 2 . This determines the sets T = { T 1 , T 2 } and D = { D 1 , D 2 } . Let us encode states in the trivial way: K ( s 1 ) = 00 , , K ( s 4 ) = 11 . Having state codes allows transforming Table 1 (the initial STT) to Table 2. Table 2 is the DST of P FSM A 1 .
To fill the column D h , we should take into account that the value of D r D is equal to the value of the r-th bit of code K ( s T ) [13]. Systems (5) and (6) are represented as a sum-of-products (SOPs) [10,43]. These SOPs include product terms F h F corresponding to rows of a DST. The elements of the set of terms F are determined as
F h = S C I h ( h 1 , , H ) .
In (8), the first member S C is a conjunction of state variables corresponding to a code of the current state K ( s C ) from the h-th row of DST. There are the following conjunctions S C in the discussed case: S 1 = T 1 ¯ T 2 ¯ , , S 4 = T 1 T 2 .
Using Table 2, we can obtain the following SBFs:
o 1 = F 1 F 4 = T 1 ¯ T 2 ¯ i 1 T 1 ¯ T 2 i 3 ; o 2 = F 1 F 3 F 4 F 7 ; o 3 = F 2 F 6 ; o 4 = F 3 F 7 ; o 5 = F 5 F 8 = T 1 ¯ T 2 i 3 ¯ T 1 T 2 .
D 1 = [ F 2 F 3 ] F 4 F 7 = T 1 ¯ T 2 ¯ i 1 ¯ T 1 ¯ T 2 i 3 T 1 T 2 ¯ i 1 ¯ ; D 2 = F 1 F 3 F 5 F 7 F 8 .
The SBF (9) determines the circuit of block of outputs and the SBF (10) determines the circuit of block of input memory functions.
The hardware amount in an FSM circuit depends on the combination of SBF characteristics (the numbers of literals, functions, and product terms of SOPs) and specifics of the used logic elements (the number of inputs, outputs and product terms). Denote, by N A ( f i , F h ) , the number of literals in a term F h of the SOP of a function f i , and, by N T ( f i ) , the number of terms in a SOP of this function. Obviously, the following conditions are true for a SOP of any function f i D O :
N A ( f i , F h ) L + R S ;
N T ( f i ) H .
Consider the SOP of function D 1 from SBF (10). Each term of this SOP includes N A ( D 1 , F h ) = 3 literals. There are N T ( D 1 ) = 3 terms in this SOP. If NAND gates having N I N A N D = 3 inputs are used for implementing a logic circuit corresponding to D 1 , then there are four gates and two levels of gates in the circuit. This is the best solution, because the circuit includes the minimum possible number of gates (the minimum hardware amount), their levels (the maximum operating frequency), and interconnections.
However, if there is N I N A N D = 2 , then the SOP should be transformed. After the transformation, the SOP is represented by the following formula:
D 1 = T 1 ¯ T 2 ¯ ¯ ¯ i 1 ¯ ¯ · T 1 ¯ T 2 ¯ ¯ i 3 ¯ ¯ ¯ · T 1 T 2 ¯ ¯ ¯ i 1 ¯ ¯ ¯
Twelve gates are necessary for implementing the function (13). The resulting circuit has six levels of gates. Thus, an imbalance between the characteristics of the function and logic elements leads to an increase in the number of gates and levels of logic in the resulting logic circuit.
This situation can occur for any logical elements (logic gates, ROMs, PROMs, PLAs, PALs, CPLDs, FPGAs, and so on). In this case, it is necessary to optimize the characteristics of a resulting logic circuit. The structural decomposition is one of the ways for such an optimization [21].

3. Roots of Structural Decomposition

The control units’ circuits of the first computers were characterized by an irregular structure [45,46,47,48] with all the ensuing consequences. In 1951, Professor of Cambridge M. Wilkes proposed a principle of microprogram control [25,26]. According to this principle, each computer instruction is represented as a microprogram kept into a special control memory (CM). A microprogram consists of microinstructions. Each microinstruction has an operational part with control outputs (microoperations) and an address part having data used for generating an address of transition (the address of the next microinstruction to be executed). A special register is used to keep the microinstruction address. This approach allows for obtaining a microprogram control unit (MCU) with a regular circuit, which is quite simple to implement and test. A trivial structural diagram of the MCU is shown in Figure 3.
The MCU (Figure 3) uses the microinstruction address from the register and logical conditions (inputs) i l I to generate outputs o n O and the next address represented by variables T r T . A comparison of Figure 2 and Figure 3 shows that the MCU is a finite state machine in which blocks of input memory functions and outputs are replaced by the control memory. At the same time, microinstructions correspond to FSM states; microinstruction addresses correspond to state codes. This connection between FSMs and MCUs was first noted in [49].
The circuit of control memory was implemented using ROM [50,51,52]. For MCU (Figure 3), the required volume of such a ROM, V R O M , is determined as
V R O M = ( N + R S ) 2 L + R S .
For average FSMs [13], there is N = 50 , R S = 8 , L = 30 . If such a control unit is implemented as MCU (Figure 3), then it is necessary for 10 13 bits of control memory. In the 1950s, the use of such a big control memory would lead to a significant increase in the cost of a computer. Because of it, Figure 3 rather shows an idea of MCU, not the practical way of its implementation.
In order to diminish the required value of V R O M , two approaches have been proposed by M. Wilkes. The first of them is the selection of an input that should be used for generation of the transition address. As a rule, only a single logic condition is selected in each cycle of MCU operation. This allows for reducing the length of the address part of microinstruction. This approach leads to a two-level MCU shown in Figure 4.
The second approach is an encoding of collections of microoperations Y q O by maximum binary codes K ( Y q ) having R Q bits. In practical cases [53], there is R Q 8 . This allows reducing the length of the operational part of microinstruction up to
R Q = l o g 2 Q .
In (15), we use Q to denote the number of different COs for a particular STT.
If an MCU is implemented starting from STT (Table 1), then the following collections of outputs (COs) can be found: Y 1 = { o 1 , o 2 } , Y 2 = { o 3 } , Y 3 = { o 2 , o 4 } , Y 4 = { o 5 } . Accordingly, there is Q = 4 . Using (15) gives R Q = 2 . Let us use elements of the set Z = { z 1 , , z R Q } for encoding of the COs. It gives the set Z = { z 1 , z 2 } . Figure 5 shows one of the possible outcomes of encoding.
As follows from Figure 5, there is K ( Y 1 ) = 00 , , K ( Y 4 ) = 11 . The system of outputs is represented by the following SOP:
o 1 = Y 1 = z 1 ¯ z 2 ¯ ; o 2 = Y 1 Y 3 = z 1 ¯ ; o 3 = Y 2 = z 1 z 2 ¯ ; o 4 = Y 3 = z 1 ¯ z 2 ; o 5 = Y 4 = z 1 z 2 .
To implement the system (16), we should include in the MCU a block of outputs. This block consists of a decoder (DC) and a coder. Hence, this block has two levels of logic. The decoder transforms codes of COs into one-hot codes corresponding to COs. The coder transforms these one-hot codes into outputs. In the general case, the outputs are represented by the system
O = O ( Z ) .
If both of the approaches are used simultaneously, then there are three levels of logic blocks in the MCU. Figure 6 shows the structural diagram of three-level MCU with the compulsory addressing of microinstructions [50,54].
In the case of the compulsory addressing of microinstructions, the microinstruction format includes the operational part having R Q bits and the address part having R L = l o g 2 L bits with a code of logical condition to be checked and two address fields. The first address field includes an address of transition, if a logical condition to be checked is equal to 0 (or an address of unconditional transition). The second address field includes an address of transition if a logical condition to be checked is equal to 1. If a microprogram includes M microinstructions, then the number of address bits is determined by (4). Accordingy, each microinstruction has R Q + R L + 2 × R S bits. If R Q = R S = 8 and R L = 6 , then the value of V R O M is equal to 30 × 256 = 7680 bits.
The block of addressing generates address variables D r D . These variables depend on the inputs and microinstruction address part. This block is implemented on multiplexers (MXs) [52]. The block of outputs generates outputs as functions of the MCU operational part. Hence, an MCU is a Moore FSM.
The MCU with the block of addressing became the prototype of the FSMs with replacement of inputs. In literature [41] such FSMs are called M P FSMs, where “M” means “multiplexer”. The MCU with the block of outputs became the prototype of P Y FSMs with encoding of collections of outputs. The three-level MCU (Figure 6) corresponds to M P Y FSM. This means that various methods of structural decomposition can be used together.
The method of encoding of fields of compatible outputs (FCOs) was proposed to eliminate the coder from the block of outputs [55]. The outputs are compatible if they are not written in the same rows of STT. The set O is divided by I classes of compatible outputs:
O = O 1 O 2 O I .
Outputs o n O i are encoded by maximum binary codes K i ( o n ) . There are R i bits in the code K i ( o n ) :
R i = l o g 2 ( N i + 1 ) .
In (19), we use the symbol N i to denote the number of outputs in the class O i . The one is added to N i to take the relation o n O i into account.
The outputs o n O i are encoded using variables z r Z i . The total number of operational part bits, R F C O , is determined by summation of the values of (19). The structural diagram of the MCU based on this principle is the same as the one shown in Figure 6. However, the block of outputs consists of I decoders D C i . A decoder D C i generates outputs from the field F C O i .
This approach was used in optimizing control units of IBM/360 [56]. Additionally, they became the prototypes of P D FSMs [41].
There are three possible organizations of the block of outputs that are shown in Figure 7.
As follows from Figure 7, the one-hot organization (Figure 7a) leads to the fastest MCUs having the longest operational part. The block of outputs is absent. The maximum encoding of collections of outputs (Figure 7b) results in the two-level block of outputs. This is the slowest solution, but it provides the shortest operational part. As follows from Figure 7c, the encoding of FCOs results in a single-level block of outputs. This approach provides a compromise solution with the average delay and hardware amount.
The value of V R O M can be reduced due to using the nanomemory [44,54]. We now explain the idea of this approach (Figure 8).
If there are M o unique microinstructions in a microprogram, then they are encoded using R o variables v r V , where R o is determined as it is for R S . These codes are kept into a micro-memory (the first level of control memory). There are M codes in the micro-memory. The second level of memory (nanomemory) keeps the operational and addresses parts of these microinstructions. For example, there is M = 1024 , M 0 = 256 , and a microinstruction contains 64 bits. In the case of MCU (Figure 6), there are V 0 = 1024 × 64 = 64 k = 65,536 bits. We have R 0 = 10 . Accordingly, there are 1024 × 10 bits of the micro-memory and 256 × 64 = 16 k bits of the nanomemory. Hence, there are 26,624 bits of the control memory for MCU (Figure 8). It means that that approach allows reducing the volume of control memory by 2.46 time when compared to the MCU (Figure 6). This approach is a prototype of P H FSMs with encoding of product terms [41].
One fundamental law follows from the analysis of different methods of minimizing the value of V R O M . This is the following: the reducing hardware amount leads to an increase in the delay time of the resulting circuit (due to an increase in the number of logic levels). This law holds for all methods of structural decomposition.

4. Structural Decomposition in Matrix-Based Fsms

If an FSM is a part of an application-specific integrated circuit (ASIC) [57], then its circuit can be implemented using custom matrices [13,58]. These matrices are used as either AND-planes or OR-planes [59]. Each plane is a system of wires connected by CMOS transistors. Two wires (direct and compliment values of corresponding arguments) represent each literal of a SOP. Each term of a SOP corresponds to a wire.
To implement a matrix circuit of Mealy FSM, it is enough to use a single AND-matrix M 1 and a single OR-matrix M 2 . This is a trivial matrix implementation of P Mealy FSM (Figure 9).
The trivial matrix circuit (Figure 9) represents a P Mealy FSM [41]. This is the fastest matrix solution. However, such a solution is very redundant.
The hardware amount of matrix circuits is defined in conventional units of area (CUA) of matrices [13]. These areas are determined as the following:
S ( M 1 ) = 2 ( L + R S ) · H ;
S ( M 2 ) = H ( N + R S ) .
In (20) and (21), the symbols S ( M 1 ) , S ( M 2 ) stand for area of matrices M 1 and M 2 , respectively.
If there is L = 30 , N = 50 , R S = 8 , and H = 2000 (an average FSM [13]), then S ( M 1 ) = 152,000 CUA and S ( M 2 ) = 116,000 CUA. It gives the total area equal to 268,000 CUA. If each product term of SBFs (5) and (6) includes 3 + R = 11 literals, then there are 11 × 2000 = 22,000 useful interconnections in M 1 . If each term enters SOPs of five functions, then there are 5 × 2000 = 10,000 useful interconnections in M 2 . Hence, only 32,000 interconnections are used for implementing an FSM circuit. Because there are 268,000 interconnections, only 12% of the area is really used.
Two methods of structural decomposition were used to reduce the chip area that is occupied by an FSM circuit, namely [58]:
  • The replacement of inputs ( M P FSM).
  • The encoding of collections of outputs ( P Y FSM).
To design an M P FSM, it is necessary to replace the set I by some set P = { p 1 , , p G } . This makes sense if
G L .
The value of G is determined by the maximum number of inputs causing transitions from states s m S [58]. Consider the DST of Mealy FSM A 2 (Table 3).
In the case of A 2 , we have G = 2 . Accordingly, there is a set P = { p 1 , p 2 } .
To replace inputs, it is necessary to create the following SBF:
P = P ( T , I ) .
This SBF is constructed using a table of replacement. In the discussed case, Table 4 presents the table of replacement.
Using Table 4 gives the following SBF:
p 1 = T 1 ¯ T 2 ¯ T 3 ¯ i 1 T 1 ¯ T 2 ¯ T 3 i 2 T 1 ¯ T 2 T 3 ¯ i 4 T 1 ¯ T 2 T 3 i 7 ; p 2 = T 1 ¯ T 2 ¯ T 3 i 3 T 1 ¯ T 2 T 3 ¯ i 5 T 1 T 2 ¯ T 3 ¯ i 6 T 1 ¯ T 2 T 3 i 8 .
The SOP for p 1 includes terms v 1 v 4 ; the SOP for p 2 includes terms v 5 v 8 . Hence, there is N T ( P ) = 8 .
We should construct a table of M P FSM to design the circuit of M P FSM. It can be done by a transformation of the DST of P FSM. The transformation is reduced to the replacement of the column I h by the column P h [58]. In the discussed case, this leads to Table 5.
From Table 5, we can find that the IMFs and outputs of M P FSM are represented by the following SBFs:
D = D ( T , P ) ;
O = O ( T , P ) .
Systems (23)–(25) determine a matrix circuit of M P FSM that is shown in Figure 10.
In the M P Mealy FSM (Figure 10), the matrix M 3 implements terms of SBF (23). The matrix M 4 transforms terms v r V into functions (23). The matrix M 1 implements terms F h F . These terms correspond to the rows of DST. The matrix M 2 generates functions (25) and (26). These matrices have the following areas:
S ( M 1 ) = 2 ( G + R S ) · H ; S ( M 2 ) = H ( N + R S ) ; S ( M 3 ) = ( L + 2 R S ) · N T ( P ) ; S ( M 4 ) = N T ( P ) · G .
To optimize the matrix M 2 , the method of encoding of COs can be used [58]. As it is for MCU, Q COs are encoded by binary codes K ( Y q ) . These codes have R Q bits, where the expression (15) determines the value of R Q .
For FSM A 2 , the following COs can be found: Y 1 = , Y 2 = { o 1 , o 2 } , Y 3 = { o 3 } , Y 4 = { o 2 , o 4 } , Y 5 = { o 5 } , Y 6 = { o 3 , o 5 } . Accordingly, there is Q = 6 , R Q = 3 , Z = { z 1 , z 2 , z 3 } . To minimize the number of literals in (17), it is necessary to encode COs Y q O using the approach [60]. In the discussed case, Figure 11 shows the outcome of encoding.
Using codes (Figure 11), we can get the following SBF:
o 1 = Y 2 = z 2 z 3 ; o 2 = Y 2 Y 4 = z 1 ¯ z 2 ; o 3 = Y 3 Y 6 = z 1 z 3 ¯ ; o 4 = Y 4 = z 1 ¯ z 2 z 3 ¯ ; o 5 = Y 6 = z 1 z 2 ¯ z 3 ¯ .
There are N = 5 terms in (28). In the general case, there are N T ( O ) terms in (17). They form a set W.
To implement a P Y FSM circuit, it is necessary to create a DST of P Y FSM. For the FSM A 2 , it is Table 6.
The DST is a base for deriving SBFs (5) and
Z = Z ( T , I ) .
The SBFs (5), (17), and (29) determine a P Y Mealy FSM whose structural diagram is shown in Figure 12.
In P Y FSM, the matrix M 2 implements functions D r D and variables z r Z . The matrix M 5 transforms z r Z into terms of SBF (17). The matrix M 6 generates outputs o n D . These matrices have the following areas:
S ( M 1 ) = 2 ( L + R S ) · H ; S ( M 2 ) = H ( R S + R Q ) ; S ( M 5 ) = 2 R Q · N T ( O ) ; S ( M 6 ) = N T ( O ) · N .
These approaches can be used simultaneously [58]. This leads to M P Y Mealy FSM (Figure 13).
In the matrix circuit (Figure 13), the matrices M 3 and M 4 implement the SBF (23), the matrices M 5 and M 6 implement the SBF (17). The matrices M 1 and M 2 implement SBFs (25) and
Z = Z ( T , P ) .
There are two levels of logic in the matrix circuit of P Mealy FSM (Figure 9). This circuit has six levels of logic. Obviously, the P FSM is three times faster than an equivalent M P Y FSM (Figure 13). Let us compare areas of equivalent FSMs.
As shown in [58], the average FSMs have the following characteristics: L = 30 , N = 50 , R S = 8 , H = 2000 , G = 4 , N T ( P ) = 50 , R Q = 6 , N T ( O ) = 80 . This gives the following: S ( M 1 ) = 2 ( G + R S ) · H = 48,000 , S ( M 2 ) = H ( R S + R Q ) = 28,000 , S ( M 3 ) = ( L + 2 R S ) · N T ( P ) = 2300 , S ( M 4 ) = N T ( P ) · G = 200 , S ( M 5 ) = 2 R Q · N T ( O ) = 960 , S ( M 6 ) = N T ( O ) · N = 4000 . Now, we have the following total area of M P Y FSM circuit: 83,460 CUA. There are 268,000 CUA of the area of P Mealy FSM (Figure 9). This gives around 69% of economy. Accoridngly, an increase in the number of levels of a matrix circuit leads to an average reduction in area by 3.23 times. Of course, the FSM performance practically decreases to the same extent.
Accordingly, the methods of structural decomposition can be used for optimizing matrix circuits of Mealy FSMs. The same is true for Moore FSMs [61]. For further reducing the area, it is necessary to apply various methods of joint minimization of SBFs [43].

5. Structural Decomposition in Spld-Based Fsms

In the 1970s, a wide range of so-called simple programmable logic devices (SPLDs) appeared. This class includes programmable logic arrays (PLAs), programmable read-only memories (PROMs), and programmable array logic (PAL) [62,63,64,65]. A SPLD is a general purpose chip whose hardware can be configured by an end user to implement a particular product [66,67,68,69].
There is one common feature of SPLDs. Namely, they can be viewed as a composition of AND and OR arrays [62,63,64,65,70]. A typical SPLD structure is exactly the same as the one shown in Figure 9. Accordingly, SPLDs can implement SOPs representing the systems of Boolean functions.
In the case of PROM, the AND-array is fixed. It creates an address decoder. The OR-array is programmable. A PROM is the best tool for implementing SBFs that are represented by truth tables [10]. The number of address inputs of a PROM was rather small. Acccordingly, PROMs were used for implementing only parts of FSM circuits [71].
The joint using PROMs and multiplexers (MXs) leads to M P FSMs. The MXs implement the replacement of inputs that are represented by (23). The PROMs implement systems (25) and (26). To keep state codes, the register RG is used (Figure 14a). The joint using PROMs, decoders (DCs), and MXs leads to M P D FSMs (Figure 14b). To implement M P Y FSMs, it is necessary to use MXs and PROMs (Figure 14c).
As follows from Figure 14, different logic elements implement different parts of FSM circuits. This approach is a heterogeneous implementation of FSM circuit [71]. Of course, it is enough to use only memory blocks for implementing an FSM circuit [72].
The PLAs have the following specifics: both of the arrays are programmable [62,63]. Because of it, PLAs are used for implementing reduced SOPs [43] of SBFs [13,41,70]. Typical PLAs have S P L A = 16 inputs, t P L A = 8 outputs, and q P L A = 48 terms [73,74].
As a rule, FSM circuits were represented by networks of PLAs [13,75]. To optimize the number of chips in a circuit, the methods of structural decomposition were used. Additionally, the principle of heterogeneous implementation was used. For example, M P Y FSMs could be implemented using MXs, PLAs, and PROMs (Figure 15).
Different approaches were used for optimizing characteristics of PLA-based FSMs [76,77,78,79,80,81,82]. One of the new approaches was an encoding of FSM terms [78], leading to P H FSMs.
In this case, terms F h F , corresponding to rows of STT, were encoded by binary codes K ( F h ) having R H bits:
R H = l o g 2 H .
To encode terms, variables z r Z were used, where | Z | = R H . The following SBFs represent P H FSMs:
Z = Z ( I , T ) ; D = D ( Z ) ; O = O ( Z ) .
These SBFs were implemented using PLAs (for Z) and PROMs (for D , O ). Such a composition of PLAs and PROMs leads to P H FSM (Figure 16).
To implement a P H FSM, it is necessary to: (1) encode terms F h F ; (2) create a DST of P H FSM; (3) create SBFs (33); and, (4) program PLAs and PROMs. For example, there is H = 14 for Mealy FSM A 2 (Table 3). Using (32) gives R H = 4 and Z = { z 1 , , z 4 } . Let us encode terms in the trivial way: K ( F 1 ) = 0000 , , K ( F 14 ) = 1101 . Table 7 is a DST of P H Mealy FSM A 2 . Table 8 shows the PROMs’ contents.
Obviously, P H FSM (Figure 16) can be transformed into M P H , M P H Y , M P H D , P H Y , and P H D FSMs. To optimize circuits with decoders, the method [50] can be used.
To optimize hardware of PLA-based FSMs, it is possible to use the methods that are based on transformation of objects [27,83,84]. The following objects are characteristic for the Mealy FSMs [71]: states, outputs, and collections of outputs. The main idea of this approach is a representation of some objects as functions of other objects and additional variables.
The transformation of states into outputs leads to P S FSMs (Figure 17a). The transformation of states into COs leads to P S Y FSMs (Figure 17b). The transformation of COs into states leads to P O Y FSMs (Figure 17c).
As follows from Figure 17a, additional variables v r V replace inputs i e I in the SBF of outputs:
O = O ( T , V ) .
If | V | L , then the SOPs of (34) are much simpler than SOPs of (6). In P S Y FSMs (Figure 17b), the following SBFs are generated:
V = V ( T , I ) ;
Z = Z ( T , V ) .
In the case of P O Y FSMs (Figure 17c), the following new SBF is implemented:
D = D ( V , Z ) .
As follows from [83], the transformation of objects improves performance as compared with M P Y FSMs. Because of it, they are used in FPGA-based design [21].
The PAL chips have the following specific [64,85]: the AND array is programmable and OR-array is fixed. the terms of PAL are assigned to macrocells [23,74]. The evolution of this conception led to complex programmable logic devices (CPLDs) [15,69,86]. There are a huge number of publications related to PAL- and CPLD-based synthesis [64,73,85,87,88,89,90,91]. We do not discuss these methods in this survey. However, we note that the structural decomposition is used in CPLD-based FSMs [23].

6. Structural Decomposition in Fpga-Based Fsms

6.1. Basic Methods of Structural Decomposition in Design with Luts and Embs

Field-programmable gate arrays are widely used for implementing circuits of various digital systems [12,15,69,92]. To implement an FSM circuit, the following internal resources of FPGA chip can be used: look-up table (LUT) elements, embedded memory blocks (EMBs), programmable flip-flops, programmable interconnections, input-output blocks, and block of synchronization. LUTs and flip-flops form configurable logic blocks (CLBs). The “island-style” architecture is used in the majority of FPGAs [17,93,94].
A LUT is a block having S L inputs and a single output [95,96,97,98]. If a Boolean function depends on up to S L arguments [67], then the corresponding circuit only includes a single LUT. However, the number of LUT inputs is very limited [95,96,97]. Due to it, the methods of functional decomposition are used to implement the FPGA-based FSM circuits [99,100,101,102,103]. As a result, the FSM circuits have a lot of logic levels and a complex systems of interconnections [29]. Such circuits resemble programs that are based on intensive use of “go-to” operators [104]. Using terminology from programming, we can say that the functional decomposition produces the “spaghetti-type” LUT-based FSM circuits.
Modern FPGAs include a lot of configurable embedded memory blocks [95,96]. These CLBs allow for implementing systems of regular functions [28]. If at least a part of the FSM circuit is implemented using EMBs, then the characteristics of this circuit can be significantly improved [16]. Because of it, there are a lot of design methods targeting EMB-based FSMs [16,105,106,107,108,109,110,111,112,113,114,115]. In [28], there is the survey of various methods of EMB-based FSM design. However, very often, practically all available EMBs are used for implementing the operational blocks of digital systems. Accordingly, the EMB-based FSM design methods can only be applied if a designer has some “free” EMBs.
An EMB can be characterized by a pair S A , t F , where S A is a number of address inputs and t F is a number of memory cell outputs. A single EMB can keep a truth table of an SBF including up to t F Boolean functions depended on up to S A arguments [116]. A pair S A , t F defines a configuration of an EMB with the constant total number of bits (size of EMB):
V 0 = 2 S A × t F .
The parameters S A and t F could be defined by a designer [66]. It means that EMBs are configurable memory blocks [67]. The following configurations exist for modern EMBs [95,96]: 15 , 1 , 14 , 2 , , 9 , 64 . Accordingly, modern EMBs are very flexible and can be tuned to meet characteristics of a particular FSM. This explains the existence of a wide spectrum of EMB-based design methods [16,105,106,107,108,109,110,111,112,113,114,115].
If the condition
2 ( R S + L ) ( R S + N ) V 0
holds, then a single EMB implements an FSM circuit [28]. If (39) is violated, then an FSM circuit could be implemented as: (1) a homogenous network of EMBs or (2) a heterogeneous network where LUTs and EMBs are used together [16,114].
There are three approaches for implementing combinational parts of CLB-based FSMs. They are the following: (1) using only LUTs; (2) using only EMBs; and, (3) using the heterogeneous approach, when both LUTs and EMBs are applied [28].
One of the most crucial steps in the CLB-based design flow is the technology mapping [29,117,118]. The outcome of the technology mapping is a network of interconnected CLBs representing an FSM circuit. This step largely determines the resulting characteristics of an FSM circuit. These characteristics are strongly interrelated.
A chip area occupied by a CLB-based FSM circuit is mostly determined by the number of CLBs and the system of their interconnections. Obviously, to reduce the area, it is necessary to reduce the CLB count in an FSM circuit. As follows from [119], the more LUTs are included into an FSM circuit, the more power it consumes. Now, “process technology has scaled considerably …with current design activity at 14 and 7 nm. Due to it, interconnection delay now dominates logic delay” [18]. As noted in [120], the interconnections are responsible for the consume up to 70% of power. Accordingly, it is very important to reduce the amount of interconnections to improve the characteristics of FSM circuits. All of this can be done using methods of structural decomposition.
As follows from (39), an FSM circuit can be implemented by a single EMB if the following conditions hold for a configuration S A , t F :
S A R S + L ;
t F R S + N .
As a rule, the modern EMBs are synchronous blocks. Hence, there is no need in an additional register to keep FSM state codes [28]. Figure 18 shows a trivial EMB-based circuit of Mealy FSM.
To design such a circuit, it is necessary to [28]: (1) execute the state assignment; (2) construct a DST on the base of an STT; and, (3) create the truth table corresponding to the DST. This truth table has L + R S columns containing an address of a particular cell. Each cell has R S + N bits. Transitions from any state s m S are represented by H ( s m ) rows of the truth table [28]:
H ( s m ) = 2 L .
The following parameters can be found for A 1 (Table 2): the number of inputs L = 3 , and the number of state variables R S = 2 . Accordingly, using (42) gives H ( s m ) = 8 . If an input i e I is insignificant for transitions from a state s m S , then there are the same values of IMFs and outputs for cells with addresses having either i e = 0 or i e = 1 . This rule is illustrated by Table 9 with the transitions from state s 2 from Table 2.
In Table 9, the number of a cell is shown in the column q. The column h is added to compare Table 2 and Table 9. The even rows of Table 9 correspond to i 3 = 1 , and the odd rows correspond to i 3 = 0 .
The transition from LUTs to EMBs is similar to the transition from gates to large scale integration circuits. This transition improves all the characteristics of an FSM circuit, namely, the chip area that is occupied by FSM circuit, the FSM performance and power consumption. If conditions (40) and (41) are violated, then methods of structural decomposition can be used [21]. In this case, an FSM circuit is represented as a network of EMBs and LUTs.
The analysis of numerous literature has shown that the following methods of structural decomposition are used in EMB-based FSM design:
  • The replacement of inputs i e I by additional variables p g P leading to M P FSMs [16,107,108,109,110,111,112,113].
  • The maximum encoding of collections of outputs leading to P Y FSMs [28].
  • Mixed encoding of outputs leading to P Y M FSMs [121].
  • The encoding of product terms leading to P H FSMs [122].
Following the notation of [21], we denote, as LUTer, a block consisting of LUTs and. as EMBer, a block consisting of EMBs. The structural diagram of M P Mealy FSM is shown in Figure 19.
In M P FSM, the LUTerP implements SBF (23), the EMBer contains a truth table of SBFs (25) and (26). As follows from Figure 18 and Figure 19, the outputs o n O are synchronized. This is necessary to stabilize FSM outputs [42]. The M P Mealy FSM can be used if the following condition holds:
2 G + R S ( N + R S ) V o .
Clearly, the M P FSM (Figure 19) uses an idea of the two-level MCU (Figure 4) in an FPGA environment. The state variables create the address part of microinstructions. The number of EMBs in EMBer is determined as
n E M B = R S + N t F .
To diminish the value of n E M B , the maximum encoding of COs Y q O can be used [21]. The replacement of inputs can be used together with this approach. This results in the M P Y Mealy FSM (Figure 20).
In M P Y FSM, the EMBer implements SBFs (23) and (31). The LUTerO transforms codes K ( O q ) into outputs o n O . To do it, SBF (17) is implemented by LUTerO. Now, the number of EMBs in EMBer is determined as
n E M B = R S + R Q t F .
The value of R Q is determined by (15).
If the condition
R Q S L
holds, then a single-level circuit of LUTerO includes up to N LUTs. If (46) is violated, then a mixed encoding of outputs [121] can be used. The idea of this approach is the following.
Let it be Q = 17 , R Q = 5 , and S L = 4 . The analysis of these values shows that the condition (46) is violated. Let the set of COs include COs Y 5 = { o 1 , o 3 , o 4 } and Y 8 = { o 1 , o 4 } . If we eliminate o 3 O from Y 5 , then Y 5 Y 8 . Now, there is R Q = 4 = S L . The eliminated outputs form a set O E . The set of outputs is represented as O = O E O L , where O L O E = . This leads to M P Y M Mealy FSM (Figure 21).
In M P Y M FSM, the outputs o n O E are represented by SBF (26). The outputs o n O L are represented by (17). The outputs o n O E are represented by one-hot codes, the outputs o n O L by maximum binary codes. Because of that, this is a mixed encoding of outputs.
In [121], there is proposed a method allowing to create such a partition of the set O. It allows for eliminating the minimum possible number elements of O to create the set O E .
This approach can be used to diminish the number of CLBs in the circuit of LUTerO. For example, there is S L = 6 for LUTs of Virtex 7 [96]. If R Q = 6 , then the number of LUTs in the circuit of LUTerO is equal to N. However, the CLB can be organized as two LUTs having five shared inputs. If the mixed encoding of outputs gives the set O L with R Q = 5 , then the number of LUTs in LUTerO is determined as | O L | / 2 . The closer the values of N and | O L | are, the greater the saving in the number of CLBs.
Two approaches are possible for implementing EMB-based Mealy FSMs [122]. In both cases, the binary codes K ( F h ) encode the terms F h F . These codes have R H bits. The variables z r Z are used for encoding of terms, where | Z | = R H . The value of R H is determined by (32). The system Z = Z ( T , I ) represents the block of terms [122]. This system can be implemented as either the network of LUTs (Figure 22a) or the network of EMBs (Figure 22b).
Both methods should be used. Finally, the method leading to the minimum hardware should be selected [122].

6.2. Structural Decomposition in Lut-Based Design

As mentioned in [12], EMBs are widely used for implementing various blocks of digital systems. Accordingly, it is quite possible that only LUTs can be used for implementing FSM circuits. The methods of structural decomposition may be used in LUT-based FSMs [21]. They are used to improve LUT counts (and other characteristics) of LUT-based P Mealy FSMs (Figure 23).
In P FSMs, the LUTerD implements SBF (5) and the LUTerO implements SBF (6). Each function f i D O is represented by a SOP having N A ( f i ) literals. In the best case, there are R S LUTs in the circuit of LUTerD and N LUTs in the circuit of LUTerO. The following relation determines this case:
N A ( f i ) S L ( i { 1 , , R S + N } ) .
If (47) is violated, then a P FSM is represented by a multi-level circuit. To improve LUT count of such circuits, the model of M P Y FSM can be used.
This approach is proposed in [123]. It leads to a three-level circuit that is shown in Figure 24.
In M P Y FSM, the LUTerP implements system (23). It generates additional variables p g P replacing inputs i e I . The LUTerD generates input memory functions that are represented by (25). The LUTerZ generates variables z r Z used for encoding of collections of outputs. This block implements SBF (31). The LUTerO implements outputs o n O that are represented by SBF (17).
The method of synthesis of LUT-based M P Y FSM includes the following steps [123]:
  • Executing the replacement of inputs.
  • Executing the state assignment optimizing (23).
  • Deriving collections of outputs from the STT.
  • Executing the encoding of COs.
  • Creating the DST of M P Y FSM.
  • Deriving SBFs (25) and (31) from the DST.
  • Implementing FSM circuit using particular LUTs.
In [123], the results of experiments conducted to compare the characteristics of various models of LUT-based FSMs are shown. The standard benchmarks [124] were used for investigation. These benchmarks are Mealy FSMs; they are represented in KISS2 format. Table 10 contains the characteristics of these benchmark FSMs.
To conduct experiments [123], the CAD tool Vivado (ver. 2019.1) [125] was used with the target chip XC7VX690T2FFG1761 (Xilinx Virtex 7) [126]. There is S L = 6 for LUTs of Virtex 7 family.
Four other methods were compared with M P Y FSMs. They were Auto of Vivado, one-hot of Vivado, JEDI [39,127], and DEMAIN [128]. The benchmarks were divided by five categories. To do it, the values of R S + L and S L = 6 were used. If R S + L 6 , then benchmarks belong to category 0; if 6 < R S + L 12 , it is the category 1; if 12 < R S + L 18 , then it defines the category 2; if 18 < R S + L 24 , then benchmarks belong to category 3; finally, the relation R S + L > 24 , determines category 4.
Table 11 (the LUT counts) and Table 12 (the maximum operating frequency) represent the results of investigations [123]. As follows from Table 11, MPY-based FSMs have minimum number of LUTs. As follows from Table 12, MPY-based FSMs are the slowest. However, this disadvantage is reduced with the increase in the number of category.

6.3. New Methods of Structural Decomposition

In all thw discussed methods, only maximum state codes are used when the value of R S is determined by (4). In [129,130,131], there is a method of twofold state assignment proposed. In this case, any state s m S has two codes. The code K ( s m ) determines the state as an element of the set S. The code C ( s m ) defines the state as an element of some partition class.
To use the method [129,130], it is necessary to construct a partition Π S = { S 1 , , S K } of the set of states S. For each class S k Π S , the following condition holds:
R k + L k S L ( k = 1 , K ¯ ) .
In (48), the symbol R k denotes the length (the number of bits) of a code C ( s m ) for states s m S k ; the symbol L k defines the number of inputs i e I determining the transitions from states s m S k .
Each class S k Π S determines a DST k with transitions from states s m S k . This table includes inputs from the set I k I , outputs from the set O k O , and IMFs that are equal to 1 for transitions from states s m S k . These IMFs form a set D k D . A DST k determines the SBFs
D k = D k ( τ k , I k ) ;
O k = O k ( τ k , I k ) .
The variables τ r τ k encode states as elements of the set S k S .
This approach determines P T Mealy FSMs. The logic circuits of P T FSMs include three levels of logic blocks. Figure 25 showsn the structural diagram of P T FSM.
In P T Mealy FSM, the LUTerk ( k { 1 , , K } ) implements SBF (49) and (50). The LUTerTO implements the following SBFs:
D = D ( D 1 , , D K ) ;
O = O ( O 1 , , O K ) .
The LUTer τ transform state codes K ( s m ) into state codes C ( s m ) . To do it, the following SBF is implemented:
τ = τ ( T ) .
The structural diagram (Figure 25) determines a case of the one-hot encoding of outputs [130]. In [129], there was a method proposed combining the twofold state assignment with the maximum encoding of COs. This leads to P T Y Mealy FSM, as shown in Figure 26.
In P T Y FSM, the SBFs (50) and (52) are replaced by SBFs:
Z k = Z k ( τ k , I k ) ;
Z = Z ( Z 1 , , Z K ) .
Because of (48), each function (49), (50), and (54) are implemented as a single-level circuit; moreover, each function is implemented by a circuit having exactly one LUT. If there is
K S L ,
then it is enough a single LUT to implement a circuit for each determined by (52) and (54). If there is
R S S L ,
then the circuit of the LUTer τ is a single-level one. If the condition (46) holds, then there are up to N LUTs in the circuit of LUTerO.
In the best case, the conditions (46), (48), (56), and (57) are true. This best case determines the three-level LUT-based circuits of both P T and P T Y Mealy FSMs. Logic circuits of P T Y FSMs consume fewer LUTs than equivalent P Y FSMs, as shown in [129]. The experimental results [130] show that the logic circuits of P T FSMs consume fewer LUTs than this is for the equivalent P Mealy FSMs.
Using the twofold state assignment improves the characteristics of EMB-based FSMs, as shown in [122]. In [122], this method is used to improve LUT count in P H Mealy FSMs (Figure 22b). The method is based on finding a partition Π F = { F 1 , , F k } of the set of terms F. For each class of this partition, the following condition holds:
R k S L ( k { 1 , , K } ) .
The value of R k can be found as l o g 2 H k , where H k is a number of elements in the set F k .
The binary codes K ( F h ) encode the classes F k Π F . These codes have R c bits, where
R c = l o g 2 K .
The code of a term F h F is represented as
K ( F h ) = C ( F k ) * C ( F h ) .
In (60), C ( F h ) is a code of a term as an element of the set F k F , * is a sign of concatenation. To encode terms, the variables z r Z are used. To use free outputs of EMB, the set D is represented as D E D L and the set O is represented as O E O L . The classes of Π F are encoded using variables v r V . Now, the P H FSM is represented, as shown in Figure 27.
In [122], the results of experiments are shown. The following models were compared: P FSMs (Figure 23), M P FSMs (Figure 19), P H FSMs (Figure 22b), and the proposed approach (Figure 27). Table 13 (LUT counts), Table 14 (the maximum operating frequency), and Table 15 (the consumed power) show the results of experiments for some benchmarks [124].
The experiments have been conducted for the benchmarks [124], the evolution board with chip XC7VX690TFFG1761-2 [126] and CAD tool Vivado [125]. It is enough a single EMB of Virtex 7 to implement the logic circuits for any from 33 benchmarks [124], as shown in [122]. A network of LUTs and EMBs is used to implement circuits for other benchmarks.
It is possible to improve the characteristics of LUT-based FSM circuits using the transformation of objects [21]. For example, there is a structural diagram of P o Y Mealy FSM shown in Figure 28 [132].
In P o Y FSM, the LUTerZV implements SBFs (35) and
Z = Z ( T , I ) .
The LUTerT generates the functions from the SBF (37) and the LUTerO implements SBF (17). This approach is used to: (1) improve the operating frequency of multi-level M P Y FSMs and (2) reduce the LUT count as compared with P FSMs if the condition (47) is violated.
If condition (47) is violated for functions f i V Z , then the LUTerZV is represented by a multi-level circuit. To improve the characteristics of P o Y FSMs, the following approach is proposed in [132].
The set S is divided by classes S k Π S , such that the condition (48) holds for each class of Π S . Next, states s m S k are encoded by codes C ( s m ) having the minimum possible number of bits. The following SBFs should be implemented [132]: (54), (55), (17), (37), and
V k = V k ( τ k , I k ) ;
V = V ( V 1 , , V K ) .
This approach leads to P o T Y FSMs. The circuit of P o T Y FSM includes three levels of LUTs (Figure 29).
In P o T Y FSM, the LUTerk ( k { 1 , , K } ) implements SBFs (54) and (62). The LUTerZV generates functions z r Z and v r V . They are represented by SBFs (55) and (63). The LUTerO implements SBF (17), the LUTer τ generates functions (37).
There are experimental results in [132] that are obtained using the CAD tool Vivado [125] and the evolution board with Virtex 7 FPGA chip [126]. The following characteristics have been compared: the LUT counts (Table 16), maximum operating frequency (Table 17), and area-time products (Table 18).
As follows from Table 16, the P o Y FSMs require fewer LUTs than other investigated methods. The P o T Y FSMs consume more LUTs (8.84%) when compared to P o Y FSMs. However, other FSMs are based on functional decomposition. Their circuits require more LUTs than for P o T Y FSMs. The gain increases along with the growth of the category number.
As follows from Table 17, the P o T Y -based FSMs have the highest operating frequency as compared to other investigated methods. The following can be found from Table 18: the P o T Y -based FSMs produce circuits with better area-time products then it is for other investigated methods. Starting from average FSMs, P o T Y -based circuits have better area-time products.
Hence, using the methods of structural decomposition allows for improving characteristics of FPGA-based FSMs. Three-level circuits improve the LUT count and two-level circuits improve the performance. These methods can be applied together with other optimization methods used in FSM design [21].

7. Conclusions

Since the 1950s, digital systems have increasingly influenced different areas of our lives. The control units and other sequential blocks are very important parts of digital systems. Very often, the behaviour of sequential blocks is represented using a model of finite state machine. During these 70 years, several generations of logic elements that are used to implement FSM circuits have changed. However, one thing remained unchanged: regardless of the generation of logic elements, there is always the problem of reducing their number in the FSM circuit. This problem arises if a single-level FSM circuit with minimum possible amount of elements cannot be implemented. One of the ways for reducing the required hardware is the applying various methods of structural decomposition.
These approaches have roots in various methods that are used for optimizing the size of the control memory of microprogram control units. The following basic methods of structural decomposition are known: the replacement of FSM inputs, encoding of the collections of outputs, encoding of product terms corresponding to interstate transitions, and transformation of objects. Using these methods requires taking the peculiarities of logic elements into account. Recently, two new methods of structural decomposition have appeared. These new methods are: (1) the twofold state assignment and (2) the mixed encoding of FSM outputs. These methods are focused on FPGA-based FSMs.
This orientation is related to the fact that FPGA devices are very often used for implementing digital systems. These chips include a lot of LUT elements and embedded memory blocks. It allows implementing very complex digital systems. Embedded memory blocks are effective tools for implementing FSM circuits. However, it is quite possible that all available EMBs are used for implementing various blocks of a digital system. In this case, an FSM circuit is implemented as a network of LUTs. The main specific of LUTs is a very small number of inputs (for the vast majority of FPGAs the value of S L is less than 7). This feature makes it necessary to use the methods of functional decomposition in the FPGA-based design. As a rule, this leads to multi-level FSM circuits that are characterized by the very complex systems of “spaghetti-type” interconnections.
The optimization of the chip area that is occupied by a LUT-based FSM circuit can be achieved due to applying various methods of structural decomposition. Numerous studies show that the structural decomposition produces the FSM circuits having better characteristics than their counterparts based on the functional decomposition. The FSM circuits that are based on the structural decomposition are characterized by the regular system of interconnections and predicted number of logic levels. The same is true for the heterogeneous implementation of FSM circuits when LUTs and EMBs are used simultaneously.
In this review, we have shown the roots of structural decomposition methods and their development starting from the 1950s. Our research shows that these methods can be used for optimizing FSM circuits that were implemented with any logic elements (PROMs, PLAs, PALs, CPLDs, FPGAs, and custom matrices of ASIC). Now, the majority of digital systems are implemented using FPGAs and ASICs. It is difficult to imagine what elements will replace them in the future. However, one thing remains clear: these elements will also have limits on the number of inputs, outputs, and terms. The results of the research presented in this article allow us to conclude that the methods of structural decomposition will be used in the future generations of the logic elements implementing FSM circuits.

Author Contributions

Conceptualization, A.B., L.T. and K.K.; methodology, A.B., L.T. and K.K.; formal analysis, A.B., L.T. and K.K.; writing—original draft preparation, A.B., L.T. and K.K.; supervision, A.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available in the article.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
CLBconfigurable logic block
COFcollection of output functions
COcollection of output
CPLDcomplex programmable logic device
DSTdirect structure table
EMBembedded memory block
ESCextended state code
FCOfield of compatible outputs
FDfunctional decomposition
FSMfinite state machine
FPGAfield-programmable gate array
IMFinput memory function
LUTlook-up table
PALprogrammable array Logic
PLAprogrammable logic array
PROMprogrammable read-only memory
ROMread-only memory
SBFsystems of Boolean functions
SDstructural decomposition
SOPsum-of-products
SRGstate register
STTstate transition table

References

  1. Alur, R. Principles of Cyber-Physical Systems; MIT Press: Cambridge, MA, USA, 2015. [Google Scholar]
  2. Suh, S.C.; Tanik, U.J.; Carbone, J.N.; Eroglu, A. Applied Cyber-Physical Systems; Springer: New York, NY, USA, 2014. [Google Scholar]
  3. Krzywicki, K.; Barkalov, A.; Andrzejewski, G.; Titarenko, L.; Kolopienczyk, M. SoC research and development platform for distributed embedded systems. Prz. Elektrotech. 2016, 92, 262–265. [Google Scholar] [CrossRef] [Green Version]
  4. Nowosielski, A.; Małecki, K.; Forczmański, P.; Smoliński, A.; Krzywicki, K. Embedded Night-Vision System for Pedestrian Detection. IEEE Sens. J. 2020, 20, 9293–9304. [Google Scholar] [CrossRef]
  5. Barkalov, A.; Titarenko, L.; Mazurkiewicz, M. Foundations of Embedded Systems; Springer International Publishing: New York, NY, USA, 2019. [Google Scholar]
  6. Lee, E.A.; Seshia, S.A. Introduction to Embedded Systems: A Cyber-Physical Systems Approach; MIT Press: Cambridge, MA, USA, 2017. [Google Scholar]
  7. Barkalov, A.; Titarenko, L.; Andrzejewski, G.; Krzywicki, K.; Kolopienczyk, M. Fault detection variants of the CloudBus protocol for IoT distributed embedded systems. Adv. Electr. Comput. Eng. 2017, 17, 3–10. [Google Scholar] [CrossRef]
  8. Zajac, W.; Andrzejewski, G.; Krzywicki, K.; Królikowski, T. Finite State Machine Based Modelling of Discrete Control Algorithm in LAD Diagram Language With Use of New Generation Engineering Software. Procedia Comput. Sci. 2019, 159, 2560–2569. [Google Scholar] [CrossRef]
  9. Marwedel, P. Embedded System Design: Embedded Systems Foundations of Cyber-Physical Systems, and the Internet of Things, 3rd ed.; Springer International Publishing: New York, NY, USA, 2018. [Google Scholar]
  10. De Micheli, G. Synthesis and Optimization of Digital Circuits; McGraw-Hill: Cambridge, MA, USA, 1994. [Google Scholar]
  11. Gajski, D.D.; Abdi, S.; Gerstlauer, A.; Schirner, G. Embedded System Design: Modeling, Synthesis and Verification; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
  12. Sklyarov, V.; Skliarova, I.; Barkalov, A.; Titarenko, L. Synthesis and Optimization of FPGA-Based Systems; Vol. 231 of Lecture Notes in Electrical Engineering; Springer: Berlin, Germany, 2014; Volume 294. [Google Scholar]
  13. Baranov, S. Logic Synthesis of Control Automata; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1994. [Google Scholar]
  14. El-Maleh, A.H. Finite state machine-based fault tolerance technique with enhanced area and power of synthesised sequential circuits. IET Comput. Digit. Tech. 2017, 11, 159–164. [Google Scholar] [CrossRef]
  15. Jenkins, J.H. Designing with FPGAs and CPLDs; Prentice Hall: Hoboken, NJ, USA, 1994. [Google Scholar]
  16. Tiwari, A.; Tomko, K.A. Saving Power by Mapping Finite-State Machines into Embedded Memory Blocks in FPGAs. In Proceedings of the Proceedings Design, Automation and Test in Europe Conference and Exhibition, Paris, France, 16–20 February 2004; Volume 2, pp. 916–921. [Google Scholar]
  17. Trimberger, S.M. Field-Programmable Gate Array Technology; Springer Science & Business Media: Berlin, Germany, 2012. [Google Scholar]
  18. Feng, W.; Greene, J.; Mishchenko, A. Improving FPGA Performance with a S44 LUT Structure. In Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA”18), Monterey, CA, USA, 25–27 February 2018; p. 6. [Google Scholar] [CrossRef]
  19. Benini, L.; De Micheli, G. State assignment for low power dissipation. IEEE J. Solid-State Circuits 1995, 30, 258–268. [Google Scholar] [CrossRef]
  20. Agrawal, R.; Borowczak, M.; Vemuri, R. A State Encoding Methodology for Side-Channel Security vs Power Trade-Off Exploration. In Proceedings of the 2019 32nd International Conference on VLSI Design and 2019 18th International Conference on Embedded Systems (VLSID), Delhi, India, 5–9 January 2019; pp. 70–75. [Google Scholar]
  21. Barkalov, A.; Titarenko, L.; Mielcarek, K.; Chmielewski, S. Logic Synthesis for FPGA-Based Control Units—Structural Decomposition in Logic Design; Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2020; Volume 636. [Google Scholar]
  22. Barkalov, A.; Titarenko, L. Logic Synthesis for FSM-Based Control Units; Springer: Berlin, Germany, 2009; Volume 53. [Google Scholar]
  23. Czerwinski, R.; Kania, D. Finite State Machine Logic Synthesis for Complex Programmable Logic Devices; Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2013; Volume 231. [Google Scholar]
  24. Barkalov, A.; Titarenko, L.; Mazurkiewicz, M.; Krzywicki, K. Improving LUT count of FPGA-based sequential blocks. Bull. Pol. Acad. Sci. Tech. Sci. 2021. [Google Scholar] [CrossRef]
  25. Wilkes, M.V. The Best Way to Design an Automatic Calculating Machine. In Proceedings of the Manchester University Computer Inaugural Conference, London, UK, 9–12 July 1951; pp. 16–18. [Google Scholar]
  26. Wilkes, M.V.; Stringer, J.B. Micro-programming and the design of the control circuits in an electronic digital computer. In Mathematical Proceedings of the Cambridge Philosophical Society; Cambridge University Press: Cambridge, UK, 1953; Volume 49, pp. 230–238. [Google Scholar]
  27. Barkalov, A.; Titarenko, L.; Barkalov, A., Jr. Structural decomposition as a tool for the optimization of an FPGA-based implementation of a Mealy FSM. Cybern. Syst. Anal. 2012, 48, 313–322. [Google Scholar] [CrossRef]
  28. Barkalov, A.; Titarenko, L.; Kolopienczyk, M.; Mielcarek, K.; Bazydlo, G. Logic Synthesis for FPGA-Based Finite State Machines; Springer: Chem, Switzerland, 2015; pp. 2–31. [Google Scholar]
  29. Kubica, M.; Opara, A.; Kania, D. Technology Mapping for LUT-Based FPGA; Springer: Cham, Switzerland, 2021. [Google Scholar]
  30. Brayton, R.; Mishchenko, A. ABC: An Academic Industrial-Strength Verification Tool. In Computer Aided Verification; Touili, T., Cook, B., Jackson, P., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 24–40. [Google Scholar]
  31. Mealy, G.H. A method for synthesizing sequential circuits. Bell Syst. Tech. J. 1955, 34, 1045–1079. [Google Scholar] [CrossRef]
  32. Moore, E.F. Gedanken-experiments on sequential machines. Autom. Stud. 1956, 34, 129–153. [Google Scholar]
  33. Glushkov, V.M. Synthesis of Digital Automata; Foreign Technology Div Wright-Patterson Afb Ohio: Dayton, OH, USA, 1965. [Google Scholar]
  34. Issa, H.H.; Ahmed, S.M.E. FPGA implementation of floating point based cuckoo search algorithm. IEEE Access 2019, 7, 134434–134447. [Google Scholar] [CrossRef]
  35. Senhadji-Navarro, R.; Garcia-Vargas, I. Methodology for Distributed-ROM-based Implementation of Finite State Machines. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 2020. [Google Scholar] [CrossRef]
  36. Klimowicz, A. Combined State Splitting and Merging for Implementation of Fast Finite State Machines in FPGA. In International Conference on Computer Information Systems and Industrial Management; Springer: Cham, Switzerland, 2020; pp. 65–76. [Google Scholar]
  37. Gazi, O.; Arlı, A.Ç. VHDL Implementation of Finite State Machines and Practical Applications. In State Machines Using VHDL; Springer: Cham, Switzerland, 2021; pp. 55–113. [Google Scholar]
  38. Yan, Z.; Jiang, H.; Li, B.; Yang, M. A Flowchart Based Finite State Machine Design and Implementation Method for FPGA. In International Conference on Internet of Things as a Service; Springer: Cham, Switzerland, 2020; pp. 295–310. [Google Scholar]
  39. Sentowich, E.; Singh, K.J.; Lavagno, L.; Moon, C.; Murgai, R.; Saldanha, A.; Savoj, H.; Stephan, P.R.; Brayton, R.K.; Sangiovanni-Vincentelli, A. SIS: A System for Sequential Circuit Synthesis; University of California: Berkely, CA, USA, 1992. [Google Scholar]
  40. ABC System. Available online: https://people.eecs.berkeley.edu/~alanmi/abc/ (accessed on 6 April 2021).
  41. Baranov, S.; Skliarov, V. Digital Devices with Programmable LSIs with Matrix Structure. Radio and Communications; Radio Sviaz: Moscow, Russia, 1986. [Google Scholar]
  42. Skliarov, V. Synthesis of Automata with Matrix LSIs; Nauka i Technika: Minsk, Belarus, 1984. [Google Scholar]
  43. McCluskey, E.J. Logic Design Principles with Emphasis on Testable Semicustom Circuits; Prentice-Hall, Inc.: Hoboken, NJ, USA, 1986. [Google Scholar]
  44. Agerwala, T. Microprogram optimization: A survey. IEEE Comput. Archit. Lett. 1976, 25, 962–973. [Google Scholar] [CrossRef]
  45. Agrawala, A.K.; Rauscher, T.G. Foundations of Microprogramming; Academic Press: New York, NY, USA, 1976. [Google Scholar]
  46. Chu, Y. Computer Organization and Microprogramming; Prentice Hall: Hoboken, NJ, USA, 1972. [Google Scholar]
  47. Flynn, M.J.; Rosin, R.F. Microprogramming: An introduction and a viewpoint. IEEE Trans. Comput. 1971, 100, 727–731. [Google Scholar] [CrossRef]
  48. Habib, S. Microprogramming and Firmware Engineering Methods; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1988. [Google Scholar]
  49. Palagin, A.; Rakitskij, A. Three structures of microprogram control units. Control Mach. Syst. 1984, 3, 40–43. [Google Scholar]
  50. Kravcov, L.; Chernicki, G. Design of Microprogram Control Units; Energy: Leningrad, Russia, 1976. [Google Scholar]
  51. Dasgupta, S. The organization of microprogram stores. ACM Comput. Surv. (CSUR) 1979, 11, 39–65. [Google Scholar] [CrossRef]
  52. Husson, S.S.; Mm, S. Microprogramming: Principles and Practices; Prentice-Hall Inc.: Englewood Cliffs, NJ, USA, 1970. [Google Scholar]
  53. Salisbury, A.B. Microprogrammable Computer Architectures; Elsevier Science Inc.: Amsterdam, The Netherlands, 1976. [Google Scholar]
  54. Baranov, S.; Barkalov, A. Microprogramming: Principles, methods, applications. Foreign Radioelectron. 1984, 5, 3–29. [Google Scholar]
  55. Schwartz, S.J. An Algorithm for Minimizing Read only Memories for Machine Control. In Proceedings of the 9th Annual Symposium on Switching and Automata Theory, Schenedtady, NY, USA, 15–18 October 1968; pp. 28–33. [Google Scholar]
  56. Tucker, S.G. Microprogram control for System/360. IBM Syst. J. 1967, 6, 222–241. [Google Scholar] [CrossRef]
  57. Solovjev, V.; Chyzy, M. Refined CPLD Macrocell Architecture for the Effective FSM Implementation. In Proceedings of the 25th EUROMICRO Conference, Informatics: Theory and Practice for the New Millennium, Milan, Italy, 8–10 September 1999; Volume 1, pp. 102–109. [Google Scholar]
  58. Baranov, S.I. Synthesis of Microprogram Machines; Energiya: Leningrad, Russia, 1979. [Google Scholar]
  59. Navabi, Z. Embedded Core Design with FPGAs; McGraw-Hill Professional: New York, NY, USA, 2006. [Google Scholar]
  60. Achasova, S. Synthesis algorithms for automata with PLAs. Sov. Radio 1987, 3, 22–33. [Google Scholar]
  61. Barkalov, A.; Węgrzyn, M. Design of Control Units with Programmable Logic; University of Zielona Góra Press: Zielona Góra, Poland, 2006. [Google Scholar]
  62. Baranov, S.; Barkalov, A. Application of programmable logic arrays in digital systems. Foregin Radioelectron. 1982, 6, 67–79. [Google Scholar]
  63. Baranov, S.; Sinjov, V. Programmable logic arrays in digital systems. Foregin Radioelectron. 1976, 1, 78–84. [Google Scholar]
  64. Palagin, A.; Barkalov, A.; Usifov, S.; Szwets, A. Synthesis of microprogram automata with PLIs. Kiev IC NAN 1992, 92, 18–26. [Google Scholar]
  65. Gorman, K. The programmable logic array: A new approach to microprogramming. Electron. Des. News 1973, 18, 68–75. [Google Scholar]
  66. Maxfield, C. The Design Warrior’s Guide to FPGAs: Devices, Tools and Flows; Elsevier: Amsterdam, The Netherlands, 2004. [Google Scholar]
  67. Maxfield, C. FPGAs: Instant Access; Elsevier: Amsterdam, The Netherlands, 2011. [Google Scholar]
  68. Hemel, A. The PLA: A different kind of ROM. Electron. Des. 1976, 24, 28–47. [Google Scholar]
  69. Brown, S.; Rose, J. Architecture of FPGAs and CPLDs: A tutorial. IEEE Des. Test Comput. 1996, 13, 42–57. [Google Scholar] [CrossRef] [Green Version]
  70. Bibilo, P. Synthesis of Combinational PLA Structures for VLSI; Nauka i Tehnika: Minsk, Belarus, 1992. [Google Scholar]
  71. Below, P.L.A.L. Digital Systems Design with Programmable Logic; Addison-Wesley: Boston, MA, USA, 1990. [Google Scholar]
  72. Sasao, T. Memory-Based Logic Synthesis; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
  73. Solovjov, V. Design of functional blocks of digital systems with programmable logic devices. Bestprint 1996, 7, 40–52. [Google Scholar]
  74. Solovjov, V. Design of Digital Systems Basing on Programmable Logic Integrated Circuits; Hotline–Telecom: Moscow, Russia, 2001. [Google Scholar]
  75. Baranov, S.I. Logic and System Design of Digital Systems; TUT Press: Tallinn, Estonia, 2008. [Google Scholar]
  76. Baer, J.L.; Koyama, B. On the minimization of the width of the control memory of microprogrammed processors. IEEE Comput. Archit. Lett. 1979, 28, 310–316. [Google Scholar] [CrossRef]
  77. Novikov, S. Synthesis Of Logic-Circuits With Programmable Logic-Arrays. Avtomat. Vychislitelnaya Tekh. 1977, 5, 1–4. [Google Scholar]
  78. Skilarov, V. Synthesis of Microprogram Automata with Standard PLAs; Automatic Control and Computer Sciences; Allerton Press Inc.: New York, NY, USA, 1983; pp. 28–35. [Google Scholar]
  79. Skilarov, V. Using decoders in microprogram automata with matrix structure. Izwiestia Wuzow Priborostrojenie 1982, 12, 27–31. [Google Scholar]
  80. Sorokin, B. A Method of Synthesis of Microprogram Automata on Standard ROMs and PLAs. Avtomat. Vychislitelnaya Tekh. 1984, 2, 69–77. [Google Scholar]
  81. Barkalov, A. Multilevel PLA schemes for microprogram automata. Cybern. Syst. Anal. 1995, 31, 489–495. [Google Scholar]
  82. Barkalov, A. Optimization of multilevel circuit of mealy FSM with PLAs. Control Syst. Mach. 1994, 93, 13–16. [Google Scholar]
  83. Barkalov, A.A.; Barkalov, A.A.J. Design of Mealy finite-state machines with the transformation of object codes. Int. J. Appl. Math. Comput. Sci. 2005, 15, 151–158. [Google Scholar]
  84. Barkalov, A.; Titarenko, L.; Mielcarek, K.; Wegrzyn, M. Design of EMB-based mealy FSMs with transformation of output functions. IFAC-PapersOnLine 2015, 48, 197–201. [Google Scholar] [CrossRef]
  85. Palagin, A.; Barkalov, A.; Usifov, S.; Starodubov, K.; Svetc, A. Realization of microprogrammed automata on CPLD. Control Syst. Mach. 1991, 8, 18–22. [Google Scholar]
  86. Zeidman, B. Designing with FPGAS and CPLDS; CRC Press: Boca Raton, FL, USA, 2002. [Google Scholar]
  87. Kania, D. Two-level logic synthesis on PALs. Electron. Lett. 1999, 35, 879–880. [Google Scholar] [CrossRef]
  88. Kania, D. Two-Level Logic Synthesis on PAL-Based CPLD and FPGA Using Decomposition. In Proceedings of the 25th EUROMICRO Conference, Informatics: Theory and Practice for the New Millennium, Milan, Italy, 8–10 September 1999; Volume 1, pp. 278–281. [Google Scholar]
  89. Kania, D. Coding capacity of PAL-based logic blocks included in CPLDs and FPGAs. IFAC Proc. Vol. 2000, 33, 167–172. [Google Scholar] [CrossRef]
  90. Kania, D. An Efficient Algorithm for Output Coding in PAL Based CPLDs. Int. J. Eng. 2002, 15, 325–328. [Google Scholar]
  91. Kania, D.; Milik, A. Logic Synthesis based on decomposition for CPLDs. Microprocess. Microsyst. 2010, 34, 25–38. [Google Scholar] [CrossRef]
  92. Bomar, B.W. Implementation of microprogrammed control in FPGAs. IEEE Trans. Ind. Electron. 2002, 49, 415–422. [Google Scholar] [CrossRef]
  93. Kuon, I.; Tessier, R.; Rose, J. FPGA Architecture: Survey and Challenges; Now Publishers Inc.: Delft, The Netherlands, 2008. [Google Scholar]
  94. Trimberger, S.M.S. Three Ages of FPGAs: A Retrospective on the First Thirty Years of FPGA Technology: This Paper Reflects on How Moore’s Law Has Driven the Design of FPGAs Through Three Epochs: The Age of Invention, the Age of Expansion, and the Age of Accumulation. IEEE Solid-State Circuits Mag. 2018, 10, 16–29. [Google Scholar] [CrossRef]
  95. Altera. Cyclone IV Device Handbook. Available online: http://www.altera.com/literature/hb/cyclone-iv/cyclone4-handbook.pdf (accessed on 6 April 2021).
  96. Xilinx FPGAs. Available online: https://www.xilinx.com/products/silicon-devices/fpga.html (accessed on 6 April 2021).
  97. Intel FPGAs and Programmable Devices. Available online: https://www.intel.pl/content/www/pl/pl/products/programmable.html (accessed on 6 April 2021).
  98. Kilts, S. Advanced FPGA Design: Architecture, Implementation, and Optimization; Wiley-IEEE Press: Hoboken, NJ, USA, 2007. [Google Scholar]
  99. Łuba, T.; Rawski, M.; Jachna, Z. Functional Decomposition as a Universal method of Logic Synthesis for Digital Circuits. In Proceedings of the 9th International Conference Mixed Design of Integrated Circuits and Systems MixDes, Wroclaw, Poland, 20–22 June 2002; Volume 2, pp. 285–290. [Google Scholar]
  100. Scholl, C. Functional Decomposition with Applications to FPGA Synthesis; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
  101. Nowicka, M.; Luba, T.; Rawski, M. FPGA-based decomposition of boolean functions. Algorithms and implementation. In ACS’98: Advanced Computer Systems; Instytut Informatyki Politechniki Szczecinskiej: Szczecin, Poland, 1998; pp. 502–509. [Google Scholar]
  102. Łuba, T. Multi-level logic synthesis based on decomposition. Microprocess. Microsyst. 1994, 18, 429–437. [Google Scholar] [CrossRef]
  103. Machado, L.; Cortadella, J. Support-reducing decomposition for FPGA mapping. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2018, 39, 213–224. [Google Scholar] [CrossRef]
  104. Dahl, O.J.; Dijkstra, E.W.; Hoare, C.A.R. Structured Programming; Academic Press Ltd.: Cambridge, MA, USA, 1972. [Google Scholar]
  105. Kolopienczyk, M.; Barkalov, A.; Titarenko, L. Hardware Reduction for RAM-Based Moore FSMs. In Proceedings of the 7th International Conference on Human System Interactions (HSI), Costa da Caparica, Portugal, 16–18 June 2014; pp. 255–260. [Google Scholar]
  106. Kołopieńczyk, M.; Titarenko, L.; Barkalov, A. Design of EMB-based Moore FSMs. J. Circuits Syst. Comput. 2017, 26, 1750125. [Google Scholar] [CrossRef]
  107. Das, N.; Priya, P.A. FPGA implementation of reconfigurable finite state machine with input multiplexing architecture using hungarian method. Int. J. Reconfigurable Comput. 2018. [Google Scholar] [CrossRef] [Green Version]
  108. Garcia-Vargas, I.; Senhadji-Navarro, R. Finite state machines with input multiplexing: A performance study. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2015, 34, 867–871. [Google Scholar] [CrossRef]
  109. Garcia-Vargas, I.; Senhadji-Navarro, R.; Jiménez-Moreno, G.; Civit-Balcells, A.; Guerra-Gutierrez, P. ROM-Based Finite State Machine Implementation in Low Cost FPGAs. In Proceedings of the 2007 IEEE International Symposium on Industrial Electronics, Vigo, Spain, 4–7 June 2007; pp. 2342–2347. [Google Scholar]
  110. Senhadji-Navaro, R.; Garcia-Vargas, I. High-speed and area-efficient reconfigurable multiplexer bank for RAM-based finite state machine implementations. J. Circuits Syst. Comput. 2015, 24, 1550101. [Google Scholar] [CrossRef]
  111. Senhadji-Navarro, R.; Garcia-Vargas, I. High-performance architecture for Binary-Tree-Based finite state machines. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2017, 37, 796–805. [Google Scholar] [CrossRef] [Green Version]
  112. Senhadji-Navarro, R.; Garcia-Vargas, I.; Jimenez-Moreno, G.; Civit-Ballcels, A. ROM-based FSM implementation using input multiplexing in FPGA devices. Electron. Lett. 2004, 40, 1249–1251. [Google Scholar] [CrossRef]
  113. Sklyarov, V. Synthesis and implementation of RAM-based finite state machines in FPGAs. In International Workshop on Field Programmable Logic and Applications; Springer: Berlin/Heidelberg, Germany, 2000; pp. 718–727. [Google Scholar]
  114. Rawski, M.; Selvaraj, H.; Łuba, T. An application of functional decomposition in ROM-based FSM implementation in FPGA devices. J. Syst. Archit. 2005, 51, 424–434. [Google Scholar] [CrossRef]
  115. Rawski, M.; Tomaszewicz, P.; Borowik, G.; Łuba, T. 5 logic synthesis method of digital circuits designed for implementation with embedded memory blocks of FPGAs. In Design of Digital Systems and Devices; Springer: Berlin/Heidelberg, Germany, 2011; pp. 121–144. [Google Scholar]
  116. Rafla, N.I.; Gauba, I. A Reconfigurable Pattern Matching Hardware Implementation Using on-Chip RAM-Based FSM. In Proceedings of the 2010 53rd IEEE International Midwest Symposium on Circuits and Systems, Seattle, WA, USA, 1–4 August 2010; pp. 49–52. [Google Scholar]
  117. Mishchenko, A.; Chattarejee, S.; Brayton, R. Improvements to technology mapping for LUT-based FPGAs. IEEE Trans. CAD 2006, 27, 240–253. [Google Scholar]
  118. Kubica, M.; Kania, D.; Kulisz, J. A technology mapping of fsms based on a graph of excitations and outputs. IEEE Access 2019, 7, 16123–16131. [Google Scholar] [CrossRef]
  119. Cong, J.; Yan, K. Synthesis for FPGAs with Embedded Memory Blocks. In Proceedings of the 2000 ACM/SIGDA Eighth International Symposium on Field Programmable Gate Arrays, Monterey, CA, USA, 10–11 February 2000; pp. 75–82. [Google Scholar]
  120. Barkalov, A.; Bukowiec, A. Synthesis of mealy finite states machines for interpretation of verticalized flow-charts. Theor. Appl. Inform. 2005, 5, 39–51. [Google Scholar]
  121. Barkalov, A.; Titarenko, L.; Chmielewski, S. Mixed encoding of collections of output variables for LUT-based mealy FSMs. J. Circuits Syst. Comput. 2019, 28, 1950131. [Google Scholar] [CrossRef]
  122. Barkalov, A.; Titarenko, L.; Mazurkiewicz, M.; Krzywicki, K. Encoding of terms in EMB-based Mealy FSMs. Appl. Sci. 2020, 10, 2762. [Google Scholar] [CrossRef]
  123. Barkalov, A.; Titarenko, L.; Krzywicki, K. Reducing LUT Count for FPGA-Based Mealy FSMs. Appl. Sci. 2020, 10, 5115. [Google Scholar] [CrossRef]
  124. McElvain, K. LGSynth93 Benchmark; Mentor Graphics: Wilsonville, OR, USA, 1993. [Google Scholar]
  125. Vivado Design Suite User Guide: Synthesis UG901 (v2019.1). Available online: https://www.xilinx.com/support/documentation/sw_manuals/xilinx2019_1/ug901-vivado-synthesis.pdf (accessed on 6 April 2021).
  126. VC709 Evaluation Board for the Virtex-7 FPGA User Guide; UG887 (v1.6); Xilinx, Inc.: San Jose, CA, USA, 2019.
  127. Lin, B. Synthesis of Multiple-Level Lgic from Symbolic High-Level Description Languages. In Proceedings of the IFIP International Conference on Very Large Scale Integration, Munich, Germany, 16–18 August 1989. [Google Scholar]
  128. Rawski, M.; Łuba, T.; Jachna, Z.; Tomaszewicz, P. The influence of functional decomposition on modern digital design process. In Design of Embedded Control Systems; Springer: Boston, MA, USA, 2005; pp. 193–204. [Google Scholar]
  129. Barkalov, O.; Titarenko, L.; Mielcarek, K. Hardware reduction for LUT-based Mealy FSMs. Int. J. Appl. Math. Comput. Sci. 2018, 28, 595–607. [Google Scholar] [CrossRef] [Green Version]
  130. Barkalov, A.; Titarenko, L.; Mielcarek, K. Improving characteristics of LUT-based Mealy FSMs. Int. J. Appl. Math. Comput. Sci. 2020, 30, 745–759. [Google Scholar]
  131. Barkalov, A.; Titarenko, L.; Krzywicki, K.; Saburova, S. Improving Characteristics of LUT-Based Mealy FSMs with Twofold State Assignment. Electronics 2021, 10, 901. [Google Scholar] [CrossRef]
  132. Barkalov, A.; Titarenko, L.; Krzywicki, K.; Saburova, S. Improving the Characteristics of Multi-Level LUT-Based Mealy FSMs. Electronics 2020, 9, 1859. [Google Scholar] [CrossRef]
Figure 1. The state transition graph of Mealy finite state machine (FSM) A 1 .
Figure 1. The state transition graph of Mealy finite state machine (FSM) A 1 .
Electronics 10 01174 g001
Figure 2. Structural diagram of P Mealy FSM.
Figure 2. Structural diagram of P Mealy FSM.
Electronics 10 01174 g002
Figure 3. Trivial implementation of microprogram control unit.
Figure 3. Trivial implementation of microprogram control unit.
Electronics 10 01174 g003
Figure 4. Two-level microprogram control unit.
Figure 4. Two-level microprogram control unit.
Electronics 10 01174 g004
Figure 5. Maximum codes of collections of outputs.
Figure 5. Maximum codes of collections of outputs.
Electronics 10 01174 g005
Figure 6. Three-level microprogram control unit.
Figure 6. Three-level microprogram control unit.
Electronics 10 01174 g006
Figure 7. Organization of block of outputs with one-hot (a), maximum encoding (b) and encoding of FCO (c).
Figure 7. Organization of block of outputs with one-hot (a), maximum encoding (b) and encoding of FCO (c).
Electronics 10 01174 g007
Figure 8. Organization of microprogram control unit (MCU) with nanomemory.
Figure 8. Organization of microprogram control unit (MCU) with nanomemory.
Electronics 10 01174 g008
Figure 9. Trivial matrix implementation of P Mealy FSM.
Figure 9. Trivial matrix implementation of P Mealy FSM.
Electronics 10 01174 g009
Figure 10. Structural diagram of M P Mealy FSM.
Figure 10. Structural diagram of M P Mealy FSM.
Electronics 10 01174 g010
Figure 11. Optimal codes of collections of outputs.
Figure 11. Optimal codes of collections of outputs.
Electronics 10 01174 g011
Figure 12. Structural diagram of P Y Mealy FSM.
Figure 12. Structural diagram of P Y Mealy FSM.
Electronics 10 01174 g012
Figure 13. Structural diagram of M P Y Mealy FSM.
Figure 13. Structural diagram of M P Y Mealy FSM.
Electronics 10 01174 g013
Figure 14. Organization of M P (a), M P D (b), and M P Y (c) programmable read-only memory (PROM)-based FSMs.
Figure 14. Organization of M P (a), M P D (b), and M P Y (c) programmable read-only memory (PROM)-based FSMs.
Electronics 10 01174 g014
Figure 15. Heterogeneous circuit of M P Y FSM.
Figure 15. Heterogeneous circuit of M P Y FSM.
Electronics 10 01174 g015
Figure 16. The structural diagram of P H FSM.
Figure 16. The structural diagram of P H FSM.
Electronics 10 01174 g016
Figure 17. Structural diagrams of Mealy FSMs with transformation of states into outputs (a), states into collections of outputs (b), and collections of outputs into states (c).
Figure 17. Structural diagrams of Mealy FSMs with transformation of states into outputs (a), states into collections of outputs (b), and collections of outputs into states (c).
Electronics 10 01174 g017
Figure 18. A trivial embedded memory block (EMB)-based circuit of Mealy FSM.
Figure 18. A trivial embedded memory block (EMB)-based circuit of Mealy FSM.
Electronics 10 01174 g018
Figure 19. Structural diagram of field-programmable gate array (FPGA)-based M P Mealy FSM.
Figure 19. Structural diagram of field-programmable gate array (FPGA)-based M P Mealy FSM.
Electronics 10 01174 g019
Figure 20. The structural diagram of FPGA-based M P Y Mealy FSM.
Figure 20. The structural diagram of FPGA-based M P Y Mealy FSM.
Electronics 10 01174 g020
Figure 21. Structural diagram of M P Y M Mealy FSM.
Figure 21. Structural diagram of M P Y M Mealy FSM.
Electronics 10 01174 g021
Figure 22. Structural diagrams of P H Mealy FSM with L U T e r , E M B e r (a) and E M B e r , L U T e r (b) organization.
Figure 22. Structural diagrams of P H Mealy FSM with L U T e r , E M B e r (a) and E M B e r , L U T e r (b) organization.
Electronics 10 01174 g022
Figure 23. Structural diagram of look-up table (LUT)-based P Mealy FSM.
Figure 23. Structural diagram of look-up table (LUT)-based P Mealy FSM.
Electronics 10 01174 g023
Figure 24. Structural diagram of LUT-based M P Y Mealy FSM.
Figure 24. Structural diagram of LUT-based M P Y Mealy FSM.
Electronics 10 01174 g024
Figure 25. Structural diagram of P T Mealy FSM.
Figure 25. Structural diagram of P T Mealy FSM.
Electronics 10 01174 g025
Figure 26. Structural diagram of P T Y Mealy FSM.
Figure 26. Structural diagram of P T Y Mealy FSM.
Electronics 10 01174 g026
Figure 27. The structural diagram of P H Mealy FSM.
Figure 27. The structural diagram of P H Mealy FSM.
Electronics 10 01174 g027
Figure 28. Structural diagram of P o Y Mealy FSM.
Figure 28. Structural diagram of P o Y Mealy FSM.
Electronics 10 01174 g028
Figure 29. The structural diagram of LUT-based P o T Y Mealy FSM.
Figure 29. The structural diagram of LUT-based P o T Y Mealy FSM.
Electronics 10 01174 g029
Table 1. State transition table (STT) of Mealy finite state machine (FSM) A 1 .
Table 1. State transition table (STT) of Mealy finite state machine (FSM) A 1 .
s C s T I h O h h
s 1 s 2 i 1 o 1 o 2 1
s 3 i 1 ¯ i 2 o 3 2
s 4 i 1 ¯ i 2 ¯ o 2 o 4 3
s 2 s 3 i 3 o 1 o 2 4
s 2 i 3 ¯ o 5 5
s 3 s 1 i 2 o 3 6
s 4 i 2 ¯ o 2 o 4 7
s 4 s 2 1 o 5 8
Table 2. Direct structure table (DST) of Mealy FSM A 1 .
Table 2. Direct structure table (DST) of Mealy FSM A 1 .
s C K ( s C ) s T K ( s T ) I h O h D h h
s 1 00 s 2 01 i 1 o 1 o 2 D 2 1
s 3 10 i 1 ¯ i 2 o 3 D 1 2
s 4 11 i 1 ¯ i 2 ¯ o 2 o 4 D 1 D 2 3
s 2 01 s 3 10 i 3 o 1 o 2 D 1 4
s 2 01 i 3 ¯ o 5 D 2 5
s 3 10 s 1 00 i 2 o 3 -6
s 4 11 i 2 ¯ o 2 o 4 D 1 D 2 7
s 4 11 s 2 011 o 5 D 2 8
Table 3. Direct structure table (DST) of Mealy FSM A 2 .
Table 3. Direct structure table (DST) of Mealy FSM A 2 .
s C K ( s C ) s T K ( s T ) I h O h D h h
s 1 000 s 2 001 i 1 o 1 o 2 D 3 1
s 3 010 i 1 ¯ o 3 D 2 2
s 2 001 s 2 001 i 2 o 2 o 4 D 3 3
s 3 010 i 2 ¯ i 3 o 1 o 2 D 2 4
s 4 100 i 2 ¯ i 3 ¯ o 5 D 1 5
s 3 010 s 4 100 i 4 o 3 D 1 6
s 5 011 i 4 ¯ i 5 o 3 o 5 D 2 D 3 7
s 6 101 i 4 ¯ i 5 ¯ o 1 o 2 D 1 D 3 8
s 4 100 s 3 010 i 6 o 2 o 4 D 2 9
s 1 000 i 6 ¯ --10
s 5 011 s 2 001 i 7 o 3 D 3 11
s 2 001 i 7 ¯ i 8 o 1 o 2 D 3 12
s 4 100 i 7 ¯ i 8 ¯ o 3 o 5 D 1 13
s 6 101 s 1 0001--14
Table 4. Table of replacement of P Mealy FSM A 2 .
Table 4. Table of replacement of P Mealy FSM A 2 .
s m s 1 s 2 s 3 s 4 s 5 s 6
p 1 i 1 i 2 i 4 - i 7 -
p 2 - i 3 i 5 i 6 i 8 -
K ( s m ) 000001010100011101
Table 5. DST of M P Mealy FSM A 2 .
Table 5. DST of M P Mealy FSM A 2 .
s C K ( s C ) s T K ( s T ) P h O h D h h
s 1 000 s 2 001 p 1 o 1 o 2 D 3 1
s 3 010 p 1 ¯ o 3 D 2 2
s 2 001 s 2 001 p 1 o 2 o 4 D 3 3
s 3 010 p 1 ¯ p 2 o 1 o 2 D 2 4
s 4 100 p 1 ¯ p 2 ¯ o 5 D 1 5
s 3 010 s 4 100 p 1 o 3 D 1 6
s 5 011 p 1 ¯ p 2 o 3 o 5 D 2 D 3 7
s 6 101 p 1 ¯ p 2 ¯ o 1 o 2 D 1 D 3 8
s 4 100 s 3 010 p 2 o 2 o 4 D 2 9
s 1 000 p 2 ¯ --10
s 5 011 s 2 001 p 1 o 3 D 3 11
s 2 001 p 1 ¯ p 2 o 1 o 2 D 3 12
s 4 100 p 1 ¯ p 2 ¯ o 3 o 5 D 1 13
s 6 101 s 1 0001--14
Table 6. DST of P Y Mealy FSM A 2 .
Table 6. DST of P Y Mealy FSM A 2 .
s C K ( s C ) s T K ( s T ) I h Z h Φ h h
s 1 000 s 2 001 i 1 z 2 z 3 D 3 1
s 3 010 i 1 ¯ z 1 z 2 D 2 2
s 2 001 s 2 001 i 2 z 2 D 3 3
s 3 010 i 2 ¯ i 3 z 2 z 3 D 2 4
s 4 100 i 2 ¯ i 3 ¯ z 1 z 3 D 1 5
s 3 010 s 4 100 i 4 z 1 z 2 D 1 6
s 5 011 i 4 ¯ i 5 z 1 D 2 D 3 7
s 6 101 i 4 ¯ i 5 ¯ z 2 z 3 D 1 D 3 8
s 4 100 s 3 010 i 6 z 2 D 2 9
s 1 000 i 6 ¯ --10
s 5 011 s 2 001 i 7 z 1 z 2 D 3 11
s 2 001 i 7 ¯ i 8 z 2 z 3 D 3 12
s 4 100 i 7 ¯ i 8 ¯ z 1 D 1 13
s 6 101 s 1 0001--14
Table 7. DST of P H Mealy FSM A 2 .
Table 7. DST of P H Mealy FSM A 2 .
s C K ( s C ) I h F h K ( F h ) Z h h
s 1 000 i 1 F 1 0000-1
i 1 ¯ F 2 0001 z 4 2
s 2 001 i 2 F 3 0010 z 3 3
i 2 ¯ i 3 F 4 0011 z 3 z 4 4
i 2 ¯ i 3 ¯ F 5 0100 z 2 5
s 3 010 i 4 F 6 0101 z 2 z 4 6
i 4 ¯ i 5 F 7 0110 z 2 z 3 7
i 4 ¯ i 5 ¯ F 8 0111 z 2 z 3 z 4 8
s 4 100 i 6 F 9 1000 z 2 9
i 6 ¯ F 10 1001 z 2 z 4 10
s 5 011 i 7 F 11 1010 z 2 z 3 11
i 7 ¯ i 8 F 12 1011 z 1 z 3 z 4 12
i 7 ¯ i 8 ¯ F 13 1100 z 1 z 2 13
s 6 1011 F 14 1101 z 1 z 2 z 4 14
Table 8. Contents of programmable read-only memories (PROMs) of P H Mealy FSM A 2 .
Table 8. Contents of programmable read-only memories (PROMs) of P H Mealy FSM A 2 .
F h K ( F h ) o 1 o 2 o 3 o 4 o 5 D 1 D 2 D 3 h
F 1 0000110000011
F 2 0001001000102
F 3 0010010100013
F 4 0011110000104
F 5 0100000011005
F 6 0101001001006
F 7 0110001010117
F 8 0111110001018
F 9 1000010100109
F 10 10010000000010
F 11 10100010000111
F 12 10111100000112
F 13 11000010110013
F 14 11010000000014
Table 9. Part of truth table for FSM A 1 .
Table 9. Part of truth table for FSM A 1 .
AddressContents of Cellsqh
T 1 T 2 i 1 i 2 i 3 o 1 o 2 o 3 o 4 o 5 D 1 D 2
01000000010195
010011100010104
010100000101115
010111100010124
011000000101135
011011100010144
011100000101155
011111100010164
Table 10. Characteristics of Mealy FSM benchmarks.
Table 10. Characteristics of Mealy FSM benchmarks.
BenchmarkLN R + L M / R H
Category 0
bbtas2269/424
dk1723616/432
dk2712510/414
dk51213624/515
ex322614/436
ex522616/432
lion2155/311
lion921611/425
mc3568/310
modulo1211512/424
shiftreg11516/416
Category 1
bbara42812/460
bbsse771226/556
beecount34710/428
cse771232/591
dk1435826/556
dk1535817/532
dk1623975/7108
donfile21724/596
ex222725/572
ex4691118/521
ex658914/434
ex7221217/536
keyb771222/5170
mark15161022/522
opus561018/522
s2741811/434
s386771223/564
s841815/420
sse771226/556
Categories 2–4
ex19191680/7138
kirkman1261848/6370
planet7191486/7115
planet17191486/7115
pma881449/673
s1871454/6106
s148881915112/7251
s149481915118/7250
s1a861586/7107
s2081121737/6153
styr9101667/7166
tma791363/644
sand1191888/7184
s42019227137/8137
s51019727172/877
s82018192578/7232
s83218192576/7245
Table 11. Experimental results for M P Y Mealy FSMs [123] (LUT counts).
Table 11. Experimental results for M P Y Mealy FSMs [123] (LUT counts).
BenchmarkAutoOne-HotJEDIDEMAIN MPY
Category 0
bbtas55558
dk17512568
dk2735447
dk512101091012
ex3999911
ex5999910
lion25226
lion9611558
mc47456
modulo1277779
shiftreg26224
Category 1
bbara171710910
bbsse3337242626
beecount1919141614
cse4066363833
dk141027101212
dk15516566
dk161534121411
donfile3131222621
ex299898
ex41513121311
ex62436222321
ex745446
keyb4361404237
mark12323202119
opus2828222621
s27618666
s3862639222520
s899999
sse3337303226
Categories 2–4
ex17074535740
kirkman4258394133
planet131131889478
planet1131131889478
pma9494869172
s16599616454
s148812413110811289
s149412613211011790
s1a4981435438
s208123110119
styr93120818870
tma4539394130
sand13213211412199
s42010319108
s5104848323922
s8208882687652
s8328079627050
Total17922104148016011321
Percentage135.65%159.27%112.04%121.20%100%
Table 12. Experimental results for M P Y Mealy FSMs [123] (the operating frequency, MHz).
Table 12. Experimental results for M P Y Mealy FSMs [123] (the operating frequency, MHz).
BenchmarkAutoOne-HotJEDIDEMAIN MPY
Category 0
bbtas204.16204.16206.12208.32194.43
dk17199.28167.00199.39172.19147.22
dk27206.02201.90204.18205.10181.73
dk512196.27196.27199.75197.49175.63
ex3194.86194.86195.76193.43174.44
ex5180.25180.25181.16181.76162.56
lion202.43204.00202.35201.32185.74
lion9205.30185.22206.38205.86167.28
mc196.66195.47196.87192.53178.02
modulo12207.00207.00207.13207.37189.7
shiftreg262.67263.57276.26276.14248.79
Category 1
bbara193.39193.39212.21198.46183.32
bbsse157.06169.12182.34178.91159.24
beecount166.61166.61187.32184.21156.72
cse146.43163.64178.12174.19153.24
dk14191.64172.65193.85187.32162.78
dk15192.53185.36194.87188.54175.42
dk16169.72174.79197.13189.83164.16
donfile184.03184.00203.65194.83174.28
ex2198.57198.57200.14199.75188.95
ex4180.96177.71192.83178.14168.39
ex6169.57163.80176.59174.12156.42
ex7200.04200.84200.60200.32191.43
keyb156.45143.47168.43157.16136.49
mark1162.39162.39176.18169.65153.48
opus166.20166.20178.32168.79157.42
s27198.73191.50199.13198.43185.15
s386168.15173.46179.15169.21164.65
s8180.02178.95181.23180.39168.32
sse157.06169.12174.63169.69158.14
Categories 2–4
ex1150.94139.76176.87186.14164.32
kirkman141.38154.00156.68143.76155.36
planet132.71132.71187.14185.73174.68
planet1132.71132.71187.14185.73173.29
pma146.18146.18169.83153.57156.12
s1146.41135.85157.16149.17145.32
s1488138.50131.94157.18153.12141.27
s1494149.39145.75164.34159.42155.63
s1a153.37176.40169.17158.12166.36
s208174.34176.46178.76172.87166.42
styr137.61129.92145.64138.83118.02
tma163.88147.80164.14168.19137.48
sand115.97115.97126.82120.63120.07
s420173.88176.46177.25172.87186.35
s510177.65177.65198.32183.18199.05
s820152.00153.16176.58166.29175.69
s832145.71153.23173.78160.03174.39
Total8127.088061.228718.878461.107917.10
Percentage102.65%101.82%110.13%106.87%100%
Table 13. Experimental results [122] (LUT counts).
Table 13. Experimental results [122] (LUT counts).
BenchmarkP MP PH [122]
ex122194836
kirkman30262711
planet21165138
planet121165138
pma28232714
s126232412
s148824215237
s149428245039
s208292387
s420383687
s51039362215
s82040344736
s83241344735
sand27232916
styr26203118
Total440374522359
Percentage123%104%145%100%
Table 14. Experimental results [122] (the operating frequency, MHz).
Table 14. Experimental results [122] (the operating frequency, MHz).
BenchmarkP MP PH [122]
ex1141.43105.78158.28212.93
kirkman125.78107.81155.11174.73
planet122.01105.41124.31187.95
planet1122.01105.41124.31187.95
pma115.41114.49127.65186.22
s1124.49117.80132.85178.84
s1488127.80112.79131.77186.37
s1494122.79124.92135.73181.62
s208144.92128.04144.05209.36
s420148.04112.66152.65192.14
s510122.66111.42138.75192.87
s820121.4288.65133.36163.18
s83298.65115.57100.53184.69
sand135.57104.68146.78178.65
styr114.68116.47115.69181.22
Total1887.661671.902021.822798.72
Percentage67.4%59.7%72.2%100%
Table 15. Experimental results [122] (the consumed power, Watts).
Table 15. Experimental results [122] (the consumed power, Watts).
BenchmarkP MP PH [122]
ex13.5603.2903.0142.918
kirkman4.9223.5622.8112.476
planet3.2223.7561.7271.527
planet13.2223.7561.7271.527
pma4.7784.9154.2573.683
s13.6943.8133.5783.058
s14881.5862.4121.4491.785
s14941.7302.3981.4531.302
s2083.0053.5442.5742.248
s4201.6043.3841.5431.292
s5101.8831.9961.8781.682
s8202.4652.1611.7561.843
s8322.5152.5042.1931.732
sand2.5792.5782.3852.017
styr1.4671.5561.3071.112
Total42.23245.62533.65230.202
Percentage139.8%151%111.4%100%
Table 16. Experimental results [132] (LUT counts).
Table 16. Experimental results [132] (LUT counts).
BenchmarkAutoOne-HotJEDI P o Y P oT Y
Category 0
bbtas55589
dk175125810
dk2735479
dk512101091214
ex39991114
ex59991012
lion25268
lion96115810
mc47468
modulo12777911
shiftreg26246
Category 1
bbara1717101014
bbsse3337242629
beecount1919141416
cse4066363335
dk141627101214
dk15151612811
dk161534121113
donfile3131242124
ex2998810
ex41513121113
ex62436222123
ex745468
keyb4361403740
mark12323201921
opus2828222123
s27618668
s3862639222022
s8999911
sse3337302629
Categories 2–4
ex17074534044
kirkman4258393335
planet131131887882
planet1131131887882
pma9494867276
s16599615458
s14881241311088993
s14941261321109094
s1a4981433842
s208123110911
styr93120817078
tma4539393034
sand13213211499103
s42010319810
s5104848322223
s8208882685256
s8328079625052
Total18082104148913201448
Percentage124.86%145.30%102.83%91.16%100%
Table 17. Experimental results [132] (the maximum operating frequency, MHz).
Table 17. Experimental results [132] (the maximum operating frequency, MHz).
BenchmarkAutoOne-HotJEDI P o Y P oT Y
Category 0
bbtas204.16204.16206.12194.43201.47
dk17199.28167.00199.39147.22172.99
dk27206.02201.90204.18181.73190.32
dk512196.27196.27199.75175.63187.45
ex3194.86194.86195.76174.44187.26
ex5180.25180.25181.16162.56162.56
lion202.43204.00202.35185.74195.73
lion9205.30185.22206.38167.28183.45
mc196.66195.47196.87178.02182.95
modulo12207.00207.00207.13189.70201.74
shiftreg262.67263.57276.26248.79253.72
Category 1
bbara193.39193.39212.21183.32210.21
bbsse157.06169.12182.34159.24193.43
beecount166.61166.61187.32156.72194.47
cse146.43163.64178.12153.24182.62
dk14191.64172.65193.85162.78201.39
dk15192.53185.36194.87175.42206.74
dk16169.72174.79197.13164.16199.14
donfile184.03184.00203.65174.28206.83
ex2198.57198.57200.14188.95196.58
ex4180.96177.71192.83168.39196.18
ex6169.57163.80176.59156.42187.53
ex7200.04200.84200.60191.43204.16
keyb156.45143.47168.43136.49178.59
mark1162.39162.39176.18153.48182.37
opus166.20166.20178.32157.42186.34
s27198.73191.50199.13185.15201.26
s386168.15173.46179.15164.65192.34
s8180.02178.95181.23168.32191.32
sse157.06169.12174.63158.14171.18
Categories 2–4
ex1150.94139.76176.87164.32180.72
kirkman141.38154.00156.68155.36184.62
planet132.71132.71187.14174.68212.45
planet1132.71132.71187.14173.29212.45
pma146.18146.18169.83156.12192.43
s1146.41135.85157.16145.32145.32
s1488138.50131.94157.18141.27182.14
s1494149.39145.75164.34155.63186.49
s1a153.37176.40169.17166.36188.92
s208174.34176.46178.76166.42192.15
styr137.61129.92145.64118.02164.52
tma163.88147.80164.14137.48182.72
sand115.97115.97126.82120.07
s420173.88176.46177.25186.35218.62
s510177.65177.65198.32199.05221.19
s820152.00153.16176.58175.69195.73
s832145.71153.23173.78174.39199.18
Total8127.088061.228718.877873.369005.11
Percentage90.25%89.52%96.82%87.43%100%
Table 18. Experimental results [132] (area-time products, LUTs × n s ).
Table 18. Experimental results [132] (area-time products, LUTs × n s ).
BenchmarkAutoOne-HotJEDI P o Y P oT Y
Category 0
bbtas24.4924.4924.2641.1544.67
dk1725.0971.8625.0854.3457.81
dk2714.5624.7619.5938.5247.29
dk51250.9550.9545.0668.3374.69
ex346.1946.1945.9763.0674.76
ex549.9349.9349.6861.5273.82
lion9.8824.519.8832.3040.87
lion929.2359.3924.2347.8254.51
mc20.3435.8120.3233.7043.73
modulo1233.8233.8233.8047.4454.53
shiftreg7.6122.767.2416.0823.65
Category 1
bbara87.9187.9147.1254.5566.60
bbsse210.11218.78131.62163.28149.93
beecount114.04114.0474.7489.3382.27
cse273.17403.32202.11215.35191.65
dk1483.49156.3951.5973.7269.52
dk1577.9186.3261.5845.6053.21
dk1688.38194.5260.8767.0165.28
donfile168.45168.48117.85120.50116.04
ex245.3245.3239.9742.3450.87
ex482.8973.1562.2365.3266.27
ex6141.53219.78124.58134.25122.65
ex720.0024.9019.9431.3439.18
keyb274.85425.18237.49271.08223.98
mark1141.63141.63113.52123.79115.15
opus168.47168.47123.37133.40123.43
s2730.1993.9930.1332.4139.75
s386154.62224.84122.80121.47114.38
s849.9950.2949.6653.4757.50
sse210.11218.78171.79164.41169.41
Categories 2–4
ex1463.76529.48299.66243.43243.47
kirkman297.07376.62248.91212.41189.58
planet987.11987.11470.24446.53385.97
planet1987.11987.11470.24450.11385.97
pma643.04643.04506.39461.18394.95
s1443.96728.74388.14371.59399.12
s1488895.31992.88687.11630.00510.60
s1494843.43905.66669.34578.29504.05
s1a319.49459.18254.18228.42222.32
s20868.83175.6855.9454.0857.25
styr675.82923.65556.17593.12474.11
tma274.59263.87237.60218.21186.08
sand1138.231138.23898.91824.52719.58
s42057.51175.6850.7842.9345.74
s510270.19270.19161.36110.52103.98
s820578.95535.39385.09295.98286.11
s832549.04515.56356.77286.71261.07
Total12228.6114168.648844.908554.937877.31
Percentage155.24%179.87%112.28%108.60%100%
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Barkalov, A.; Titarenko, L.; Krzywicki, K. Structural Decomposition in FSM Design: Roots, Evolution, Current State—A Review. Electronics 2021, 10, 1174. https://0-doi-org.brum.beds.ac.uk/10.3390/electronics10101174

AMA Style

Barkalov A, Titarenko L, Krzywicki K. Structural Decomposition in FSM Design: Roots, Evolution, Current State—A Review. Electronics. 2021; 10(10):1174. https://0-doi-org.brum.beds.ac.uk/10.3390/electronics10101174

Chicago/Turabian Style

Barkalov, Alexander, Larysa Titarenko, and Kazimierz Krzywicki. 2021. "Structural Decomposition in FSM Design: Roots, Evolution, Current State—A Review" Electronics 10, no. 10: 1174. https://0-doi-org.brum.beds.ac.uk/10.3390/electronics10101174

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop