# Multi-objective Optimization Approach for VLSI Implementation of FIR Filter

Jitesh R Shinde<sup>1</sup>, Suresh Salankar<sup>2</sup>, Shilpa J Shinde<sup>3</sup>

<sup>1</sup>(Research Scholar, Nagpur University, India) <sup>2</sup>(Electronics & Telecommunication, G.H Raisoni College of Engineering, Nagpur, India) <sup>3</sup>(M.Tech Scholar, Nagpur University, India)

**Abstract:** This paper présents a new approach for multi-objective optimization of area-delay-power simultaneously for VLSI implementation of digital finite impulse response filter. It is based on use of concept of multiple constant multiplication approach with partial product sharing and coefficient reuse in multiplier module and /or digit serial architecture in adder module design along with fixed point arithmetic. Designs are synthesized using Synopsys Design Compiler in 90 nm & 45 nm process technology. The synthesis results shows that proposed approach provide a good multi-objective optimization technique for digital FIR filter compared to other previous findings published in last decade.

*Keywords:* Direct form (DF) FIR filter, Transposed form, Multiple Constant Multiplication (MCM), Computation Sharing Multiplication (CSHM)

### I. Introduction

Modular design approach has become recent trend for the development of complex VLSI systems wherein most of these modular VLSI systems need to be optimized with respect to basic constraints in VLSI design i.e. area, power and speed with acceptable performance level [1].

Moreover in present decade, the growing awareness to environmental heating, the drive to deliver lighter mobile computers with longer battery life, and the emerging demand for portable & faster consumer electronic products has created the need for VLSI implementation of these products by the industry wherein all three basic constraints area, power and delay need to be optimized simultaneously [2].

But the main hurdle in the VLSI implementation of digital circuits is that either the design can be area efficient or power efficient or speed efficient, but not all area-time-speed efficient simultaneously. Optimizing one parameter affects the other as seen in equation below:-

$$T_d = \frac{C_L * V_{dd}}{I} \tag{1.1}$$

Where I is device current defined in terms of  $(V_{dd} - V_{th})$ .

So, single-objective solutions to various design optimization problems in VLSI circuit design must hence be augmented to deal with this changing scenario. Thus, in today's era of VLSI circuit design, there is need for multi-objective optimization of VLSI circuit.

Multi-objective optimization involves minimizing or maximizing multiple objective functions subject to a set of constraints. Example problems include analyzing design tradeoffs, selecting optimal product or process designs, or any other application where you need an optimal solution with tradeoffs between two or more conflicting objectives.

This paper describes the various possible multi-objective optimization trade-off approaches that that allows designer to optimize the performance of Multiply Accumulate (MAC) unit or specifically multiplier and adder block of MAC unit depending on the need of VLSI implementation of digital applications. This paper is an extended version of our previously published work [22, 23]. In this paper, results obtained on 90 nm process technology are presented and compared with FIR filter approaches presented in last decade.

The proposed concept of the use of concept of Multiple Constant Multiplication (MCM) for implementing multiplier blocks only with wires, and adder block using digit serial architecture approach in floating point arithmetic , and these approaches provides a result that reliably match with theoretical results or results implemented in MATLAB, and thus making these approaches a valuable trade-off strategies in enhancing the performance of VLSI implementation of digital signal processing (DSP) applications with reasonable accuracy. Digital low pass finite impulse response (FIR) filter was selected as case study in this research work. The rest of the paper is organized as follows:

Section II describes the traditional approach for VLSI implementation of digital FIR filters. Section III suggests multi-objective optimization approaches introduced in VLSI implementation of digital FIR filter. In

Section IV experimental implementations are discussed. Section V describes the simulation results of various FIR filter structure obtained on 45 nm and 90 nm CMOS process technologies and their comparisons. In Section VI, obtained results of direct form FIR filters based on approaches presented in this paper are compared with benchmark FIR filter implementation approaches present in last decade. The paper is concluded in section VII.

#### II. Digital Fir Filter

The FIR filters are basic component in any DSP application. The block diagram of discrete time direct form FIR digital filter is shown in figure (2.1) and its time domain representation is defined by following equation:-

$$\mathbf{h[n]} = \sum_{i=0}^{N} \mathbf{b}_{i} \mathbf{x} [\mathbf{n}-i]$$
(2.1)



Fig. 2.1: A discrete-time FIR filter of order N

From equation 2.1, it is seen that the critical operations usually involve in FIR filter are many multiplications and additions. Hence for real-time signal processing, a high speed and high throughput multiplier & adder module are always a key to achieve a high performance in digital signal processing system or may in digital communication system.

Therefore, the performance & optimization of VLSI implementation of any DSP applications greatly depends on how digital FIR filters are implemented and hence in it how the multiplier and adder module. Hence, digital FIR low pass symmetric FIR filter was selected as case study in this research work.

The previous works on digital FIR filter structure proposed in last two decades were mostly single objective based [3, 4, 5, 6, 7 and 8] i.e. the objectives were either area or power or speed or in very few cases combination of two [13]. So, in this paper the focused was on how area, power and speed can be optimized simultaneously without affecting the functionality & performance of the circuit.

### III. Design Approach

The fixed point arithmetic was used in VLSI implementation of digital FIR filter because of reduced cost of the hardware and high speed processing fixed point arithmetic offers over floating point arithmetic [16].

The Multiple Constant Multiplication (MCM) concept along with partial product sharing was used in VLSI implementation of multiplier module in digital FIR filter realization. Because of MCM approach, the shift and add loop of traditional multipliers are replaced with a set of high speed wire-shifts and then added in one quick step. The concept of MCM with & without partial product sharing is shown in figure 3.1. Without partial product sharing 29x & 43x product are implemented in three stages and 6 adder blocks are required. Whereas in MCM approach with partial product sharing products are implemented in just two stages and only four adder blocks are required.

The digit serial architecture approach was used in VLSI implementation of the adder module because it leads to higher throughput because each computation is carried out only in W/N clock cycles, instead of W as in bit serial case. In digit-serial computation, data words of size W bits are partitioned into digits of size N bits (the digit-size, N, is divisor of the word-size, W) and are processed serially one digit at a time with least significant digit first. A complete word is processed in P=W/N clock cycles and consecutive words follow each other without a break. The number of bits per digit or digit-size is width of digit serial signal, and W/N is its length. The sequence of W/N clock cycles is sample period [17, 18, & 19]. The RTL view of digit adder module used in the work is shown in figure 3.2.



**Fig. 3.1:** Shift-adds implementations of 29x and 43x (a) without partial product sharing; (b) with partial product sharing



Fig.3.2: RTL view of digit adder

The digit size in this adder is two bits. In first clock cycle, three least significant bits (LSB) of A (A0 & A1) and of B (B0 & B1) are presented to array of full adder cells. The carry- in to the LSB position will be zero. During the clock cycle, a ripple carry addition of the two bits is performed & produces a sum of two bits & a carry-out bit. This carry-out bit is delayed one clock cycle and fed to LSB position. In next clock cycle, it is combined with next digit of output. Thus operation is carried out two bits at a time [17].

# IV. Implementation

In the work, five LTI filters were designed viz.

- filter 0 (direct form)
- filter 1 (transposed form)
- filter 2 (Optimized direct form with only partial product sharing and MCM approach)
- filter 3 (TDF with digit serial adder approach)
- filter 4 (Optimized direct form with partial product sharing with MCM and digit adder approach)

and then their performances were compared with respect to area, dynamic power dissipation and propagation delay.

Firstly, simple direct form FIR filter structure was implementated in MATLAB using FDA (Filter Design & Analysis) tool of MATLAB with following specifications:-

- Design Method: FIR equiripple
- Response type: Low pass
- Filter order: 17

From FDA tool, the filter co-efficients for direct form FIR filter structure were obtained. But these coefficients were negative and in floating point format. In order to optimize the resources used (i.e. gates and hence area) at RTL (Register Transfer level), processing performance, system cost and ease of use; and since dynamic range of output is known, the floating point coefficients were converted into the fixed point coefficient by multiplying them with 1000 and taking the round off value of it. After that negative coefficients were converted into the positive coefficients by taking the absolute value of previous value. The process is illustrated in figure 4.1.



Fig.4.1: Steps for Problem elimination

The magnitude response of direct form FIR filter using values obtained from FDA tool in MATLAB and using the values after carrying out the process illustrated in figure 4.1 were found to be same (figure 4.2).

Next, using these filter co-efficients filter0, filter1,filter2,filter3 & filter4 FIR filter structure were implemented using Active HDL and their performance with respect to area, timing and dynamic power consumption were analyzed using Altera's Quatrus tool as well as Xilinx tool at RTL level and Synopsis Design Vision tool.

The partial RTL view of filter 0 (direct form and ripple carry adder (RCA) based adder module), filter 1 (transposed form and RCA based adder module)), filter2 (Optimized direct form with only partial product sharing and MCM approach and RCA based adder module) and filter 3 (TDF with digit serial adder approach and RCA based adder module) and filter 3 (TDF with digit serial adder approach and RCA based adder module) implementation in Active HDL software tool are shown in figure 4.3, 4.4, 4.5 and 4.6 respectively. The multiplier module of MAC in filter0, filter1 and filter 3 were implemented using generic or array multiplier. The MCM approach based multiplier module was used in filter 2 and filter 4 were implemented using structure shown in figure 4.1(b).



Fig 4.2: Magnitude Response of FIR Filter



Fig.4.3: Partial RTL view of filter 0 (direct form)



Fig.4.4: Partial RTL view of filter 1 (transposed form)



Fig.4.5: Partial RTL view of filter 2 (Optimized direct form with only partial product sharing and MCM approach)



Fig.4.6: Partial RTL view of filterdigitadder0 module of filter 3 (TDF with digit serial adder approach)

The drawback of the approach use in filter 0 or filter 1 were that either size of adders and / or hence latches in transposed or DF structure were increasing after each stage and thereby adding to increase in area and power consumption of the circuit.

To optimize the problems faced in implementation of the multiplier and adder module of direct form digital filter structure, a slight modification is suggested in direct form FIR filter structure i.e. concept of coefficient reuse with MCM approach in multiplier module implementation and use of digit architectures in adder module design, and with same approach DF FIR was also redesigned. This approach considerable optimizes the number of multiplier module used in design and improves the efficiency of adder module.

From low pass symmetric digital FIR filter design on MATLAB at initial level it was observed that five filter coefficients were repeated two or three times. Hence multiplier module corresponding to these filter coefficient were designed only one time and were reused in later design simultaneously based on approach shown in figure 3.1b.

The top module RTL view of filter 4 (Optimized direct form with partial product sharing with MCM and digit adder approach) is shown in figure 4.7. The partial RTL view of filterdigitadder2optmul1 module of filter 4 is shown in figure 4.8. In this design, latches are being removed from upper branch and its need is covered in the shift and load process of adderfsm1 module, when all bits shift and accumulate in shift register, it then finally thrown out to register, on load clk, which is equivalent to latch process. The RTL view of adder module is shown in figure 4.9. The logic in the form of finite state machine (adderfsm1 module) which is used to control adder module of filter 4 is shown in figure 4.10. It generates sel\_sumcount and sel control signals used in digit adder module. The logic used in implementation of multiplier block based on MCM approach is shown in figure 4.11.



Fig.4.7: RTL view of top module of filter 4 (Optimized direct form with partial product sharing with MCM and digit adder approach)



Fig.4.8: Partial RTL view of filter 4 (Optimized direct form with partial product sharing with MCM and digit adder approach)

![](_page_7_Figure_0.jpeg)

![](_page_7_Figure_2.jpeg)

![](_page_7_Figure_3.jpeg)

Fig.4.10: State diagram of adderfsm1 module in filter 4

```
library IEEE;
     IEEE.STD_LOGIC_1164.ALL;
IEEE.STD_LOGIC_ARITH.ALL;
IEEE.STD_LOGIC_UNSIGNED.ALL;
uze
12.8 C
use
          mulh0 15
entity
                 : IN std_logic_vector(7 downto 0);
: out std_logic_vector(13 downto 0));
          a
Port
        ¢
            X
end mulho
              -
architecture mulh0 of mulh0 is
begin
      <= "00" 6 a 6 "00000";
                                               -- shift by 4
v
                                                                       16*X
end mulh0;
```

Fig.4.11: Logic used in implementation of multiplier module (mulh0) in filter 4

# V. Results & Comparison

Compilation summary for total cell area and total dynamic power dissipation respectively of filter0, filter1, filter2, filter3 & filter 4 of digital FIR filter on 45 nm process technologies (without wire load (WLM) model on Synopsis Design Vision tool are given below in table 5.1. Advanced HDL Synthesis Timing Report and report stating number of adders and multiplier inferred in design at RTL level (Family: Vertex 5, Device: XC5VLX30, Speed grade: -3, Package: FF324) obtained on Xilinx tool are given in table 5.2 and 5.3 respectively. The table 5.3 shows that multiplier module in filter 2 and filter 4 is implemented by simply means of wire because of MCM approach. The table 5.3 also shows that number of adder blocks is also drastically reduced in filter 4 in comparison to other filters.

|                           | Filter 0 (DF +Filter 1Filter 2Filter 3Filter 4 |              |                      |                  |                  |  |  |  |
|---------------------------|------------------------------------------------|--------------|----------------------|------------------|------------------|--|--|--|
|                           | Array Multiplier +                             | (TF + Array  | (MCM + Partial       | (TF + Array      | (MCM + Partial   |  |  |  |
|                           | RCA)                                           | Multiplier + | product sharing +    | Multiplier + DA) | product          |  |  |  |
|                           |                                                | RCA)         | RCA                  |                  | sharing+ DA)     |  |  |  |
| Technology without        | 45 nm                                          | 45 nm        | 45 nm                | 45 nm            | 45 nm            |  |  |  |
| WLM                       |                                                |              |                      |                  |                  |  |  |  |
| Global operating          | 1.1 V                                          | 1.1 V        | 1.1 V                | 1.1 V            | 1.1 V            |  |  |  |
| Voltage                   |                                                |              |                      |                  |                  |  |  |  |
| Total cell area in µm sq. | 28836.138528                                   | 34037.859675 | 20489.168487         | 13424.326756     | 988.815092       |  |  |  |
| Total Dynamic Power in    | 5.8902                                         | 12.8132      | 2.3453               | 7.3128           | 0.1492223        |  |  |  |
| mW                        |                                                |              |                      |                  |                  |  |  |  |
| Maximum output            | 17.179                                         | 4.065        | 4.076                | 2.699            | 2.699            |  |  |  |
| required time after clock |                                                |              |                      |                  |                  |  |  |  |
| (RTL report) in nsec      |                                                |              |                      |                  |                  |  |  |  |
| Maximum                   | 17.440                                         | 7.336        | no critical path not | no critical path | no critical path |  |  |  |
| combinational path        |                                                |              | found                | not found        | not found        |  |  |  |
| delay (RTL report)        |                                                |              |                      |                  |                  |  |  |  |

Table 5.1: Compilation Summary of Filter 0, Filter 1, Filter 2, Filter 3 & Filter 4 of order N=17

Further, the graphical analysis of filter0, filter1, filter2, filter3 & filter 4 of digital FIR filter with respect to total cell area, total dynamic power readings and circuit latency with reference to table 5.1 are shown in figure 5.1, 5.2 and 5.3 respectively.

| Table 5.2: Advanced HDL | Synthesis Timing Report | of Filter 0, Filter 1, Filter 2, | , Filter3 & Filter4 of order N=17 |
|-------------------------|-------------------------|----------------------------------|-----------------------------------|
|-------------------------|-------------------------|----------------------------------|-----------------------------------|

| Name of<br>filter | Minimum<br>period in nsec | Maximum<br>Frequency in<br>MHz | Minimum input arrival<br>time before clock | nimum input arrival Maximum output<br>e before clock required time after<br>clock in nsec |               |
|-------------------|---------------------------|--------------------------------|--------------------------------------------|-------------------------------------------------------------------------------------------|---------------|
| Filter4           | 2.650                     | 377.333                        | 1.801ns                                    | 2.699                                                                                     | No path found |
| Filter3           | 2.650                     | 377.333                        | 1.801ns                                    | 2.699                                                                                     | No path found |
| Filter2           | 1.809                     | 552.884                        | 1.376ns                                    | 4.076                                                                                     | No path found |
| Filter1           | 1.891                     | 528.905                        | 5.162ns                                    | 4.065                                                                                     | 7.336         |
| Filter0           | 0.585                     | 1710.279                       | 0.850ns                                    | 17.179                                                                                    | 17.440        |

| Name of filter | Adders                                                       | Multipliers                          |  |
|----------------|--------------------------------------------------------------|--------------------------------------|--|
| Filter4        | Total : 6, 14-bit adder :5, 4-bit adder : 1                  |                                      |  |
| Filter3        | Total :1, 4-bit adder :1                                     | Total :18, 8x8-bit multiplier<br>:18 |  |
| Filter2        | Total :24, 14-bit adder :5, 31-bit adder :18, 8-bit adder :1 |                                      |  |
| Filter1        | Total : 18, 31-bit adder :18                                 | Total :18, 8x8-bit multiplier<br>:18 |  |
| Filter0        | Total : 18, 31-bit adder :18                                 | Total :18, 8x8-bit multiplier<br>:18 |  |

Multi-objective Optimization Approach for VLSI Implementation of FIR Filter

![](_page_9_Figure_2.jpeg)

 Table 5.3: Advanced HDL Synthesis Report of Filter 0, Filter 1, Filter 2, Filter 3 & Filter 4 of order N=17

Fig.5.1: Comparative analysis in terms of area

![](_page_9_Figure_5.jpeg)

Fig.5.2: Comparative analysis in terms of power

![](_page_9_Figure_7.jpeg)

Fig.5.3: Comparative analysis in terms of delay

The summaries of experimental results on 45 nm technology are given below:

- Area reduction in filter 4 vs. Filter 0: 96.52 %
- Area reduction in filter 4 vs. Filter 2: 95.29 %
- Area reduction in filter 4 vs. Filter 1: 97.09 %
- Area reduction in filter 4 vs. Filter 3: 92.64 %

- Area reduction in filter 3 vs. Filter 1: 61.23 %
- Dynamic Power reduction in filter 4 vs. Filter 0 : 97.44 %
- Dynamic Power reduction in filter 4 vs. Filter 2 : 93.64 %
- Dynamic Power reduction in filter 4 vs. Filter 1 : 98.84 %
- Dynamic Power reduction in filter 4 vs. Filter 3 : 97.96 %
- Dynamic Power reduction in filter 3 vs. Filter 1 : 42.8 %
- Circuit latency reduction in filter 4 vs. Filter 0 : 84.29 %
- Circuit latency reduction in filter 4 vs. Filter 2 : 33.79 %
- Circuit latency reduction in filter 4 vs. Filter 1 : 33.61 %
- Circuit latency reduction in filter 3 vs. Filter 1 : 63.21 %

Compilation summary for total dynamic power dissipation of filter0, filter1, filter2, filter3 & filter 4 of digital FIR filter on 90 nm technologies on Synopsis Design Vision tool is given below in table 5.4.

| Tuble 544 Complication Summary of Theor 0, Theor 1, Theor 2, Theory & Theor 10 of order 10-17 |              |              |              |              |              |  |  |  |
|-----------------------------------------------------------------------------------------------|--------------|--------------|--------------|--------------|--------------|--|--|--|
|                                                                                               | Filter 0     | Filter 1     | Filter 2     | Filter 3     | Filter 4     |  |  |  |
| Technology with WLM                                                                           | 90 nm        |  |  |  |
| Global Operating Voltage                                                                      | 1.2 V        |  |  |  |
| Total Dynamic Power in mW                                                                     | 0.9964143    | 1.9999       | 0.5749835    | 0.1052175    | 0.1052098    |  |  |  |
| Total area in µm sq.                                                                          | 52881.060093 | 66098.457501 | 23439.235934 | 56187.065995 | 54596.246247 |  |  |  |
| data required time (nsec)                                                                     | 19.90        | 19.84        | 19.88        | 19.82        | 19.82        |  |  |  |
| data arrival time (nsec)                                                                      | -14.83       | -8.19        | -2.67        | -4.28        | -4.31        |  |  |  |
| slack (nsec)                                                                                  | 5.07         | 11.64        | 17.21        | 15.54        | 15.51        |  |  |  |

| Fable 5.4: Compilation | n Summary of Fi | lter 0, Filter 1, F | Filter 2, Filter3 & | Filter4 of order N=17 |
|------------------------|-----------------|---------------------|---------------------|-----------------------|
|                        | 1               | / /                 | ,                   |                       |

The experimental results on 90 nm technologies file results on Synopsis design Vision tool showed that digital FIR filter design in fixed point arithmetic with MCM; partial product sharing and coefficient reuse approach gives better result in comparison to traditional DF or transposed digital FIR filter structure.

### VI. Proposed Fir Filter Implementation Approach Validation

The FIR filter implementation approaches proposed in this paper was found to be better than approach Computation Sharing High Speed Multiplier (CSHM) proposed in [24] published in year 2015 in which respective authors have claimed that their proposed CSHM approach is better than existing CSHM approach published in [25].

The FIR filter implementation approaches proposed in this paper was also found to be better than approach Computation Sharing High Speed Multiplier (CSHM) proposed in [27] published in year 2013 in which floating point arithmetic scheme was used in implementation.

The FIR filter implementation approaches proposed in this dissertation was also found to be better than approach presented in [26] in which a new design paradigm for the programmable FIR filters by exploiting the Extended Double Base Number Systems (EDBNS) to maximize the sub-expression sharing for all filter coefficients of a given word length was proposed in year 2015.

The respective benchmark experimental reading comparison of [24, 25, 26 & 27] with filter implementation proposed in this paper viz. filter2, filter3 and filter4 is given in table 6.1.

|                               | Tuble office comparison of proposed maniphers with approaches proposed in fast decade |           |           |                          |           |        |                          |        |  |  |
|-------------------------------|---------------------------------------------------------------------------------------|-----------|-----------|--------------------------|-----------|--------|--------------------------|--------|--|--|
| Multiplier size<br>in FIR 8×8 | Filter2                                                                               | Filter 3  | Filter 4  | Existing<br>CSHM<br>[70] | [70]      | [80]   | Existing<br>CSHM<br>[71] | [79]   |  |  |
| Filter order                  | N=17                                                                                  | N=17      | N=17      | N=4                      | N=4       | N=10   | N=10                     | N=20   |  |  |
| Technology                    | 90 nm                                                                                 | 90 nm     | 90 nm     | 180 nm                   | 180 nm    | 180 nm | 180 nm                   | 180 nm |  |  |
| Number of cells               | 1426                                                                                  | 3369      | 3319      | 1288                     | 904       | 46521  |                          | 4774   |  |  |
| Total cell area<br>in μm sq.  | 22965.350                                                                             | 54799.257 | 53236.224 | 30093.9                  | 24854.027 | 475442 |                          |        |  |  |
| Total area in<br>µm sq.       | 23439.235                                                                             | 56187.065 | 54596.246 |                          |           |        | $5 \times 10^{6}$        |        |  |  |

Table 6.1: Comparison of proposed multipliers with approaches proposed in last decade

| Total Dynamic<br>Power in mW            | 0.5749835 | 0.1052175 | 0.1052098 | 2.098 | 1.299 | 41.77 | 286.6 |      |
|-----------------------------------------|-----------|-----------|-----------|-------|-------|-------|-------|------|
| data arrival<br>time or delay<br>(nsec) | 2.67      | 4.28      | 4.31      | 9     | 8.151 | 44    | 5.7   | 2.59 |

Above comparison of the proposed FIR filter implementation approaches in this paper with recent benchmarks results in [24, 25 & 27] has established that proposed multiplier module (MUL3) is good multiobjective solutions in VLSI implementation of various digital signal processing (DSP) and digital communication applications where area-power-speed are design constraints.

#### VII. Conclusion

This paper has presented a new approach for multi-objective optimization of VLSI implementation of direct form FIR filter module and it has shown how simultaneously considerable optimization in area, power and delay can be achieved without affecting the functionality of filter module. This new approach is based on novel idea of use of concept of MCM approach with partial product sharing and coefficient reuse in multiplier module and /or digit serial architecture in adder module design along with fixed point arithmetic.

Section VI has established the benchmark that approaches presented in this paper is a better alternative over benchmark FIR filter (CSHM based) VLSI implementation proposed in last decade [70, 71, 79 & 80]. This is due the fact that approaches proposed in literature [70, 71, 79 & 80] has suggested transposed of direct form of FIR filter structure for optimization or has not consider the fact that in transposed structure latch size will go on increasing after each addition & hence the adder size after each stage will increase and hence these may increase area & power dissipation of the circuit. Further, concept of coefficient reuse in multiplier module and /or digit serial architecture in adder module design was not used in their implementation of FIR filter module. Moreover the multiplier module in these literatures was implemented using adder and shifter whereas in our proposed concept multiplier module is implemented by simply means set of high speed wires.

The suggested modifications were also found to be a better multi-objective optimization strategy in VLSI implementation of digital FIR design without affecting the functionality of the circuit when compared with other promising finding available in literature [6, 8, 12, 13, and 20]. Similarly the suggested VLSI implemented of multiplier module based on MCM approach with combination of adder module based on digit serial architecture approach was also found to be giving better multi-objective approach than promising finding available in literature [28].

Thus, the approach suggested in this paper may provide challenging solutions in realizing area, power as well as speed efficient optimized design for VLSI circuits or DSP system.

#### Reference

- Zhipeng Zeng ; Dept. of Electr. & Comput. Eng., Ryerson Univ., Toronto, ON, Canada ; Sedaghat, R. ; Sengupta, A., " A novel [1]. framework of Optimizing modular computing architecture for multi objective VLSI designs", Microelectronics (ICM), 2009 International IEEE Conference, Marrakech, ISBN: 978-1-4244-5814-1, 19-22 Dec. 2009.
- Moiseev, K., Wimer S., and Kolodny A. " Power-Delay optimization in VLSI microprocessors by wire spacing", ACM [2]. Transactions on Design Automation of Electronic Systems, Vol. 14, No. 4, Article 55, Pub. date: August 2009.
- A. E. Cetin, O.N. Gerek, Y. Yardimci, "Equiripple FIR filter design by the FFT algorithm," IEEE Signal Processing Magazine, pp. [3]. 60-64, March 1997.
- [4]. K. Johansson, O. Gustafsson, and L. Wanhammar, "Multiple Constant Multiplication for Digit-Serial Implementation of Low Power FIR Filters," WSEAS Transactions on Circuits and Systems, vol. 5, no. 7, pp. 1001-1008,2006.
- Y. Voronenko and M. Piischel, "Multiplierless Multiple Constant Multiplication," ACM Transactions on Algorithms, vol. 3, no. 2, [5]. 2007.
- H. Nguyen and A. Chatterjee, "Number-Splitting with Shift-and-Add Decomposition for Power and Hardware Optimization in [6]. Linear DSP Synthesis," *IEEE Trans. on VLSI, vol. 8, no. 4, pp. 419--424, 2000.* L. Aksoy, C. Lazzari, E. Costa, P. Flores, and J. Monteiro, "Efficient shift-adds design of digit-serial multiple constant
- [7]. multiplications," in Proc. Great Lakes Symp. VLSI, 2011, pp. 61-66.
- [8]. A. Dempster and M. Macleod, "Use of Minimum-Adder Multiplier Blocks in FIR Digital Filters," IEEE TCAS II, vol. 42, no. 9, pp. 569-577. 1995.
- Kyung-Saeng Kin and Kwyro Lee, " Low Power & Area Efficient FIR filter Implementation Suitable for Multiple taps", IEEE [9]. transctions on VLSI systems, Vol. 11, No.1, February 2003.
- Anantha P. Chandrakasen, S.Sheng and R.Brodersen, " Low Power CMOS Digital Design", IEEE Journal of Solid State Circuits, [10]. VOL 27, No. 24, April 1992.
- Phil Lapsley, Jeff Bier, Amit shoham & Edward A. Lee, "DSP Processor Fundamentals Architectures & Features", Chapter 9, [11]. Page no. 99, A Volume in the IEEE Press Series on Signal Processing, Wiley-Interscience Publication, ISBN no. 978-81-265-2354-
- N.Sankarayya, K.Roy and D.Bhattacharya, " Optimizing Computations in Transposed Direct Form Realization of Floating-Point [12]. LTI FIR Systems", Computer-Aided Design, 1997, Digest of Technical Papers., IEEE/ACM International Conference, 1997
- Levent Aksoy, Cristiano Lazzari, Eduardo Costa, Paulo Flores and Jose Monteiro, "Optimization of Area in Digit-Serial [13]. Multiple Constant Multiplications at Gate-Level", Circuits and Systems (ISCAS), IEEE International Symposium, Rio de Janeiro, 2011.

- [14]. Kyung-Saeng Kin and Kwyro Lee, "Low Power & Area Efficient FIR filter Implementation Suitable for Multiple taps", *IEEE transctions on VLSI systems, Vol. 11, No.1, February 2003.*
- [15]. Suhap Sahin, Yasar Becerikli, and Suleman Yazici, "Neural Networks Implementation in Hardware using FPGAs", ICONIP 2006, Part III, LNCS 4234, pp.1105, 2006, © Springer-Verlag Berlin Heidelberg 2006.
- [16]. A.Nagoor Kani, "Digital Signal Processing", Chapter 8, pp.8.1-8.16, Second Edition, Tata McGrawHill.
- [17]. Richard Hartley, Keshab Parhi, "Digit-Serial Computation", Chapter1, Page no.14-16, Springer Science & Business Media, ISBN 978-1-4615-2327-7, First edition, 1995.
- [18]. Keshab Parhi, Ching-Yi Wang, "Digit Serial DSP-architectures", International Conference on Application Specific Array Processor", CH2920-7/90/0000/034@1990, IEEE.
- [19]. Javeir Walls, Marcos PeiroTrini Sansaloni & EuardoBoemo, " A Study About FPGA- Based Digital Filters", Departamento de Ingeniería Electrónica, Universidad Politécnica de Valencia, Camino de Vera s/n, 46071 Valencia, Spain. E-mail: {jvalls, mpeiro, Escuela Técnica Superior de Ingeniería Informática, Universidad Autónoma de Madrid, Ctra. Colmenar Km.15, 28049 Madrid, Spain
- [20]. A. E. Cetin, O.N. Gerek, Y. Yardimci, "Equiripple FIR filter design by the FFT algorithm," *IEEE Signal Processing Magazine, pp.* 60-64, March 1997.
- [21]. Shahzad Asif and Yinan Kong, "Low-Area Wallace Multiplier", Hindawi Publishing Corporation, VLSI Design ,Volume 2014, Article ID 343960.
- [22]. Jitesh Shinde, S. Salankar, "Multi-objective Optimization for VLSI Circuits", IEEE International Conference on Computational Intelligence & Communication Networks", November 14-16, 2014, Kolkata, India
- [23]. Jitesh Shinde, S. Salankar, "Optimal Multi-objective Approach for VLSI Implementation of Digital FIR Filters", International Journal of Engineering Research & Technology (IJERT), Vol. 3, pg. no. 2470-74, Issue 2, February 2014.
- [24]. S. Umadevi, 1T. Vigneswaran, 1S. Kadam Vinay and 2V. Seerengasamy, "A Novel Less Area Computation Sharing High Speed Multiplier Architecture for FIR Filter Design", *Research Journal of Applied Sciences, Engineering and Technology 10(7): 816-823, July 10, 2015.*
- [25]. J.Park, W.Jeong, H.M Meimand, Y.Wang and Kaushik Roy, "Computation Sharing Programmable FIR Filters for Low Power & High Performance Applications", *IEEE Journal of Solid State Circuits, Vol 39, No 2, February 2004.*
- [26]. Jiajia Chen, Member, IEEE, Chip-Hong Chang, Senior Member, IEEE, Feng Feng, Weiao Ding and Jiatao Ding, "Novel Design Algorithm for Low Complexity Programmable FIR Filters Based on Extended Double Base Number System", IEEE transctions on Circuits and Systems I: Regular Papers, vol. 62, no.1, pp 224-233, Jan. 2015.
- [27]. Sivanantham S, Jagannadha Naidu K, Balamurugan S, Bhuvana Phaneendra D, "Low Power Floating Point Computation Sharing Multiplier for Signal Processing Applications", *International Journal of Engineering and Technology (IJET)*, Vol 5 No 2 Apr-May 2013.
- [28]. Shahzad Asif and Yinan Kong, "Low-Area Wallace Multiplier", Hindawi Publishing Corporation, VLSI Design, , Article ID 343960, Volume 2014.