# Design and Implementation of Optimized Area and PDP Multiplier for High Speed Digital Circuit Applications

### M. Kathirvelu

Abstract: Low power, High speed Multipliers are needed for high speed switching applications like Digital Signal Processing (DSP), microprocessors and filters. Various multiplier architectures are implemented by various research people. In the 8-bit array multiplier, partial products are obtained through AND gates and it is added sequentially through Full Adders and Half Adders. The array multiplier depends on the previous computations of partial sum to produce the final output. Hence, delay is more to produce the output. In the proposed architecture, partial products are added parallel to obtain the product with lesser delay. The power dissipation of full adder is minimized by implementing with the CMOS technology. The designed 8-bit multiplier is implemented and simulated with the Cadence Virtuoso tool in 90nm technology and its performance like Power, speed and area are analyzed.

Index Terms: Full Adder, Half Adder, Multiplier, Carry lookahead adder.

### I. INTRODUCTION

A Multiplier is one of the primary hardware block in most of the digital circuits. To multiply any two binary numbers these multipliers are required. It plays a major role in Digital Signal Processing, Microprocessors, Filters and ALU's etc. Distinctive structure of full adders (FA) and a large portion of adder's circuits has been utilized to lessen the power and delay so as to plan an advanced multiplier which incorporates the CMOS innovation. There are different multiplier algorithms are existing among them few are Booth, Dadda, Vedic, Wallace tree multipliers are some of the optimized multiplier algorithms. In the optimized multiplier algorithms delay is less as compared to previous design but still delay is significantly has to reduce, so it will helps in large circuits where multipliers plays a key role. In the proposed work, multiplier has been designed and implemented with 90nm technology by adding the partial products parallel to reduce the delay. There are various research has been developed to design the full adder with high speed and low power. In the proposed structure, CMOS based 10T FA is used to structure the multiplier.

#### Revised Manuscript Received on December 22, 2018.

**M. Kathirvelu**, Electronics and communication Engineering,, GMR Institute of Technology, Rajam,Andhrapradesh, India.

#### **II. EXISTING METHOD**

Array multiplier is a common in structure and is less requesting to plan. It is used for increment of unsigned numbers by using FA and half adders (HA). FA and HA are related in on an alternate plane, vertically and corner to corner to get aggregate of the partial items. Partial items are adjusted legitimately by basic steering and it requires any logic. Each row of adder adds an partial item to the total, creating new halfway aggregate and an arrangement of carriers.

A 4-bit multiplier based on dada algorithm utilizing FA and HA is analyzed. These FA and HA are designed utilizing pass transistor logic and Complementary MOS process innovation to limit proliferation postponement and power dissemination [1]. The different diverse multipliers and looked at all the multipliers as far as delay, power, unpredictability and region in light of the fact that in the VLSI usage, it manages the low power and fast processor by executing a good multiplier [2]. The Wallace tree multiplier and Booth and Dadda multiplier and presumed that Wallace has less power scattering. Since power minimization is one of the vital factor for number of reasons running from expanding interest for compact registering to the issue of hot chip because of expanding clock frequencies and device counters. Wallace tree multiplier is quickest among every one of the multipliers though Booth multiplier is a best decision when speed isn't critical [4]. Multipliers are getting to be a standout amongst the most vital essential building blocks in DSP and several applications. The speed of the execution of multiplier winds up a standout amongst a most critical parameter of a VLSI framework. The proposed Dadda multiplier (8x8) is utilized to diminish the calculation in the multiplier fractional item by the utilization of carry lookahead adder in the last phase of expansion rather than RCA. In this work, the speed is expanded by one third of 100% (inertness) than the regular Wallace tree multiplier and one fifth of 100% than the ordinary dadda tree multiplier. The outcome demonstrated that the proposed engineering is increasingly productive as far as the speed contrasted with the current one with slight increment in power [5]. An alteration of the CLA for Wallace/Dadda Multiplier to utilize convey look forward adders rather than full adders to actualize the decrease of the bit item lattice into the two numbers that are summed to make

the item. Each CLA diminishes up



Retrieval Number: F11850476S519/19©BEIESP

1081

Published By: Blue Eyes Intelligence Engineering & Sciences Publication to 9 partial products while taking a similar time. This prompts less number of stages than a conventional Wallace/Dadda Multiplier [6]. High power dissemination expands temperature of the chip and influences the execution of the structure. Numerous systems at various dimensions of configuration process have proposed to lessen the power scattering. Speed is an essential prerequisite of elite frameworks. In this work, the proposed 8\*8 tree multiplier is by consolidating both Wallace and Dadda technique. By utilizing the both the multipliers they diminished power, deferral and region of the multiplier [8]. Arithmetic operations (calculations) are performed by Arithmetic and Logical Unit (ALU) which is fundamental unit of a processor. Decreasing delay diminishes the general calculation time. Carry Propagating Adder (CPA) takes long time as it need to get engendered until the last adder [10]. The array multiplier delay is calculated by the time taken by the signals to propagate through AND gate, FA and HA.

#### III. 8-BIT ARRAY MULTIPLICATION

An 8 x 8 array multiplier takes two 8-bit input and generates an output of 16-bits. Let the multiplier and multiplicand be AO - A7 and BO - B7 and its outputs are PO-P15. The array multiplication process is shown the below figure 1.

|        | A7 A                                 | A6 A5 A4 A3 A2 A1 A0                 |
|--------|--------------------------------------|--------------------------------------|
| 1      | * B7 B                               | 6 B5 B4 B3 B2 B1 B0                  |
|        | A7B0                                 | ) A6B0 A5B0 A4B0 A3B0 A2B0 A1B0 A0B0 |
|        | A7B1 A6B1                            | A5B1 A4B1 A3B1 A2B1 A1B1 A0B1        |
|        | A7B2 A6B2 A5B2                       | A4B2 A3B2 A2B2 A1B2 A0B2             |
|        | A7B3 A6B3 A5B3 A4B3                  | A3B3 A2B3 A1B3 A0B3                  |
|        | A7B4 A6B4 A5B4 A4B4 A3B4 A           | A2B4 A1B4 A0B4                       |
|        | A7B5 A6B5 A5B5 A4B5 A3B5 A2B5 A      | A1B5 A0B5                            |
| А      | A7B6 A6B6 A5B6 A4B6 A3B6 A2B6 A1B6 A | 10B6                                 |
| A7B7 A | A6B7 A5B7 A4B7 A3B7 A2B7 A1B7 A0B7   |                                      |

#### P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0 Figure 1: Array multiplication

In the array multiplier, most of the present inputs for computation of partial sum are depend on the previous output of the adders. Hence, many full adders are idle until the previous output is received from the adders. Hence the delay is more to compute the final product. To reduce the delay of adder, parallel computations of the inputs have to been done in the proposed multiplier. The partial products are added simultaneously and it reduces the number of full adder delays. The regular structure of 8-bit Array multiplier is portrayed in figure 2. The final stage of array multiplier is RCA structure to compute the final output. The ripple carry adder requires highest delay for computation of output and hence it is proposed to modify the final stage with carry look ahead adder to decrease the delay further.



Figure 2: Architecture of 8\*8 array multiplier

#### IV. PROPOSED METHODOLOGY

#### A. PROPOSED METHOD WITH RIPPLE CARRY ADDER:

Though there are different algorithms to reduce the delay in the multiplier the proposed methodology proven to be the best. The propagation delay which are created during the addition of the partial products are reduced. The proposed 8x8 multiplier has a total of 64 partial products, so that the height is eight. In the proposed multiplier, partial products are added parallelly with the help of FA and HA. The full adder is designed with10 transistor CMOS technology to reduce the power consumption. The regular array multiplier, only one adder will perform the addition at a time and remaining adders will wait for results of previous computation. It will increase the delay of multiplication output. The carry and sum of the previous outputs are used by the successive stages in regular array multiplier.

The various stages of computation is shown in figure 3 to figure 9. In the first stage of proposed multiplier, 17 adders will compute the result simultaneously and its outputs are propagated to next stage. In the second stage, 11 full adders are computing the parallel addition and in third stage will perform 8 computations in parallel. In fourth & fifth stages will compute only 3 parallel additions. The final stage of the multiplier required 8bit ripple carries adder architecture to compute the final output. In the proposed multiplier requires only 6 full adder delay to compute 8-bit multiplications. In the final stage required 8-bit RCA to compute last 9-bit output. The delay of proposed multiplier is further reduced by replacing ripple carry structure by carry look ahead adder. An algorithm is developed for proposed structure and coding is written in Verilog to simulate in cadence. The height is reduced from eight to two.



Published By: Blue Eyes Intelligence Engineering & Sciences Publication Further, then all the remaining stages are similarly reducing the height of the tree, up to sixth stage. The Schematic of proposed method with ripple carry adder is shown in figure 10.



Figure 3: First stage of computations



Figure 4: Second stage of computations



Figure 5: Third stage of computations





Figure 7: Fifth stage of computations

A7B7 FS27 FS34 FS38 FS37 FS40 A7B1 FS30 FS39 HS5 HS4 HS3 HS2 HS1 A0B0 FC27 FC34 FC38 FC37 FC40 FS36 FC29 HC5 FC39

Figure 8: Sixth stage of computations

A7B7 FS27 FS34 FS38 FS37 FS40 HS7 FS41 HS6 HS5 HS4 HS3 HS2 HS1 A0B0

FC27 FC34 FC38 FC37 FC40 HC7 FC41 HC6

#### Figure 9: Seventh stage of computations



Figure 10: Schematic diagram for proposed method with ripple carry adder

## B. PROPOSED METHOD WITH CARRY LOOK AHEAD ADDER

The RCA required previous stage computations to calculate the next stage output. Hence it will consume more delay to produce the output. The delay will be reduced by replacing the RCA structure by CLA. Pre-processing is done by this circuit. Carry look ahead structure produces propagate and generate terms. The equation for sum and carry of CLA is given in equation 1 to 4.

| Pi = Xi ^ Yi         | (1) |
|----------------------|-----|
| Gi = Xi Yi           | (2) |
| Sum = Pi ^ Ci-1      | (3) |
| Carry = Gi + Pi Ci-1 | (4) |

CLA is fast parallel adder uses the concept for generating and propagating carries. By using carry look ahead adder, without external input, it will produce the carry for the additions. Hence, maximum amount of delay will be reduced while compared to the RCA. The schematic of proposed multiplier with carry look ahead structure is shown in figure 11.



Published By:

& Sciences Publication

Blue Eyes Intelligence Engineering



Figure 11: Schematic diagram for proposed method with Carry look ahead adder

### **V. SIMULATED RESULTS & DISCUSSION**

The designed 8-bit Array multiplier and proposed multipliers are simulated with 90nm technology in cadence tool. The proposed multiplier is implemented with RCA and CLA at final stage. The multiplier inputs A0- to A7 & B0- B7 are applied to the AND gates to produce the partial products. The partial products are added with designed full adder and half adder. The simulated waveform is depicted in figure 12. The power consumed, Delay and Area occupied by the various structure is given in table 1. The performance metrics like power, delay, area and PDP is represented as graph in figure 13 to 16.

| Architecture | Timings(pS) | Total | Area               | PDP                |
|--------------|-------------|-------|--------------------|--------------------|
|              |             | power | (nm <sup>2</sup> ) | X <sup>10-12</sup> |
|              |             | (mw)  |                    | (ws)               |
| Array        | 14930       | 66.05 | 1323.82            | 0.986              |
| Multiplier   |             |       |                    |                    |
| Proposed     | 3949.3      | 60.99 | 1317.76            | 0.240              |
| Multiplier   |             |       |                    |                    |
| with RCA     |             |       |                    |                    |
| Proposed     | 3514.1      | 65.88 | 1353.34            | 0.231              |
| Multiplier   |             |       |                    |                    |
| with CLA     |             |       |                    |                    |

**Table 1: Performance of various Multipliers** 



# Multiplier

Comparing the Delay with array multiplier, proposed multiplier with RCA adder requires 75% lesser delay and CLA adder requires 76.4% lesser delay.



Figure 13: Comparison of delay of different multipliers

Comparing the power with array multiplier, proposed multiplier with RCA adder requires 8% lesser and CLA adder requires 3% less power consumptions.



Figure 14: Comparison of Total Power of different multipliers

Comparing the Power Delay Product with array multiplier, proposed multiplier with RCA & CLA adder is superior by 75% and 76%



Retrieval Number: F11850476S519/19@BEIESP

Published By:

& Sciences Publication



Figure 15: Comparison of Power Delay Product of different multipliers

Comparing Area, array multiplier requires 2% lesser than proposed multiplier with CLA adder.



## Figure 16: Comparison of Areas of different multipliers

#### VI. CONCLUSION

The proposed 8-bit multiplier has implemented with parallel computation of partial products. The performances are compared with array multiplier. The designed circuits are simulated with 90nm technology in cadence virtuoso software. The proposed multiplier is simulated with RCA and CLA structure at the final stage and its performances are compared with array multiplier. The power consumption of proposed multiplier is 8% lesser than array multiplier with increasing speed of 76%. Comparing the power delay product (PDP), the proposed structure is 76% lesser than array multiplier with increasing 2% area. Hence, the proposed multiplier is optimised structure for all high speed applications.

#### REFERENCES

- Tariq Kamal, Muhammad Hussnain Riaz, "Low Power 4x4 Bit Multiplier Design using Dadda Algorithm and Optimized Full Adder" International Bhurban Conference on Applied Sciences & Technology, DOI: 10.1109/IBCAST.2018.8312254, March 2018.
- Sumit Vaidya, Deepak Dandekar, "Delay-Power performance comparison of multiplier in VLSI circuit design" International Journal of Computer Networks & Communications (IJCNC), vol.2, No.4, July 2010.
- 3. M. Kathirvelu, "Design of Floor Plan based Optimized PDP Multiplier using Novel CMOS CPL Hybrid Full Adder in the Asian Journal of

Research in Social Sciences and Humanities, ISSN 2249-7315, Volume 6, Issue No 9, pp. 430-441, Sep 2016.

- Deepika Purohit, Himanshu Joshi, "Comparative Study and Analysis of Fast Multipliers" International Journal of Engineering and Technical Research (IJETR), ISSN: 2321-0869, Volume-2, Issue-7, July 2014.
- T. Arunachalam and S. Kirubaveni, "Analysis of High Speed Multipliers" International conference on Communication and Signal Processing, April 3-5, 2013, India.
- Wesley Chu, Ali I. Unwala, Pohan Wu, and Earl E. Swartzlander, "Implementation of a High Speed Multiplier Using Carry Lookahead Adders", ISSN: 1058-6393, 08 May 2014.
- M. Kathirvelu and Manigandan, "Design of Area Optimized, Low power, High Speed Multiplier using Optimized PDP full adder" in the International Journal of Electrical Engg, ISSN 0974-2158, Volume 6, Number 2, pp. 173-185, June 2013
- P.Anitha, Dr. P. Ramanathan, "A new hybrid Multiplie using Dadda and Wallace method "International Conference on Computer Communication and Infonnatics (ieee -2014), 13-14 Feb. 2014
- R Arun Sekar, Balaji G Naveen, A Gautami, B Sivasankari, "High efficient carry skip adder in various multiplier structures", Advances in Natural and Applied Sciences, Vol. 10, Issue 14, sep. 2016, pp. 193-198.
- S. Rajaram, K. Vanithamani, "Improvement of Wallace multipliers using parallel prefix adders", International Conference on Signal Processing, Communication, Computing and Networking Technologies, INSPEC Accession Number: 12318870, September 2011.



Published By: Blue Eyes Intelligence Engineering & Sciences Publication