# FPGA Implementation of High Speed and Low Area Four Port Network-On-Chip (NoC) Router

Santrupti M. Sobarad<sup>1</sup>, Sayantam Sarkar<sup>2</sup> and Shubhangi Lagali<sup>3</sup>

<sup>1</sup>Programmer Analyst, Cognizant Technology Solutions, Bangalore, India. <sup>2</sup>Assistant Professor, E.C.E. Department, Vijaya Vittala Institute of Technology, Bangalore, India. <sup>3</sup>Associate R&D Engineer, Sai-Tektronix Pvt. Ltd., Bangalore, India.

**Abstract:** In today's modern life high speed devices are the essential components of daily human life to reduce efforts of day to day works. For this reason those device must be able to operate at very high speed. To increase the operating speed, normally multi-core processing architecture is used. In those cases, the total task is subdivided into multiple tasks and each processing cores are executing a particular task in parallel manner. But to calculate accurate value of the total task the individual processors must share some of the variables depending upon the task. So, bi-directional communication is necessary between all processing elements present in the multi-core system. But many issues occur in normal bus architecture to support this kind of communications. This case Network-On-Chip (NoC) router is suitable. In this paper we propose four port NoC router which can operate at high speed by consuming less area.

Keywords: FPGA Architecture, Matrix Switch, Network Architectures, NoC and VLSI Techniques etc.

## I. Introduction

The sophistication of the electronic devices increases rapidly due to the revolutionary development in the microchip (IC) technology. As a result, the number of IP blocks increases on the System on Chip (SoC). This increases the amount of data transferred from one IP block to another IP block present on the chip. The existing bus architectures are not able to transfer the required amount of data without reducing the device operating frequency. To overcome this problem, Network on Chip (NoC) is used for the communication between the IP blocks present on the chip. In other word, NoC is mainly used to build network interface (NI) between various IP blocks present on the chip. Normally NoC uses basic network topologies like mesh, ring, and torus etc., [1] for communication purpose.

### **II.** Literature Surveys

Mathew and Mugilan [2] proposed reconfigurable router architecture by using heterogeneous router architecture. The proposed router can change the buffer length dynamically. It uses the multiplexers to reduce the power consumption. But this requires large area and hardware resources. Nasim Nasirian and Magdy Bayoumi [3] proposed power efficient adaptive routing algorithm. The algorithm directs the traffics in the network with respect to the routers status. This reduces static power consumption and average delay of the circuit. But this also increases the area and cell leakage power. Shaoteng Liu et al., [4] combine circuit switching and packet switching techniques. The proposed technique gives high throughput and low latency. But this requires large setup time. Somashekhar and Rekha [5] proposed ten port router using crossbar switch. This reduces power consumption and delay. But the architecture uses arbiter which increase area requirement. Paria and Reza [6] proposed Reconfigurable NoC router by replacing few routers by five port switches constructed by standard SRAM cells. This architecture reduces delay, power consumption and increase throughput of the circuit at the cost of area overhead. Suraj et al., [7] proposed dual crossbar NoC router architecture. This decrease device utilization at the cost of device latency. Giuseppe Ascia et al., [8] proposed Neighbors-an-Path adaptive routing algorithm. The algorithm uses immediate neighbor's congestion level for adaptive routing. The algorithm uses less area. Maurizio Palesi et al., [9] proposed application specific routing algorithm (APSRA). The algorithm is designed specifically for a set of applications to maximize adaptivity and performance. M. Plalesi et al., [10] proposed distribute traffic algorithm for NoC. Rodrigo et al., [10] proposed Logic Based Distributed Routing (LBDR). The proposed technique can perform operation without routing table. Mavevich et al., [12] proposed centralized adaptive routing technique for NoC along with a specific mesh topology. This algorithm continuously monitors the traffic load and modifies the packet depending on load. Jose Flich et al., [13] classified the existing routing algorithms depending on their important properties. Martha et al., [14] propose multi-object adaptive immune algorithm ( $M^2AIA$ ) to overcome the problem of mapping NoC in a large set of applications. The latency and power consumption of this algorithm depends upon the range of application considered. The efficiency of proposed algorithm is good. This algorithm is coded using C++ language. Sergio Saponara et al., [15] propose multi-processor NoC based architecture for image/video enhancement. The architecture uses packet switched data for good communication. This structure is implemented on 65 nm CMOS technology. The structure produce high throughput and operate at 400 MHz frequency. Varginie Fresse et al., [16] propose predictive NoC architecture for image analysis using PIV algorithm. Total hardware resources in the architecture break down to many single blocks in such way that improves predictability. Also each block can be optimized separately to maximize operating frequency.

## **III. Proposed Architecture**

The block diagram of proposed NOC router is shown in Fig. 1 which consists of four FIFO and one controller unit. The proposed block is having input and output ports in all four directions namely East, West, North and South. Depending on the control logic one of the four input data is routed to one of the four output port with some delay inserted by the FIFO. Also the FIFO's of the respective port is empty or full will be detected by empty and full port respectively. This is mainly used to avoid data collision, a problem which occurs in the communication networks.



## 2.1. FIFO

The block diagram of FIFO is shown in Fig. 2. The main use of the FIFO is to synchronize the input data at output side with the clock signal and hold the data packet temporarily for a predefined period.



The internal structure of the FIFO is shown in Fig. 3. The FIFO mainly consists of three D flip-flops which are also known as FIFO depth [17]. The Counter and Decision block is used to indicate the memory conditions (i.e. full or empty).



### 2.2. Controller Unit

The controller unit is used to control the direction of data flow through the router by controlling all the ports representing data flow in respective directions. The internal structure of the controller unit is shown in Fig. 4. This block mainly consists of multiplexers, de-multiplexer, D flip-flops and DEMUX Controller. Depending on the port select line each multiplexer will pass one of the four input signal to the respective output port. At each output port one D flip-flop is used to synchronize all data with clock signal. The enable port of the D flip-flop is controlled by DEMUX and DEMUX Controller. The DEMUX controller is used to route the request signal (req) to the rst port of the D flip-flop depending upon destination address which activates that particular direction D flip-flop to rout the data correctly in proper direction.



## **IV. FPGA Implementation**

The proposed architecture is implemented on Spartan-6 (XC6SLX45-3csg324) FPGA board using Xilinx 14.5 software and simulation is checked by ISim simulator P.58f version. The coding of the architecture is done using VHDL language.

#### 4.1. RTL Schematic

The snapshot of the RTL schematic generated by the software is shown in Fig. 5 which will show the internal blocks and their interconnections.



Fig.5: RTL Schematic of Proposed Router

#### 4.2. Technology Schematic

Any FPGA will map digital logic in-terms of lookup table, LUT etc. The Internal Schematic of FPGA mapping is shown in Fig. 6



Fig.6: Technology Schematic of Proposed Router

#### 4.3. Synthesis Report

The hardware utilization of the proposed structure is given in Table 1 for Spartan-6 (xc6slx45-3csg224) board. The proposed NoC router uses 224 slice registers, 162 slice LUT's and 128 LUT-FF pairs. The maximum operating frequency is 292.987 MHz.

| Parameters                         | <b>Device Utilizations</b> |  |
|------------------------------------|----------------------------|--|
| Numbers of Slice Registers         | 224                        |  |
| Numbers of Slice LUT's             | 162                        |  |
| Numbers of fully used LUT-FF Pairs | 128                        |  |
| Maximum Operating Frequency (MHz)  | 292.987                    |  |

Table I: Device Utilization Summery of Proposed Architecture

#### V. Comparisons With Existing Techniques

Hardware comparisons of proposed technique with existing techniques are given in Table 2. The architecture proposed by Suraj et al., [7] was implemented on Virtex-5 board and uses 1322 slice registers and 1022 slice LUTs. The architecture proposed by Afroz and Shaik [18] was implemented on Spartan-3 board and uses 31056 slice registers, 56690 slice LUTs, 22163 LUT-FF pairs with maximum operating frequency of 49.718 MHz. Also the architecture proposed by Ashis and Bhoyar [19] was implemented on Virtex-2 board and uses 529 slice registers, 954 slice LUTs, 523 LUT-FF pairs with maximum operating frequency of 226.19 MHz. The proposed architecture uses 529 slice registers, 954 slice LUTs, 523 LUT-FF pairs with maximum operating frequency of 292.987 MHz implemented on Spartan-6 board.

**Table II:** Hardware Comparisons of Existing Architectures with Proposed Architecture

| Tuble II. Hardware comparisons of Existing Anemicetures with Proposed Anemiceture |                   |                      |                       |            |  |
|-----------------------------------------------------------------------------------|-------------------|----------------------|-----------------------|------------|--|
| Parameters                                                                        | Suraj et al., [7] | Afroz and Shaik [18] | Ashis and Bhoyar [19] | Proposed   |  |
| Board                                                                             | Virtex-5          | Spartan-3 (XC3S500E) | Virtex-2 (XC2VP30)    | Spartan-6  |  |
|                                                                                   | (XC5VLX50T)       | -                    |                       | (XC6SLX45) |  |
| Number of Slice Registers                                                         | 1322              | 31056                | 529                   | 224        |  |
| Number of Slice LUTs                                                              | 1022              | 56690                | 954                   | 162        |  |
| Number of fully used LUT-                                                         |                   | 22163                | 523                   | 128        |  |
| FF Pairs                                                                          |                   |                      |                       |            |  |
| Maximum Operating                                                                 |                   | 49.718               | 226.19                | 292.987    |  |
| Frequency (MHz)                                                                   |                   |                      |                       |            |  |

#### **VI.** Conclusion

In this paper we propose four port Network-On-Chip (NoC) router which is able to transfer the data in bi-directional manner. The architecture is able to operate at high speed without consuming large area. This is because the architecture is designed using basic elements are used in digital logic such as multiplexer, demultiplexer, D flip-flop etc. Also the proposed architecture is suitable for VLSI implementation.

#### References

- B. Forouzan, "Data Communication and Networking", Tata McGraw Hill, 4th Edition, 2006. [1].
- Minu Mathewand and D Mugilan, "Reconfigurable Router Design for Network-On-Chip", International Conference on Circuit, [2]. Power and Computing Technologies, pp. 1268-1272, India, March 2014.
- Nasim Nasirian and Magdy Bayoumi, "Low-Latency Power-Efficient Adaptive Router Design for Network-on-Chip", 28<sup>th</sup> IEEE International System-on-Chip Conference, pp. 287-291, China, September 2015. [3].
- [4]. Shaoteng Liu, Axel Jantsch and Zhonghai Lu, "Analysis and Evaluation of Circuit Switched NoC and Packet Switched NoC", 16th Euromicro Conference on Digital System Design, pp. 21-28, California, September 2013.
- Somashekhar Malipatil and Rekha S, "Design and Analysis of 10 PortRouters for Network on Chip (NoC)", International [5]. Conference on Pervasive Computing, pp. 1-3, India, January 2015.
- Paria Darbani and Hamid Reza Zarandi, "A Reconfigurable Network-on-Chip Architecture to Improve Overall Performance and [6]. Throughput", 22<sup>nd</sup> Iranian Conference on Electrical Engineering, pp. 943-948, Iran, May 2014.
- M.S.Suraj, D.Muralidharan and K.Seshu Kumar, "A HDL Based Reduced Area NoC Router Architecture", International [7]. Conference on Emerging Trends in VLSI, Embedded System, Nano-Electronics and Telecommunication Systems, pp. 1-3, India, 2013
- [8]. Giuseppe Ascia, Vincenzo Catania, Maurizio Palesi and Davide Patti "Neighbors-on-Path: A New Selection Strategy for On-Chip Networks", Proceeding of the IEEE/ACM/IFIP Workshop on Embedded system for Real Time Multimedia, pp. 79-84, South Korea, October 2006.
- M. Palesi, R. Holsmark, S. Kumar and V. Catania, "Application Specific Routing Algorithms for Network on Chip", *IEEE Transactions on Parallel and Distributed Systems*, Vol. 20, Issue. 3, pp. 316-330, March 2009. [9].
- [10]. M. Plalesi, S. Kumar and V. Catana, "Bandwidth-Aware Routing Algorithms for Networks-on-Chip Platforms", IET Computers & Digital Techniques, Vol. 3, Issue. 5, pp. 413-429, September 2009.
- S. Rodrigo, S. Medardoni, J. Flich, D. Bertozzi and J. Duato, "Efficient implementation of distributed routing algorithms for NoCs", [11]. IET Computers & Digital Techniques, pp. 460-475, Vol. 3, Issue. 5, September 2009. R. Manevich, I. Cidon, A. Kolodny and I. Walter, "Centralized Adaptive Routing for NoCs", IEEE Computer Architecture Letters,
- [12]. Vol. 9, Issue. 2, pp. 57-60, February 2010.

- [13]. Jose Flich, T. Skeie, A. Mejia, O. Lysne, P. Lopez, A.Robles, J. Duato, M. Koibuchi, T. Rokicki and J. C. Sancho, "A Survey and Evaluation of Topology-Agnostic Deterministic Routing Algorithms", *IEEE Transactionson Paralleland Distributed Systems*, Vol. 23, Issue. 3, pp. 405-425, March 2012.
- [14]. Martha Johanna Sepulveda, Wang Jiang Chu, Guy Gogniat and Marius Strum, "A Multi-Objective Adaptive Immune Algorithm for Multi-Application NoC Mapping", *International Journal of Analog Integrated Circuits and Signal Processing*, Springer, Vol. 73, Issue. 3, pp. 851-860, December 2012.
- [15]. Sergio Saponara, Luca Fanucci and Esa Petri, "A Multi-Processor NoC based Architecture for Real Time Image/Video Enhancement", *International Journal of Real-Time Image Processing*, Springer, Vol. 8, Issue. 1, pp. 111-125, March 2013.
- [16]. Varginie Fresse, Alain Aubert and Nathalic Bochard, "A Predictive NoC Architecture for Vision Systems Dedicated to Image Analysis", EURASIP Journal on Embedded Systems, Hindwai, pp. 1-13, March 2007.
- [17]. [Online]http://www.asic-world.com/tidbits/fifo\_depth.html.
- [18]. Afroz Fatima and Shaik Mohammed Waseem, "Rapid On-Chip Communication in 2D Networks using 8-port Router in a Multicast Environment and Their Research", *International Journal of Science and Research*, Vol. 3, Issue. 9, pp. 117-120, September 2014.
- [19]. Ashis Khodwe and C. N. Bhoyar, "Efficient FPGA Based Bi-directional Network on Chip Router Through Virtual Channel Regulator", 2<sup>nd</sup> International Conference on Emerging Trends in Engineering and Management, Vol. 3, Issue. 3, pp. 82-87, July 2013.