Research Page of Sumanta Chaudhuri

Differential Scanning Techniques for Detection of Security Vulnerabilities

A technique to discern security bugs in a SoC well in advance of tapeout.

Pros: Simulation based technique very similar to assertion based verification(ABV).

Cons: Not Static. (i.e like lint) need to have a simulation up and running. (which is not so easy for big SoCs)

Here is our invited talk on the same topic at DAC 2018:

Accelerator Design With OpenCL

The Objective of this ATHENS one week course is to introduce the students to the concepts of programming with OpenCL. Recently there is a trend in Computer Architecture towards heterogeneous systems (HSA) where accelerators like FPGAs, GPUs are integrated on the same die as Chip Multi-Processors. Compute intensive tasks are then offloaded to these accelerators. OpenCL (Open Computing Language) is an industry standard language for parallel programming which is adopted by industry leaders such as Intel, Xilinx, ARM for programming accelerators (i.e Intel FPGAs, ARM Mali GPUs). After following this course a student should be able to :

  1. Write basic OpenCL programs (both host program and kernel) for FPGAs.

  2. Write basic OpenCL programs for programming GPUs.

  3. Be familiar with notions of optimization for performance.


Day 1 : Introduction to OpenCL API, and Host Program.

Day 2. Practical work with ARM MALI OpenCL SDK.

Day 3: Hands On experience: Programming GPUs with ODROID XU4 Boards.

Day 4: Practical work with Xilinx SDSoC.

Day 5: Hands On Experience: Programming FPGAs with Pynq-Z1 boards.


Computer Architecture, VLSI, C/C++

Course exam

The students will be marked based on

  1. Practical Work

  2. Quiz at the end of the course.

Course Material


Architecture Recap

GPU Architecture

FPGA Architecture

OpenCL Syntax



Multi-Valued Routing in FPGAs

As the performance of a processor based system depends largely on the available memory bandwidth, the performance of a gate array (FPGA) is intertwined with its interconnect speed and density. 70% of the FPGA area is thus interconnect switches and buffers. This is not a surprise, because memory and interconnect are nothing but two sides of the same coin. While one bit of memory transfers one bit of information from point A in time to point B in time, a 1 bit wire transfers 1 bit of information from point A to point B in space.

Flash memory chips already use multivalued (MLC) flash transistors to increase density. In this research we try increasing interconnect density by using multi-valued routing wires, i.e 4 voltage levels to encode 2 bits of information. Things get much more technical from here as handling 4-valued signals with binary switches is not evident. Check out this article to see how we try to implement this with FDSOI transistors (not possible with ordinary CMOS).

GAGA: A SoCFPGA Cluster for Fun and Profit

With some help from our students (Zhengyu Xu) and Karim (Karim Ben Kalia) we have put up a Heterogeneous Computing Cluster, called GAGA (GPU And Gate Array) cluster. Yeah, sorry for the name. While this is not the first cluster of this type, GAGA is a bit different. It is an embedded super-computing cluster, i.e the whole software stack is rebuilt given the application, and it runs only one application at a time. In the spirit of embedded computing, everything (Linux kernel, MPI, OpenCL, FPGA hardware) is tuned to a technical context.


  • DE1-SoC Boards from Terasic, with cyclone V SoCFPGA from Altera (ARM Cortex A9 (Dual core) Hard Processor System 925 MHz, 85K LUT FPGA, 1GB DDR3 SDRAM.

  • ODROID-XU board from Hardkernel [3], with Exynos Octa SoC (ARM Cortex A15/A7 (quad core) and PowerVR/Mali embedded GPUs, 2GB LPDDR3).

  • HP Procurve 2530G PoE+ switch. (24 1Gb ports, 4 SFP ports, Switching Capacity 56 Gbps, PoE power capability 195 Watts,


Only the libraries necessary for a given technical context goes on-board. The whole executable software can be contained within 10MBytes, which allows fast booting, wake-up and more resources dedicated to applications.

Software Stack Build System
Linux Kernel Kernel Config
InitRamFS Buildroots
FPGA OpenCL Runtime Altera OpenCL SDK
GPU OpenCL Runtime PowerVR GPU Compute
Custom Libraries e.g BLAS GCC X-Compile