Computation and Communication Blocks for Xilinx Kintex7 FPGA with UTIA EdkDSP Accelerators. Vivado 2013.4 Designs with SW Demos.

Key features

This application note describes the precompiled Vivado 2013.4 Kintex7 designs with the floating point EdkDSP accelerators and examples of use of basic communication and computation blocks used in the video processing and image processing applications. The MicroBlaze SoC design with the AXI bus is based on the Xilinx BIST (build in self-test) provided by Xilinx for the Kintex7 KC705 board and the Vivado 2014.3 design flow. The network HW controller is supporting 1Gbit/100Mbit/10Mbit standards with HW DMA and a SW stack based on the lwIP library described in the Xilinx application note XAPP1026. The MicroBlaze processor is controlling 8 EdkDSP floating point accelerators. Each accelerator is organised as 8xSIMD reconfigurable data path, controlled by a PicoBlaze6 controller. This evaluation package is provided by UTIA for the Xilinx KC705 board with the 28nm Kintex7 xc7k325t-2 FPGA part.

This application note explains how to install and use the demonstrator on Windows7, (32 or 64 bit) and Xilinx KC705 board. The evaluation package can be also installed and used on PC running Linux (32 or 64 bit) or Win XP (32 or 64 bit) with the corresponding Xilinx SDK 2013.4. These key features are demonstrated:

  • WWW server running on Kintex7 KC705 board with the lwIP stack running in RAW mode or SOCKET mode with the Xilkernel support of POSIX compatible threads.
  • TFTP server running on Kintex7 KC705 board with the lwIP stack running in RAW mode or SOCKET mode.
  • RAM based file system with files in the DDR3 memory on the KC705 board.
  • 8 reprogrammable floating point accelerators for local embedded computing on the Kintex7 28nm chip.
  • Demo implementation of an adaptive acoustic noise cancellation on 1 of the 6 accelerators is computing the recursive adaptive LMS algorithm for identification of regression filter with 2000 coefficients in single precision floating point arithmetic with this sustained performance
    • 1012,0 MFLOP/s on a single 175 MHz (8xSIMD) EdkDSP accelerator (only 1 of the 8 units is used)
    • 7,6 MFLOP/s on the 100 MHz MicroBlaze processor with the floating point HW unit
  • The EdkDSP accelerators can be reprogrammed by the firmware. The programming is possible in C with the use of the UTIA EDKDSP C compiler. Accelerators can be programmed with two firmware programs. Designs can swap in the real time the firmware in only few clock cycles in the runtime.
  • The alternative firmware can be downloaded to the EdkDSP accelerators from the internet in parallel with the execution of the current firmware. This is demonstrated by the download of firmware by the TFTP server and by swap of the firmware for the FIR filter room-response to the firmware for the adaptive LMS identification of the filter coefficients in the acoustic noise cancellation demo.
  • The EdkDSP accelerator is providing single-precision floating point results bit-exact identical to the reference software implementations running on the MicroBlaze with the Xilinx HW single precision floating point unit.
  • Single 175 MHz (8xSIMD) EdkDSP accelerator is 132x faster than computation on the performance optimized 100 MHz MicroBlaze with HW floating point unit, in the presented case of the 2000 tap adaptive LMS filter.
  • The floating point 2000 tap coefficients FIR filter (acoustics room model) is computed by single 175 MHz (8xSIMD) EdkDSP accelerator with the floating point performance of 1403 MFLOP/s. The peak performance (only theoretical) of a single 175 MHz (8xSIMD) EdkDSP accelerator is 2,8 GFLOP/s.
  • The peak performance of eight 175 MHz (8xSIMD) EdkDSP accelerators implemented in this demo design is 22,4 GFLOP/s (this is only theoretical peek figure).
  • This evaluation package presents two (8xSIMD) EdkDSP accelerator families: one family without pipelined floating point divider data path and one family with a single pipelined floating point divider data path. The members of both families differ by size and by supported vector floating point operations.
  • The floating point applications are scheduled inside of the EdkDSP accelerator by the Xilinx PicoBlaze6 processor. Each PicoBlaze6 firmware program has maximal size of 4096 (18 bit wide words).

What is included

The evaluation package includes precompiled Vivado 2013.4 Kintex7 designs with floating point EdkDSP accelerators and SW examples of communication and computation in form of Xilinx SDK 2013.4 SW projects for Windows 7 (32 or 64bit) or a PC running Linux (32 or 64 bit) or Win XP (32 or 64 bit) with the corresponding installation of Xilinx SDK 2013.4:

  • 8 evaluation versions of precompiled Kintex7 designs. Each design contains one MicroBlaze and eight instances of the EdkDSP accelerators. Each accelerator has 8xSIMD floating point data paths and programmable PicoBlaze6 controller for scheduling of floating point vector operations in the accelerator. The MicroBlaze works with 100 MHz system clock and EdkDSP accelerators use 175 MHz clock. The Microblaze processor works with 1 Gb Ethernet with DMA controller and 1 GB DDR3 memory. Designs are compiled in Xilinx Vivado 2013.4.
  • UTIA is providing source code for the demo applications and SW projects for the Xilinx SDK 2013.4. These source code projects are compiled with the UTIA library libwal.a serving for the EdkDSP communication and the library libmfsimage.a with the initial file system supporting the simple www server GUI.
  • The included evaluation versions of the UTIA EdkDSP accelerators have HW limitation of maximal number of performed vector operations.
  • The UTIA EDKDSPC C compiler is provided as 4 binary applications for Ubuntu in the VMware Player.
  • The firmware for accelerators is provided in source code and also in format of binary files to enable the initial evaluation of the EdkDSP accelerators without the need to install the EDKDSPCC C compiler.
  • UTIA partners of the Artemis Almarvi [6] project can get from UTIA the release version of Vivado 2013.4 HW design projects with the evaluation versions of the EdkDSP accelerators (in the Vivado 2013.4 IP netlist format) for free. See chapter 6 for specification of deliverables for the Artemis Almarvi [6] project partners with license details.
  • Release versions of Vivado 2013.4 HW design projects and release version of EdkDSP accelerators for the Xilinx KC705 board is offered by UTIA. All customers can order and buy from UTIA the release version of this demo. It includes the Vivado 2013.4 HW design projects with the EdkDSP accelerators (in the Vivado 2013.4 IP netlist format) with the HW limitation of maximal number of performed vector operations removed. See sections 7 of this application note for specification of deliverables and license details.

Package Summary

Title Evaluation version of precompiled Vivado 2013.4 Kintex7 designs with evaluation versions of UTIA (8xSIMD) EdkDSP accelerator cores
Filename d_34_7k.zip
Utia_EdkDSP_Vivado_2013_4_KC705.pdf
License Utia_EdkDSP_Vivado_2013_4_KC705.pdf for licensing conditions.
Package content ZIP archive with precompiled Vivado 2013.4 projects demonstrating Utia_EdkDSP HW Floating-point accelerators and source code of SDK 2013.4 software projects with Utia_EdkDSP libraries for Kintex7 FPGA on KC705.
Size ZIP file: 41413140 Bytes
PDF file: 3028185 Bytes
Required tools
& platform
Xilinx Vivado 2013.4, Xilinx SDK 2013.4, Xilinx KC705 Evaluation board
Installation notes See application note

Result Category

Project number Year RIV category Comment
7H14004 2014 Gfunk Functional sample (demo)

Contact Person

Please, do not hesitate to contact Jiri Kadlec to obtain more information.