Asymmetric Multiprocessing (AMP) on ZYNQ with EdkDSP Accelerators on Xilinx ZC702 Board - ISE 14.5

Key features

This application note describes Asymmetric Multiprocessing (AMP) on ZYNQ with EdkDSP Accelerators on Xilinx ZC702 Board - ISE 14.5 based on the Xilinx application note XAPP1093. The ARM Cortex A9 processor works together with the MicroBlaze processor, sharing the terminal and block ram. Both processors execute program from the same external DDR3 memory. The MicroBlaze processor is controlling 4 EdkDSP floating point accelerators. Each accelerator is organised as 8xSIMD reconfigurable data path, controlled by the PicoBlaze6 controller. This evaluation package is provided by UTIA for the Xilinx ZC702 designs with AXI bus. This application note explains how to install and use the demonstrator on Windows7, (32 or 64 bit) and the Xilinx ZC702 board. These key features are demonstrated:

  • Implementation of adaptive acoustic noise cancellation on 1 of 4 accelerators is computing the recursive adaptive LMS algorithm for identification of regression filter with 2000 coefficients in single precision floating point arithmetic with sustained performance
    • 627 MFLOP/s on the 100 MHz EdkDSP
    • 146 MFLOP/s on the 666 MHz ARM Cortex A9 (with the vector floating point unit)
    • 5 MFLOP/s on the 100 MHz MicroBlaze processor with the floating point HW unit
  • The EdkDSP accelerators can be reprogrammed by the firmware. The programming is possible in C with the use of the UTIA EDKDSP C compiler. Accelerators can be programmed with two firmware programs. Designs can swap in the real time the firmware in only few clock cycles in the runtime.
  • The alternative firmware can be downloaded to the EdkDSP accelerators in parallel with the execution of the current firmware. This is demonstrated by swap of the firmware for the FIR filter room response to the firmware for adaptive LMS identification of the filter coefficients in the acoustic noise cancellation demo.
  • The EdkDSP accelerator is providing single-precision floating point results bit-exact identical to the reference software implementation running on MicroBlaze with the Xilinx HW single precision floating point unit.
  • The 100 MHz 8xSIMD EdkDSP accelerator is 4,3x faster than the 666 MHz ARM Cortex A9 (with the vector processing unit) and 125x faster than computation on area optimized 100 MHz MicroBlaze with HW floating point unit, in the presented case of the 2000 tap adaptive LMS filter.
  • The floating point 2000 tap coefficients FIR filter (acoustics room model) is computed by single 100 MHz (8xSIMD) EdkDSP accelerator with the floating point performance 994 MFLOP/s. The peak performance (only theoretical) of the single 100 MHz (8xSIMD) EdkDSP accelerator is 1,6 GFLOP/s.
  • The peak performance of four 100 MHz (8xSIMD) EdkDSP accelerators implemented in this demo design is 6,4 GFLOP/s (this is only theoretical, peek figure).
  • This evaluation package presents two (8xSIMD) EdkDSP accelerator families: one family without pipelined floating point divider data path and one family with a single pipelined floating point divider data path. The members of both families differ by size and by supported vector floating point operations.
  • The floating point applications can be scheduled inside of the EdkDSP accelerator by the Xilinx PicoBlaze6 processor. Each firmware program has maximal size of 4096 (18 bit wide words).

What is included

The asymmetric multiprocessing on ZYNQ (AMP) with the EdkDSP platform evaluation package contains these deliverables for the Windows 7 (32 or 64bit):

  • 10 evaluation versions of AMP designs. Each design contains one used ARM Cortex A9 processor core, one MicroBlaze and four instances of the EdkDSP accelerators with 8xSIMD floating point data paths with AXI-lite bus. (ARM 666 MHz, MicroBlaze 100 MHz, Accelerators 100 MHz) Designs are compiled in Xilinx XPS 14.5.
  • UTIA is providing source code for the demo applications and SW projects for the Xilinx SDK 14.5. These source code projects are compiled with the UTIA library libwal.a serving for the EdkDSP communication.
  • The included evaluation versions of the UTIA EdkDSP accelerators have HW limitation of maximal number of performed vector operations.
  • The UTIA EdkDSPC C compiler is provided as 3 executable applications for Ubuntu in the VMware Player.
  • The firmware is also provided in format of binary files to enable testing of accelerators without C compiler.
  • Partners of the Artemis EMC2 project can get from UTIA the HW design projects with the evaluation versions of the EdkDSP accelerators in form of AXI netlist pcores for free. See chapter 6 for specification of deliverables for the EMC2 project partners and license details.
  • Release versions of AMP designs with the EdkDSP package for the Xilinx ZC702 board is offered by UTIA. All customers can order and buy from UTIA the release version of this AMP demo. It includes the HW design projects with the EdkDSP accelerators in form of AXI netlist pcores with main limitations removed. See sections 7 of this application note for specification of deliverables and license details.

Package Summary

Title Asymmetric Multiprocessing (AMP) on ZYNQ with EdkDSP accelerators on Xilinx ZC702 board - ISE 14.5
Filename d_145_7z.zip
Utia_EdkDSP_145_EMC2_ZC702.pdf
License See application note for licensing conditions.
Package content ZIP archive with precompiled ISE 14.5 HW projects for asymmetrical multiprocessing desings on ZYNQ for ZC702, demonstrating Utia_EdkDSP HW Floating-point accelerators and source code of SDK 14.5 software projects with Utia_EdkDSP libraries.
Size ZIP file: 31686593 Bytes
PDF file: 2304507 Bytes
Required tools
& platform
Xilinx ISE 14.5, Xilinx SDK 14.5, Xilinx ZC702 Evaluation board
Installation notes See application note

Result Category

Project number Year RIV category Comment
7H14005 2014 Gfunk Functional sample (demo)

Contact Person

Please, do not hesitate to contact Jiri Kadlec to obtain more information.