Evaluation of Asymmetric Multiprocessing for Zynq System-on-Modules TE0720-02-2IF, TE0720-02-1CF, TE0720-02-1QF with Carrier Board TE0701-05


This application note describes the asymmetric multiprocessing design (AMP) based on the Xilinx application note XAPP1093. The AMP design is ported from ISE 14.5 design flow to the Xilinx Vivado 2015.2 and SDK 2015.2 design flow. The ARM Cortex A9 processor works together with the MicroBlaze processor, sharing the terminal and block ram. Both processors execute program from the same external DDR3 memory. The MicroBlaze processor is controlling 4 EdkDSP floating point accelerators. Each accelerator is organised as 8xSIMD reconfigurable data path, controlled by the PicoBlaze6 controller.

This evaluation package is provided by UTIA for these three system-on-modules on the TE0701-05 carrier board:

  • TE0720-02-2IF Xilinx XC7Z020-2CLG484I Zynq; temperature range -40 ... +85
  • TE0720-02-1CF Xilinx XC7Z020-1CLG484C Zynq; temperature range -00 ... +70
  • TE0720-02-1QF Xilinx XA7Z020-1CLG484Q Automotive Zynq; temperature range -40 ... +105

This application note explains how to install and use the demonstrator on Win7 64 bit PC.

These key features are demonstrated:

  • Implementation of adaptive acoustic noise cancellation on 1 of 4 accelerators is computing the recursive adaptive LMS algorithm for identification of regression filter with 1984 coefficients in single precision floating point arithmetic with sustained performance
    • 775 MFLOP/s on 125 MHz MicroBlaze processor with single 125 MHz (8xSIMD) EdkDSP accelerator
    • 338 MFLOP/s on single 666 MHz ARM Cortex A9 (with the 32bit vector floating point NEON unit)
    • 160 MFLOP/s on single 666 MHz ARM Cortex A9 (with the HW floating point unit)
    • 10 MFLOP/s on single 125 MHz MicroBlaze processor (with the 32bit floating point HW unit)
  • EdkDSP accelerators can be reprogrammed by firmware. Programming is possible in C with compilation by UTIA EDKDSP C compiler. Accelerators can contain two firmware programs. Accelerators can swap in the real time from one firmware to the other firmware in only few clock cycles in the runtime.
  • The floating point applications are scheduled inside of the 8x SIMD EdkDSP accelerator by the Xilinx PicoBlaze6 processor. Each firmware program has maximal size of 4096 (18 bit wide words).
  • The alternative firmware can be downloaded to the EdkDSP accelerators in parallel with the execution of the current firmware. This is demonstrated by swap of the FIR-filter, room-response firmware to the firmware for adaptive LMS identification of filter coefficients in the enclosed acoustic noise cancellation demo.
  • The 8xSIMD EdkDSP accelerator provides single-precision floating point results, which are bit-exact identical to the reference software implementation running on the MicroBlaze processor with the Xilinx HW 32bit floating point unit.
  • The 125 MHz 8xSIMD EdkDSP accelerator is 2,3x faster than the 666 MHz ARM Cortex A9 (with NEON vector processing unit), 4,8x faster than the 666 MHz ARM Cortex A9 without code optimized for NEON and 78x faster than 125 MHz MicroBlaze with HW floating point unit. These data are measured for the presented case adaptive LMS filter with 1984 coefficients.
  • The FIR filter with 1984 coefficients is computed by single 125 MHz (8xSIMD) EdkDSP accelerator with the floating point performance 1227 MFLOP/s.
  • Peak performance of four 125 MHz (8xSIMD) EdkDSP accelerators implemented in this demo design is 8 GFLOP/s (single precision floating point).

What is included

The asymmetric multiprocessing on ZYNQ (AMP) with the EdkDSP platform evaluation package contains these deliverables for the Win 7 64 bit:

  • Evaluation versions of AMP designs. Designs work with one ARM Cortex A9 processor core with NEON vector floating point unit, one MicroBlaze and one instance or four instances of the EdkDSP accelerators, each with 8xSIMD floating point data paths. Designs are compiled in Xilinx Vivado 2015.2.
  • Clocks: ARM 666 MHz; MicroBlaze 100 MHz and 125 MHz; EdkDSP accelerators 100 MHz and 125 MHz.
  • UTIA is providing source code for the demo applications and SW projects for the Xilinx SDK 2015.2. These source code projects are compiled with the UTIA library libwal.a serving for the EdkDSP communication.
  • Included evaluation versions of the UTIA EdkDSP accelerators have HW limitation of maximal number of performed vector operations.
  • UTIA EdkDSPC C compiler is provided as 3 executable applications for Ubuntu. It can be executed in the VMware Workstation 12 Player on a 64bit Win7 PC.
  • The firmware is also provided in format of binary files to enable testing of accelerators without C compiler.
  • Partners of the Artemis EMC2 project can get from UTIA the Vivado 2015.2 HW design projects with the evaluation versions of the EdkDSP accelerators (in the Vivado 2015.2 IP netlist format) for free. See chapter 6 of this application note for specification of deliverables for the EMC2 project partners and license details.
  • Release of AMP designs with the commercial version of EdkDSP accelerators for Trenz TE0720-02-2IF, TE0720-02-1CF and automotive TE0720-02-1Q system-on-modules on Trenz TE0701-05 carrier board is offered by UTIA. It includes the Vivado 2015.2 HW design projects with the EdkDSP accelerator IP cores (in netlist format) with main limitations of the evaluation version removed. See sections 7 of this application note for specification of deliverables and license details.

Package Summary

Title Evaluation of Asymmetric Multiprocessing for Zynq System-on-Modules TE0720-02-2IF, TE0720-02-1CF, TE0720-02-1QF with Carrier Board TE0701-05
Filename Utia_EdkDSP_2015_2_EMC2_te0720_te0701.zip
License Utia_EdkDSP_Vivado_2015_2_EMC2_te0720_te0701.pdf for licensing conditions.
Package content ZIP archive with precompiled Vivado 2015.2 projects demonstrating Utia_EdkDSP HW Floating-point accelerators and source code of SDK 2015.2 software projects with Utia_EdkDSP libraries.
Size ZIP file: 158874025 Bytes
PDF file: 3980882 Bytes
Required tools
& platform
Xilinx Vivado Lab Tools 2015.2, Xilinx SDK 2015.2, TE0720-02-2IF, TE0720-02-1CF, TE0720-02-1QF with Carrier Board TE0701-05
Installation notes See application note

Result Category

Project number Year RIV category Comment
7H14005 2015 Gfunk Functional sample (demo)

Contact Person

Please, do not hesitate to contact Jiri Kadlec to obtain more information.