Skip to content

Commit

Permalink
[AIE] Update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
stephenneuendorffer committed Apr 30, 2024
1 parent b8670fd commit 8a05b7b
Showing 1 changed file with 7 additions and 6 deletions.
13 changes: 7 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This repository extends the LLVM framework to generate code for use with AMD/Xil

## Architecture Overview

Generally speaking, AI Engine processors are in-order, exposed-pipeline VLIW processors. These processors are implemented as part of an array of processors focused on application acceleration targeting AI, Machine Learning, and DSP applications. They have been integrated in a number of commercial devices including the [Versal AI Core Series](https://www.xilinx.com/products/silicon-devices/acap/versal-ai-core.html) and [Ryzen-AI SOCs](https://www.amd.com/en/products/processors/consumer/ryzen-ai.html).
Generally speaking, AI Engine processors are in-order, exposed-pipeline VLIW processors. These processors are implemented as part of an array of processors focused on application acceleration targetting AI, Machine Learning, and DSP applications. They have been integrated in a number of commercial devices including the [Versal AI Core Series](https://www.amd.com/en/products/adaptive-socs-and-fpgas/versal/ai-core-series.html) and [Ryzen-AI SOCs](https://www.amd.com/en/products/processors/consumer/ryzen-ai.html).

Each VLIW instruction bundle specifies the behavior of one or more functional units, which begin executing a new instruction at the same time. The processor pipeline does not include stall logic, and instructions will continue executing in order regardless of other instructions in the pipeline. As a result, the compiler is able to schedule machine instructions which access the same register in ways that potentially overlap. e.g.

Expand All @@ -23,8 +23,8 @@ Each VLIW instruction bundle specifies the behavior of one or more functional un
Other key architectural characteristics include varying width instruction slots between different instruction encodings and relatively small address spaces (20-bit pointer registers). The presence of varying-width instruction slots implies some code alignment restrictions for instructions which are branch or return targets.

For more information, see:
[AIE1 architecture manual](https://docs.xilinx.com/r/en-US/am009-versal-ai-engine) and
[AIE2 architecture manual](https://docs.xilinx.com/r/en-US/am020-versal-aie-ml)
[AIE1 architecture manual](https://docs.amd.com/r/en-US/am009-versal-ai-engine) and
[AIE2 architecture manual](https://docs.amd.com/r/en-US/am020-versal-aie-ml)


## Implementation
Expand All @@ -33,18 +33,19 @@ The AIE target includes basic support for both AIE1 and AIE2 instruction sets.

| Subtarget | Instruction Selection | VLIW scheduling | Software Pipelining | Hardware Loops | delay slot filling | Floating point | Vector Intrinsics | post-increment addressing |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| AIE1 | SelectionDAG | PostRA Scheduler | Not Yet | Not Yet | Not Yet | Yes, vector fp32 | Yes, i8, i16, i32 | No |
| AIE1 | SelectionDAG | PostRA Scheduler | No | No | No | Yes, vector fp32 | Yes, i8, i16, i32 | No |
| AIE2 | GlobalISel | Machine Scheduler | Experimental | Experimental | Yes | Yes, vector bfloat | Yes, i8, i16 | Yes |

Support for Clang, LLD, binutils (e.g. 'llvm-objdump'), Compiler-RT, and LLVM-LIBC is also included.

In order to support the unusual architecture features of AI Engine, this repository adds LLVM support for several specific features:
- support for non-power of 2 pointers;
- improved TableGen support for specifying operand latencies and resource conflicts of exposed pipeline instructions;
- scheduler support for negative operand latencies;
- scheduler support for negative operand latencies (i.e. an instruction writing to a register may be scheduled *after* a corresponding use);
- scheduler support for slot assignment when instructions that can be issued in multiple VLIW slots;
- support for selecting relocations for instructions with multiple encodings;
- support for architectures with code alignment restrictions;
- improved register allocation support for complex register hierarchies;
- improved register allocation support for complex register hierarchies, specifically related to spills of sub-registers of large compound-registers;

## Disclaimer

Expand Down

0 comments on commit 8a05b7b

Please sign in to comment.