https://github.com/Accelergy-Project/timeloop-accelergy-exercises
We recommend using the Docker containers. There are instructions at https://github.com/Accelergy-Project/timeloop-accelergy-exercises.
Dynamic energy, static (leakage) energy, energy due to each component, energy for each dataspace are reported in the stats file. Energy by each action is reported in the ERT.
Overall area is reported in the stats file. Area of each component is reported in the ART.
Access counts for each component (e.g., number of reads, writes, updates), sparse statistics, mappings, workload tensor sizes, bandwidth, cycles, throughput are reported in the stats file. Mapping is reported in the mapping file.
Multiply reported cycles from the stats file with the global_cycle_seconds veriable.
Constraining the mapspace can speed up the mapper.
Yes, you can set the attributes.datawidth
attribute independently for each component in the architecture.
Yes. If you use constraints such that a dataspace bypasses all levels above a given level, then the dataspace will be assumed to be kept in the highest non-bypassed level at the start and/or end of a layer. For example, consider a system with DRAM and global buffer. You are running Layer 1 and Layer 2, for which the outputs of Layer 1 are the inputs of Layer 2. You'd like to keep this tensor on-chip in the global buffer between the two layers. To simulate this, when simulating Layer 1, have outputs bypass the DRAM and be kept in the global buffer. When simulating Layer 2, have the inputs bypass the DRAM and be kept in the global buffer.