FAQ

Tutorials

Where are the tutorials? Where are the examples?

https://github.com/Accelergy-Project/timeloop-accelergy-exercises

Installation

My install is breaking because ...

We recommend using the Docker containers. There are instructions at https://github.com/Accelergy-Project/timeloop-accelergy-exercises.

Reported Statistics

What energy statistics does Timeloop+Accelergy report?

Dynamic energy, static (leakage) energy, energy due to each component, energy for each dataspace are reported in the stats file. Energy by each action is reported in the ERT.

What area statistics does Timeloop+Accelergy report?

Overall area is reported in the stats file. Area of each component is reported in the ART.

What other statistics does Timeloop+Accelergy report?

Access counts for each component (e.g., number of reads, writes, updates), sparse statistics, mappings, workload tensor sizes, bandwidth, cycles, throughput are reported in the stats file. Mapping is reported in the mapping file.

How can I calculate latency?

Multiply reported cycles from the stats file with the global_cycle_seconds veriable.

Running the Mapper

Why is the mapper taking a long time? Why is the mapper slow? How can I speed up the mapper?

Constraining the mapspace can speed up the mapper.

Why is the converging on a sub-optimal mapping?
  • If the mapper is timing out, constraining the mapspace or increasing the mapper timeout and/or victory conditions may help.
  • If the mapper is converging on mappings that appear poor, investigate the stats file to see if anything is limiting the mapping (such as a limited buffer capacity, limited bandwidth, or limited spatial fanout).
  • Enabling diagnostics in the mapper may also help.
  • If there is a known mapping that you think the mapper should be finding, try constraining the mapspace until only that mapping is possible. See if the mapper finds that mapping when constraints are in place, and if it does not find that mapping, check the diagnostics to see why not.

Architecture

Can I change the precision of operands throughout the architecture?

Yes, you can set the attributes.datawidth attribute independently for each component in the architecture.

Can I keep operands on-chip between layers?

Yes. If you use constraints such that a dataspace bypasses all levels above a given level, then the dataspace will be assumed to be kept in the highest non-bypassed level at the start and/or end of a layer. For example, consider a system with DRAM and global buffer. You are running Layer 1 and Layer 2, for which the outputs of Layer 1 are the inputs of Layer 2. You'd like to keep this tensor on-chip in the global buffer between the two layers. To simulate this, when simulating Layer 1, have outputs bypass the DRAM and be kept in the global buffer. When simulating Layer 2, have the inputs bypass the DRAM and be kept in the global buffer.