Trace

Timeloop has a tracing feature that emits a trace of the point-sets that the nest analysis visits at each coordinate in space-time.

To see a complete trace, you will have to disable temporal (and maybe spatial) extrapolation. Note that this will massively slow down simulation speed. This is because with extrapolation disabled Timeloop starts behaving more like a cycle-level simulator than a fast analytical model. Tracing should generally only be used with timeloop-model on a specific mapping. Using tracing with the mapper will just generate a ton of noise that's hard to parse.

To enable all this, set the following env variables:

TIMELOOP_ENABLE_TRACING=1
TIMELOOP_DISABLE_TEMPORAL_EXTRAPOLATION=1
TIMELOOP_DISABLE_SPATIAL_EXTRAPOLATION=1

and then run timeloop-model as usual on a workload + architecture + mapping.

The trace output will look something like this:

    t/7/ s/0/ Weights: { [0,0,0,0:2,256,1,1), } Inputs: { [0,0,0,14:1,2,8,28), } Outputs: { [0,0,14,0:1,256,28,8), } 
      t/8/0/ s/0/0/ Weights: { [0,0,0,0:2,16,1,1), } Inputs: { [0,0,8,14:1,2,16,15), } Outputs: { [0,0,14,8:1,16,15,16), } 
      t/8/1/ s/0/0/ Weights: { [0,128,0,0:2,144,1,1), } Inputs: { [0,0,8,14:1,2,16,15), } Outputs: { [0,128,14,8:1,144,15,16), } 
      t/8/2/ s/0/0/ Weights: { [0,0,0,0:2,16,1,1), } Inputs: { [0,0,8,16:1,2,16,17), } Outputs: { [0,0,16,8:1,16,17,16), } 
      t/8/3/ s/0/0/ Weights: { [0,128,0,0:2,144,1,1), } Inputs: { [0,0,8,16:1,2,16,17), } Outputs: { [0,128,16,8:1,144,17,16), } 
    t/8/ s/0/ Weights: { [0,0,0,0:2,256,1,1), } Inputs: { [0,0,8,14:1,2,16,28), } Outputs: { [0,0,14,8:1,256,28,16), } 
  t/ s/ Weights: { [0,0,0,0:2,256,1,1), } Inputs: { [0,0,0,0:1,2,56,56), } Outputs: { [0,0,0,0:1,256,56,56), }

Here's how to interpret the trace:

  • t/.../.../... is a time-stamp.
  • s/.../.../... is a space-stamp.
  • The indentation level and the number of coordinates in the space/time stamp tells you the hardware tiling level you're looking at. E.g., the rank-0 stamps t/ and s/ refer to the outermost (e.g., DRAM) level, because the tile never changes there over space or time -- it's the complete tensor. The rank-1 stamps t/8/ and s/0/ refer to the next-inner level (e.g., a GlobalBuffer), and in this case is telling you the tile resident at the GlobalBuffer space-coordinate 0 and at time-step 8. As you go deeper into the hierarchy, the rank order of the time and space stamps increases.
  • Weights: { [0,0,0,0:2,16,1,1), } says that at this space/time coordinate the mapping installs a Weights tile that is represented by an axis-aligned hyper-rectangle between the points [0,0,0,0] (inclusive) and [2,16,1,1] (exclusive).
  • Note that these tiles that are being printed out are what we call the "T-relation", i.e., they are the tiles that are present in the hardware space-time coordinate. They are not the "Delta-relation", i.e., they do not represent the incremental data that is moved in to construct the tile. Based on your original ask, I believe you may be more interested in the Delta trace. It should be relatively straightforward to extend this existing tracing code in nest-analysis.cpp to optionally emit the Delta trace as well. This will be a valuable contribution to the tool.
  • Also note that this tracing is at the abstract nest analysis level, and so does not understand bypassing. So even if your mapping does not store, e.g., Outputs, at the GlobalBuffer, the trace will show that tensor there. Bypassing is modeled as a post-processing step in tiling.cpp. By the time Timeloop gets to that stage of processing, all fine-grained information about space and time is discarded, and it's not generate a trace there. So you may have to do some outboard post-processing if you want to incorporate bypassing into the trace.

For more background on hierarchical space/time stamps please refer to this paper: Hardware Abstractions for Targeting EDDO Architectures with the Polyhedral Model