Timeloop has a tracing feature that emits a trace of the point-sets that the nest analysis visits at each coordinate in space-time.
To see a complete trace, you will have to disable temporal (and maybe spatial) extrapolation. Note that this will massively slow down simulation speed. This is because with extrapolation disabled Timeloop starts behaving more like a cycle-level simulator than a fast analytical model. Tracing should generally only be used with timeloop-model on a specific mapping. Using tracing with the mapper will just generate a ton of noise that's hard to parse.
To enable all this, set the following env variables:
TIMELOOP_ENABLE_TRACING=1
TIMELOOP_DISABLE_TEMPORAL_EXTRAPOLATION=1
TIMELOOP_DISABLE_SPATIAL_EXTRAPOLATION=1
and then run timeloop-model as usual on a workload + architecture + mapping.
The trace output will look something like this:
t/7/ s/0/ Weights: { [0,0,0,0:2,256,1,1), } Inputs: { [0,0,0,14:1,2,8,28), } Outputs: { [0,0,14,0:1,256,28,8), }
t/8/0/ s/0/0/ Weights: { [0,0,0,0:2,16,1,1), } Inputs: { [0,0,8,14:1,2,16,15), } Outputs: { [0,0,14,8:1,16,15,16), }
t/8/1/ s/0/0/ Weights: { [0,128,0,0:2,144,1,1), } Inputs: { [0,0,8,14:1,2,16,15), } Outputs: { [0,128,14,8:1,144,15,16), }
t/8/2/ s/0/0/ Weights: { [0,0,0,0:2,16,1,1), } Inputs: { [0,0,8,16:1,2,16,17), } Outputs: { [0,0,16,8:1,16,17,16), }
t/8/3/ s/0/0/ Weights: { [0,128,0,0:2,144,1,1), } Inputs: { [0,0,8,16:1,2,16,17), } Outputs: { [0,128,16,8:1,144,17,16), }
t/8/ s/0/ Weights: { [0,0,0,0:2,256,1,1), } Inputs: { [0,0,8,14:1,2,16,28), } Outputs: { [0,0,14,8:1,256,28,16), }
t/ s/ Weights: { [0,0,0,0:2,256,1,1), } Inputs: { [0,0,0,0:1,2,56,56), } Outputs: { [0,0,0,0:1,256,56,56), }
Here's how to interpret the trace:
t/.../.../...
is a time-stamp.s/.../.../...
is a space-stamp.t/
and s/
refer to the outermost (e.g., DRAM) level, because the tile never changes there over space or time -- it's the complete tensor. The rank-1 stamps t/8/
and s/0/
refer to the next-inner level (e.g., a GlobalBuffer), and in this case is telling you the tile resident at the GlobalBuffer space-coordinate 0
and at time-step 8
. As you go deeper into the hierarchy, the rank order of the time and space stamps increases.Weights: { [0,0,0,0:2,16,1,1), }
says that at this space/time coordinate the mapping installs a Weights tile that is represented by an axis-aligned hyper-rectangle between the points [0,0,0,0]
(inclusive) and [2,16,1,1]
(exclusive).nest-analysis.cpp
to optionally emit the Delta trace as well. This will be a valuable contribution to the tool.tiling.cpp
. By the time Timeloop gets to that stage of processing, all fine-grained information about space and time is discarded, and it's not generate a trace there. So you may have to do some outboard post-processing if you want to incorporate bypassing into the trace.For more background on hierarchical space/time stamps please refer to this paper: Hardware Abstractions for Targeting EDDO Architectures with the Polyhedral Model