Tag
The paper introduces Temporal Difference in Vision (TDV), a self-supervised learning method for video that relies only on a causal assumption that past causes future, avoiding strong inductive biases while matching state-of-the-art on dense spatial tasks.