A heavy network (like ResNet) extracts "deep features" only from select frames.
Usually a Convolutional Neural Network (CNN) that extracts spatial information. 5faafede261a08fdaba46_source.mp4
For intermediate frames, the model uses a "flow field" (optical flow) to warp and move the previous features forward. A heavy network (like ResNet) extracts "deep features"
A lighter network (like FlowNet) that estimates the motion between frames. 5faafede261a08fdaba46_source.mp4
Most researchers using these specific file IDs are implementing or testing Deep Feature Flow . This framework solves the problem of per-frame processing being too slow for real-time video.
In this context, refers to the high-level numerical representations extracted from video frames using Deep Learning models. 🏗️ Deep Feature Flow (DFF)