Learning Unsupervised Visual Representations using 3D Convolutional Autoencoder with Temporal Contrastive Modeling for Video Retrieval
The rapid growth of tag-free user-generated videos (on the Internet), surgical recorded videos, and surveillance videos has necessitated the need for effective content-based video retrieval systems.Earlier methods for video representations are based on hand-crafted, which hardly performed well on the video retrieval tasks.Subsequently, deep learnin