

To copy the contents of the missing pixel from the nearby frame we first need to trace the position of that pixel. With the latter being not as popular due to memory and time constraints. Generally proposed solutions employ Convolutional Neural Networks (CNNs), Visual Transformers or 3D CNNs. At the same time, generative methods might not be as accurate in reconstruction, often producing blurry, averaged across possible options, solution. In this case generative models would be more suitable. Copy and paste methods work well in scenarios where pixels can be tracked through the video and would obviously fail when the information cannot be retrieved, for example a stationary video with a stationary region removed. They can either be copy and paste – where the information for the missing pixel is searched for in the nearby frames and once found copied to the specified location, or generative, where some form of a generative model is used to hallucinate the pixel information in the region based on the content of the whole video.īoth methods have their pros and cons. Generally available methods can be distinguished by the inpainting mechanism. While video inpainting is more challenging compared to image inpainting - due to the need to satisfy temporal-consistency between the frames, it inherently has more cues as valid pixels for missing regions in a frame may appear in other frames. And there are many variables that impact how difficult it is to achieve temporal consistency - complexity of the scene, changing camera position, change in the scene or movement of the selected area for inpainting (for example a moving object). Gradually changing over time video should not have flickering artifacts or sudden change in colors or shapes of objects in the inpainted region. Video inpainiting is conceptually similar to image inpainting, however, with a slight complication - the need to satisfy temporal consistency across the whole video. And Photoshop has replaced foreground (the selected area) with the background texture using “ Content-aware fill”.įor a video this process would be more laborious, as we would need to mark all frames where the object occurs. In masking this image I specified the region that I want to inpaint. Example of image inpainting with Adobe Photoshop.
