Two Stream Semantic Compression of Videos with Dynamic Backgrounds
Solomon Garber
Brandeis University
Antonella DiLillo
Brandeis University
James Storer
Brandeis University
Abstract
Video containing only oscillatory motion can be compressed and approximately reconstructed using static descriptors and global motion parameters. In this work we propose a system for generating these descriptors in a video containing semantic object motion in the foreground which may be occluding the background oscillations in different regions at different times, for example an outdoor soccer video. Our technique improves visual quality over traditional video codecs in the video background while preserving the fidelity in semantically important foreground region. These improvements are most pronounced at low bitrates.
Figure 3. Sample frames from four versions of a 1m48s 1080p video, 59.94 fps, 5868 frames total. (a): video encoded with AVC, original quality (397.9 MB, 0.26 bpp, approximately 29 Mbit/s). (b): same frame encoded by AVC, low quality settings (3.4 KB, .0023 bpp, approximately 255 Kbit/s). (c): same frame encoded with our method (3.4 KB or .0023 bpp, approximately 255 Kbit/s). (d): A/B comparison of our method and AVC. Click on the still image to view the corresponding video
Figure 5. The video clip is 2 minutes and 40 seconds in duration, shot at 23.976 frames per second, a total of 3714 frames. (a) The clip, when encoded with our HEVC tool at its default good quality setting, was 62.7 MB (0.065 bpp, approximately 3 Mbit/s). (b) With our HEVC tool at its lowest quality setting, the clip was 2 MB (0.0021 bits per pixel, approximately 102 Kbit/s). (c) Our encoder produced a clip of 2 MB (0.0021 bits per pixel, 102 Kbit/s). (d) A side-by-side comparison of our approach and the HEVC tool at its lowest quality.