2:3 Pulldown Removal Example

Contents

Buffer a chunk of frames for pulldown detection

First, let's read 50 frames of our video into memory. We'll use this chunk of frames to try to detect the phase of the pulldown. This demo script only works for 2:3:2:3 pulldown.

% On Windows, this requires the Haali Media Splitter and ffdshow,
% configured as described in ../readme.html, except skip the "SelectEvery"
% and "Weave" calls.  This also requires my videoIO package
% (http://sourceforge.net/projects/videoio).
clc; clear all; close all;
vr = videoReader('../hg10-pf24-clip2.mts', 'preciseFrames',-1);
info = getinfo(vr);

pulldownBlockSize = 10; % # of fields in a complete pulldown block ((2+3)*2 = 10)
nBlocks = 5; % how many pulldown blocks to acquire
fields = zeros(info.height, info.width, 3, pulldownBlockSize*nBlocks+1);
assert(logical(seek(vr,-1)));
for i=0:pulldownBlockSize*nBlocks
  fields(:,:,:,i+1) = double(getnext(vr))/255;
end

interactive = 0; % disable interactivity when using publish-to-html

Per-field frame differencing

Here we compute the mean square frame difference error (MSE) between pairs of adjacent frames, on a per-field basis (so we compute the difference between the top field in frame 1 with the top field in frame 2 and the bottom field in frame 1 with the bottom field in frame 2).

Some decoders (i.e. some versions of Microsoft's Windows Media 9 decoder) are not "consistent." By this I mean that if you take a video and decode some frames, then seek back to the beginning and re-decode those same frames, the pixel values of the two decodes may in fact be different. Visually they'll look the same, but numerically they can differ. Decoders that are consistent always give the same decoded output for a given frame, no matter what.

If the encoder was consistent (it encoded the duplicate fields exactly the same) and decoder is consistent, then the MSE will be 0 every time there is a duplicate field. If either the encoder or decoder is inconsistent, then the MSE should still be very small for duplicated fields.

pull2err = zeros(pulldownBlockSize*nBlocks-2,1);
for i=1:pulldownBlockSize*nBlocks - 1;
  f1 = fields(:,:,:,i);
  f2 = fields(:,:,:,i+2);
  pull2err(i) = mean((f1(:) - f2(:)).^2);
end

Plot per-field frame differencing results

Here we look at the per-field frame differencing, as described above. For this video, ffmpeg is a consistent decoder so we actually get 0-touches every 5th frame. The rest of this code does not assume a consistent decoder (it would be simpler if we did assume it).

if interactive, clf; else close all; end
plot(0:length(pull2err)-1, pull2err);
xlabel('field number');
ylabel('MSE');
title({'MSE between pairs of matched top/top and bottom/bottom fields',...
  '(we assume top frame first)'});

Find the block starting frame

The very first frame we see may not be at the beginning of a block of 5 frames (10 fields) in a pulldown sequence. This could be because the camera chose to start at some non-standard time in the sequence or because the splitter eats some frames (hint: as best we can tell, Haali's eats at least 2 frames). We expect the MSE error to dip to zero (or near zero) every 5 fields for a 2:3:2:3 pulldown sequence. For smoothly-varying motion in our image, a robust way of detecting this is to take a discrete Fourier transform and look at the phase of the frequency component corresponding to an every-5-fields undulation. By looking at the phase of that FFT coefficient, we can compute when those dips occur. Other approaches might work too...this was the first thing that we tried that worked well.

if interactive, clf; else close all; end

% compute the fft
p = pull2err((1:end-pulldownBlockSize+1)+0);
f = p;
F = fft(f,length(p));

% Zero out all elements we don't care about (since we know exactly the
% frequency we want).  Make sure that we keep both the positive and
% negative frequency components so we get a cosine out of ifft.
FF = zeros(size(F));
keeper = length(FF)/(pulldownBlockSize/2);
keepers = [keeper length(FF)-keeper]+1;
FF(keepers) = F(keepers);

% Visualizations
subplot(311); plot(0:length(f)/2-1,abs(F(1:length(F)/2)), 'x:');
xlabel('pseudo frequency'); ylabel('fft magnitude');
subplot(312); plot(0:length(f)/2-1,angle(F(1:length(F)/2)),'x:'); grid on;
xlabel('pseudo frequency'); ylabel('fft phase (radians)');
subplot(313); plot(0:length(f)-1,f,'b-x', (0:length(f)-1),ifft(FF),'r:x'); grid on;
xlabel('field'); ylabel('MSE');
legend('original MSE', 'reconstructed MSE for phase detection');

% Compute which field was the first to be pulled 3 times instead of 2.
% This will be the same field that has (near-)zero error.
firstPull3Field = round((pi*3/2 - angle(F(keeper+1))) / (pi*2) * pulldownBlockSize/2) % 1-indexed

% Compute the shift that tells us what part of the pulldown block the first
% field corresponds to.
if (mod(firstPull3Field,2)==0)
  % first pull3 is a bottom field
  pull23shift = -(firstPull3Field-1 - 7);
else
  % first pull3 is a top field
  pull23shift = -(firstPull3Field-1 - 2);
end
pull23shift = mod(pull23shift+pulldownBlockSize, pulldownBlockSize)
firstPull3Field =

     5


pull23shift =

     8

Show some decoded frames

Here we decode some frames and use our learned phase information to do the pulldown reversal. We reread our frames each time to make debugging easier. A real implementation would avoid some of the work duplication.

We show some results starting after the first frame to see some interesting movement and avoid some mistakes made by the decoder since it was not fed a few initial fields (this is the splitter's fault, not the decoder's, assuming we're correct and it's the splitter that is dropping the first 4 fields).

if interactive
  clf;subplot('Position',[0 0 1 1]);
  fieldset = 0:2:floor(info.numFrames/2)-1 - 4;
else
  close all;
  fieldset = 60:2:100; % just 10 frames for publish-to-html
end

for i=fieldset
  if ~interactive
    figure('Position',[0 0 1440 1080]); subplot('Position',[0 0 1 1]);
  end

  % get the next 4 fields
  assert(logical(seek(vr,i)));
  at = getframe(vr); ab = getnext(vr); bt = getnext(vr); bb = getnext(vr);

  % remove the pulldown
  progressive = ivtc(at,ab,bt,bb, pull23shift+i, [2 3 2 3]);

  % Show results if the pulldown removal doesn't indicate we should drop
  % this field.
  if ~isempty(progressive)
    imshow(progressive);
    t = sprintf('frame %d', i/2);  title(t);  ylabel(t);
    if interactive
      pause;
    end
  elseif ~interactive
    a = axis;
    text(a(1)+a(2)/2, a(3)+a(4)/2, ...
      {sprintf('Frame %d is being dropped', i/2),'(this is good)'}, ...
      'FontSize',72, ...
      'VerticalAlignment','middle', 'HorizontalAlignment','center');
  end
end
Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space
Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space
Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space
Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space
Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space
Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space

Cleanup

close all;
if ~interactive
  vr = close(vr);
end