Digital image stabilization with fixed cameras - correlational approach

Introduction h4> This article I decided to write after reading the article « massively parallel image stabilization », which describes algorithm for image stabilization PTZ. The fact that at the time I was implemented algorithm for image stabilization with fixed cameras, which is used in IP-video server MagicBox and some other company's products Synesis , in which I'm working on now. The algorithm turned out quite successful in its speed characteristics. In particular, it is very efficient search algorithm implemented by the bias current image relative to the background. This efficiency is allowed to use its basic features (of course with some modifications) for target tracking, as well as to verify their immobility.

Stabilization algorithm includes the following main elements: detection bias for the current frame, the compensation of the bias and periodically update the background against which there is a stabilization. Below I detail sign for each of them.

Fig. 1 Image stabilization is sometimes very useful.

Finding bias current frame h4> The basic approach, which is based on the correlation approach to define the offset can be briefly described as follows:
1) Take the central part of the background image. The magnitude of the indentation determines the maximum possible displacement that we want to define. The central part should not be too small, otherwise the correlation function (see below) will not have enough data for stable operation.
2) the current frame is selected portion of the same size, but offset relative to the center of the picture.
3) For each offset is calculated a metric, which describes the correlation of the central portion of the current image and the background. This may be used, for example, the sum of squared differences for each point of the two images or, for example, sum of absolute difference for each point.
4) Shifting for which the maximum correlation (less than the sum of squared differences or sum of absolute difference) is the required offset.

Fig. 2 Offset the current frame relative to the background.

Naturally, if such an approach is applied on the forehead, the speed of the algorithm will be catastrophically low, even though the fact that the speed of the correlation functions can be very high. This is not surprising, since we will need to go through all the options for possible image offset relative to each other (the complexity of the algorithm can be estimated as O (n ^ 2), where n - number of pixels in the image).

The first optimization is to use a non exhaustive search of all possible embodiments, and using the gradient descent method in the beginning, the correlation is calculated for a 3x3 region of zero offset, then select the maximum correlation shift, and the process is repeated until until a local maximum is detected. This method is considerably faster, but in the worst case of large displacements it will have complexity O (n ^ 1.5), which is not acceptable.

Figure 3. Search for the maximum of the correlation function. Gradient descent.

The way out of this situation is the use of multi-scale images (each zoom level reduces the image twice). Now look for a local maximum correlation we will seek to maximize the scale and then on a smaller scale it consistently refined. Thus, the algorithm reduces the complexity to O (n), which is quite acceptable.

Figure 4 Multiscale image.

subpixel precision h4> If compensate for jitter camera images with pixel precision, the stabilized image is still very noticeable twitch. Fortunately, this can be corrected. If we carefully analyze the neighborhood near the peak of the correlation function (see. Figure 3), we can see that the value of the function is not symmetric with respect to the maximum, which suggests that the maximum is not at the point (3, 2), somewhere between her and the point (1, 4). If we approximate the behavior of the correlation function near the maximum paraboloid A * x ^ 2 + B * x * y + C * y ^ 2 + D * x + E * y + F = 0 , the task of verifying the origin of the maximum will be reduced to the selection of parameters of a paraboloid, in which its deviation from the actual values ​​at known points is minimal. Experience suggests that the accuracy obtained in this way will clarify the order of 0.1-0.2. When you shake compensation with such precision, stable image is almost no twitches.

Compensation offset h4> offset compensation for a shift as follows: Shifts the current image on the shift found with the opposite sign. Empty area near the edge of fill background. For sub-pixel shift compensation perform bilinear interpolation method. In this case, however, may be a slight blurring of the stabilized image. If this is critical, it is possible to use bicubic interpolation.

Upgrading background h4> In the background you can use just any previous frame. However, the quality of stabilization is markedly improved, if used as a background, averaged over many image frames. Background, it is desirable to periodically update to compensate for possible changes in illumination on the scene. When you update the background you need to make sure that the background value sufficient contrast and inhomogeneous. Otherwise, the correlation function does not have a clear peak, which greatly reduces the accuracy of the stabilizer. It is also highly desirable to present the background on the moving objects.

Working in tandem with the motion detector h4> If the stabilizer is paired with a motion detector, the process of updating the background for it much easier. Usually the motion detector already has in its composition averaged over many frames background against which it determines the presence of motion. The same pattern can be used for the stabilizer. Stable image of the stabilizer in turn reduces the number of false positives for motion detection. You can also use the fact that the motion detector in the course of their work gets mask areas with the presence of motion. This mask resulting motion detector on the last frame, can be used to calculate the correlation function to exclude areas with motion. Which also has a positive effect on the work of the image stabilizer.

Advantages of the proposed approach: h4> 1) The high speed of the algorithm. In particular, in order to stabilize the image resolution of 1280x720 format BGRA32 processor Core i7-4470 (involved 1 core) algorithm takes 1.5 milliseconds.
2) Payment of camera shake with sub-pixel accuracy.

The disadvantages of the proposed approach h4> 1) Image Stabilization in the current implementation is only possible for fixed cameras.
2) detects and compensated only spatial shift camera, rotate the camera will not be compensated.
3) The background must be sufficiently clear and non-uniform, or the correlation function will be nothing to catch. Therefore, in the dark or in fog stabilization will not function properly.
4) The background should be fixed. Stabilizer on the background of traveling waves is also impossible.

Notes on practical implementation h4> To begin with, we note that for the determination of shear sufficient use only gray image, the color characteristics do not affect the accuracy, but naturally slow down calculations.

In implementing the stabilizer is desirable to use optimized functions for image manipulation. I used for this purpose library Simd . It can be found in particular:
1) SimdAbsDifferenceSum and SimdAbsDifferenceSumMasked - for the calculation of the correlation function.
2) SimdReduceGray2x2, SimdReduceGray3x3, SimdReduceGray4x4 and SimdReduceGray5x5 - for building multi-scale images.
3) SimdBgrToGray - for gray image.
4) SimdShiftBilinear - to compensate for the shift.

View the result of the algorithm h4> Example 1:

Example 2:



See also

New and interesting