Rapid Object Detection using a Boosted Cascade of Simple Features
Introduction of a new image representation called the Integral Image which allows the features used by the detector to be computed very quickly.
A learning algorithm, based on AdaBoost, which selects a small number of critical visual features from a larger set and yields extremely efficient classifiers.
A method for combining increasingly more complex classifiers in a cascade which allows background regions of the image to be quickly discarded while spending more computation on promising object-like regions. The cascade can be viewed as an object specific focus-of-attention mechanism which unlike previous approaches provides statistical guarantees that discarded regions are unlikely to contain the object of interest.
Their object detection procedure classifies images based on the value of simple features.
There are many motivations for using features rather than the pixels directly.
The most common reason is that features can act to encode ad-hoc domain knowledge that is difficult to learn using a finite quantity of training data.
Feature based system operates much faster than a pixel-based system.
They use three kinds of features.
The value of a two-rectangle feature is the difference between the sum of the pixels within two rectangular regions.
A three-rectangle feature computes the sum within two outside rectangles subtracted from the sum in a center rectangle.
A four-rectangle feature computes the difference between diagonal pairs of rectangles.
Rectangle features can be computed very rapidly using an intermediate representation for the image which we call the integral image.
The integral image at location x,y contains the sum of the pixels above and to the left of x,y inclusive:
Learning Classification Functions
In their system a variant of AdaBoost is used both to select a small set of features and train the classifier.
The AdaBoost learning algorithm is used to boost the classification performance of a simple (sometimes called weak) learning algorithm.
The key insight is that generalization performance is related to the margin of the examples, and that AdaBoost achieves large margins rapidly.
There are over 180,000 rectangle features associated with each image sub-window, a number far larger than the number of pixels. (24*24)
The weak learning algorithm is designed to select the single rectangle feature which best separates the positive and negative examples.
For each feature, the weak learner determines the optimal threshold classification function, such that the minimum number of examples are misclassified.
The Attentional Cascade
The key insight is that smaller, and therefore more efficient, boosted classifiers can be constructed which reject many of the negative sub-windows while detecting almost all positive instances (i.e. the threshold of a boosted classifier can be adjusted so that the false negative rate is close to zero).