Talha Hanif Butt
3 min readDec 15, 2019


I recently worked on camera calibration for stereo vision and for that I studied a little bit about the topic. What follows is my understanding of it.

Camera Calibration

A camera projects 3D world points onto the 2D image plane. Calibration is the process of finding the quantities that affect this imaging process.

Basically the problem statement of camera calibration is to write projection equations linking known coordinates of a set of 3D points and their projections and solve for camera parameters.

Why it is required

Using a calibrated camera, height and distance of an object can be measured.

Calibration parameters can be used to recover 3D quantitative measures from 2D images.

Precise calibration is required for 3D interpretation of images.

Camera calibration includes 5 intrinsic parameters while 6 extrinsic parameters.

Intrinsic parameters depend only on camera characteristics while extrinsic parameters depend on camera position.

Intrinsic Parameters

Intrinsic parameters basically tell us the relationship between pixel coordinates and camera coordinates. These can be represented in matrix form as below.

Photo taken from


Focal Length

The focal length is the distance between the pinhole and the image plane.

Principal point Offset

The camera’s “principal axis” is the line perpendicular to the image plane that passes through the pinhole. Its intersection with the image plane is referred to as the “principal point,” illustrated below.

Photo taken from

The principal point offset is the location of the principal point relative to the film’s (image plane) origin.

Extrinsic Parameters

Extrinsic parameters basically deal with the camera’s location and orientation in the world. These relate camera position to a known frame. In matrix form, these can be represented as below.

Photo taken from



Rotation matrix can also be represented as below.

Photo taken from

Three types of rotations are required as extrinsic parameters which are pitch, roll and yaw.

Rotation around the front-to-back axis is called roll.

Rotation around the side-to-side axis is called pitch.

Rotation around the vertical axis is called yaw.


Three values corresponding the translation in x,y and z dimensions are required.


Baseline is basically the translation between two cameras in stereo vision.

Projection Matrix

Combination of intrinsic and extrinsic parameters is also known as projection matrix. It’s a 3 by 4 matrix represented as below.

P = [R|T]K

If we have the 2D coordinates, then using calibration parameters, we can map to 3D and vice versa using the following equation

Photo taken from

The above representation consists of three coordinate systems which are as follows:

World Coordinate System

Photo taken from

Camera Coordinate System

Photo taken from

Image Coordinate System

Photo taken from

What’s Next

I am currently working on object detection and will try to explain my approach soon, if it works, hopefully.


Robotics 2: Camera Calibration

Three Dimensional Rotation Matrices

Chapter 12: Calibration

Camera Calibration

Dissecting the Camera Matrix, Part 3: The Intrinsic Matrix

Roll, Pitch, and Yaw