Extrinsic Camera Calibration

April 04, 2023 12:46

I am working on a project where I have one RGB camera, depth camera and IR camera. I want to have common coordinate system for all the sensors. I am using Intel realsense d435i camera as a depth and RGB camera.

I have several questions regarding this extrinsic camera calibration.

What is the location of the origin for the camera coordinate frame of the Depth camera, and could you please provide the distance from the RGB camera?
If we obtain a Transformation matrix from the Global coordinate system (somewhere in the room) to the camera coordinate system, do we need to calculate a separate transformation matrix for the Depth camera, or is it internally calibrated?
If we do need to calibrate the extrinsic parameters for the Depth camera separately, could you please clarify whether there is a separate origin for the Depth camera and RGB camera, or whether all cameras share a single origin?
Assuming that I don't need to calibrate the Depth camera, would it be possible to use the pose estimated from the RGB camera to transform my point cloud data collected from the Depth module?

Comments

5 comments

MartyG

April 04, 2023 15:07
Hi Neevkumar Hareshbhai Manavar Thanks very much for your questions.

1. Information about the D435i's coordinate system can be found at the link below.

https://github.com/IntelRealSense/librealsense/blob/master/doc/d435i.md#sensor-origin-and-coordinate-system

The 0,0,0 origin of depth is the center-line of the left infrared sensor. When looking at the camera from the front, the left-side infrared sensor is on the right side of the camera.

On the D435i camera model the RGB sensor is on the end of the right side of the camera. There is a 15 mm distance between the center-lines of the RGB sensor and the left infrared sensor beside it.

2. XY camera coordinates can be converted to real-world XYZ coordinates by generating a 3D pointcloud or by converting a single specific coordinate from a 2D color pixel to a 3D depth pixel. It is also possible to project 3D XYZ back to 2D XY. The RealSense SDK has built-in support for these procedures.

3. Each sensor has its own coordinate system. More information about this can be found in the RealSense SDK's Projection documentation.

https://dev.intelrealsense.com/docs/projection-in-intel-realsense-sdk-20

The RealSense Dynamic Calibration tool can perform a robust calibration of both the depth and RGB sensors.

4. If you are asking whether depth and color can be aligned together into a combined image, then yes they can. You can align depth coordinates to color coordinates (depth to color alignment), or color coordinates to depth coordinates (color to depth alignment). You can also generate a textured pointcloud that maps together color and depth.
0

Comment actions Permalink
Neevkumar Hareshbhai Manavar

April 05, 2023 08:31

Edited
Thank you for your answer.
1. Just to confirm, if I understand correctly: the Depth camera has its origin located at the left IR sensor, while the origin of the RGB camera is at the center of the RGB camera. The distance between the RGB camera origin and the Depth camera origin is 15 mm. Therefore, when capturing a point cloud or depth image, all distances are calculated from the Depth camera origin. Is that correct?
2. For instance, consider the diagram below, which shows two images taken from the Depth camera and RGB camera. If we overlay these images, we may notice a slight shift in the location of the object.
- In summary, there are two coordinate systems: one for the Depth camera, which is located at the left IR sensor, and one for the RGB camera. Is that correct?
- If I have a 4x4 extrinsic matrix that represents the rotation and translation from the world coordinate system to the RGB camera coordinate system, and I want to transform from the Depth camera coordinates to the world coordinate system, I need to account for the physical shift of 15mm in the X direction.
extrinsic matrix for RGB:

[[ 0.04279714 -0.95447858 0.29519322 -0.0312782 ]
[-0.99595292 -0.06413105 -0.06296819 0.1577129 ]
[ 0.07903284 -0.29130369 -0.95336036 1.83856794]
[ 0. 0. 0. 1. ]]

# Create transformation matrix to represent offset between RGB and depth cameras. where d is physical shift in our case 15mm.

T_offset = np.array([[1, 0, 0, d],

[0, 1, 0, 0],

[0, 0, 1, 0],

[0, 0, 0, 1]])

# depth camera extrinsic matrix

Transformation_depth = Extrinsic_RGB X T_offset

I can use this matrix to transform my point cloud data. Please let me know if there is anything I have missed or misunderstood.
0

Comment actions Permalink
MartyG

April 05, 2023 08:52

Edited
1. That is mostly correct. When depth is aligned to color though when generating an RGB-textured pointcloud, the center-line of the RGB sensor becomes the origin instead of the left IR sensor.

2. Yes, depth and RGB have separate coordinate systems.

In regard to a slight shift in location of an object after alignment, bear in mind that the IR and RGB sensors are physically in different positions on the front of the camera and so are going to have slightly different field of view perspectives. The difference in translation and rotation between two different sensors is described by extrinsics such as 'Infrared 1 to Color' or 'Color to 'Infrared 1'.

When translating 2D camera coordinates to 3D world coordinates, you are usually only converting the values for that particular coordinate system (e.g depth). So you would not have to take into account the 15 mm distance between the left IR sensor and the RGB sensor, because only one of the sensors is involved in the XY to XYZ 'deprojection' (XYZ to XY is 'projection').

One of the methods for converting XY and XYZ to obtain a 3D pointcloud is to first perform depth to color alignment and then use the RealSense SDK instruction RS2_DEPROJECT_PIXEL_TO_POINT. Converting from XYZ to XY can be done with RS2_PROJECT_POINT_TO_PIXEL.

This method may have a small amount of inaccuracy introduced by the alignment process though. Better accuracy can be obtained by generating an RGB-textured pointcloud with pc.map_to and pc.calculate and then obtaining the XYZ values by retrieving the cloud's vertices with points.get_vertices() and reading the individual X, Y and Z values of the vertices.
1

Comment actions Permalink
MartyG

April 05, 2023 08:53
The link below compares obtaining vertices versus using deprojection.

https://github.com/IntelRealSense/librealsense/issues/4315
1

Comment actions Permalink
Neevkumar Hareshbhai Manavar

April 05, 2023 09:01
Thanks for the support.

I think my doubt is clear.
0

Comment actions Permalink

Please sign in to leave a comment.

Comments

Didn't find what you were looking for?