coordinate mapping issue between data captured in separate functions

MartyG

Following on some previous discussions regarding mapping between 2D pixel points and 3D coordinates, I have some points that I'd like to make.

Of the three methods suggested for calculating the above mapping, one method calculates a point cloud. In this method, we enabling both the depth and the RGB stream, and then used aligning to color. The point cloud is calculated from the aligned data, and then it calculates vertices that give us 3D position for each 2D pixel point.

rs2::pointcloud pc;
rs2::points points;
rs2::config cfg;

rs2::align align_to(RS2_STREAM_COLOR);
   
cfg.enable_stream(RS2_STREAM_DEPTH, 848, 480, RS2_FORMAT_Z16, 30);          // optimal resolution for depth stream is 848x480
cfg.enable_stream(RS2_STREAM_COLOR, 1280, 720, RS2_FORMAT_BGR8, 30);

auto selection = pipe.start(cfg);
    
// setting some optimizations for the depth sensor
    auto sensor = selection.get_device().query_sensors()[0];
    sensor.set_option(RS2_OPTION_EMITTER_ENABLED,1);
    sensor.set_option(rs2_option::RS2_OPTION_VISUAL_PRESET,
rs2_rs400_visual_preset::RS2_RS400_VISUAL_PRESET_MEDIUM_DENSITY);     
  

    const rs2::stream_profile color_profile = selection.get_stream(RS2_STREAM_COLOR);
    const rs2::stream_profile depth_profile = selection.get_stream(RS2_STREAM_DEPTH);


    // un-aligned frames    
    rs2::frameset frames = pipe.wait_for_frames();
    rs2::depth_frame depth = frames.get_depth_frame();   
    auto color = frames.get_color_frame(); 


        
    rs2_intrinsics intr_depth = depth.get_profile().as<rs2::video_stream_profile>().get_intrinsics(); // Calibration data
    rs2_intrinsics intr_color = color.get_profile().as<rs2::video_stream_profile>().get_intrinsics();
    rs2_extrinsics color_extrin_to_depth = color.get_profile().as<rs2::video_stream_profile>().get_extrinsics_to(depth.get_profile().as<rs2::video_stream_profile>());
    rs2_extrinsics depth_extrin_to_color = depth.get_profile().as<rs2::video_stream_profile>().get_extrinsics_to(color.get_profile().as<rs2::video_stream_profile>());
    
    // aligned frames
    frames = frames.apply_filter(align_to);
    auto color_aligned = frames.get_color_frame();
    auto depth_aligned = frames.get_depth_frame();    
    
    // Tell pointcloud object to map to this color frame
    pc.map_to(color_aligned);
   // Generate the pointcloud and texture mappings
    points = pc.calculate(depth_aligned);

    int w = color_aligned.get_width();
    int h = color_aligned.get_height();

    int c, r;     // pixel coordinates

    int index = c + w * r;  // index in the point cloud data

   // find corresponding depth point from vertices
   const rs2::vertex* tv = points.get_vertices();

   // 3d coordinates x, y, z
    float x = tv[idx].x;
    float y = tv[idx].y;
    float z = tv[idx].z;

My question is the following.

If we have captured some images (RGB images from aligned data), and thereafter we are interested to detect objects, create bounding boxes. Let's call this process object detection.

Do we need to perform object detection in the same function that we calculate the point cloud.

If we want to retrieve the 3D data corresponding to bounding boxes that are calculated via object detection process, would it be valid still, if we take 2d pixel points from the object detection process that we performed separately, and ask the information stored (complete map) in the point cloud function, give me 3d point corresponding to the 2d pixel. Would that mapping be still valid. Or the point cloud information is highly temporal?

I hope, I am able to make my point clear.

February 07, 2024 09:21

Hi Zahid Iqbal The RealSense SDK does not have built-in bounding box and object detection features. Typically, RealSense users use a tool called YOLO for object detection.

If you would prefer to define a bounding box using RealSense program scripting then this can be a complicated process. The link below suggests a method of doing so for C++

https://github.com/IntelRealSense/librealsense/issues/2016#issuecomment-403804157

Zahid Iqbal

February 07, 2024 09:50

For object detection and creating bounding box, I have been using openCV. Let's say, I have the bounding box available and hence the pixel coordinates for the vertices of the bounding box. Now, I want a mapping to the 3D coordinates. For this reason, if I pass my query to the point cloud (calculated as described in my post), and ask for the 3D coordinates. would this be valid mapping.

Point cloud of the same space,
bounding boxes for the objects in the same space

The above two have been calculated in separate functions, but each using realsense APIs (streams creation, frame capture, alignment).

thanks,

February 07, 2024 11:55

There are very few available references for use of bounding boxes with RealSense and even fewer for 3D bounding boxes, unfortunately. The best known methods are with darknet for 2D bounding boxes and darknet_3d_ros for 3D bounding boxes.

https://github.com/IntelRealSense/realsense-ros/issues/2311

coordinate mapping issue between data captured in separate functions

Comments

Didn't find what you were looking for?