I created a system based on Intel Realsense D455 and OpenVino 2022

July 21, 2022 07:39
Edited

My specific request is on OpenVino.

I briefly describe my system: it is written in python it is a web server that shows Relsense measurements on a browser such as a smartphone.

It is structured with 3 different detector models

- OpenVino face-detection-retail-0005

- OpenVino person-detection-retail-0013

- and finally Tensorflow.compct.v1 HEAD_DETECTION_300x300_ssd_mobilenetv2.pb

The latter replaced a slower one which I had done the Train in Yolov3 however I am still not satisfied with this result because today it runs around 18/19 FPS.

I'm buying a NUCbox kb5 mini pc with n5015 processor and Intel UHD graphics to take advantage of the Intel GPU with OpenVino.

In short, it is a lot of money for a private person so I would like to get the most out of this configuration and the weak link of my system is precisely the third model of detector.

Only one OpenVino model today runs at 40/50 FPS

Two OpenVino models run at 30/40 FPS

With the addition of Tensorflow.compact.v1 everything runs at 18/19 FPS.

If I replaced this latter detetor with an original OpenVino then have all the Intel based project, I would get at least 20/30 FPS, more with the use of the NUCbox GPU (I checked the Intel driver list and it is compatible for the GPU for OpenVino) then I would be able to reach 30 FPS as per my project specification, and especially since I use Realsense D455 at 60 FPS I would not lose even one FPS.

The third detector model is exclusive to the head of a human subject seen from behind, which with the first model I cannot do and with the second model I can only partially do.

I would definitely like to eliminate the use of Tensorflow and instead use a third OpenVino model dedicated exclusively to the detector of the head of a human subject seen from behind! A head-detection-retail :-)

I couldn't find any inside the Open Model Zoo.

The conversion of my Yolov3 model with Model Optimizer was not successful, I believe that in some conversion step in .pb format I did not get a compatible model for Model Optimizer and the result is a model that plots everything randomly.

So my request is this:

I hope that without the need to convert anything else, because I am not very practical given the results. Have you already a model that I can use turnkey already in some of your archives?

So I kindly ask you if can you provide it to me to complete my project in Vision of the need to have all my Intel based system?

Sincerely, Marco Pirondini

Comments

5 comments

MartyG

July 19, 2022 17:01
Hi Marco Pirondini Have you seen the RealSense SDK's OpenVINO face detection example program rs-face-vino at the link below which uses the trained model Intermediate Representation files face-detection-adas-0001.xml and .bin please?

https://github.com/IntelRealSense/librealsense/tree/master/wrappers/openvino/face
0

Comment actions Permalink
Marco Pirondini

July 21, 2022 06:42

Edited
Well, your link is a very interesting git, because that described how an asynchronous detection takes place by queueing a frame and only processing its results when the next frame is available!

Sencond thing I just tryed face-detection-adas-0001.xml and .bin as you suggested but also that model detect only the faces, when my research is to found an OpenVINO model dedicated exclusively to the detector of the head of a human subject seen from behind! Head in backside view to integrate it with a face detector to not loose detection ever if human subject go ahead, or go backward or simply turn head, without looking into the lens!

Thank you
0

Comment actions Permalink
MartyG

July 21, 2022 07:18
OpenVINO also has a head pose estimation model called head-pose-estimation-adas-0001 though does not specify whether the rear of the head is detectable.

https://docs.openvino.ai/2021.3/omz_models_model_head_pose_estimation_adas_0001.html
0

Comment actions Permalink

Marco Pirondini

February 15, 2023 09:54
Edited

It is structured with 3 different detector models

- OpenVino face-detection-retail-0005

- OpenVino person-detection-retail-0013

- and finally Tensorflow.compct.v1 HEAD_DETECTION_300x300_ssd_mobilenetv2.pb

After a while a solved my issue whit MartyG similar suggest:

Now the project is structured with new 3 different detector models (order are changed)

- OpenVino human-pose-estimation-0001 (instead HEAD_DETECTION_300x300_ssd_mobilenetv2.pb)

- OpenVino person-detection-retail-0013 (and/or person-detection-0200) '0200' is better to a close person

- OpenVino face-detection-adas-0001 (instead face-detection-retail-0005) '0001' is better to a far face

The first one works both far and close, but it has two 'problems' to do resolve:

First problem it is asincrounous -) when it resolve its last frame queue, then I can apply the two last models in list. (the two last models are indented methods, but they works).

Second problem the first model needed a parser that identify the head - ) with a simple but complete algoritm we can do it.

I try to write to you a python code for this algoritm to detect Head from 'human-pose-estimation-0001':

if hpe_pipeline.is_ready():
    frame = cv2.resize(image,(456, 256))
    ratio = (image.shape[1] / 456, image.shape[0] / 256)
    # Submit for inference
    hpe_pipeline.submit_data(frame, next_frame_id, {'img': image, 'ratio_frame': ratio})
                                    
    next_frame_id += 1
     
else:
    # Wait for empty request
    hpe_pipeline.await_any()
                                
if hpe_pipeline.callback_exceptions:
    raise hpe_pipeline.callback_exceptions[0]

results = hpe_pipeline.get_result(next_frame_id_to_show)
if results:
    threshold = 0.1
    next_frame_id_to_show += 1 
    (poses, scores), frame_meta = results
    img = frame_meta['img']
    ratio_frame = frame_meta['ratio_frame']

    if poses.size > 0:
        for pose in poses:
            points = pose[:, :2].astype(np.int32)
            #points = output_transform.scale(points)
            points_scores = pose[:, 2]
            # Draw joints.
            poses = np.zeros((19, 2), np.int32)
            
            
            for i, (p, v) in enumerate(zip(points, points_scores)):
                if v > threshold:
                    #cv2.circle(img, tuple(p), 1, colors[i], 2)
                    #cv2.putText(img, str(i), tuple(p), cv2.FONT_HERSHEY_PLAIN, 1, (255,255,255), 1, cv2.LINE_AA) 
                    poses[i,0] = tuple(p)[0] * ratio_frame[0]
                    poses[i,1] = tuple(p)[1] * ratio_frame[1]
                    
            trovato = False

            if ((trovato == False) and
                (poses[3,0] != 0 and poses[3,1] != 0) and
                (poses[4,0] != 0 and poses[4,1] != 0)):
                centerX = min(poses[3,0], poses[4,0]) + int(abs(poses[3,0] - poses[4,0])/2)
                centerY = min(poses[3,1], poses[4,1]) + int(abs(poses[3,1] - poses[4,1])/2)
                radius = int(abs(poses[3,0] - poses[4,0])/2)
                trovato = True
                
            if ((trovato == False) and
                (poses[3,0] != 0 and poses[3,1] != 0) and
                (poses[2,0] != 0 and poses[2,1] != 0)):
                centerX = min(poses[3,0], poses[2,0]) + int(abs(poses[3,0] - poses[2,0])/2)
                centerY = min(poses[3,1], poses[2,1]) + int(abs(poses[3,1] - poses[2,1])/2)
                radius = int(abs(poses[3,0] - poses[2,0])/2)
                trovato = True
                
            if ((trovato == False) and
                (poses[4,0] != 0 and poses[4,1] != 0) and
                (poses[1,0] != 0 and poses[1,1] != 0)):
                centerX = min(poses[1,0], poses[4,0]) + int(abs(poses[1,0] - poses[4,0])/2)
                centerY = min(poses[1,1], poses[4,1]) + int(abs(poses[1,1] - poses[4,1])/2)
                radius = int(abs(poses[1,0] - poses[4,0])/2)
                trovato = True          

            if ((trovato == False) and
                (poses[3,0] != 0 and poses[3,1] != 0) and
                (poses[5,0] != 0 and poses[5,1] != 0) and       
                (poses[6,0] != 0 and poses[6,1] != 0)):
                centerXs = min(poses[5,0], poses[6,0]) + int(abs(poses[5,0] - poses[6,0])/2)
                centerX = min(centerXs, poses[3,0]) + int(abs(centerXs - poses[3,0])/2)
                centerY = poses[3,1] 
                radius = int(abs(poses[3,0] - centerXs)/2)
                trovato = True

            if ((trovato == False) and
                (poses[4,0] != 0 and poses[4,1] != 0) and
                (poses[5,0] != 0 and poses[5,1] != 0) and       
                (poses[6,0] != 0 and poses[6,1] != 0)):
                centerXs = min(poses[5,0], poses[6,0]) + int(abs(poses[5,0] - poses[6,0])/2)
                centerX = min(centerXs, poses[4,0]) + int(abs(centerXs - poses[4,0])/2)     
                centerY = poses[4,1] 
                radius = int(abs(centerXs - poses[4,0])/2)
                trovato = True
                
            if ((trovato == False) and
                (poses[3,0] != 0 and poses[3,1] != 0) and
              (poses[0,0] != 0 and poses[0,1] != 0)):
                centerX = min(poses[0,0], poses[3,0]) + int(abs(poses[0,0] - poses[3,0])/2)
                centerY = min(poses[0,1], poses[3,1]) + int(abs(poses[0,1] - poses[3,1])/2)        
                radius = int(abs(poses[0,0] - poses[3,0])/2)        
                trovato = True
                
            if ((trovato == False) and
                (poses[0,0] != 0 and poses[0,1] != 0) and
              (poses[4,0] != 0 and poses[4,1] != 0)):
                centerX = min(poses[0,0], poses[4,0]) + int(abs(poses[0,0] - poses[4,0])/2)
                centerY = min(poses[0,1], poses[4,1]) + int(abs(poses[0,1] - poses[4,1])/2)        
                radius = int(abs(poses[0,0] - poses[4,0])/2)
                trovato = True  
                
            if ((trovato == False) and
                (poses[4,0] != 0 and poses[4,1] != 0) and
                (poses[2,0] != 0 and poses[2,1] != 0)):
                centerX = min(poses[4,0], poses[2,0]) + int(abs(poses[4,0] - poses[2,0])/2)
                centerY = min(poses[4,1], poses[2,1]) + int(abs(poses[4,1] - poses[2,1])/2)
                radius = int(abs(poses[4,0] - poses[2,0])/2)
                trovato = True
                
            if ((trovato == False) and
                (poses[3,0] != 0 and poses[3,1] != 0) and
                (poses[1,0] != 0 and poses[1,1] != 0)):
                centerX = min(poses[1,0], poses[3,0]) + int(abs(poses[1,0] - poses[3,0])/2)
                centerY = min(poses[1,1], poses[3,1]) + int(abs(poses[1,1] - poses[3,1])/2)
                radius = int(abs(poses[1,0] - poses[3,0])/2)
                trovato = True   
                
            if ((trovato == False) and
                (poses[3,0] != 0 and poses[3,1] != 0)):

                centerX = 0
                if (poses[5,0] != 0 and poses[5,1] != 0):
                    centerXs = min(poses[5,0], poses[3,0]) + int(abs(poses[5,0] - poses[3,0])/2)
                    centerX = poses[3,0] 
                    
                elif (poses[6,0] != 0 and poses[6,1] != 0):
                    centerXs = min(poses[6,0], poses[3,0]) + int(abs(poses[6,0] - poses[3,0])/2)
                    centerX = poses[3,0] 
                    
                if (centerX != 0):
                    centerY = poses[3,1] 
                    radius = int(abs(poses[3,0] - centerXs))
                    trovato = True

            if ((trovato == False) and
                (poses[4,0] != 0 and poses[4,1] != 0)):
                
                centerX = 0
                if (poses[5,0] != 0 and poses[5,1] != 0):
                    centerXs = min(poses[5,0], poses[4,0]) + int(abs(poses[5,0] - poses[4,0])/2)
                    centerX = poses[4,0] 
                    
                elif (poses[6,0] != 0 and poses[6,1] != 0):
                    centerXs = min(poses[6,0], poses[4,0]) + int(abs(poses[6,0] - poses[4,0])/2)
                    centerX = poses[4,0] 
                    
                if (centerX != 0):
                    centerY = poses[4,1] 
                    radius = int(abs(poses[4,0] - centerXs))
                    trovato = True  
                    
            if (trovato == True):         
                if ((poses[5,0] != 0 and poses[5,1] != 0) and       
                    (poses[6,0] != 0 and poses[6,1] != 0)):
                    centerYs = min(poses[6,1], poses[5,1]) + int(abs(poses[6,1] - poses[5,1])/2)
                    radius2 = int(abs(centerYs - centerY)/2) 
                    if radius < radius2: radius = radius2 
                elif ((poses[5,0] != 0 and poses[5,1] != 0)):
                    radius2 = int(abs(poses[5,1] - centerY)/2) 
                    if radius < radius2: radius = radius2 
                elif ((poses[6,0] != 0 and poses[6,1] != 0)):
                    radius2 = int(abs(poses[6,1] - centerY)/2) 
                    if radius < radius2: radius = radius2 
                                        
            if (trovato == True): 
                # Put stuff here when you want to find Head

                x1 = centerX  
                y1 = centerY  
                x0 = x1 - radius 
                y0 = y1 - int(radius * 1.3)
                w0 = x1 + radius
                h0 = y1 + int(radius * 1.3)
                    
                cv2.rectangle(img, (x0,y0), (w0,h0), (0,0,255), 2)
                # OR better 
                cv2.ellipse(img, (x1,y1), (int((w0-x0)/2),int((h0-y0)/2)), 0, 0, 360, (0,255,0), 2)

MartyG

February 15, 2023 11:02
Thanks so much Marco Pirondini for sharing your knowledge and your code with the RealSense community!
0

Comment actions Permalink

Please sign in to leave a comment.

Comments

Didn't find what you were looking for?