Open Access Paper
2 February 2023 Research on binocular ranging method based on feature point extraction and matching
Ganquan Su, Lei Cheng
Author Affiliations +
Proceedings Volume 12462, Third International Symposium on Computer Engineering and Intelligent Communications (ISCEIC 2022); 1246218 (2023) https://doi.org/10.1117/12.2660775
Event: International Symposium on Computer Engineering and Intelligent Communications (ISCEIC 2022), 2022, Xi'an, China
Abstract
At present, artificial intelligence has become a hot topic, and the development of its related fields is also developing rapidly, among which visual ranging and image processing are particularly important for the development of artificial intelligence. Now there are many ranging methods are not fast enough, and measurement accuracy is not high, leading to the resulting estimates there are large deviation distance and the actual distance, such as unmanned vehicle, unmanned aircraft in operation process, cannot be accurately driving distance and obstacle avoidance, and existing deviations in robot grab, which cause personnel life safety is threatened, economic property damage. In order to solve the above related problems, this paper uses SIFT algorithm and ORB algorithm to extract feature points, and then through BFmatcher, FlannBasedMatcher, KnnMatch to match, finally get the corresponding distance, from the algorithm and accuracy of the two aspects of relevant research. It is concluded from the experimental measurement that FlannBasedMatcher takes into account both speed and accuracy after SIFT algorithm extraction, while ORB algorithm is faster than SIFT algorithm.

1.

INTRODUCTION

Nowadays, artificial intelligence is a hot social topic. Research on computer vision has also increasingly been conducted. As a key research direction in the related field, binocular technology is progressively drawing people’s attention. Binocular technology has a broad prospect in robot vision, driverless prediction and obstacle avoidance. Binocular camera is employed in the binocular technology to perceive the depth information of 3D scene and thus to provide an effective basis for robot vision and 3D modeling [1]. There are many ways to implement ranging. Traditional measurement can be carried out with ruler. However, modern technology facilitates the measurement and calculation by means of calculation and geometric model construction. Thus the ultimate measurement results are worked out. Binocular ranging also involves laser ranging, ultrasonic ranging, infrared ranging, etc. These rangings may consequently lead to high accuracy, but much more complex equipments and high implementation cost are also inevitably required.

In the advantageous indoor environment, binocular stereo vision ranging technology is employed in the object ranging research. Binocular vision technology refers to the computerized simulation of human eyes for the observation of surrounding environment. A fixed binocular camera is used to capture simultaneously a pair of images from different angles [2]. Similar to the function of human eyes, the object image information is captured and fed back to the computer for a series of operations, so as to get the corresponding results. The key of binocular ranging technology lies in the calibration and matching of the camera. The Camera calibration is achieved by first utilizing VC++ and OpenCV to obtain images, and then transmitting the acquired images into the Stereo Camera Calibrator toolbox of MATLAB to find out the internal and external parameters of the Camera. The distance measurement is based on the local feature of SIFT algorithm and ORB algorithm to extract feature points. Then combined with BFmatcher[3], FlannBasedMatcher[4], and KnnMatch[5], the matching is carried out. The parallax is therefore calculated in accordance with the feature matching. After obtaining the disparity, the distance of the target object can be obtained in line with the principle of triangular similarity [6]. On the computer, the final ranging result is obtained by programming in VS2019 software with the help of OpenCV [7].

2.

SYSTEM DESIGN

2.1

Ranging Principle and System Process of Binocular Camera

The human eye can perceive the distance of the object because there is a visual difference between the two eyes on the captured image, which is termed as parallax. The farther the target is, the smaller the parallax will be; The closer the target is, the greater the parallax will be. This principle is applied in binocular vision ranging. Therefore, the distance measurement of the obtained image is achieved by virtue of geometric methods and the related camera knowledge.

Assuming that there is a motion point P, it can move freely within the range of the camera. As point P moves, the position of its imaging point on the left and right cameras will also change. Based on the principle of binocular parallax triangulation, the depth information of the object is obtained, and the distance measurement of the binocular camera is realized. Figure 1 shows the ranging principle. In the figure, P is the moving point position; Ol is the optical center position of the left camera; Or is the optical center position of the right camera; F is the focal length of the camera; B is the baseline length; Pl is the left imaging point; Pr is the right imaging point.

Figure 1.

Schematic diagram of binocular camera ranging

00044_PSISDG12462_1246218_page_2_4.jpg

The core of binocular ranging is to solve D, and D is the disparity:

00044_PSISDG12462_1246218_page_2_1.jpg

According to the similarity triangle principle, the depth information z can be obtained:

00044_PSISDG12462_1246218_page_2_2.jpg

Simplify to obtain:

00044_PSISDG12462_1246218_page_2_3.jpg

In the process of solving the binocular camera, the camera focal length F and the baseline length are fixed parameters, which are determined by the selected camera. It is the value of disparity D and depth information Z that need to be calculated by computer.

The system process of binocular camera ranging is roughly divided into the following steps: image acquisition, camera calibration, stereo correction, feature point extraction and matching, and ranging. Image acquisition is realized by programming language with OpenCV. Camera calibration serves to obtain internal parameters such as focal length, image point, radial distortion, tangential distortion and external parameters involving translation parameter and rotation parameter. Stereo correction is to eliminate the distortion by calibrating the obtained parameters, and to achieve the goal of complete alignment for the need of ranging. Finally, the image pixels obtained from the left and right cameras are matched by feature point extraction and matching to find out the disparity and the calculated depth information. The ranging results are accordingly aquired. Figure 2 is the flow chart of the system.

Figure 2.

Flow chart of binocular camera ranging system

00044_PSISDG12462_1246218_page_3_1.jpg

2.2

Image Acquisition and Camera Calibration

The acquisition of image means the application of OpenCV in controlling the binocular camera with programming language and then capturing the images from different angles.

The camera calibration is to obtain the internal and external parameters of the camera. These parameters are fixed and constant, only relating to the selected equipments. Once the relevant parameters are accurately obtained, they are directly taken in the future need. The internal and external parameters determine the correspondence between the image coordinate system and the world coordinate system. The inner parameter is a transformation from plane to pixel, which only depends on the physical characteristics of the camera itself. The outer parameter reflects the transformation between the camera coordinate system and the world coordinate system, and is determined by the inner parameter and the baseline length [8].

The commonly used checkerboard calibration method is employed in this experiment. The checkerboard template in this experiment is a rectangular black and white checkerboard with 10×7 corner points and 29mm×29mm side length. As shown in Figure 3.

Figure 3.

10×7 black and white checkerboard

00044_PSISDG12462_1246218_page_3_2.jpg

Three coordinate systems need to be used in the process of calibration, namely: image coordinate system, camera coordinate system and world coordinate system. Among them, the image coordinate system can be divided into pixel coordinate system and physical coordinate system. The pixel coordinate system is represented by U and V, which is the image captured by the camera and returned to the computer. After the computer processing, it is converted into a digital image, in which each element is called a pixel. The physical coordinate system is generally represented by X and Y, and the intersection between the optical axis of the lens and the front plane is set as the origin. Setting dx and dy as the size of the pixel in the XY coordinate system, then the transformation relationship between pixel coordinates and physical coordinates is

00044_PSISDG12462_1246218_page_4_1.jpg

The camera coordinate system is represented by Xc, Yc and Zc. According to the camera coordinate system, the imaging position of the moving point P on the image is

00044_PSISDG12462_1246218_page_4_2.jpg

The matrix form is

00044_PSISDG12462_1246218_page_4_3.jpg

The internal parameter matrix can be obtained by combining matrices (4) and (6)

00044_PSISDG12462_1246218_page_4_4.jpg

The world coordinate system is represented by Xw, Yw and Zw, which are all three-dimensional coordinate systems with the camera coordinate system. The transformation between them only requires rotation and translation, and the resulted matrix is the external parameter matrix

00044_PSISDG12462_1246218_page_4_5.jpg

After 30 pairs of images are input into the Stereo Camera Calibrator toolbox of MATLAB, 29 pairs of images satisfying the calibration conditions are acquired through relevant screening. Then, the relevant information images are secured through the operation of corner extraction, etc. Finally, the final 28 pairs of image information are garned after removing the one pair of images with large error. The calibration results are as follows:

00044_PSISDG12462_1246218_page_4_6.jpg
00044_PSISDG12462_1246218_page_5_1.jpg
00044_PSISDG12462_1246218_page_5_2.jpg
00044_PSISDG12462_1246218_page_5_3.jpg
00044_PSISDG12462_1246218_page_5_4.jpg
00044_PSISDG12462_1246218_page_5_5.jpg

Where Ml and Mr are the internal parameters of the left and right cameras, Dl and Dr are the distortion parameters of the left and right cameras, R is the rotation parameter of the left and right cameras, and T is the translation parameter of the left and right cameras.

2.3

Stereo Correction

Stereo correction is to eliminate the distortion in the course of calibrating the image acquired by the left and right cameras. Then the strict correspondence of the already processed images is conducted, in which it is necessary to reprojecting the image plane of the two cameras.

This process paves the way for future acquision of the disparity through extraction and matching of the feature points. In this case, it is the simplest to calculate the stereo disparity, reducing the computational complexity of the matching process and improving the accuracy of feature point extraction and matching [9].

The Bouguet algorithm [10] uses the rotation parameter R and translation parameter T obtained by the binocular camera calibration above to minimize the number of reprojections and maximize the overlapping observation area of each left and right image. In order to minimize the image reprojection distortion, it is necessary to break down the matrix R that rotates the right camera image plane to the left camera image plane into two parts.

Firstly, the obtained calibration data are entered into the program and used as parameters. Then the correlation function in OpenCV is used to complete the calibration. Finally, the output of the corrected image is achieved.

Before completing the stereo correction, the corner points of the black and white checkerboard in the picture cannot match correctly (as shown in Figure 5), which could be clearly seen by pictures shot by the left and right cameras. This will bring great difficulty to the matching calculation later.

Figure 4.

Images taken by left and right cameras without correction

00044_PSISDG12462_1246218_page_5_6.jpg

Figure 5.

Image pair by left and right cameras after correction

00044_PSISDG12462_1246218_page_6_1.jpg

After correction, the images (as shown in Figure 6) obtained by the left and right camera could clearly prove that the corresponding positions of the obtained black and white checkerboard are all on the same row.

Figure 6.

Image of 350mm

00044_PSISDG12462_1246218_page_6_2.jpg

2.4

Feature Point Extraction, Matching and Ranging

A pixel can be regarded as a feature of the image, which is accordingly entitled as a feature point. Feature point is characterized by repeatability, distinguishability, efficiency, locality, rotation invariance and scale invariance. The difference can be used for detection, and the repeatability is employed for matching. Feature points include key elements and descriptors. SIFT algorithm is a widely-used scale-invariant feature detection method. Each feature point in the picture is described by 128-dimensional vector. SIFT algorithm seeks extreme points in the scale space to extract location, scale and selection invariants [11]. The ORB algorithm is a combination of FAST detector [12] and Brief descriptor [13], and the ORB algorithm has scale invariance and rotation invariance [14]. After extracting feature points according to the two algorithms, the extracted feature points are matched by corresponding methods, and then the disparity is obtained according to the matching method. The purpose of ranging is eventually achieved in accordance with the disparity obtained and the depth calculation by the triangular similarity principle.

3.

EXPERIMENT RESULTS AND ANALYSIS

3.1

Experiment Process and Results

The experiment was divided into five groups, and each group was divided into six different distance ranging experiments of the same target. The first group carried out SIFT+BFmatcher algorithm for image ranging; The second group implemented SIFT+FlannBasedMatcher algorithm for ranging; The third group conducted SIFT+KnnMatch algorithm for ranging. The fourth group used the ORB+BFmatcher algorithm for ranging. The fifth group of experiments employed the ORB+KnnMatch algorithm for ranging. The contents of each group are the same. The same target picture (as shown in Figure 6) is divided into 350mm, 400mm, 450mm, 500mm, 550mm and 600mm, and then ranging is carried out to check the accuracy. The left and right shots are shown in Figure 6.

Figure 7.

Image of 400mm

00044_PSISDG12462_1246218_page_6_3.jpg

Figure 8.

Image of 450mm

00044_PSISDG12462_1246218_page_6_4.jpg

Figure 9.

Image of 500mm

00044_PSISDG12462_1246218_page_6_5.jpg

Figure 10.

Image of 550mm

00044_PSISDG12462_1246218_page_7_1.jpg

Figure 11.

Image of 600mm

00044_PSISDG12462_1246218_page_7_2.jpg

First, fix the camera in the ready position and adjust the machine position so that the cameras on both sides are kept on the same horizontal line and perpendicular to the plane. Then use a ruler to measure the corresponding distance between the camera and the level. The picture is pasted and vertical to the plane and placed in the front end of the range corresponding position. Then the image acquisition and ranging are carried out.The distance measurement of SIFT+BFmatcher algorithm is shown in Table 1, SIFT+FlannBasedMatcher algorithm in Table 2, SIFT+KnnMatch algorithm in Table 3, ORB+BFmatcher algorithm in Table 4, and ORB+KnnMatch algorithm in Table 5. Taking 600mm ranging as an example, the time used to calculate each algorithm is indicated in Table 6.

Table1.

Experiment results of SIFT+BFmatcher binocular ranging

Serial NumberExperiment Data
Soft ruler ranging/mmBinocular ranging/mmAbsolute error/mmRelative error/%
1350352.61642.61640.7475
2400389.764010.23602.5590
3450433.526716.47333.6607
4500478.320421.67964.3359
5550520.806929.19315.3078
6600567.215032.78505.4642

Table2.

Experiment results of SIFT+FlannBasedMatcher binocular ranging

Serial NumberExperiment Data
Soft ruler ranging/mmBinocular ranging/mmAbsolute error/mmRelative error/%
1350352.61642.61640.7475
2400393.10526.89481.7237
3450433.526716.47333.6607
4500502.96642.96640.5933
5550569.515319.51533.5482
6600591.27168.72841.4547

Table3.

Experiment results of SIFT+KnnMatch binocular ranging

Serial NumberExperiment Data
Soft ruler ranging/mmBinocular ranging/mmAbsolute error/mmRelative error/%
1350352.61642.61640.7475
2400393.10526.89481.7237
3450433.526716.47333.6607
4500502.96642.96640.5933
5550569.515319.51533.5482
6600591.27168.72841.4547

Table 4.

Experiment results of ORB+BFmatcher binocular ranging

Serial NumberExperiment Data
Soft ruler ranging/mmBinocular ranging/mmAbsolute error/mmRelative error/%
1350348.94701.05300.3009
2400394.45075.54931.3873
3450435.174314.82573.2946
4500488.152011.84802.3696
5550550.36750.36750.0668
6600569.946630.05345.0089

Table 5.

Experiment results of ORB+KnnMatch binocular ranging

Serial NumberExperiment Data
Soft ruler ranging/mmBinocular ranging/mmAbsolute error/mmRelative error/%
1350343.97966.02041.7201
2400389.843610.15642.5391
3450435.174314.82573.2946
4500488.152011.84802.3696
5550535.156514.84352.6988
6600569.114730.88535.1476

Table 6.

Taking ranging 600mm as an example to compare the final time of each algorithm

 Experiment Data
SIFT+BFmatcherSIFT+FlannBasedMatcherSIFT+KnnMatchORB+BFmatcherORB+KnnMatch
Time(s)1.29590.05860.06730.24750.1796

3.2

Analysis of the Experiment Results

According to the above five groups of experiment results, it is found that the experiment results obtained by SIFT+FlannBasedMatcher algorithm and SIFT+KnnMatch algorithm are the same, the feature points acquired by matching and screening are exactly the same, and therefore the ranging results are exactly the same. However, compared with SIFT+KnnMatch algorithm, SIFT+FlannBasedMatcher algorithm takes less time and is more efficient, which is the best matching scheme to extract through SIFT algorithm. However, ORB algorithm is faster for extracting feature points compared with SIFT algorithm. In terms of accuracy of ranging, SIFT+FlannBasedMatcher algorithm and SIFT+KnnMatch algorithm are relatively more precise in ranging. And ORB+BFmatcher algorithm has a higher accuracy in the experimental close-range measurement.In general, SIFT+FlannBasedMatcher algorithm is the most efficient one. In the five groups of experiments, this algorithm is both accurate and fast, which can better meet the needs of binocular camera ranging and other related academic research.

Figure 12.

Relationship between the measurement and relative error of the tape ruler

00044_PSISDG12462_1246218_page_10_1.jpg

Figure 13.

Relationship between ranging algorithm and operation speed

00044_PSISDG12462_1246218_page_10_2.jpg

4.

CONCLUSION

Research is carried out in this paper, regarding the differences in speed and accuracy between five ranging algorithms based on binocular technology. Starting from the ranging principle of binocular vision technology, a series of processes such as image acquisition, camera calibration, stereo calibration, feature point extraction and matching, and ranging are studied through experiments, and the ranging information required by the experiment is calculated. According to the experiments of five ranging methods, SIFT+FlannBasedMatcher algorithm leads to the best result. And the reasons for this conclusion are analyzed. The future research will be improved with the continuous progress of computer technology and upgrade of the related algorithms. Therefore, a more rapid and accurate ranging method will be worked out.

REFERENCES

[1] 

Xu Jie, Chen Yimin, Shi Zhilong, “Binocular Vision Zoom Ranging Technology [J],” Journal of Shanghai University (Natural Science Edition), 15 (2), 169 –174 (2009). Google Scholar

[2] 

Shasha Yu, Hao Huang, Yangjie Liu, “A Low-complexity Autonomous 3D Localization Method for Unmanned Aerial Vehicles by Binocular Stereovision Technology [A],” in IEEE. 2018 10th International Conference on Intelligent Human-Machine Systems and Cybernetics, 344 –347 (2018). Google Scholar

[3] 

Amila Jakubović, Jasmin Velagić, “Image Feature Matching and Object Detection Using Brute-Force Matchers[C],” in International Symposium ELMAR, (2018). Google Scholar

[4] 

Vineetha Vijayan, Pushpalatha Kp, “FLANN Based Matching with SIFT Descriptors for Drowsy Features Extraction[C],” in International Conference on Image Information Processing, (2019). Google Scholar

[5] 

Fengquan Zhang, Yahui Gao, Liuqing Xu, “An adaptive image feature matching method using mixed rich-KD tree[J],” Multimedia Tools and Applications, 23 (a24), 16421 –16439 (2020). Google Scholar

[6] 

Wang Hao, Xu Zhiwen, Xie Kun, “Binocular Ranging System Based on OpenCV [J],” Journal of Jilin University (Information Science Edition), 32 (2), 188 –194 (2014). Google Scholar

[7] 

LI Dejun, MA Xiaohui, “The Binocular Measuring System Research Based on OpenCV[C],” in International Conference on Material and Manufacturing technology, (2012). Google Scholar

[8] 

Hu Jinbo, Zhang Feixiong, Wan Zekun, Huang Hao, “Research on Indoor Three-dimensional Measurement Algorithm Based on Binocular Technology [J],” Computer measurement and control, 27 (9), 66 –67 (2019). Google Scholar

[9] 

Zelin Meng, Xiangbo Kong, Lin Meng, Hiroyuki Tomiyama, “Distance Measurement and Camera Calibration based on Binocular Vision Technology[C],” in International Conference on Advanced Mechatronic Systems, (2018). Google Scholar

[10] 

Gunen Mehmet Akif, Besdok Erkan, Civicioglu Pinar, Atasever Umit Haluk, “Camera Calibration by Using Weight Differential Evolution Algorithm: A Comparative Study with ABC, PSO, COBIDE, DE, CS, GWO, TLBO, MVMO, FOA, LSHADE, ZHANG and BOUGUET[J],” Neural computing & applications, 32 (23), (2020). Google Scholar

[11] 

David G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints [J],” International Journal of Computer Vision, 60 91 –110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94 Google Scholar

[12] 

Edward Rosten, Tom Drummond, “Machine Learning for high-speed Corner Detection[C],” in European Conference on Computer Vision, (20062006). Google Scholar

[13] 

Michael Calonder,Vincent Lepetit,Christoph Strecha,Pascal Fua, “BRIEF: Binary Robust Independent Elementary Features[C],” in European Conference on Computer Vision, (2010). Google Scholar

[14] 

Ethan Rublee;Vincent Rabaud;Kurt Konolige, “ORB: An Efficient Alternative to SIFT or SURF [A],” in IEEE. 2011 International Conference on Computer Vision, 2564 –2571 (2011). Google Scholar
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ganquan Su and Lei Cheng "Research on binocular ranging method based on feature point extraction and matching", Proc. SPIE 12462, Third International Symposium on Computer Engineering and Intelligent Communications (ISCEIC 2022), 1246218 (2 February 2023); https://doi.org/10.1117/12.2660775
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Ranging

Cameras

Calibration

Imaging systems

Image processing

Detection and tracking algorithms

Feature extraction

Back to Top