Open Access
22 August 2022 Special Section Guest Editorial: Biologically Inspired Computer Vision and Image Processing
Keping Yu, Wei Wang, Muhammad Tariq
Author Affiliations +
Abstract

Guest Editors Keping Yu, Wei Wang, Moayad Aloqaily, Muhammad Tariq introduce the Special Section on Biologically Inspired Computer Vision and Image Processing.

The domain of computer vision and image processing has dramatically evolved in recent years, and it is incredibly useful across various fields. It is a broad spectrum of research that extracts meaningful information from the images. The key driving factor behind the computer vision application is the “machine vision” that enables the machine to see and extract intelligence. From a biological perspective, computer vision techniques aim to come up with computational models of the human visual system. So that it can perform autonomous tasks which are similar to the human visual systems (sometimes far beyond the human vision). At present, deep learning assisted computer vision techniques such as convolutional neural networks offer promising results with improved accuracy across various applications such as medical imaging, self-driving cars, multimedia, Earth observation, and many more.

In practice, computer vision and image processing have enormous applications in the field of research, industry, and our daily lives. Since it has the ability to process images efficiently, it can be employed for natural images, satellite images, radar images, seismic data, and so on. Thus, it has a wide range of scope when these two techniques, computer vision, and image processing, are converged on each other. This can even be used for robotic vision systems. However, these systems face reasonable challenges when they do pixel-to-pixel transformations. Even after plenty of research in this field, it works under a few constraints, and the vision-related problems are not completely solved. This is because human vision can be too good for various tasks. For example, a human eye can recognize a face under various circumstances such as viewpoint, illumination, expression, etc. At the same time, machine vision may find the same task to be difficult in certain situations.

The ultimate goal of this special section is to explore how novel computer vision and image processing approaches could be developed from biological insights. It offers a unique forum to develop and scale models rooted in experimental biology such as neurophysiology, psychophysics, etc. This may typically lead to an exciting synergy between computer vision and biological vision enabling state-of-the-art advancements in this field. If appropriately explored, the research on biologically inspired computer vision and image processing will possibly direct to a new interdisciplinary research discipline with enormous opportunities.

We received a very good response to our open call for papers for the special section. All of the articles were rigorously evaluated according to the normal reviewing process of the Journal of Electronic Imaging. The evaluation process took into consideration factors pertaining to originality, technical quality, presentational quality, and overall contribution. In all, 17 articles were accepted for publication. In the following, we will introduce these articles and highlight their main contributions.

Almutiry contributes to the effect of the distribution of the most irregular iris images under visible light. Gray level co-occurring matrix is proposed for segmentation, and VGG 16 and VGG 19 are applied for iris classification. The proposed iris recognition method divides the iris segmentation process into two steps: localization of the iris area of the eye and segmentation. Well-designed VGG 16 and VGG 19 networks are used to distinguish and locate the eyes.

Singh, Gaba, and Hedabou propose an effective denoising method that works well on both grayscale and color images. The proposed two-stage method involves (i) noisy pixel detection and (ii) noisy pixel restoration. In the first stage, the conglomerate method detects the noisy pixel position; in the second stage, the corrupted value of the noisy pixel is restored using a conglomerate method. The proposed method was used to process a variety of grayscale and color images.

Rao et al. use the cluster-based feature selection method to derive features for statistical and structural approaches. The proposed strategy should be implemented in three stages. To begin, the function can be extracted using the local binary pattern, the gray level co-occurrence matrix, or the Gabor filter. Following that, a modified K-means clustering method is used to select features using four standard distance measurements: Euclidean, Minkowski, Chebychev, and City Block. Finally, benchmark classifiers such as naive Bayes, support vector machine, and K-nearest neighbor were used to effectively compute the classification accuracy of the research.

Tian et al. give a detailed introduction to the background of instance segmentation technology, its development, and the common datasets in this field, and further deeply discuss key issues appearing in the development of this field, with the future development direction of instance segmentation technology proposed. The review provides an important reference for future research on this technology.

Sharma and Chakraborty aim to identify the different optimization algorithms used in image steganography after embedding the data to improve the resilience, visibility, and payload carrying capacity. Additionally, it highlights several bioinspired algorithms, including particle swarm optimization, ant colony optimization, firefly optimization, and artificial bee colony optimization, and evaluate through performance measures such as peak signal-to-noise ratio (PSNR) and mean square error (MSE). The performance metrics generated from the collected data indicate that the firefly method produced a higher PSNR and a lower MSE, namely 72.42 dB and 0.13, respectively.

Sharma et al. introduce a robust watermarking method to efficiently embed multiple marks into ECG signals using redundant discrete wavelet transform and generalized singular value decomposition. Multiple marks are embedded within the signal to eliminate any ownership conflict and provide copyright protection. The hybrid optimization technique is also used to balance invisibility and robustness. Further, wavelet-based compression is adapted to compress the marked signal before transmission reduces the bandwidth demand.

Yi believes that paying attention to the local feature of the eye can improve the accuracy of age estimation to a certain extent. The multilevel feature convolutional neural network (MLFCNN) is proposed, which values eye features and combines them with face features to jointly estimate human age. MLFCNN performs two rounds of estimation based on extracted features. First, the age range of the sample is estimated as the age group, and then on this basis, the fine age of the sample is further estimated.

Ishiyama et al. propose a deep learning model for the task of single image reflection removal (SIRR). The assumed scenes of reflection vary, and there is little training data because it is difficult to obtain true values. It focuses on the latter and proposes a SIRR based on meta-learning. We adopt model agnostic meta-learning (MAML), and the authors propose an SIRR using a deep learning model with MAML, both of which are methods of meta-learning. The deep learning model includes the iterative boost convolutional long short-term memory network, which is adopted as the deep learning model.

Sharma et al. offer an automatic recognition method based on DL that makes use of the Grupo de Procesado Digital de Seales. The biggest publicly accessible handwritten signature dataset, the synthetic signature dataset, was used to classify the signatures of 100 people, each of whom possessed 24 genuine signatures and 30 forged signatures. An inception V3 transfer learning model is proposed by hyper-tuning different layers from the middle of its architecture and this model is fine-tuned by adding layers, such as flatten, dense (1024), dropout (0.5), and dense (1). This study will aid researchers in developing more effective CNN-based models for offline signature verification with application to computer vision.

Lu et al. propose a multitask deep active contour model for off-angle iris image segmentation. Specifically, the proposed approach combines the coarse and fine localization results. The coarse localization detects the approximate position of the iris area and further initializes the iris contours through a series of robust preprocessing operations. Then, iris contours are represented by 40 ordered isometric sampling polar points and thus their corresponding offset vectors are regressed via a convolutional neural network for multiple times to obtain the precise inner and outer boundaries of the iris. Next, the predicted iris boundary results are regarded as a constraint to limit the segmentation range of noise-free iris mask. Besides, an efficient channel attention module is introduced in the mask prediction to make the network focus on the valid iris region. A differentiable, fast, and efficient SoftPool operation is also used in place of traditional pooling to keep more details for more accurate pixel classification. Finally, the proposed iris segmentation approach is combined with off-the-shelf iris feature extraction models including traditional OM and deep learning-based FeatNet for iris recognition.

Sohan, Basalamah, and Solaiman collect a series of publicly available unique COVID-19 x-ray and CT image datasets, then assess and compare their performances using proposed 22-layer convolutional neural network model along with ResNet-18 and VGG16. The paper investigates eight individual datasets known as Twitter, SIRM x-ray, COVID-19 Image Repository, EURORAD, BMICV, SIRM CT, COVID-CT, and SARS-CoV-2 CT. The model obtained classification accuracy of 91%, 81%, 59%, 98%, 58%, 79%, and 97%, respectively. The proposed model obtained the highest classification accuracy using four datasets (Twitter, COVID-19 Image Repository, COVID-CT, and SARS-CoV-2 CT). Similarly, ResNet-18 only utilized three (EURORAD, BMICV, and SIRM CT), whereas VGG16 only utilized the SIRM x-ray dataset. Results of this investigation indicate a significant comparison chart among the performance of the datasets.

To fully consider the exposure in various distorted images, Rahman et al. propose a diverse image enhancement model that improved the brightness and contrast, processed the colors, and eliminated the hazy effect. Accordingly, an input red green blue color image was transformed into a hue, saturation, value color image. The V component was inverted and enhanced using three steps. In the first step, the hyperbolic and statistical methods were applied, and then their results were combined using an adjusted logarithmic methodology. This method properly adjusted the high-contrast and low-contrast impact while preserving the vital image information. In the second step, the output of the first step was inverted back and fed into a complete optimization algorithm to estimate the illumination map. Then, the exposure ratio map was estimated using an illumination map, which was adjusted using the camera response function. In the third step, a nonlinear stretching function was introduced to control brightness and contrast. For instance, a lower value of α yielded maximum stretching, and a higher value of α eliminated haze in the image to a great extent.

Zhai et al. propose a biologically inspired crowd counting method named group-split attention network (GSANet). The GSANet consists of three principal modules, namely GS module, dual-aware attention module, and aggregation module. The GS module processes the subfeatures of each group in parallel, and groups the input feature map to reduce the computational cost. The dual-aware attention module synergies the spatial and channel dimensional information to alleviate the estimation error in background regions. The aggregation module adopts a learning-based cross-group strategy to aggregate and facilitate the fusion of feature maps along different channel dimensions.

Gobinath and Gopinath present the squeeze excitation residual UNet (SER-UNet) model for vessel segmentation. The proposed model uses a new type of residual block called SER residual blocks for vessel segmentation. Initially, the fundus image is read and downsampled by converting the input image into vector values. Then, it conducts segmentation by adding attention mechanism and residual structure into convolution blocks to find vessel regions accurately and aggregate the tiny vessel characteristics. It helps segment the image of the glaucoma affected region in the retina. Together with a pixelwise cross-entropy loss function, it shows excellent performance on fundus image segmentation.

Huo et al. propose a precise and fast iris segmentation algorithm. First, an efficient feature extraction network that combines depth-wise separable convolution with dilated convolution is designed to reduce model parameters while maintaining segmentation accuracy. Then, an attention mechanism is introduced to suppress noise interference and enhance the discriminability of the iris region. Finally, an auxiliary training branch is proposed to overcome the vanishing gradient problem.

Li, Xue, and Gao propose the Center Normalization De-Trended Fluctuation Analysis feature extraction algorithm of contour based on the process of contour feature extraction for coal and gangue. Based on the analysis of the representative features of coal and gangue, the extraction process of target contour feature is established based on hardness difference. Combined with the overall features of contour curve after de-trending, a center normalized CNDFA feature extraction algorithm is proposed. First, the de-trended analysis of contour curve is realized by least square optimal fitting, and then the de-trended data is normalized. Finally, the contour features are described quantitatively by multi-fractal spectrum to form the geometric features of target contour curve, which is used to train SVM classifier.

Selvanambi et al. propose a unique method for reducing noise in still images using a combination of Block Matching and Dilated Convolutional Neural Network. In this approach, initially the existing algorithm is used to preprocess the image. Then block matching technique uses a sliding window size of a four-by-four block and moves it across the image to select blocks that are similar in the image. The matching blocks are highly correlated with one another. The matching blocks are then fed to the deep dilated convolutional neural network to remove noise and get a better noiseless image. Finally, the noiseless image performance is evaluated using standard metrics.

We would like to express our sincere thanks to all the authors for submitting their articles and to the reviewers for their valuable comments and suggestions that significantly enhanced the quality of these articles. We are also grateful to Editor-in-Chief Prof. Zeev Zalevsky for the great support throughout the whole review and publication process of this special section, and, of course, all the editorial staff. We hope that this special section will serve as a useful reference for researchers, scientists, engineers, and academics.

Biography

Keping Yu received his ME and PhD degrees from the Graduate School of Global Information and Telecommunication Studies, Waseda University, Tokyo, Japan, in 2012 and 2016, respectively. He was a research associate and a junior researcher with the Global Information and Telecommunication Institute, Waseda University, from 2015 to 2019 and 2019 to 2020, respectively, where he is currently a researcher. He has hosted and participated in more than ten projects, is involved in many standardization activities organized by ITU-T and ICNRG of IRTF, and has contributed to ITU-T Standards Y.3071 and Supplement 35. His research interests include smart grids, information-centric networking, the Internet of Things, blockchain, and information security. He has been the general co-chair and publicity co-chair of the IEEE VTC2020-Spring 1st EBTSRA workshop, general co-chair of IEEE ICCC2020 2nd EBTSRA workshop, TPC co-chair of SCML2020, local chair of MONAMI 2020, session co-chair of CcS2020, and session chair of ITU Kaleidoscope 2016. Moreover, he has served as TPC member of more than ten international conferences, including the ITU Kaleidoscope, IEEE VTC, IEEE CCNC, IEEE WCNC, etc. He has been a lead guest editor for Sensors, Peer-to-Peer Networking and Applications, Energies and guest editor for IEICE Transactions on Information and Systems, Intelligent Automation & Soft Computing, Computer Communications. He is an editor of the IEEE Open Journal of Vehicular Technology.

Wei Wang received his BSe degree in electronic information science and technology from Shenyang University in 2012 and his PhD degree in software engineering from Dalian University of Technology in 2018. He is now an associate professor at the School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China. He has authored/co-authored over 30 scientific papers in international journals and conferences, including IEEE Transactions on Big Data, IEEE Transactions on Emerging Topics in Computing, IEEE Transactions on Human-Machine Systems, WWW, and so on. He received the best paper award of the IEEE International Conference on Ubiquitous Intelligence and Computing in 2014. His research interests include computational social science, data mining, and mobile social networks.

Muhammad Tariq assumed the charge of director of National University of Computer & Emerging Sciences (NUCES), Peshawar Campus, on January 1, 2018. He is associate professor at the Department of Electrical Engineering. Before taking charge as a director, he was head of the Electrical Engineering Department. He held an assistant professor position from 2012 to 2017. He is also a visiting researcher at Waseda University, Japan, since 2012 and visiting research collaborator at Princeton University since 2016. He has co-authored a book on smart grids with leading researchers from Europe, China, Japan, and USA, which was published by John Wiley & Sons in April 2015. In 2017, he was selected by the Chinese government as High-End Foreign Expert through International Cooperation Project funded by State Administration of Foreign Experts Affairs P.R. China. He has rendered his technical committee services in IEEE ICC (2018) IEEE Globecom (2017), IEEE CISS (2016), IEEE AFRICON (2015), SENSORCOMM (2011–2015), IEEE ISADS (2014) and ICET (2015). He is a regular technical reviewer of the leading journals such as IEEE Transactions on Comm., Smart Grids, Industrial Electronics, and Sensors. He has won the Fulbright fellowship for his postdoc in USA, MEXT scholarship for his PhD in Japan and the HEC scholarship for his MS in South Korea. He has also won Brain Korea 21 research grant from the Ministry of IT, government of South Korea in 2008.

© 2022 Society of Photo-Optical Instrumentation Engineers (SPIE)
Keping Yu, Wei Wang, and Muhammad Tariq "Special Section Guest Editorial: Biologically Inspired Computer Vision and Image Processing," Journal of Electronic Imaging 31(4), 041201 (22 August 2022). https://doi.org/10.1117/1.JEI.31.4.041201
Published: 22 August 2022
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Iris recognition

Machine vision

Computer vision technology

Image processing

Image segmentation

Convolutional neural networks

Eye models

RELATED CONTENT


Back to Top