PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224701 (2022) https://doi.org/10.1117/12.2639938
This PDF file contains the front matter associated with SPIE Proceedings Volume 12247, including the Title Page, Copyright information, Table of Contents, and Conference Committee listings.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image Recognition Processing and Remote Sensing Imaging Technology
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224702 (2022) https://doi.org/10.1117/12.2636926
With the development of techniques and society, people gradually have more and more serious requirements for artistic pleasure. People want to make artistic works by themselves, and thus, many photo applications presented image style transfer features to cater to consumers’ tendencies. In order to improve the present algorithm about the feature and get a more effective solution. We presented the VGG-16 structure and improved it when solving the image style transfer problem. After comparing different networks, weight ratio, and activation functions, we find the VGG-16 is the best choice according to its least iteration and shortest running time among the four networks. In addition, ReLu and LeakyReLu are better than Tanh and Sigmoid when choosing activation functions for their better handling texture ability. LBFGS is better than Adam, SGD and RMSprop when choosing optimizer for its smaller number of iterations. Finally, we obtain an optimized image style transfer model based on VGG-16 network.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224703 (2022) https://doi.org/10.1117/12.2637092
Based on 116-phase Landsat satellite remote sensing data between 1984 and 2021, this paper inverted the long-term distribution of water quality parameters such as dissolved oxygen (DO), oxidation-reduction potential (ORP), and chlorophyll-a (Chl-a) of the water in Baiyangdian Lake, and analyzed the spatiotemporal distribution characteristics and variations of water quality in Baiyangdian Lake over 37 years. In terms of temporal scale, the inter-annual variation of DO shows certain stability, and the images with the proportion of pollution-free area reaching 90% or more account for 88.7% of the total, showing no pollution in terms of DO; In terms of ORP, the images with the proportion of pollution-free area reaching 90% or more account for 81.2% of the total, showing no pollution to light pollution; the inter-annual variation of Chl-a concentration shows certain volatility, and the overall performance is light-moderate pollution, but the pollution level has been alleviated in recent years; the pollution status of water quality in Baiyangdian Lake in terms of Chl-a and ORP has a certain correlation. In terms of spatial scale, the spatial distribution pattern of DO and ORP is stable, presenting the characteristic that most areas are pollution-free, and a few areas with more frequent human activities show light and moderate pollution.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224704 (2022) https://doi.org/10.1117/12.2636918
Deep convolutional neural network has achieved superior recognition performance on many public object detection datasets. However, under the weather conditions of rain or fog, the scarcity of samples has always been the problems restricting the accuracy of detection and identification. To solve this problem, this paper proposed an object detection method for heavy fog scenes based on image defogging and sample enhancement. Firstly, generative adversarial network (GAN) is adopted to remove the fog from images, and then achieve sample enhancement by a style transfer network, which keeps the image content basically unchanged and transform the style of image texture. Fog-free dataset after sample enhancement can reduce the influence of the texture information on the network model and make it pay more attention to the contour information of the object shape. The experimental results on I-HAZE and REISDE dataset show that our proposed method can effectively improve the object detection precision and the mAP (mean average precision) can be improved by up to 15%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224705 (2022) https://doi.org/10.1117/12.2636974
This paper discusses the important role of deep learning technology in optimizing the image recognition function of digital libraries. On this basis, it analyzes the necessity and feasibility of building an image recognition mechanism based on deep learning technology, and expounds the application of deep neural network to construct images. The main contents of the recognition mechanism, including the mobile vision big data input layer, the mobile visual resource organization layer, the deep neural network analysis layer and the mobile vision service interaction layer, etc. Realization of personalized smart service in library.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224706 (2022) https://doi.org/10.1117/12.2636786
The performance of rain removal methods which are based on deep learning is largely affected by the designed models and training datasets for the image rain removal tasks. Most of current state-of-the-art focus on how to construct powerful deep models. But in this paper, we start from two perspectives of training dataset and model. We propose a novel rain model that includes a rain layer, a background layer and and a way how rainy image is generated. Based on this model, we develop a multi-task deep learning architecture that learns features of both the rain layer and the clean background layer. The additional information of rain layer is important because its loss function can provide additional powerful information to the network. Then we collected a large number of images of real rain streaks and outdoor scenes, and produced datasets for training. The effectiveness of our model and architecture was shown in tests on synthetic datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224707 (2022) https://doi.org/10.1117/12.2636811
Classification of images is highly useful in medical, agriculture, industry, and other fields. Improving the accuracy of classification with a small quality of parameters is a challenging problem. This paper performs a study that relied on the use of EfficientNet and convolutional block attention module (CBAM). Especially, Spatial Group-wise Enhance (SGE) module is used to adjust the importance of sub features and suppress possible noise. Stochastic gradient 1. descent (SGD) is chosen as the optimizer. After experiments, the EfficientNet model with CBAM module and SGE module can achieve higher accuracy in image classification. This module has achieved high accuracy on Flowers (98.53%).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224708 (2022) https://doi.org/10.1117/12.2636840
Magnetic resonance (MR) imaging is an important computer aided diagnosis techniques with rich pathological information. Due to the factor of physical and physiological constraint, it affects the applicability of that technique seriously. However, computed tomography (CT)-based radiotherapy is more popular on account of its imaging rapidity and environmental simplicity. Therefore, it is of great theoretical and practical significance to design a method that can construct MR image from corresponding CT image. In this paper, we treat MR imaging as a machine vision problem and propose a multiconditional generative adversarial network (GAN) for MR imaging from CT scan data. Considering reversibility of GAN, both generator and reverse generator are designed for MR and CT imaging respectively, which can constrain each other and improve consistency between features of CT and MR images. In addition, we use VGG16 model to extract semantic features, perception error and voxel error fusing with original GAN loss is designed to enhance similarity of MR image structure and detail texture features. The experimental results with challenging public CT-MR imaging dataset show distinct performance improvement over other GANs utilized in medical imaging and demonstrate the effect of our method for medical image modal transformation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224709 (2022) https://doi.org/10.1117/12.2636807
Since there is often a huge amount of redundant information content in the native image data information extracted using conventional photographic devices, it is detrimental to the transmission, preservation, display and recognition of the original image data information. By using the principal component analysis method, the main characteristics of the original image can be obtained, thus reducing the relevant information content and redundant signals in the original image data information; subsequently, these methods have been applied to image signal processing, including offset correction of images, face recognition and image compression, and excellent experimental results have been achieved, further enriching and advancing the principal component analysis method in the field of image processing. It further enriches and promotes the extensive application of principal component analysis in image processing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470A (2022) https://doi.org/10.1117/12.2636813
This paper proposes a new liver tumor segmentation method based on mixed domain attention mechanism. Firstly, by combining the well-known SENet with channel attention architecture, cross channels are interacted at their Excitations by using multiple one-dimensional convolution kernels. Then, multiple dilation convolution is used to increase the receptive field in the spatial attention module of BAM. After then, the channel and spatial attention feature maps are fused to recorrect the original feature map. Finally, a gating mechanism is introduced at the sampling skip connection on the decoder to filter important features. The experimental results show that in many evaluation indexes, the accuracy of this method is higher than the relevant segmentation methods, and thus the introduced segmentation method can provide some guidance in clinical diagnosis for liver tumor.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470B (2022) https://doi.org/10.1117/12.2636856
Affected by the total ionizing dose, the CMOS image sensor degrades and the fixed pattern noise increases. In order to improve the removal efficiency and effect of the current fixed pattern noise, an image correction method based on statistical filtering and wavelet analysis was proposed. Firstly, the image is segmented by selective down sampling. Next, a particular shaped Gaussian filter kernel is used to blur the image. The fuzzy image and the original image are then statistical extracted and compared. Finally, after median processing, the correction result is obtained. Experimental results show that SFAWA has a better processing performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470C (2022) https://doi.org/10.1117/12.2636924
In order to effectively evaluate the quality score of blurred images, a reference-free blurred image quality evaluation method based on saliency regions is proposed. The method first calculates the saliency value of the image through the improved SDSP algorithm, and uses the improved adaptive threshold to binarize the image to extract the saliency region of the image; then, the image is re-blurred with a Gaussian low-pass filter to obtain the reference image, At the same time, the similarity of the blur detection probability of the salient region of the image before and after re-blurring and the similarity of the standard deviation of the image before and after re-blurring are calculated; finally, the two are fused to obtain the final evaluation result of the image. Compared with the recognized excellent evaluation methods in the LIVE and CSIQ image databases, the comparison results show that the algorithm in this paper is superior to the traditional excellent evaluation methods, and has high versatility and accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470D (2022) https://doi.org/10.1117/12.2636940
Despite the remarkable progress has made in deep compressed sensing (DCS), how to improve the reconstruction quality is still a major challenge. The existing DCS model generally still has some issues, especially in recovering details. In this paper, a new parallel enhanced network (PENet) is proposed for image compressed sensing. PENet is designed as a sampling network and a parallel network, which contains a basic network and an enhanced network. The basic network is designed to provide the initial reconstructed image. The enhanced network is trained to progressively acquire module details through the connections with each block of the basic network in stages. The final reconstructed image is the cumulative results between the parallel network. Experimental result shows that PENet has a high reconstruction quality and comparable running time complexity with existing advanced DCS methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470E (2022) https://doi.org/10.1117/12.2636784
Ship detection in synthetic aperture radar (SAR) images is receiving more and more attention. At the same time, the need for high-precision and intelligent ship detection is becoming more and more urgent. To further improve the detection performance in SAR images, this paper proposes an improved Faster R-CNN based on feature pyramid network (FPN) and cascade network for SAR ship detection. First, Faster R-CNN is used as a baseline network to realize ship detection. Then, FPN was used to deal with the multi-scale problem of ships. Finally, the cascade network is used to further improve the ship detection performance. Experimental results on the public SAR ship detection dataset (SSDD) show that this method has better detection performance than YOLOv2, Faster R-CNN, Cascade R-CNN and RetinaNet.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470F (2022) https://doi.org/10.1117/12.2636830
The radiometric calibration accuracy of a hyperspectral imager is a key link in its quantitative application. The side-slither radiometric calibration can realize the high-frequency and full field of view relative radiometric correction of the hyperspectral imager, which lays the foundation for the subsequent quantitative application of hyperspectral data. In this paper, a hyperspectral imager relative radiometric calibration method and data processing method based on satellite platform 90° yaw maneuver is proposed. A verification of this method is carried out based on the in-orbit data of Ziyuan1(02D) satellite. The results show that the proposed method has good performance for relative radiometric calibration. The relative radiometric calibration accuracy of all the sensor pixels can reach 1%, and the fringe noise in the original image is well removed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470G (2022) https://doi.org/10.1117/12.2636943
In order to reproduce clear scenes of visible light images in hazy weather, and effectively suppress the image contrast and clarity degradation caused by haze degradation. General defogging methods do not take into account the uneven distribution of fog concentration and defog the whole image directly. Outdoor scenes of defogging is required to take into account the distribution of fog concentration. In this paper, a natural image defogging method for clarity evaluation indicators is proposed. We select representative indicators for fog concentration classification based on depth information and also obtain global transmittance maps. Compared with traditional method and the deep learning defogging method, The results outperform the other algorithms in different metrics. Experiments show that our results outperform other algorithms in various metrics and are robust to inhomogeneous fog.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470H (2022) https://doi.org/10.1117/12.2636789
The current national standards of face recognition do not put the placement of light for a clear description, only said that the light should be uniform when face recognition, no overexposure and underexposure. In this paper, we present a novel database and use different face recognition algorithms to determine the placement of light during face recognition. For this study, we have chosen ten widely-used face recognition algorithms: principal component analysis(PCA), linear discriminant analysis(LDA), independent component analysis(ICA), nonnegative matrix factorization (NMF), histogram of oriented gradient(HOG), scale-invariant feature transform(SIFT), face recognition, CosFace, ArcFace, and Combined Margin. We use a novel database that contains face images with a wide range of illumination angles to analyze the relationship between these algorithms and illumination angles. The results of this experiment indicate that the horizontal illumination angle of the light should be controlled in the range from -10° to 10° and the vertical illumination angle should be controlled in the range from -10° to 10°.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470I (2022) https://doi.org/10.1117/12.2636847
As a quantitative phase imaging method, white light diffraction phase microscopy is widely used in biological cell research. However, the white light diffraction phase microscopy system uses low coherence white light irradiation, resulting in halo effect around the object, cause their own expression is not clear, also cause the adjacent object phase information missing. The asymmetric U-Net network proposed in this paper can eliminate halo effect for high resolution white light diffraction phase images. Compared with the iterative deconvolution algorithm, the method based on deep learning greatly improves the work efficiency. Our samples include standard particle, blood red cells HeLa cells, and USAF phase-resolution plate. The effectiveness and robustness of the method are verified.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470J (2022) https://doi.org/10.1117/12.2636795
The transmission speed of pictures is very important. With the development of science and technology, more andmorepeople are paying attention to this field in recent years. The goal of this paper is to discuss a client-server systemthat enables an interactive user to efficiently navigate through a database of related photos, using ROI (regions of interest)within each image as navigation "portals". The study's objective is to emphasise the importance of highly efficient interactive communication system. There are three approaches to be considered: The first method (existing system) istotransmit an entire image; The second method is to communicate region of interest (ROI); The third methodistocommunicate only SIFT features[14]. JPEG2000 and Scale-Invariant Feature Transform (SIFT) are powerful tool inthese three methods. This paper investigated the responsiveness of these three different methods. When faster andmoreresource-efficient methods are found, the system will become faster. In the experiment, the database of photos takenbymyself was used, and the experimental data was collected. Finally, it was concluded that method 3 is the fastest method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470K (2022) https://doi.org/10.1117/12.2636829
Face recognition technology is a kind of biometric recognition technology which uses human face feature to recognize identity. In recent years, with the development of Internet of Things(IOT)hzi and the construction of Smart City, face recognition has been applied more and more widely. Focusing on existing several key problems of face recognition, this paper presents a face recognition algorithm based on improved BP neural network with Softmax regression, in the face detection phase, a face feature extraction method based on brightness detection and grey value opening operation is introduced to fully mine facial features and effectively improve the accuracy of face recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470L (2022) https://doi.org/10.1117/12.2636832
Since entering the 21st century, technology has developed rapidly, the computer and image processing technology also ushered in a peak period of development.. However, as the natural environment became more hostile, in particular, smog is becoming more common, it will cause the sharpness degradation and color degree distortion of outdoor images collected. The use of these damaged images as input to the system must seriously affect the analysis and understanding of the system as a whole, greatly reduce the performance of the visual system. Therefore, to study how to effectively reconstruct degraded images into original images in foggy environment, it has important application value and practical significance to improve the performance and robustness of vision system. The minimum filtering process using local filtering in the original algorithm will inevitably lead to block effect. In this paper, a filter whose window size varies with the image size is used to solve the blocky phenomenon and halo effect easily caused by the original algorithm. At the same time, this paper uses the sky region recognition algorithm of binary mask graph to prevent the color distortion of the sky part in the image restoration. Then, a guide filter with fine transmittance effect and good speed is applied to further optimize the transmittance map.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470M (2022) https://doi.org/10.1117/12.2636938
Image dehazing is an important topic in the field of computer vision. The traditional single image dehazing algorithm is susceptible to the serious halo phenomenon and color distortion. For this problem, an improved method for single image dehazing based on dark channel prior and guided filtering is proposed in this paper. Aiming at the problems of halo phenomenon by traditional dark channel prior, a compensation model was proposed firstly. This model identified halo regions by the differential operation on the two minimal filtered images and adjusted the filtering window scale to a more suitable value. Secondly, rough transmittance correction combined with sky area judgement was applied aiming at color degradation caused by serious cover influence of fog and haze. Finally, the guided filter algorithm was improved for optimal transmittance by locating edge points with Canny operator and modifying the smoothing factor in loss function with the ratio of the gradient operator to window gradient mean as edge weight. The experimental results demonstrate that the proposed algorithm can efficiently remove halo effects and keep more details whatever in near or far regions, as well as improving visual effects and color saturation of images. Objective experiment evaluations also verify the effectiveness of proposed algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470N (2022) https://doi.org/10.1117/12.2636801
Based on Tian Yu's Fuyin garden drawing in the mid-19th century and the image data left over from the garden in the same period, this paper studies it by using the coordinate reduction method and perspective reverse method respectively. The former uses the coordinate reduction method to restore the overall layout and spatial location of the Fuyin garden based on the boundary painting attribute. The latter uses the perspective reverse method to obtain the data of spatial layout and building size of different locations Fuyin garden based on photorealism. The two are pieced together, integrated, verified, and corrected in an attempt to realize the restoration conjecture of the Fuyin garden. The research contents include image data collection, image contour extraction, and vectorization, image perspective reverse, multi-view matching analysis, verification, and correction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470P (2022) https://doi.org/10.1117/12.2636803
With the development of technology, we have increasing demand for high-quality images in many area. But existing imaging systems are limited by hardware limitations such as the density of their inherent sensor arrays, the image resolution sometimes fails to meet the requirements. Super-resolution technology provides a solution to improve the quality of image from the software aspect without changing the hardware device. In terms of the number of images processed in superresolution area, we generally divided super-resolution into single-image super-resolution reconstruction and multi-image super-resolution reconstruction. Limited by the details contained in the input image itself, single-image super-resolution reconstruction is often difficult to achieve the desired effect.Compared with the single-image super-resolution algorithm, the multi-image super-resolution algorithm has the advantage of making full use of the complementary information between the image sequences. Therefore, multi-image super-resolution algorithm tends to have a stronger super-resolution capability. However, it is very difficult to capture the image sequence of the same scene with complementary information. Camera array system can capture the same scene from multiple perspectives, which can better solve the problem of image sequence acquisition with complementary information. Because most of the current methods about camera array are based on traditional methods , we propose a multi-view images from camera array super-resolution with deep learning method. In this paper , we use a set of multi-view images as input. Then, feature of each image is refined with redundant information(i.e., angular information) from that of its adjacent images by Local Feature Comparison (LFC) module. Feature improved will be aggregated by Global Feature Fusion (GFF) module we proposed. Finally, we can get only one high-resolution image via high-resolution reconstruction module. Benefiting from way, the proposed method achieves better visual quality with more realistic and natural textures.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470Q (2022) https://doi.org/10.1117/12.2636923
Aiming at the problem of insufficient data in the detection of train wheel tread damage, a tread image generation method based on generative adversarial network is proposed. In order to ensure the authenticity of the generated tread image, a symmetric skip connection network is used in the model to build a generative network, and the Wasserstein distance is introduced into the loss function of the network. This method effectively solves the problem of the shortage of tread damage image datasets, provides a data basis for the later detection of train wheel tread damage, and also provides technical support for the construction of train wheel image datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470R (2022) https://doi.org/10.1117/12.2636788
To extract ground objects from remote sensing images based on deep learning method is one of the current research hotspots, and building information has attracted much attention as an important artificial feature. Convolutional neural networks have shown great potential for building extraction tasks. In view of the current research status, this paper proposes a lightweight improved network ECAU-Net that combines semantic segmentation network U-Net and efficient channel attention mechanism, and applies it to automatic extraction of buildings from high-resolution remote sensing images. Experiments show that the extraction method proposed in this paper shows good extraction results on the Massachusetts building data set, and the accuracy, recall, precision, and F1 score are better than the original U-Net. Therefore, this method does have certain advantages.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470S (2022) https://doi.org/10.1117/12.2637664
With the development of high technology, the digitization of works of art and cultural relics is no longer a problem. How to extract enough and accurate information from a large number of complex information for our use is a common problem in practice. With the continuous improvement of digital inkjet technology and color ink quality, many digital products have appeared in the market, such as digital oil painting, digital traditional Chinese painting, digital calligraphy, digital printmaking, digital paper cutting, digital embroidery and other digital art replicas. However, in addition to modeling, pattern is also an important part of art. Chinese traditional patterns have a long history and brilliant achievements. After solving the problem of production automation, leading enterprises in various industries began to seek detection automation and assembly automation based on machine vision. Machine vision technology has become one of the hot spots in the industry. With the rapid development of 3D technology, its application scope is becoming more and more daily, from cultural relic restoration, simulation manufacturing, digital sculpture to ceramic art production, etc. Machine vision should use image recognition technology instead of human eyes to measure and judge, which is more efficient, more accurate and more objective, and can be repeated endlessly, greatly improving the degree of automation of production. This paper focuses on the three-dimensional patterns of ceramic artworks, aiming at high-quality reproduction and design re-creation, deeply studies the collection and image splicing technology of three-dimensional patterns of ceramic artworks, and constructs an image reproduction system, which realizes the whole digital process from three-dimensional pattern shooting, preprocessing, image correction, two-dimensional transformation of three-dimensional images, image registration, image fusion to generation of complete two-dimensional patterns.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470T (2022) https://doi.org/10.1117/12.2636967
With the continuous development of digital information technology and the improvement of chip processing capacity, image processing applications are becoming more and more popular[1]. Image processing has become an important modern technology, the current image acquisition equipment has its own limitations, but if the image information collected by two sensors to do fusion processing, can get better results of the image. The proposed image fusion system, the use of even have a camera to collect images, by using Laplacian pyramid fusion method, two image fusion can be testified by upper high resolution and low resolution RGB greyscale figure fusion, get the image with better effect, make the information of the fused images more rich, overcome the deficiency of the single sensor, It has certain application value.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470U (2022) https://doi.org/10.1117/12.2636771
In this paper, we propose a tubular structure segmentation method based on an adaptive front propagation scheme (AFP). Basically, the segmentation procedure is carried out by thresholding a geodesic distance map associated to a given metric. We adopt an asymmetric quadratic metric for estimating the geodesic distance values via a freezing front fast marching algorithm. The asymmetry property encoded in our method can alleviate the front leakage problem encountered in the traditional AFP method. Experimental results in both synthetic and real images prove the ability of the introduced method in extracting tubular structures.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470V (2022) https://doi.org/10.1117/12.2637648
In this work, three algorithms (HOG with SVM, VGG, ResNet) are chosen to perform better image recognition on different categories of birds. HOG with SVM performs worse than ResNetwith SVM because HOG is better at recognizing objects than identifying categories of objects. Hyper-parameters in HOG and SVM will affect the accuracy. (Bochen). To improve the accuracy of the image recognition, the VGG algorithm is adopted with different hyper-parameters. Yujin Wang employs “Deep residual learning for image recognition” from He K, Zhang X, Ren S, et al to show his understanding of the ResNet algorithm. He will test different hyperparameters in ResNet and show readers how they will affect the results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Intelligent Signal Processing and Radar Target Detection
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470W (2022) https://doi.org/10.1117/12.2636845
This paper analyzes the basic definition, development stage, application value and other contents of 3D digital animation. Combining the correlation between 3D digital animation and landscape architecture, it studies the application process of 3D digital animation in landscape architecture design, and points for attention such as defining the theme of landscape architecture design, doing a good job in zoning planning and design, making full use of the advantages of big data technology, etc. Its purpose is to give full play to the application value of 3D digital animation and improve the rationality of landscape architecture design content.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470X (2022) https://doi.org/10.1117/12.2636799
Under irradiating of the high-resolution radar at a low grazing angle, the detection of slow small targets is susceptible to sea clutter, and it can reduce the detection performance of marine radar. According to the fluctuation characteristics of radar echo signals, a novel target detection method based on the relative echo signal sharpness (RESS) is put forward. Firstly, the feature points of the radar echo signals were extracted by the local extremum method, and a sparse matrix was constructed by feature points and its corresponding pulse. Secondly, the ratio of Manhattan distances of adjacent points in the matrix was calculated. Finally, the RESS was obtained by referring to the reference bins, and the target detection was carried out by the cell-averaging constant false alarm rate (CA-CFAR) detector. The experimental results showed that the RESS of target bin is higher than that of sea clutter bin. Furthermore, compared with the traditional method, the proposed method improved the detection performance of sea-surface slow small target.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470Y (2022) https://doi.org/10.1117/12.2636825
Chopping and interleaving(C&I) jamming is a kind of very effective jamming, which has the effect of deception jamming and suppression jamming, and brings a great challenge to radar signal processing. In this paper, a new method based on PRF agility to anti C&I jamming is presented, which is analyzed theoretically and simulated numerically. The simulation results show that the C&I jamming can not disable the target tracking, and the range tracking error is reduced from 350m to nearly 20m.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122470Z (2022) https://doi.org/10.1117/12.2636854
Radar and radar jamming are the eternal research topics, frequency agile radar has excellent anti-jamming performance, so it has been widely studied. In this paper, the comb spectrum jamming of frequency agile tracking radar is simulated. Firstly, the signal model of frequency agile tracking radar is established, and then the principle of comb spectrum jamming is analyzed, on this basis, the jamming effectiveness of comb-spectrum jamming is simulated. The research in this paper can provide some reference for the jamming design of frequency agile tracking radar.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224710 (2022) https://doi.org/10.1117/12.2636934
Object detection under low lighting is still a serious challenge because feature details of a target in an image with low illumination are so indistinct that it is very difficult to extract sufficient features for target detection. Therefore, many lowilluminance enhancement networks are proposed for feature extraction. However, most of the current low-illuminance enhancement methods employ a single network to enhance the input images with different illuminances, resulting in the problem of excessive enhancement for some input images. And it is inevitable to produce excessive noise during the image enhancement. To address such two issues, we propose an embedded low illumination enhanced network based on the Scale-YOLOv4 object detection network, iteratively generating enhancement images with different illumination levels, and among the enhanced images, automatically selecting the one which is best suitable for target detection, and adding a noise removal module to eliminate the generated noise. The proposed embedded low illumination enhanced network is tested on our (ExDark + MS COCO unlabeled) data set, and the effect of 55.7% mAP (70.4% AP50) is achieved, which demonstrates the effectiveness of the proposed network.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Yi Wan, Qile Chen, Jiaxu Zhang, Yanjun Chen, Enming Guan
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224711 (2022) https://doi.org/10.1117/12.2636792
As the main product of fuze equipment, radio fuze has the advantages of strong anti-interference ability and high
precision. In the face of increasingly complex battlefield environment, radio fuze is more and more susceptible to
environmental interference, which makes the explosion point walk larger and lose the best damage effect. The fuze needs
to be able to actively identify the environment and make decisions. Like the dense jungle at the end of the trajectory would
interfere with fuze control of the blast site. To further address the issues raised by this situation,this paper analyzes the
Michigan microwave vegetation scattering model and summarizes the advantages and disadvantages of the fuze
platform. The echo model of leaf cluster based on pendulum model is proposed, and the echo micro-Doppler characteristics
of different blade sizes are simulated. The characteristics of the two methods are summarized, which lays a foundation for
further analysis of fuze echo signal.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224712 (2022) https://doi.org/10.1117/12.2636820
Aiming at the problem that traditional convolutional neural network (CNN) for solar cell defect detection cannot learn complex invariance, a defect detection method based on improved tiled convolutional neural network (TCNN) is proposed. First, the image is preprocessed by morphological smoothing method to remove grid lines and noise in the image. Then, a random forest classifier is used to replace the TCNN output layer to enhance the generalization ability of TCNN. Finally, TCNN is used to learn the complex invariance of defect images for defect detection. In order to avoid TCNN falling into local optimum, differential evolution algorithm (DE) is introduced to optimize TCNN. The experimental results show that the improved TCNN can quickly and accurately detect the surface defects of solar cells, and the current overall recognition rate is as high as 96.8%, which is 2.3% higher than the traditional CNN, which verifies the effectiveness of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224713 (2022) https://doi.org/10.1117/12.2636841
Pin is an important component in transmission line, which plays a role of fixing nut. However, the pin defect is small, which will inevitably produce noise during the annotation. Noisy annotations are harmful for the learning of model. Hard example mining is a common technique in pin defect detection, which is sensitive to noise. As far as we know, there is no work that considers both hard case mining and noise robustness for pin defect detection. The existing noise robust methods have poor performance for pin defect detection. In this paper, we explored the combination of noise robustness and hard example mining for pin defect detection. We found that noise had little influence in the early stage of training. Based on the above observation and combined the existing method, we proposed a noise distillation and correction (NDC) method, which distilled data in the mature training stage, then reduced their loss weights and corrected the noise label. NDC is not only simple, but also does not add any operations during testing. Extensive experiments show that our method achieves the state-of-the-art performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224714 (2022) https://doi.org/10.1117/12.2636980
The working environment of marine navigation equipment is poor. Limited by factors such as volume, energy consumption and cost price, higher integration is adopted in hardware design and the number of components is reduced. In the process of equipment testing, the decline of Beidou navigation signal quality has a particularly important impact on equipment performance. The signal degradation will lead to the correlation peak distortion in receivers, which will induce large tracking errors and finally affect positioning, navigation, and timing (PNT) performance. The objective of this paper is to simulate and detect Beidou signal distortions. An offline analysis method is introduced to effectively assess the signal quality in multiple domains based on the software receiver processing. Compared to a previous multi-correlator method which detects “evil waveforms” (EWFs) through correlation peak symmetry tests in the receiver, it characterizes the pre-correlation signal quality more clearly and expressively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224715 (2022) https://doi.org/10.1117/12.2636952
Studying eye movement trajectory through eye tracker can provide insights into classification mechanism of memory and reasoning. For the given text or images, memory and reasoning based observations are useful in multiple key areas of psychology, cognition and related fields. For our project, the EyeLink 1000 Plus is used to inspect and capture fixation, saccades, blinks etc. in eye tracking protocols of each participant during trials. Based on fixation sequence of each participant our trained deep recurrent neural network architecture–LSTM classifies whether a participant is performing the given text-based task by memorizing it or by inference. To set trial sentences, syllogism– deductive reasoning type is followed. Sixty university students (mature readers) participated in both memory and reasoning-based trials. Participants were then divided into two equal groups and of which one group was instructed to perform memory task first while the second group was instructed to perform reasoning based task first and then their order of performing the given task was changed. From sixty-one different sentences, sixty randomly selected sentences were presented to each participant. Sequential signals from fixation of each participant were then processed to get the results. Our trained LSTM model’s high accuracy to classify the memory and the reasoning based reading of participants ensures the significance of our work which will provide a solid base for future works on eye movements to build intelligent techniques in the field of A.I backed psychology, healthcare and neuro-marketing. Our trained model has the potential for (a) achieving very high accuracy for memory and reasoning classification, (b) for data learning, it saves enormous time, (c) Economical and comfortable for trials to record data via EyeLink 1000 plus (d) and finally, possible future work opportunities are enlisted.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224716 (2022) https://doi.org/10.1117/12.2636936
The registration between intraoperative 2D digital subtraction angiography and preoperative 3D computed tomography angiography can improve the visual perception of doctors during vascular interventional surgery and provide threedimensional information of blood vessels. Therefore, improving the accuracy and robustness of 2D-3D vascular registration is the key to vascular interventional surgery. In this paper, we propose a similarity measure that fuses normalized mutual information with gradient difference and adds a multi-resolution strategy to the registration framework. Experiments show that the mTRE of the proposed method is 2.1mm, and the time of each registration iteration is 174.6s. Compared with normalized mutual information and gradient difference, the proposed method has higher accuracy and faster efficiency, and achieves better results in a shorter time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224717 (2022) https://doi.org/10.1117/12.2636843
Under the condition of modern information technology, the scale of battlefield electromagnetic information increases exponentially, which makes the electromagnetic environment more complex, and brings great opportunities and challenges to the development of anomaly detection algorithms in complex electromagnetic environment. Since the research in this field is still in its infancy and the research work system is not strong, this paper sorted out and analyzed relevant literatures at home and abroad, and sorted out and summarized traditional methods and deep learning methods respectively according to different anomaly detection algorithms. At the same time, the research difficulties and development trend of anomaly detection algorithm in complex electromagnetic environment are proposed to provide reference and suggestions for subsequent research.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224718 (2022) https://doi.org/10.1117/12.2636812
Deep learning target detection has always been a major research direction in the field of artificial intelligence. Its research results are widely used in the fields of automatic driving, security system and medical treatment. This paper proposes a method to improve the detection effect of small targets, which realizes the detection of objects of different scales in the input image, especially to improve the detection effect of small-scale targets. Before the collected data set is sent to the neural network for training, it is first divided into three different scales according to the size of the target to be detected in the image of the data set. Then one or several images in the large target data set are stitched, the images in the small target data set are enlarged, and the above two types of images are used to form a new data set. Finally, the new data set and the original data set are sent to the neural network for training. In this paper, the YOLOX target detection network is used for verification. The results show that the detection effect of the network obtained by this method on small targets has been improved, and the missed detection rate has been reduced from 31.2% to 27.5%. At the same time, the detection effect of large and medium-sized targets has not been sacrificed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224719 (2022) https://doi.org/10.1117/12.2636808
In order to evaluate and study the static ash holdup of cigarette combustion from various angles, an objective and accurate real-time measurement method of static ash holdup of cigarette combustion is provided. The image acquisition of the test sample during the static combustion of the cigarette is evenly distributed around the test cigarette through 3 cameras, and the camera lens position is perpendicular to the axial direction of the cigarette. Through the simultaneous shooting of 3 sides of a single cigarette in the static combustion state, the image tracking acquisition is carried out in real time, so as to complete the 3-side full vision acquisition of the static ash holdup of the cigarette test sample combustion. The experimental results show that there are great differences among the three groups of static gray holdup data collected by three-side full vision, so it is necessary to take the average value of three-side photography.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471A (2022) https://doi.org/10.1117/12.2637184
Aiming at the existing problems of the digital urban traffic signal, a mechatronics collaborative simulation model is studied and constructed, and the subway shielded door control system and the simulation environment test platform are designed. The test results show that under normal conditions, the number of pulses when the brake works during the closing process of the system are about 10600, the number of pulses after the closing is 10720, and the speed at the moment of closing is about 80mm/s. When encountering obstacles, the motor can still operate normally according to the normal obstacle detection process. The research provides ideas for the efficient application of urban traffic signals in the future.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471B (2022) https://doi.org/10.1117/12.2636959
With the development of science and technology in China, UAV aerial survey technology has been widely used. As a new type of technology, drones can achieve effective monitoring with the help of aircraft, built-in cameras and core intelligent processing chips, and drone aerial photography can complete difficult aerial photography. At present, UAV oblique photogrammetry technology has been widely used in the integrated work of premises, which has comprehensively improved work efficiency and reduced operating costs. This paper mainly discusses the key technologies of UAV oblique photogrammetry in the integration of real estate, analyzes the advantages of UAV oblique photogrammetry technology, discusses the use of information technology to complete the surveying and mapping of real estate integration projects, and comprehensively improve the real estate information of houses , as well as the exploration of key production links and new directions of oblique photogrammetry technology, especially UAV measurement.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Yan Guo, Chenglong Xu, Mei Bin, Chaohui Xu, Limin Liu, Shuhua Zhao, Yichao Li, Yushan Bai, Jiangfen Jia, et al.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471C (2022) https://doi.org/10.1117/12.2636835
The oil content of oilfield sludge is an important indicator to determine the treatment effect of oily sludge and whether it can be discharged. In this study, in order to investigate the applicability of mid-IR laser spectroscopy to the determination of oil content from Huabei Oilfield, we randomly selected dehydrated crude oil. The results indicate that the mid-IR laser spectroscopy has great advantages compared to the visible light spectrophotometry which recommended by the petroleum industry standard. The average relative error and standard deviation of the mid-IR laser spectroscopy are smaller, indicating that the accuracy is better than visible light spectrophotometry. Therefore, the mid-IR laser spectroscopy can replace the traditional visible light spectrophotometry to meet the oily sludge testing needs of Huabei Oilfield.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Meizhen Liao, Wei Wei, Xiaojie Zhang, Yuzhong Long, Hanxi Li
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471D (2022) https://doi.org/10.1117/12.2636942
In this work, we propose to generate the supervised training dataset for 6-DoF object pose estimation algorithm, with trivial human labor. A semi-auto labelling board is designed so that its pose can be estimated accurately with a learned deep model. For a training sequence with more than 1000 frames, one only needs to manually mark the key points of the object for 2-5 frames and the ground-truth pose of the object can be calculated automatically for the whole sequence. In the experiment, we prove that the state-of-the-art pose estimation method can be trained well with only 20-50 human-labelled images and the yielded model performs better than the model learned based on the manually labelled dataset with more than 800 images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471E (2022) https://doi.org/10.1117/12.2637108
In order to understand the fish resources and spatial distribution characteristics of the Miyun Reservoir in the waterreceiving area before the South-to-North Water Transfer, the size composition, density and spatial distribution characteristics of the fish in the reservoir area were detected and evaluated by acoustic technology. The hydroacoustic data results showed that the average target intensity of fish was (-46.8±7.6) dB, the fish body length ranged from 3cm to 74.1cm, and the average body length was 10.15 cm.). From the perspective of the entire reservoir area, the average density of fish in the upper water body was 3.3 ind. ∙ 1000m-3, the average density of fish in the middle water body was 570.66 ind. ∙ 1000m-3 and the average density of fish in the lower water body was 20.55 ind.∙1000 m-3. There were very significant differences in the spatial distribution of fish density in different waters (P⪅0.01). The maximum fish density appears in the central waters of the reservoir, which was 865.2 ind.∙1000m-3, and the minimum fish density was 2.87 ind.∙1000m-3 in the waters of the inner lake. The spatial distribution of fish density among different water layers was also extremely uneven (P⪅0. 01). Most fish tend to be distributed in the middle of the water body, with a fish density of 570.66 ind. ∙ 1000m-3. The total number of fish resources in Miyun Reservoir was estimated to be 1.76×108 ind., and the total weight of fish resources was about 3.4×103 t.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Kang Yu, Qing Tao, Runsheng Yin, Jingyao Fang, Di Wang
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471F (2022) https://doi.org/10.1117/12.2636922
Under the condition of modern information technology, the scale of battlefield electromagnetic information increases exponentially, which makes the electromagnetic environment more complex, and brings great opportunities and challenges to the development of anomaly detection algorithms in complex electromagnetic environment. Since the research in this field is still in its infancy and the research work system is not strong, this paper sorted out and analyzed relevant literatures at home and abroad, and sorted out and summarized traditional methods and deep learning methods respectively according to different anomaly detection algorithms. At the same time, the research difficulties and development trend of anomaly detection algorithm in complex electromagnetic environment are proposed to provide reference and suggestions for subsequent research.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471G (2022) https://doi.org/10.1117/12.2636804
The conventional underground cable pipeline visual management system has problems such as too small signal coverage, resulting in too long system response time. A visual management system for underground cable pipeline based on GIS is designed. The Visual DSP++ environment linker is utilized to access source files. And then the characteristics of underground cable piping is extracted. The temperature control equation of heat source area is calculated and the signal coverage by GIS is adjusted. Through reading the layer vector data and different laying legend, different laying modes of cables are displayed. Finally, the visual management function of system software is designed. The experiments show that the average response time of the underground cable pipeline visual management system in this paper is 5.475s, 8.027s and 7.938s, respectively. Compared with the other two types of underground cable pipeline visual management systems, it takes less time, which proves that the underground cable pipeline visual management system integrated with GIS technology is more reliable.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Computer Algorithms and Network Data Model Recognition
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471H (2022) https://doi.org/10.1117/12.2636920
With China's entry into an aging society, the phenomenon of the elderly living alone has become an important issue of general concern. Accidental fall is a major factor endangering the safety of the elderly. Therefore, the design of intelligent nursing system with fall detection function is of great significance to ensure the safety of the elderly. For this demand, a fall detection method that can be applied to intelligent nursing equipment is proposed in this paper, which is based on the convolutional neural network (CNN). Since deep learning requires a large number of samples, however, it is not easy to obtain large-scale labelled images, transfer learning technology based on pre-trained model of Inception v3 is applied to construct new CNN model, which is implemented with the commonly used TensorFlow and Keras. The basic layers are transferred from Inception v3 and keep the weights untrainable. The full connection layer, dropout layer and output layer are added for fine-tuned training. A small-scale image data set is established for training and test. Cross entropy loss function and Adam optimization algorithm are used during training. Finally, the experimental results show that the trained model can effectively realize the automatic detection of falls, with an accuracy of 95.38%, which has a certain practical significance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471I (2022) https://doi.org/10.1117/12.2636798
With the in-depth development of informatization, the Internet has become an important position for data security protection, an important network terminal of key infrastructure, and an important target for the infiltration and deep latent of hostile forces. In view of the risk of disclosure of important national and enterprise information caused by illegal transmission of important files by internal personnel, this paper studies the key technologies of content security based on NLP, and proposes a text classification method based on label tree, which effectively improves the accurate management of terminal data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471J (2022) https://doi.org/10.1117/12.2636826
The cross-language news topic discovery task aims to cluster news texts in different languages that describe the same topic and classify the topic in the form of keywords. At present, most cross-language topic discovery methods are based on machine translation or external resources like bilingual dictionaries and parallel sentences to solve cross-language problems. However, Vietnamese is a low resource language and it is difficult and expensive to manually annotate ChineseVietnamese bilingual aligned corpora. To solve this problem, this paper proposes a Chinese-Vietnamese cross-language topic discovery method based on generative adversarial networks (GAN). Firstly, News texts are represented as vectors by BERT, and then the bilingual vectors are mapped to the same semantic space by GAN. Finally, k-means clustering algorithm is used to cluster the representation vectors and extract the topics. Experiments on the Chinese-Vietnamese bilingual news topic discovery corpus show that the proposed method is superior to the baseline.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471K (2022) https://doi.org/10.1117/12.2636815
Background objects obscured in some sub-apertures of light-field cameras can be seen by other sub-apertures. Consequently, occluded surfaces are possible to be reconstructed from LF images. So far, Current foreground occlusion elimination approaches based on LF usually extract only the complementary information about background objects among different sub-aperture images to get an occlusion-free center view, which cannot get ideal performances in reconstructing visually realistic and semantically plausible pixels for occluded areas. In this paper, we suggest a easy but efficient LF foreground occlusions elimination way using a dual-pathways fusion network, which is a encoder-decoder network using convolution operations. In our method, we first construct all sub-aperture images(SAIs) as an input tensor and then render it to the encoder to incorporate information between SAIs. In particular, except for a pathway to synthesize center view, we also set another pathway to predict the foreground occlusion. By fusing these two pathways’ outputs, we not only reserve more information belonging to occluded surfaces but also fill the occluded regions with better visual effects. Experimental results indicate that our method is superior to the state-of-the-art approaches and the occlusion-free view looks more realistic. Our source codes will be available.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471L (2022) https://doi.org/10.1117/12.2636925
Regulating the wearing of masks is an important means to effectively prevent respiratory infectious diseases, and safety risks can be reduced by using machines instead of humans for mask wear testing. In this paper, we propose a deep learningbased mask wearing normality detector. This method uses featu-re pyramids to fuse multi-level features and employs a multiscale detection strategy to improve the detection accuracy of face land-marks and masks. In addition, a context sensitive predict module for facial landmarks and masks detection is also proposed. We compared the proposed model with Retina Net, YOLO4 and Faster R-CNN. The results show that the proposed model is superior to existing model for the normative detection of mask wearing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471M (2022) https://doi.org/10.1117/12.2636846
In most wireless sensor network (WSN) engineering applications, the position of the sensor node needs to be determined. This paper uses mobile beacon nodes to determine a sensor node localization scheme. The multi-modal method is employed to move the beacon node through GPS to determine the coordinates of the broadcast time tag and coordinate information, and through WSN sound pulse sensor nodes at the same time to check whether the beacon messages and sound pulse, measuring the time delay between voice and radio pulses, then used the minimum time delay between voice and radio pulses of the y-coordinate as a beacon message x-coordinate. According to the ID of the beacon node antenna, our method can accurately estimate the position of the sensor. Error analysis and computer simulation show that the positioning scheme, called DAMS, is robust to various parameters of WSN and is not affected by the obstacle environment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471N (2022) https://doi.org/10.1117/12.2636796
The simulation of physical lens effects can intuitively and effectively reproduce the effects that may appear in real lenses, and the real-time depth of field effect of the lens plays an important role in the virtual test of the unmanned ship. Aiming at the authenticity and real-time nature of the current physical lens effect simulation, this paper develops a real-time depth of field effect simulation based on UE4, Applied to virtual test environment perception of unmanned ship. The algorithm uses the visual blueprint programming of UE4 and HLSL code to simulate the depth of field effect. Experimental results show that the algorithm can simulate the depth of field effect well.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471O (2022) https://doi.org/10.1117/12.2636976
With the development of information and communication technology (ICT) and the rapid spread of smart mobile terminals, a large amount of crowd-sourced big data of user-generated content (UGC) is being produced in the cloud, providing us with new technical means and methods to analyze and calculate the image of tourism destinations. This paper proposes a technical framework for tourism destination image calculation based on cognitive theory, integrated with the application of content analysis and curve fitting methods, with the big data of Huashan travel blogs from Qunar.com as an example. It mainly analyzes the word frequency distribution pattern of blog big data, calculates the travel sentiment score, and constructs the travel sentiment word cloud map. The perception calculation of Huashan tourism image from three aspects of cognition, emotion, and overall is carried out, the semantic network map of Huashan tourist attractions is constructed, and the three-layer structure of Huashan tourism semantic network map with its internal connection characteristics, attractions relationship chain and resources is further analyzed. This research can provide an important technical framework and methodological basis for the extraction of tourism destination images, and provide valuable scientific references for the construction of tourism recommendation systems as well as the planning and management of the tourism department.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471P (2022) https://doi.org/10.1117/12.2636833
The parameter information of particle size and position in particle field is of great significance in scientific research, engineering and other fields. In this paper, after the particle hologram is obtained by in-line digital holography, the angular spectrum algorithm is used to reconstruct the particle field, and the amplitude and phase information of the particle field are obtained; The gray gradient compound method is used for particle recognition, and the particles are separated from the background; Then the watershed segmentation algorithm is used to segment the overlapping particles, and the Hough transform is used to extract the particle size, transverse position and other information; Finally, the depth information of particles is determined by Laplace operator function method, and the three-dimensional field distribution of particle field is obtained. The experimental results show that the measurement error of polystyrene particles with standard diameter of 50μm is 1.53μm and the measurement error rate is 3.06%, which proves the feasibility of this method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471Q (2022) https://doi.org/10.1117/12.2636850
The purpose of person re-identification (Re-ID) is to retrieve a person of interest from a set of images taken by multiple cameras. In some current work, simple global features and local features do not allow the model to achieve excellent performance. In this paper, we propose an end-to-end person re-identification network that integrates multi-granularity pedestrian features. Our model contains multiple branching feature extraction modules, specifically, two global feature extraction modules, two auxiliary modules and two attention modules. To enhance the feature extraction capability of the model, we embed an improved parameter-free attention module in the backbone network, which significantly improves the performance. Our comprehensive experiments on the mainstream evaluation datasets of Market-1501, DukeMTMCreid show that our method achieves a more advanced performance that outperforms most existing methods. As an example, on the Market-1501 dataset, with the help of re-ranking(RK) strategy, we get the result of rank-1/mAP=95.8%/94.0% which exceeds most current methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Lingjun Chen, Caidan Zhao, Xiangyu Huang, YiLin Wang, Junjie Deng
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471R (2022) https://doi.org/10.1117/12.2636944
Fog seriously affects the visual perception of human eyes and reduces the quality of captured images. This paper proposes a dehazing Generative Adversarial Network based on multi-scale feature extraction. The method is an end-toend dehazing network that avoids the dependence on physical models. By adding the edge feature extraction module to the generator network to obtain the high-frequency information of the foggy image, the attention to the edge information of the image is effectively improved. In addition, the multi-scale features of the image are extracted, and then the foggy image is enhanced by a unique feature fusion mechanism. The discriminator network uses the global discriminator and the local discriminator to make a joint judgement, which further improves the dehazing performance. Compared with state-of-the-art approaches available in the literature, the algorithm proposed in this paper obtains better subjective and objective image quality evaluation on the cityscape foggy image synthesis dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471S (2022) https://doi.org/10.1117/12.2636793
With the development of avionics systems, ARINC 818 video, PAL video and digital map video have been widely used in airborne avionics systems. For three different videos, the system of an airborne video processing and digital map video is designed. Among them, the airborne video processing uses Among them, the airborne video processing uses programmable logic device FPGA as the core, which realizes the functions of airborne video capture and reception, gating output, format conversion and Digital map generation uses CPU main processor + GPU graphics processor as the core. The core realizes the reception of digital map data and the generation of map videos. The system solution uses low-power, high-performance FPGA and CPU+GPU as the core processor, which simplifies the This design has been successfully used in a certain type of display This design has been successfully used in a certain type of display control management system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471T (2022) https://doi.org/10.1117/12.2636819
The loss function plays a key role in the performance of online multi-label classification algorithms. Based on the design idea of the multi-label classification hinge loss function in the adaptive label thresholding algorithm, this paper expands several binary classification loss functions to new multi-label classification loss functions, proposes several adaptive label thresholding algorithms based on these new loss functions, and investigate the impact of different loss functions on multilabel classification performance. The experimental results show that the adaptive label thresholding algorithm based on the logistic loss function achieves the best performance, and the adaptive label thresholding algorithms using different loss functions are all better than several advanced comparison algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471U (2022) https://doi.org/10.1117/12.2636839
Fourier phase recovery techniques focus on how to reconstruct object information from phaseless measurement. Generally, such model-based phase recovery algorithms are difficult to obtain high-quality reconstructions in the presence of noise interference. Hence, we proposed a phase retrieval algorithm with deep denoiser networks. Firstly, an optimization model is constructed for the phase retrieval problem, then the alternating direction method of multipliers method is used to solve optimization problem iteratively. Besides, a well-trained deep neural network act as plug-and-play denoiser to participate the process of algorithm. Our method combines the model information of traditional phase retrieval algorithm and the fitting ability of the deep neural network, experiments show that it can achieve higher reconstruction result in the face of noisy image, and the generalization ability is also improved compared to end-to-end method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471V (2022) https://doi.org/10.1117/12.2636921
With China's entry into an aging society, the phenomenon of the elderly living alone has become an important issue of general concern. Accidental fall is a major factor endangering the safety of the elderly. Therefore, the design of intelligent nursing system with fall detection function is of great significance to ensure the safety of the elderly. For this demand, a fall detection method that can be applied to intelligent nursing equipment is proposed in this paper, which is based on the convolutional neural network (CNN). Since deep learning requires a large number of samples, however, it is not easy to obtain large-scale labelled images, transfer learning technology based on pre-trained model of Inception v3 is applied to construct new CNN model, which is implemented with the commonly used TensorFlow and Keras. The basic layers are transferred from Inception v3 and keep the weights untrainable. The full connection layer, dropout layer and output layer are added for fine-tuned training. A small-scale image data set is established for training and test. Cross entropy loss function and Adam optimization algorithm are used during training. Finally, the experimental results show that the trained model can effectively realize the automatic detection of falls, with an accuracy of 95.38%, which has a certain practical significance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471W (2022) https://doi.org/10.1117/12.2636802
To improve the recognition accuracy of target EEG signals, a classification model based on the combination of Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU) is proposed. CNN is used to extract the frequency domain and space domain features of EEG signals, which is connected to bidirectional GRU after the fully connected layer to continue mining the deep timing information of the data, and finally the softmax layer is used to classify the EEG data into target and non-target signals. The model obtained an average classification accuracy of 95.88% on the UC San Diego Rapid Serial Visual Presentation (RSVP) EEG target detection dataset, outperforming the comparison method. It is shown that the proposed method can effectively extract the feature information of the target EEG signal and improve the EEG signal classification accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471X (2022) https://doi.org/10.1117/12.2636827
Sensor-based human behavior recognition is a classification recognition task and is widely used in medical care, environmentally assisted living, and other fields. But multiple sensors sense the impaired behavior without considering the correlation between sensors. In this paper, a multi-head-siamese neural network, combined with weight sharing is proposed based on deep learning theory. The network hyperparameters are adjusted by Bayesian optimization. Due to the problem of over-fitting during impaired behavior recognition introduced by Adam optimizer, L2 regularization is improved by using AdamW optimizer. Processing results of raw data show that the network achieves a classification accuracy of 96.0%. Compared with the baseline network and single input network, its accuracy has increased by 6.1% and 8.8% respectively. Compared with multiple input network, its accuracy has increased by 2.4%, and reduced the number of training parameters by 92%. Verified the effectiveness of the proposed network for impaired behavior recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471Y (2022) https://doi.org/10.1117/12.2636794
Aiming at the characteristics of the angle steel punching circle defect, this paper proposes a method for detecting the punching circle defect based on the genetic algorithm to optimize the BP neural network. Using BP neural network to detect the defects of the punching circle can effectively solve the problem of nonlinear mapping between the input and output of the punching circle defect. The traditional BP neural network model is easy to fall into the local minimum and cause the risk of model failure. The genetic algorithm is used to optimize the weights and thresholds of the BP neural network, and the obtained optimal weights and thresholds are substituted into the prediction model for defects. Detection can improve the stability and predictive ability of the model. Experiments show that the GA-BP network has higher accuracy and generalization ability than the unoptimized BP network, and can accurately detect the punching circle defects of the power tower angle steel.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Xingnan Li, Xiaozhi Deng, Jiangang Lu, Zhan Shi, Yutu Liang, Bo Li
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 122471Z (2022) https://doi.org/10.1117/12.2636814
Low-voltage power line high-speed carrier communication, abbreviated as HPLC, is a power line carrier communication technology, which is mostly used in local communication (such as meter reading) of electricity consumption information collection systems in low-voltage stations. The communication method adopts OFDM technology. Different communication frequency bands can be configured through different sub-carrier shielding schemes. The typical communication frequency bands are 2~12MHz, 2.4-5.6MHz, 1.7-3MHz, 0.7-3MHz, etc. The sampling rate is 25MHz, the subcarrier spacing is 24.414KHz, the encoding algorithm is Turbo dual binary encoding, the size of the physical block includes 5 types such as PB16, PB72, PB136, PB264, PB520, and the code rate includes 1/2 and 16/18 ,There are two types of modulation methods, such as BPSK, QPSK, QAM16, using different diversity copy modes, under different noise and channel conditions, the communication rate from 100Kbps to 1Mbps can be achieved.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Jiangang Lu, Zhan Shi, Yutu Liang, Bo Li, Xingnan Li, Xiaozhi Deng
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224720 (2022) https://doi.org/10.1117/12.2636837
Power line carrier communication is a power system communication that uses the transmission line as the transmission medium of the carrier signal. Since the transmission line has a very solid support structure and is equipped with more than 3 conductors (generally there are three good conductors and one or two overhead ground wires), the transmission line is used to transmit the carrier signal at the same time as the power frequency current is transmitted. This kind of comprehensive utilization has long been the unique communication means adopted by all power sectors in the world. In the process of making the broadband carrier communication data transmission protocol and developing the broadband carrier communication technology, it is necessary to synchronously study the broadband carrier communication conformance testing technology and system. Conformance testing can not only verify whether an implementation of a protocol is consistent with the compliant protocol specification, but also ensure that communication equipment developed by different equipment manufacturers can be interconnected. Therefore, the development of the broadband carrier communication protocol conformance test system is of great significance for promoting the standardization of broadband carrier communication and equipment interconnection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Yanming Jin, Zhuonan Li, Xinli Xiao, Min Liu, Lingyu Chen
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224721 (2022) https://doi.org/10.1117/12.2636785
With the large-scale development of new energy in China, large-scale energy storage batteries will be decommissioned. Most of the decommissioned batteries of power grid companies are located in substations, which involve a wide range of branches and a large number of sites but a small total. It also determines that it does not have strong competitiveness in commercial competition. Therefore, it is necessary to study the construction of energy storage battery recycling and reuse system in advance, evaluate the commercial value, and reduce recycling cost, especially for the key links in the recycling and reuse process, to conduct targeted optimization work. This paper focuses on the problems faced by the energy storage battery recycling of power grid companies, proposes an identification model for the key factors in the recycling system of used batteries for power grid companies. Finally, this paper proposes seven elements for the construction of a business model of hazardous waste recycling and reuse, and describes their interrelationships, which provides a reference for the recycling and reuse of hazardous wastes in power grid enterprises.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224722 (2022) https://doi.org/10.1117/12.2637015
In order to improve the recognition rate of teachers' voice emotion recognition, this paper proposes a new model of teachers' voice emotion recognition for online education in colleges and universities. In the process of online education, there are some errors and noises in Teachers' voice, which need to be corrected and denoised. On this basis, teachers' voice emotion features are selected for dimensionality reduction, based on the reduced teachers' voice emotion feature parameters. The template of teacher's speech emotion recognition is constructed to realize the real-time recognition of teacher's speech emotion. The experimental results show that, under different experimental objects, the accuracy of the proposed model is much higher than that of the contrast model, which proves that the proposed model has good performance of emotion recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022), 1224723 (2022) https://doi.org/10.1117/12.2636836
Soil enzymes play an important role in the material circulation and energy flow of soil ecosystem, and the detection of their activity is the basis for the development of soil enzymes. Traditional soil enzyme activity detection methods can reflect soil enzyme activity, but cannot reflect the real situation of soil enzyme in situ, and distinguish the continuous changes of enzyme activity in time and space. Based on fluorescent substrate, in-situ zymography technology can obtain two-dimensional images of soil enzyme activity distribution in situ, reflect its continuous changes in space from microscale, and distinguish between hot and non-hot areas of soil enzyme activity. It has the advantages of high accuracy, high spatial resolution and high time resolution. In this paper, the working principle and technical advantages of in-situ zymography were summarized, and the development direction of the combination of in-situ zymography and other technologies was prospected, in order to promote the development and application of this technology and provide research direction for the further study of soil enzymology.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.