In recent years, vision transformers (ViTs) have made significant breakthroughs in computer vision and have demonstrated great potential in large-scale models. However, the quantization methods for convolutional neural network models do not perform well on ViTs models, leading to a significant decrease in accuracy when applied to ViTs models. We extend the quantization parameter optimization method based on the Hessian matrix and apply it to the quantization of the LayerNorm module in ViT models. This approach reduces the impact of quantization on task accuracy for the LayerNorm module and enables more comprehensive quantization of ViT models. To achieve fast quantization of ViTs models, we propose a quantization framework specifically designed for ViTs models: Hessian matrix–aware post-training quantization for vision transformers (HAPTQ). The experimental results on various models and datasets demonstrate that our HAPTQ method, after quantizing the LayerNorm module of various ViT models, can achieve lossless quantization (with an accuracy drop of less than 1%) in ImageNet classification tasks. Specifically, the HAPTQ method achieves 85.81% top-1 accuracy on the ViT-L model.
KEYWORDS: Semantics, Feature extraction, Computing systems, Deep learning, Machine learning, Systems modeling, Detection and tracking algorithms, Data modeling, Transformers
As computer systems become increasingly complex, deep learning methods for rapidly analyzing and pinpointing anomalies in system logs are gaining widespread application to ensure smooth system operation. Addressing the issues of many existing methods that require substantial amounts of labeled data and insufficient utilization of temporal features in logs, we propose a semi-supervised log anomaly detection method. Initially, log templates are extracted using the Drain template parsing technique. Subsequently, BERT is employed to extract deep semantic feature vectors from logs and to derive time feature vectors. Unsupervised clustering algorithms are then used to estimate labels for unlabeled samples, tackling the problem of insufficiently annotated data in practical log anomaly detection scenarios. Finally, anomaly detection is achieved using a Attn-based Bi-LSTM model. Experimental results on two datasets, HDFS and BGL, demonstrate that our proposed method achieves notable improvements in terms of accuracy and recall, thereby validating the effectiveness of our work.
KEYWORDS: Electrocardiography, Education and training, Data modeling, Deep learning, Feature extraction, Ablation, Signal processing, Performance modeling, Statistical modeling, Machine learning
Targeting the challenge where the substantial labeling expense of ECG data contributes to the present dearth of labeled ECG datasets and the subpar segmentation precision of contemporary models, this paper proposes an ECG segmentation model NGA-Net,the model is based on RRU-Net, with the addition of the ASPNL module and the improved Ghost module, in which the improved Ghost module is designed to generate an increased quantity of feature maps using a reduced parameter set, thereby boosting computational efficiency; The ASPNL module can capture ECG signal features from multiple scales to enhance the efficiency of feature extraction. Experimental evidence indicates that the ECG segmentation model, NGA-Net, introduced in this research, exhibits superior performance in comparison to other methodologies when tested on the publicly available LUDB dataset, which demonstrates the effectiveness of NGANet.In this research, we adopt a semi-supervised learning strategy for training the NGA-Net in scenarios with small sample sizes, leveraging data augmentation and consistency training methodologies. The experimental findings corroborate the effectiveness of semi-supervised learning in augmenting the performance of deep learning models.
KEYWORDS: Video, Detection and tracking algorithms, Video surveillance, Feature extraction, Cameras, Video compression, Data conversion, Neural networks
A keyframe is a crucial image frame used to describe a shot, and the use of keyframe technology can significantly reduce the amount of data for video retrieval. For example,video-on-demand, face recognition under the camera, key lens retrieval of medical images, etc. Aiming at the problems in the current video keyframe extraction process that the extraction accuracy is low and cannot meet the real-time performance, this paper proposes a real-time video keyframe extraction algorithm CTM-NN based on the inter-frame difference method combined with clustering and neural network. The algorithm uses the inter-frame difference method based on the set threshold, HOG plus HSV first-order moment feature extraction algorithm, and uses the K-means++ clustering algorithm to finally train its own ResNet-50 model, aiming to accurately and efficiently extract real-time video Keyframes. In order to verify the algorithm proposed in this paper, experiments were carried out in the finished news video, landscape video, and real-time concrete mixing video. The experimental results show that the method proposed in this paper can meet the extraction accuracy and meet the keyframe extraction speed of the real-time video so that it can save the keyframes, automatically label while maintaining the time sequence. All in all, the CTM-NN algorithm proposed in this paper has achieved good results in the extraction and storage of real-time video keyframes
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.