The smart walker is a special service robot that can assist the elderly in walking. Deploying gait recognition algorithms on the smart walker can help the elderly identify relatives and friends from a distance in real time and obtain some pedestrian information to avoid danger. However, in actual usage scenarios, it is difficult to quickly obtain qualified pedestrian silhouettes as input images for gait recognition algorithms. The difficulty is that unlike fixed cameras in traditional gait recognition scenarios, the camera of a smart walker acquires a dynamic background due to the movement of the robot, and it is difficult to accurately segment dynamic pedestrians from the dynamic background. Instance segmentation algorithms commonly used to segment silhouettes in static images require high-resolution input images. To quickly segment gait silhouettes from dynamic backgrounds, we propose a color filter-based gait silhouette extraction method (CFSE) in dynamic background. We use instance segmentation algorithms to distinguish the foreground and background, use background subtraction algorithms to obtain portrait silhouettes, and finally use color filtering to eliminate background colors to get the silhouette. Experimental results show that the method we proposed can extract silhouettes more accurately in moving scenes, and the algorithm speed meets the real-time requirements.
As a recognition method that does not require target cooperation, gait recognition has significant value in the criminal investigation field. As the gait recognition network’s input, the gait silhouette occupies a large storage space and low information density, which leads to a high demand for hardware in the gait recognition system. We proposed a compressing gait silhouette method based on the graphical edge, which extracts silhouette edges and combines them into a gait silhouette edge image. Storage space has been reduced by 90.6%, and information density has increased from 4.64% to 49.48%-71.09%. Since the new image is not applicable to the original network, we proposed the improved network GSEInet of GaitSet. The experimental results show that using our compressed images can reduce the network running time by 91.1%, and the demand for graphics card memory by 48.8%, while the recognition rate will not decrease. This will help transplant gait recognition tasks from large servers to small embedded devices and improve flexibility.
Aiming at the inverse kinematics of series-parallel hybrid 7-DOF humanoid manipulator, an analytical algorithm based on adaptive parameterization of joint posture angle is proposed. Firstly, according to the configuration of the humanoid manipulator, the schematic diagram of the mechanism is established, and the forward kinematics is derived. Secondly, the posture angle of the shoulder joint is adaptive parameterized by the moderate principle, and then the rest of the joint posture angles are deduced by the analytical method to complete the inverse kinematics solution. Finally, the correctness of the proposed inverse kinematics method is verified by simulation analysis, and the advantages and disadvantages of the joint posture angles are evaluated by using the evaluation function, which shows that it is reasonable to adopt the moderate principle for the parameterized joint posture angles.
After a few seconds of an action, the human eye only needs a few photos to judge, but the action recognition network needs hundreds of frames of input pictures for each action. This results in a large number of floating point operations (ranging from 16 to 100 G FLOPs) to process a single sample, which hampers the implementation of graph convolutional networks (GCN)-based action recognition methods when the computation capabilities are restricted. A common strategy is to retain only the portions of the frames, but this results in the loss of important information in the discarded frames. Furthermore, the selection progress of key frames is too independent and lacks connections with other frames. To solve these two problems, we propose a fusion sampling network to generate fused frames to extract key frames. Temporal aggregation is used to fuse adjacent similar frames, thereby reducing information loss and redundancy. The concept of self-attention is introduced to strengthen the long-term association of key frames. The experimental results on three benchmark datasets show that the proposed method achieves performance levels that are competitive with state-of-the-art methods while using only 16.7% of the number of frames (∼50 and 300 frames in total). On the NTU 60 dataset, the number of FLOPs and Params with a single-channel input are 3.776 G and 3.53 M, respectively. This would greatly reduce the excessive computational power cost in practical applications due to the large amount of data processed by action recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.