KEYWORDS: 3D modeling, Video, Motion models, Data modeling, Education and training, Video coding, Process modeling, Transformers, Feature extraction, Semantic video
Digital rehabilitation plays a crucial role in the treatment of chronic diseases, as it enables the assessment of disease grades and the recommendation of treatment measures. In this paper, we propose a generative pre-trained transformer towards rehabilitation (RehabGPT) via a model-as-a-service (MaaS) solution to facilitate foundation model building for digital rehabilitation on Alibaba's ModelScope platform. It offers scalable computational resources needed for pre-trained large models. It also provides tools for multi-modal feature extraction, 3D human mesh reconstruction and analysis of video sequences. RehabGPT automates various aspects of the model development workflow, such as hyper-parameter tuning and architecture selection, making it easier to achieve the desired results in rehabilitation tasks.
Attribute information in fine-grained image recognition often provides more accurate and rich information related to categories. How to effectively combine such knowledge to guide image classification tasks has been one of the research hotspots in computer vision in recent years. We believe that using the association relationship between attributes to fuse attribute information can obtain a more accurate representation of the image. In this paper, we propose a novel Multi-Task Attribute Fusion Model (MTAF) which makes two major improvements to the traditional multi-task learning framework: 1) Attribute-Aware Feature Discrimination: combine the spatial attention and the channel attention mechanism to enhance the feature map of the CNN, so that attribute can be associated to important positions and important channels of the image; 2) Transformer-Based Feature Fusion: introduce the Transformer model to better learn the logical association between attributes, so that the reconstructed features are able to achieve a best classification performance. We have verified our algorithm on two datasets, one is the own-collected medical dataset for thyroid benign and malignant identification, and the other is an open dataset widely used for fine-grained image recognition. Experimental results on both datasets demonstrate that the proposed method can achieve higher classification accuracy than baselines.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.