The previous 3D pose and shape estimation methods often suffer from the problem of depth ambiguity. Hence, we present a novel method to reduce the depth ambiguity by explicitly considering the depth of a person’s body surface. The key idea is minimizing the difference between the depth estimated from an input image and the projected depth of a reconstructed 3D mesh. This allows the proposed method to estimate 3D pose and body shape with plausible 3D joint locations. Evaluations show that the proposed method produces more appropriate 3D meshes and reduces both 3D pose and shape estimation errors.
Gestures complement the utterance contents to help the human understand. In the field of gesture generation, the task of generating gestures from utterances have attracted attention. The main method for generating gestures from utterances is to associate utterances with gestures using deep neural networks. To associate utterances with gestures using deep neural networks, a co-speech gesture dataset is required. However, building such datasets is costly and time-consuming because it requires a reliable pose estimation system (such as motion captures) and manual adjustments. We proposed an automatic method to collect a co-speech gesture dataset from online speech videos. The method extracts various utterance and gesture pairs from online speech videos. In addition, we use the collected dataset to train a deep neural network and confirm that our automatically-collected dataset can be a supervisory signal for speech-driven gesture generation.
We propose a refinement module to improve action recognition by considering the semantic relevance between verbs and nouns. Existing methods recognize actions as a combination of verb and noun. However, they occasionally produce the semantically implausible combination, such as “drink a cupboard” or “open a carrot”. To tackle this problem, we propose a method that incorporates a word embedding model into an action recognition network. The word embedding model is trained to obtain co-occurrence between verbs and nouns and used to refine the initial class probabilities estimated by the network. Experimental results show that our method improves the estimation accuracy of verbs and nouns on the EPIC-KITCHENS Dataset.
Visual tracking gives a trajectory of a person’s movement in a video. It is an important element of human’s behavior analysis. However, the performance of the most visual tracking methods does not satisfy precision and speed for use in the real world. High accuracy and low computation complexity are demanded but these two have a relation of trade-off. Therefore, we aim to propose a tracking method that balances accuracy and computational complexity. In this paper, we proposed a method named Dual Cost Graph (DCG)-Tracker using the graphs referred to as a clique graph and a flow network. We evaluated DCG-Tracker with PNNL parking lot 1 dataset1, 2 and showed that it balances between accuracy and speed.
In classification tasks, the accuracy of classifiers depends on training data. It is known that inter-class imbalanced data degrade the classification accuracy. Previous approaches tend to use data augmentation to solve inter-class imbalance, but the possibility of intra-class imbalance has been ignored. In this paper, we propose a novel method to solve the intra-class imbalance with Generative Adversarial Networks (GAN). The key idea is to examine the distribution of training data in latent space. We experimentally demonstrate that the proposed method generates diverse images and improves classification accuracy on the CIFAR-10 dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.