Chest radiographs are complex, heterogeneous medical images that depict many different types of tissues, and many
different types of abnormalities. A radiologist develops a sense of what visual textures are typical for each anatomic
region within chest radiographs by viewing a large set of "normal" radiographs over a period of years. As a result, an
expert radiologist is able to readily detect atypical features. In our previous research, we modeled this type of learning by
(1) collecting a large set of "normal" chest radiographs, (2) extracting local textural and contour features from
anatomical regions within these radiographs, in the form of high-dimensional feature vectors, (3) using a distance-based
transductive machine learning method to learn what it typical for each anatomical region, and (4) computing atypicality
scores for the anatomical regions in test radiographs. That research demonstrated that the transductive One-Nearest-Neighbor (1NN) method was effective for identifying atypical regions in chest radiographs. However, the large set of
training instances (and the need to compute a distance to each of these instances in a high dimensional space) made the
transductive method computationally expensive. This paper discusses a novel online Variance Based Instance Selection
(VBIS) method for use with the Nearest Neighbor classifier, that (1) substantially reduced the computational cost of the
transductive 1NN method, while maintaining a high level of effectiveness in identifying regions of chest radiographs
with atypical content, and (2) allowed the incremental incorporation of training data from new informative chest
radiographs as they are encountered in day-to-day clinical work.
Inductive learning refers to machine learning algorithms that learn a model from a set of training data instances. Any test
instance is then classified by comparing it to the learned model. When the set of training instances lend themselves well
to modeling, the use of a model substantially reduces the computation cost of classification. However, some training data
sets are complex, and do not lend themselves well to modeling. Transductive learning refers to machine learning
algorithms that classify test instances by comparing them to all of the training instances, without creating an explicit
model. This can produce better classification performance, but at a much higher computational cost.
Medical images vary greatly across human populations, constituting a data set that does not lend itself well to modeling.
Our previous work showed that the wide variations seen across training sets of "normal" chest radiographs make it
difficult to successfully classify test radiographs with an inductive (modeling) approach, and that a transductive approach
leads to much better performance in detecting atypical regions. The problem with the transductive approach is its high
computational cost.
This paper develops and demonstrates a novel semi-transductive framework that can address the unique challenges of
atypicality detection in chest radiographs. The proposed framework combines the superior performance of transductive
methods with the reduced computational cost of inductive methods. Our results show that the proposed semitransductive
approach provides both effective and efficient detection of atypical regions within a set of chest radiographs
previously labeled by Mayo Clinic expert thoracic radiologists.
Active learning methods have gained popularity to reduce human effort in annotating examples in order to train a classifier. When faced with large amounts of data, the active learning algorithm automatically selects appropriate data samples that are most relevant to train the classifier. Typical active learning approaches select one data instance (one face image, for example) in one iteration of the algorithm, and the classifier is trained with the selected data instances, one-by-one. Instead, there have been very recent efforts in active learning to select a batch of examples for labeling at each instant rather than selecting a single example and updating the hypothesis. In this work, a novel batch mode active learning scheme based on numerical optimization of an appropriate function has been applied to the biometric recognition problem. In problems such as face recognition, real-world data is often generated in batches, such as frames of video in a capture session. In such scenarios, selecting the most appropriate data instances from these batches (which usually have a high redundancy) to train a classifier is a significant challenge. In this work, the instance selection is formulated as a mathematical optimization problem and the framework is extended to handle learning from multiple sources of information. The results obtained on the widely used NIST Multiple Biometric Grand Challenge (MBGC) and VidTIMIT biometric datasets corroborate the potential of this method in being used for real-world biometric recognition problems, when there are large amounts of data.
Experienced radiologists are in short supply, and are sometimes called upon to read many images in a short amount of
time. This leaves them with a limited amount of time to read images, and can lead to fatigue and stress which can be
sources of error, as they overlook subtle abnormalities that they otherwise might not miss. Another factor in error rates
is called satisfaction of search, where a radiologist misses a second (typically subtle) abnormality after finding the first.
These types of errors are due primarily to a lack of attention to an important region of the image during the search. In
this paper we discuss the use of eye tracker technology, in combination with image analysis and machine learning
techniques, to learn what types of features catch the eye experienced radiologists when reading chest x-rays for
diagnostic purposes, and to then use that information to produce saliency maps that predict what regions of each image
might be most interesting to radiologists. We found that, out of 13 popular features types that are widely extracted to
characterize images, 4 are particularly useful for this task: (1) Localized Edge Orientation Histograms (2) Haar
Wavelets, (3) Gabor Filters, and (4) Steerable Filters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.