A system able to detect the existence of the tongue and locate its relative position within the surface of the mouth by
using video information obtained from a web camera is proposed in this paper. The system consists of an offline phase,
prior to the the operation by the final user, in which a 3-layer cascade of SVM learning classifiers are trained using a
database of 'tongue vs. not-tongue' images, that correspond to segmented images containing our region of interest, the
mouth with the tongue in three possible positions: center, left or right. The first training stage discerns whether the
tongue is present or not, giving the output data to the next stage, in which the presence of the tongue in the center of the
mouth is evaluated; finally, in the last stage, a left vs. right position detection is assessed. Due to the novelty of the
proposed system, a database needed to be created by using information gathered from different people of distinct ethnic
backgrounds. While the system has yet to be tested in an online stage, results obtained from the offline phase show that it
is feasible to achieve a real-time performance in the near future. Finally, diverse applications to this prototype system are
introduced, demonstrating that the tongue can be effectively used as an alternative input device by a broad range of
users, including people with some physical disability condition.
Digital images are represented nowadays as square lattices. Everyday
items, such as digital cameras, displays, as well as many systems for
vision or image processing use square lattices to represent an image.
However, as the distance between adjacent pixels is not constant, any
filter based on square lattices presents inherent anisotropy. Ando
introduced consistent gradient filters to cope with this problem, with filters derived in order to get the minimum inconsistency. Square lattices are not, however, the only way to order pixels. Another placement method can be found, for example, in the human retina, where receptors adopt an hexagonal structure. In contrast to square lattices, the distance between adjacent pixels is a constant for such structures. The principal advantage of filters based on hexagonal matrices is, then, their isotropy. In this paper, we derive consistent gradient filters of hexagonal matrices following Ando's method to derive consistent gradient filters of square matrices. The resultant hexagonal consistent gradient filters are compared with square ones. The results indicate that the hexagonal filters derived in this paper are superior to square ones in consistency, in proportion of consistency to output power, and in localization.
Three-Dimensional (hereafter, 3D) imaging is one of the very powerful tools to help the people to understand the spatial relationship of objects. Various glassless 3D imaging technologies for 3D TV, personal computers, PDA and cellular phones have been developed. These devices are often viewed for long periods. Most of the people who watch 3D images for a long time, experience asthenopia or eye fatigue. This concerns a preliminary study that attempted to find the basic cause of the problem by using MEG and the other devices. Plans call for further neurophysiological study on this subject. The purpose of my study is to design a standard or guidelines for shooting, image processing, and displaying 3D images to create the suitable images with higher quality and less or no asthenopia. Although it is difficult to completely avoid asthenopia when viewing 3D images, it would be useful if guidelines for the production of such images could be established that reduced its severity. The final goal of my research is to formulate such guidelines with an objective basis derived from measurement results from MEG and other devices. In addition to the study I was in charge of the work to install the world largest glasses-free 3D display to Japan Pavilion Nagakute in the 2005 World Exposition, Aichi, Japan during March, 25th to September 25th, 2005. And several types of large screen for 3D movies were available for testing, the result of the test to this report are added.
In this paper a new image registration methodology for matching anatomical medical images is presented. It is based on a point-to-point matching methodology that uses the Cauchy- Navier splines transformation to model the deformable anatomical behavior associated with non-rigid body medical image registration. These transformations are illustrated by matching corresponding CT and MR images of the Thorax. By applying Cauchy-Navier splines transformation to landmarks created using segmentation method, an improved performance is achieved. The Cauchy-Navier spline is compared to thin plate spline and multi-quadratic methods.
We are developing a spectral characteristics database for evaluating color reproduction in image input devices. The database is designed so that spectral characteristics are systematically divided by category, according to the purpose to which they are to be applied, and every category has a sufficient variety of samples. The categories are as follows: photographic materials, graphic color printing, computer color printers output, paints, flowers, leaves, human skin colors and historical Krinov data. The total number of colors is 49,672. The database is being proposed to be published as Japanese/International standard technical reports for use in establishing a new color reproduction evaluation method.
KEYWORDS: Image compression, Image processing, Signal to noise ratio, Medical imaging, Image quality, Image filtering, Computer programming, Magnetic resonance imaging, Digital filtering, Algorithm development
Noise greatly degrades the quality of the image and the performance of any image compression algorithm. This paper presents an approach to representation and compression of noisy images. A new concept RBP region-based prediction model is first introduced, and then the RBP model is utilized to work on noisy images. In the conventional predictive coding techniques, the context for prediction is always composed of the individual pixels surrounding the pixel to be processed. The RBP model allows for the regions instead of the individual pixels surrounding the pixel to be predicted. A practical algorithm for implementation of RBP is developed. In our experiments, the practical algorithm is applied to noisy synthetic images. By encoding we achieve the bit rate 1.10 bits/pixel of the noisy synthetic image. The decompressed image achieves the peak SNR 42.59dB. Compared with the peak SNR 41.01dB of the noisy synthetic image, the decompressed synthetic image is better in the MSE sense. The image compression standard JPEG provides peak SNR 33.17dB for the noisy synthetic image at the same bit rate, and the conventional median filter with 3 by 3 window provides the peak SNR 25.89dB. The RBP model is also applied to a medical MRI image. The subjective evaluation of the decompressed MRI image shows that our method is superior to the conventional methods.
A method for obtaining intelligent behavior and influencing the shapes of artificial creatures following an evolutionary model is described. The creatures obtain intelligent behavior by interacting with the variable conditions in the environment where they live. Our algorithm proposes a way for the creatures to shirk from enemies, overcome obstacles, search for a mate and look after their own needs by using the five senses. The evolutional model used in our research is based on Genetic Algorithms (GA) and offers a novel way to generate new shapes from intelligent behaviors. This paper proposes a method to generate intelligent behaviors for and evolve the shape of artificial creatures by expanding our previously proposed evolutionary model.
KEYWORDS: Visualization, Visual compression, Image compression, Principal component analysis, Computed tomography, 3D displays, 3D visualizations, Magnetic resonance imaging, 3D modeling, Data compression
This paper presents one part of our work on 'hierarchical visualization.' Here, we propose an efficient hierarchical structure for the visualization of 3D volumetric data, where for instance, data compression techniques are embedded into the hierarchical structure. The hierarchical structure consists of two layers. The first hierarchy roughly visualizes the whole lossy compressed data so that we can quickly understand and interpret the outline of the whole 3D volumetric dataset. This step allows users to choose the desired parts of the 3D volumetric data. The second hierarchy is used for scrutinizing the detail of the chosen parts using lossless compressed data. To implement the hierarchical structure we propose a partitioning algorithm for 3D volumetric data based on principal component analysis. With principal component analysis, the original 3D volumetric space can be divided into a number of 3D volumetric blocks such that each block contains similar data. Analyzing these specific blocks and taking advantage of an octree hierarchy, from the whole 3D volumetric dataset we can easily access only the parts which are required for display. The proposed partitioning algorithm should be very useful in reducing the amount of rendering, which means that we will need only to precisely render the blocks required for display, and not the whole 3D volumetric dataset.
This paper proposes a hair volume algorithm that is effective in producing realistic hair model for individual. The hair volume is produced from three images of the head taken from the right, back and top view. This hair volume represents the real space volume where hairs are found and guides the hair strands to move in the desired direction. Hair strands are randomly generated on the skull. The outline region acquired through image processing together with the hair volume ensures that the randomly generated hair strands fall neatly into the hair volume to produce hair model resembling the input images. This hair model can find many applications in the generation of synthetic humans and creatures in movies, multimedia and computer game productions.
Recent efforts in image morphing research aim at improving both user interface and warping results. To specify feature points in tow images, the user interface takes up much time and it allows the warping specification by the user which represents very tedious work. In this paper, we propose a semi-automatic algorithm based on active contour model to specify the feature correspondence between two given images. It allows a user to extract a contour that defines a facial features such as lips, mouth, profile, etc., by specifying only endpoints of the contour around the feature that serve as the extremities of a contour. The proposal algorithm uses these two points as anchor points, and automatically computes the image information around these endpoints to provide boundary conditions. Then we optimize the contour by taking this information into account close to its extremities. During the iterative optimization process, the image forces are moving progressively from the contours extremities towards its center to define the feature. Once the feature correspondence points are paired, the intermediate images are generated by interpolating the positions of feature points linearly. The proposal algorithm helps the user to define easily the exact position of a feature. It may also reduce the time taken to establish feature correspondence between two images.
The aim of this study is to propose a new solution to the following image morphing problems: feature specification, image warping and cross dissolve between two deformed images. First, we adopt a semi-automatic algorithm based on Ziplock snake to specify the feature correspondence between two given images. It allows a user to extract a contour that defines facial features such as lips, mouth, profile, etc., by specifying only endpoints of the contour around the feature that serve as the extremities of a contour. Then we use these two points as anchor points, and automatically computes the image information around these endpoints to provide boundary conditions. Next, we optimize the contour by taking this information into account first only close its extremities. During the iterative optimization process, the image forces are moving progressively from the contours extremities towards its center to define features. It helps the user to define easily the exact position of features. It may also reduce the time taken to establish feature correspondence between two images. For the second images morphing problem, this paper presents a warping algorithm using thin plate spline, a well known scattered data method which has several advantages. It is efficient in time complex and smoothed interpolated morphed images with only a remarkably small number of feature points specified. It allows each feature point to be mapped out the corresponding feature point in the warped image. Once the image warped to align the positions of feature and their shapes. The in-between animation from given two images could be defined by cross dissolving the positions of correspondence features and their shapes and colors. We describe an efficient cross dissolve algorithm to generate the in-between images.
KEYWORDS: Image processing, 3D modeling, Data modeling, Solid modeling, 3D image processing, Feature extraction, Skull, Head, Data processing, Multimedia
In the field of human animation, hair represents one of the most challenging problems and therefore has been one of the least satisfactory aspects of human images rendered to data. This paper proposes a method to generate realistic hair model for individuals based on image processing. The analysis and recognition of hair strands by image processing provides valuable data, particularly hair outline and the flow direction of the hair for the rendering of realistic hair model for individuals. The image is binarized prior to lines extraction and the hair region is determined by a series of expansion and contraction. These data provide the basic guidance for the generation of the hair model. A simplified spring model is used for the hair modeling. In this spring system, a strand of hair is modeled as a series of interconnected masses, springs and hinges. Hair strands are randomly generated on the skull. The outline region acquired through image processing ensures that the randomly generated hair strands fall neatly into the hair region. These strands are randomly rotated to shuffle the hair strands. In the case of hair strands falling out of the outline region, weights are added or reduced at the interconnected masses in order to move the strand back into the hair region. The lines extracted by image processing serves as guide-lines directing the hair strands to point in the desired direction. This realistic hair model can find many applications in the generation of synthetic humans and creatures in movies, multimedia and computer game productions.
A method is given for synthesizing a texture by using the interface of a conventional drawing tool. The majority of conventional texture generation methods are based on the procedural approach, and can generate a variety of textures that are adequate for generating a realistic image. But it is hard for a user to imagine what kind of texture will be generated simply by looking at its parameters. Furthermore, it is difficult to design a new texture freely without a knowledge of all the procedures for texture generation. Our method offers a solution to these problems, and has the following four merits: First, a variety of textures can be obtained by combining a set of feature lines and attribute functions. Second, data definitions are flexible. Third, the user can preview a texture together with its feature lines. Fourth, people can design their own textures interactively and freely by using the interface of a conventional drawing tool. For users who want to build this texture generation method into their own programs, we also give the language specifications for generating a texture. This method can interactively provide a variety of textures, and can also be used for typographic design.
In this paper, we make an attempt to predict the location of a cable in next frame according to some parameters in current frame. At first, the initial condition is that the approximate location of cable must be given. From Hough Transformation, a high accumulated degree point in Hough Space is gotten. Afterwards, by inverse Hough Transformation, using that point, the location is detected and then cable is pointed out from its original image. Based on actual working conditions and performance indices, as well as the location in current frame, the maximum range of degree within which the cable may occur in next frame with high possibility is evaluated. Furthermore, a narrow range of location in next frame is confirmed according to the speed of robot. It is this narrow range that reduces the influence of the background on the detected object to the least. Therefore, the location of cable can be detected more accurately. Still more, as the final target of this project, an approach to detect the dangerous status of Cable is shown. In this approach, we want to distinguish the predict area into three parts with a simple area partition. By means of time sequence, the relative changes on the size of area which is between shadow and cable can conclude whether the cable is suspended in water.
Attenuated phase-shifting mask with a single-layer absorptive shifter of CrO, CrON, MoSiO or MoSiON films has been developed. The optical parameter of these films can be controlled by the condition of sputtering deposition. These films satisfy the shifter requirements, both the 180-degrees phase shift and the transmittance between 5 and 20% for i-line. MoSiO and MoSiON films also satisfy the requirement for KrF excimer laser light. Conventional mask processes, such as etching, cleaning, defect inspection and defect repair, can be used for the mask fabrication. Defect-free masks for hole layers of 64 M-bit DRAM are obtained. Using this mask, the focus depth of 0.35-micrometers hole is improved from 0.6 micrometers to 1.5 micrometers for i-line lithography. The printing of 0.2-micrometers hole patterns is achieved by the combination of this mask and KrF excimer laser lithography.
Properties of the neural networks employed in image data compression are studied, and a method for increasing the compression capability is proposed. Since the multiple gray level image have a large quantity of data, the poor mapping capacity of the neural network is the main problem causing the poor data compression capability. In order to increase the compression capability, in the proposed method, first an image is divided into subimages, that is blocks. Then these blocks are divided into several classes. Several independent neural networks are assigned adaptively to these blocks according to their classes. Since the mapping capacity is proportional to the number of the neural networks, and no data quantity increases, the compression capability is increased efficiently by our method. The computer simulation results show that the signal to noise ratio (SNR) of the reconstructed images was increased by about 1 approximately 2 (dB) by our method. Especially the visual image quality has increased.
There are many kinds of image processing that are essentially sequential. Raster scan is commonly used for such sequential operations, to scan images from left to right, and line by line. Another scanning, called the Peano scan, traverses an image from a pixel to its neighboring one and the direction frequently changes. This scan prevents from producing periodic patterns, which are sometimes observed in images transformed in raster scan line order. However, the Peano scan is applied to only square images, and the horizontal and vertical sizes must be a power of two. We present a new scanning, called a ternary scan, which has the same property as the Peano scan and can fit to any rectangular images. Application of the ternary scan to Floyd-Steinberg's halftoning is shown.
A method of pattern recognition using a three layered feedforward neural network is described. Experiments were carried out for handwritten katakana in a frame using neural network. Handwritten characters have varieties of scales, positions, and orientations. In a neural network, however, if the input patterns are shifted in position, rotated, and varied in scales, it does not function well. So we describe a method to solve the problems of these variations using three layered feedforward neural network. We used two kinds of moment values that are invariant for these variations. One is regular moments and the other is Zernike moment, which gives a set of orthogonal complex moments of an image known as Zernike moments. We also describe the problem of the structure of neural networks and the relation between the recognition rate and data sets for similar and different patterns.
We propose a new method to extract arbitrary shapes such as lines, circles, ellipses and other complex shapes, from noisy binary images. In this method, shapes in a given binary image are traced by the tracer named PVFT (Pseudo View Field Tracer ). The movement of PVFT is similar to that of the view field of a man who recognizes arbitrary shapes with a restricted view field. That is, PVFT selects line segments of a shape in a noisy image, and traces them. Moreover,
movement of PVFT is controlled according to the shape to be extracted.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.