PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 9400 including the Title Page, Copyright information, Table of Contents, Introduction, and Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This study focuses on accelerating the optimization of motion estimation algorithms, which are widely used in video
coding standards, by using both the paradigm based on Altera Custom Instructions as well as the efficient combination of
SDRAM and On-Chip memory of Nios II processor. Firstly, a complete code profiling is carried out before the
optimization in order to detect time leaking affecting the motion compensation algorithms. Then, a multi-cycle Custom
Instruction which will be added to the specific embedded design is implemented. The approach deployed is based on
optimizing SOC performance by using an efficient combination of On-Chip memory and SDRAM with regards to the
reset vector, exception vector, stack, heap, read/write data (.rwdata), read only data (.rodata), and program text (.text) in
the design. Furthermore, this approach aims to enhance the said algorithms by incorporating Custom Instructions in the
Nios II ISA. Finally, the efficient combination of both methods is then developed to build the final embedded system.
The present contribution thus facilitates motion coding for low-cost Soft-Core microprocessors, particularly the RISC
architecture of Nios II implemented in FPGA. It enables us to construct an SOC which processes 50×50 @ 180 fps.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
PNG (Portable Network Graphics) is a lossless compression method for real-world pictures. Since its specification, it continues to attract the interest of the image processing community. Indeed, PNG is an extensible file format for portable and well-compressed storage of raster images. In addition, it supports all of Black and White (binary mask), grayscale, indexed-color, and truecolor images. Within the framework of the Demat+ project which intend to propose a complete solution for storage and retrieval of scanned documents, we address in this paper a hardware design to accelerate the PNG encoder for binary mask compression on FPGA. For this, an optimized architecture is proposed as part of an hybrid software and hardware co-operating system. For its evaluation, the new designed PNG IP has been implemented on the ALTERA Arria II GX EP2AGX125EF35" FPGA. The experimental results show a good match between the achieved compression ratio, the computational cost and the used hardware resources.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present results from a prototype CMOS camera system implementing a multiple sampled pixel level algorithm (“Last
Sample Before Saturation”) in real-time to create High-Dynamic Range (HDR) images that approach the dynamic range
of CCDs. The system is built around a commercial 1280 × 1024 CMOS image sensor with 10-bits per pixel and up to 500
Hz full frame rate with higher frame rates available through windowing. We provide details of system architecture and
present images collected with the system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The semivariogram is a statistical measure of the spatial distribution of data and is based on Markov Random Fields
(MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the
geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated,
resulting in the need for efficient real time implementation of the algorithm. The semivariogram is a plot of
semivariances for different lag distances between pixels. A semi-variance, γ(h), is defined as the half of the expected
squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to
examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image
window with n pixels is 𝑂(𝑛2). Field Programmable Gate Arrays (FPGAs) are an attractive solution for such
demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest
clock rates measured in a few hundreds of megahertz, but they can perform tens of thousands of calculations per
clock cycle while operating in the low range of power. This paper presents a technique for the fast computation of
the semivariogram using two custom FPGA architectures. The design consists of several modules dedicated to the
constituent computational tasks. A modular architecture approach is chosen to allow for replication of processing
units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is
focused on isotropic semivariogram computations only. Anisotropic semivariogram implementation is anticipated to
be an extension of the current architecture, ostensibly based on refinements to the current modules. The algorithm is
benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA.
Medical image data from MRI scans are utilized for the experiments. Computational speedup is measured with
respect to Matlab implementation on a personal computer with an Intel i7 multi-core processor. Preliminary
simulation results indicate that a significant advantage in speed can be attained by the architectures, making the
algorithm viable for implementation in medical devices
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Conversion of available 2D data for release in 3D content is a hot topic for providers and for success of the 3D
applications, in general. It naturally completely relies on virtual view synthesis of a second view given by original 2D
video. Disparity map (DM) estimation is a central task in 3D generation but still follows a very difficult problem for
rendering novel images precisely. There exist different approaches in DM reconstruction, among them manually and
semiautomatic methods that can produce high quality DMs but they demonstrate hard time consuming and are
computationally expensive. In this paper, several hardware implementations of designed frameworks for an automatic
3D color video generation based on 2D real video sequence are proposed. The novel framework includes simultaneous
processing of stereo pairs using the following blocks: CIE L*a*b* color space conversions, stereo matching via
pyramidal scheme, color segmentation by k-means on an a*b* color plane, and adaptive post-filtering, DM estimation
using stereo matching between left and right images (or neighboring frames in a video), adaptive post-filtering, and
finally, the anaglyph 3D scene generation. Novel technique has been implemented on DSP TMS320DM648, Matlab’s
Simulink module over a PC with Windows 7, and using graphic card (NVIDIA Quadro K2000) demonstrating that the
proposed approach can be applied in real-time processing mode. The time values needed, mean Similarity Structural
Index Measure (SSIM) and Bad Matching Pixels (B) values for different hardware implementations (GPU, Single CPU,
and DSP) are exposed in this paper.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The SIFT algorithm is one of the most popular feature extraction methods and therefore widely used in all sort of video analysis tasks like instance search and duplicate/ near-duplicate detection. We present an efficient GPU implementation of the SIFT descriptor extraction algorithm using CUDA. The major steps of the algorithm are presented and for each step we describe how to efficiently parallelize it massively, how to take advantage of the unique capabilities of the GPU like shared memory / texture memory and how to avoid or minimize common GPU performance pitfalls. We compare the GPU implementation with the reference CPU implementation in terms of runtime and quality and achieve a speedup factor of approximately 3 - 5 for SD and 5 - 6 for Full HD video with respect to a multi-threaded CPU implementation, allowing us to run the SIFT descriptor extraction algorithm in real-time on SD video. Furthermore, quality tests show that the GPU implementation gives the same quality as the reference CPU implementation from the HessSIFT library. We further describe the benefits of GPU-accelerated SIFT descriptor calculation for video analysis applications such as near-duplicate video detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper discusses an Android app for the purpose of removing blur that is introduced as a result of handshakes when
taking images via a smartphone. This algorithm utilizes two images to achieve deblurring in a computationally efficient
manner without suffering from artifacts associated with deconvolution deblurring algorithms. The first image is the
normal or auto-exposure image and the second image is a short-exposure image that is automatically captured
immediately before or after the auto-exposure image is taken. A low rank approximation image is obtained by applying
singular value decomposition to the auto-exposure image which may appear blurred due to handshakes. This
approximation image does not suffer from blurring while incorporating the image brightness and contrast information.
The eigenvalues extracted from the low rank approximation image are then combined with those from the shortexposure
image. It is shown that this deblurring app is computationally more efficient than the adaptive tonal correction
algorithm which was previously developed for the same purpose.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Focusing at a moving object accurately is difficult and important to take photo of the target successfully in a digital
camera. Because the object often moves randomly and changes its shape frequently, position and distance of the target
should be estimated at real-time so as to focus at the objet precisely.
We propose a new method of real-time object tracking to do auto-focus for moving target in digital camera. Video
stream in the camera is used for the moving target tracking. Particle filter is used to deal with problem of the target
object’s random movement and shape change. Color and edge features are used as measurement of the object’s states.
Parallel processing algorithm is developed to realize real-time particle filter object tracking easily in hardware
environment of the digital camera.
Movement prediction algorithm is also proposed to remove focus error caused by difference between tracking result
and target object’s real position when the photo is taken.
Simulation and experiment results in digital camera demonstrate effectiveness of the proposed method. We embedded
real-time object tracking algorithm in the digital camera. Position and distance of the moving target is obtained
accurately by object tracking from the video stream. SIMD processor is applied to enforce parallel real-time processing.
Processing time less than 60ms for each frame is obtained in the digital camera with its CPU of only 162MHz.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For several years, face recognition has been a hot topic in the image processing field: this technique is applied in several domains such as CCTV, electronic devices delocking and so on. In this context, this work studies the efficiency of a wavelet-based face recognition method in terms of subject position robustness and performance on various systems. The use of wavelet transform has a limited impact on the position robustness of PCA-based face recognition. This work shows, for a well-known database (Yale face database B*), that subject position in a 3D space can vary up to 10% of the original ROI size without decreasing recognition rates. Face recognition is performed on approximation coefficients of the image wavelet transform: results are still satisfying after 3 levels of decomposition. Furthermore, face database size can be divided by a factor 64 (22K with K = 3). In the context of ultra-embedded vision systems, memory footprint is one of the key points to be addressed; that is the reason why compression techniques such as wavelet transform are interesting. Furthermore, it leads to a low-complexity face detection stage compliant with limited computation resources available on such systems. The approach described in this work is tested on three platforms from a standard x86-based computer towards nanocomputers such as RaspberryPi and SECO boards. For K = 3 and a database with 40 faces, the execution mean time for one frame is 0.64 ms on a x86-based computer, 9 ms on a SECO board and 26 ms on a RaspberryPi (B model).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Dynamic Adaptive Streaming over HTTP (DASH) standard is becoming increasingly popular for real-time adaptive
HTTP streaming of internet video in response to unstable network conditions. Integration of DASH streaming techniques
with the new H.265/HEVC video coding standard is a promising area of research. The performance of HEVC-DASH
systems has been previously evaluated by a few researchers using objective metrics, however subjective evaluation
would provide a better measure of the user’s Quality of Experience (QoE) and overall performance of the system.
This paper presents a subjective evaluation of an HEVC-DASH system implemented in a hardware testbed. Previous
studies in this area have focused on using the current H.264/AVC (Advanced Video Coding) or H.264/SVC (Scalable
Video Coding) codecs and moreover, there has been no established standard test procedure for the subjective evaluation
of DASH adaptive streaming. In this paper, we define a test plan for HEVC-DASH with a carefully justified data set
employing longer video sequences that would be sufficient to demonstrate the bitrate switching operations in response to
various network condition patterns. We evaluate the end user’s real-time QoE online by investigating the perceived
impact of delay, different packet loss rates, fluctuating bandwidth, and the perceived quality of using different DASH
video stream segment sizes on a video streaming session using different video sequences. The Mean Opinion Score
(MOS) results give an insight into the performance of the system and expectation of the users. The results from this
study show the impact of different network impairments and different video segments on users’ QoE and further analysis
and study may help in optimizing system performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Creating panoramic images has become a popular feature in modern smart phones, tablets, and digital cameras.
A user can create a 360 degree field-of-view photograph from only several images. Quality of the resulting
image is related to the number of source images, their brightness, and the used algorithm for their stitching
and blending. One of the algorithms that provides excellent results in terms of background color uniformity and
reduction of ghosting artifacts is the multi-band blending. The algorithm relies on decomposition of image into
multiple frequency bands using dyadic filter bank. Hence, the results are also highly dependant on the used filter
bank.
In this paper we analyze performance of the FIR filters used for multi-band blending. We present a set
of five filters that showed the best results in both literature and our experiments. The set includes Gaussian
filter, biorthogonal wavelets, and custom-designed maximally flat and equiripple FIR filters. The presented
results of filter comparison are based on several no-reference metrics for image quality. We conclude that
5/3 biorthogonal wavelet produces the best result in average, especially when its short length is considered.
Furthermore, we propose a real-time FPGA implementation of the blending algorithm, using 2D non-separable
systolic filtering scheme. Its pipeline architecture does not require hardware multipliers and it is able to achieve
very high operating frequencies. The implemented system is able to process 91 fps for 1080p (1920×1080) image
resolution.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An efficient parallel architecture design for the iris unwrapping process in a real-time iris recognition system using the
Bresenham Circle Algorithm is presented in this paper. Based on the characteristics of the model parameters this
algorithm was chosen over the widely used polar conversion technique as the iris unwrapping model. The architecture
design is parallelized to increase the throughput of the system and is suitable for processing an inputted image size of
320 × 240 pixels in real-time using Field Programmable Gate Array (FPGA) technology. Quartus software is used to
implement, verify, and analyze the design’s performance using the VHSIC Hardware Description Language. The
system’s predicted processing time is faster than the modern iris unwrapping technique used today∗.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we proposed a fast thumbnail extraction algorithm using partial decoding for HEVC. The proposed
algorithm reconstructs only 4x4 boundary and TU boundary needed for thumbnail image with partial decoding.
Experimental results show that proposed method significantly reduces the computational complexity and extraction time
for thumbnail. In addition, the visual quality of thumbnail image of proposed method had no big difference compared to
conventional method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This work is dedicated to the analysis of the forward and the inverse problem to obtain a better approximation to the Electrical Impedance Tomography equation. In this case, we employ for the forward problem the numerical method based on the Taylor series in formal power and for the inverse problem the Finite Element Method.
For the analysis of the forward problem, we proposed a novel algorithm, which employs a regularization technique for the stability, additionally the parallel computing is used to obtain the solution faster; this modification permits to obtain an efficient solution of the forward problem. Then, the found solution is used in the inverse problem for the approximation employing the Finite Element Method.
The algorithms employed in this work are developed in structural programming paradigm in C++, including parallel processing; the time run analysis is performed only in the forward problem because the Finite Element Method due to their high recursive does not accept parallelism.
Some examples are performed for this analysis, in which several conductivity functions are employed for two different cases: for the analytical cases: the exponential and sinusoidal functions are used, and for the geometrical cases the circle at center and five disk structure are revised as conductivity functions. The Lebesgue measure is used as metric for error estimation in the forward problem, meanwhile, in the inverse problem PSNR, SSIM, MSE criteria are applied, to determine the convergence of both methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Nowadays vision systems are used with countless purposes. Moreover, the motion estimation is a discipline that allow to extract relevant information as pattern segmentation, 3D structure or tracking objects. However, the real-time requirements in most applications has limited its consolidation, considering the adoption of high performance systems to meet response times. With the emergence of so-called highly parallel devices known as accelerators this gap has narrowed. Two extreme endpoints in the spectrum of most common accelerators are Field Programmable Gate Array (FPGA) and Graphics Processing Systems (GPU), which usually offer higher performance rates than general propose processors. Moreover, the use of GPUs as accelerators involves the efficient exploitation of any parallelism in the target application. This task is not easy because performance rates are affected by many aspects that programmers should overcome. In this paper, we evaluate OpenACC standard, a programming model with directives which favors porting any code to a GPU in the context of motion estimation application. The results confirm that this programming paradigm is suitable for this image processing applications achieving a very satisfactory acceleration in convolution based problems as in the well-known Lucas & Kanade method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes a prototype smart imager capable of adjusting the photo-integration time of multiple regions of interest concurrently, automatically and asynchronously with a single exposure period. The operation is supported by two intertwined photo-diodes at pixel level and two digital registers at the periphery of the pixel matrix. These registers divide the focal-plane into independent regions within which automatic concurrent adjustment of the integration time takes place. At pixel level, one of the photo-diodes senses the pixel value itself whereas the other, in collaboration with its counterparts in a particular ROI, senses the mean illumination of that ROI. Additional circuitry interconnecting both photo-diodes enables the asynchronous adjustment of the integration time for each ROI according to this sensed illumination. The sensor can be reconfigured on-the-fly according to the requirements of a vision algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years video traffic has become the dominant application on the Internet with global year-on-year increases in
video-oriented consumer services. Driven by improved bandwidth in both mobile and fixed networks, steadily reducing
hardware costs and the development of new technologies, many existing and new classes of commercial and industrial
video applications are now being upgraded or emerging. Some of the use cases for these applications include areas such
as public and private security monitoring for loss prevention or intruder detection, industrial process monitoring and
critical infrastructure monitoring. The use of video is becoming commonplace in defence, security, commercial,
industrial, educational and health contexts.
Towards optimal performances, the design or optimisation in each of these applications should be context aware and task
oriented with the characteristics of the video stream (frame rate, spatial resolution, bandwidth etc.) chosen to match the
use case requirements. For example, in the security domain, a task-oriented consideration may be that higher resolution
video would be required to identify an intruder than to simply detect his presence. Whilst in the same case, contextual
factors such as the requirement to transmit over a resource-limited wireless link, may impose constraints on the selection
of optimum task-oriented parameters.
This paper presents a novel, conceptually simple and easily implemented method of assessing video quality relative to its
suitability for a particular task and dynamically adapting videos streams during transmission to ensure that the task can
be successfully completed. Firstly we defined two principle classes of tasks: recognition tasks and event detection tasks.
These task classes are further subdivided into a set of task-related profiles, each of which is associated with a set of taskoriented
attributes (minimum spatial resolution, minimum frame rate etc.). For example, in the detection class, profiles
for intruder detection will require different temporal characteristics (frame rate) from those used for detection of high
motion objects such as vehicles or aircrafts. We also define a set of contextual attributes that are associated with each
instance of a running application that include resource constraints imposed by the transmission system employed and the
hardware platforms used as source and destination of the video stream. Empirical results are presented and analysed to
demonstrate the advantages of the proposed schemes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Video streaming and other multimedia applications account for an ever increasing proportion of all network traffic. The
recent adoption of High Efficiency Video Coding (HEVC) as the H.265 standard provides many opportunities for new
and improved services multimedia services and applications in the consumer domain. Since the delivery of version one
of H.265, the Joint Collaborative Team on Video Coding have been working towards standardisation of a scalable
extension (SHVC) to the H.265 standard and a series of range extensions and new profiles. As these enhancements are
added to the standard the range of potential applications and research opportunities will expend. For example the use of
video is also growing rapidly in other sectors such as safety, security, defence and health with real-time high quality
video transmission playing an important role in areas like critical infrastructure monitoring and disaster management.
Each of which may benefit from the application of enhanced HEVC/H.265 and SHVC capabilities.
The majority of existing research into HEVC/H.265 transmission has focussed on the consumer domain addressing
issues such as broadcast transmission and delivery to mobile devices with the lack of freely available tools widely cited
as an obstacle to conducting this type of research. In this paper we present a toolset which facilitates the transmission
and evaluation of HEVC/H.265 and SHVC encoded video on the popular open source NCTUns simulator. Our toolset
provides researchers with a modular, easy to use platform for evaluating video transmission and adaptation proposals on
large scale wired, wireless and hybrid architectures. The toolset consists of pre-processing, transmission, SHVC
adaptation and post-processing tools to gather and analyse statistics. It has been implemented using HM15 and SHM5,
the latest versions of the HEVC and SHVC reference software implementations to ensure that currently adopted
proposals for scalable and range extensions to the standard can be investigated.
We demonstrate the effectiveness and usability of our toolset by evaluating SHVC streaming and adaptation to meet
terminal constraints and network conditions in a range of wired, wireless, and large scale wireless mesh network
scenarios, each of which is designed to simulate a realistic environment. Our results are compared to those for
H264/SVC, the scalable extension to the existing H.264/AVC advanced video coding standard.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The recent explosion in video-related Internet traffic has been driven by the widespread use of smart mobile devices,
particularly smartphones with advanced cameras that are able to record high-quality videos. Although many of these
devices offer the facility to record videos at different spatial and temporal resolutions, primarily with local storage
considerations in mind, most users only ever use the highest quality settings. The vast majority of these devices are
optimised for compressing the acquired video using a single built-in codec and have neither the computational resources
nor battery reserves to transcode the video to alternative formats. This paper proposes a new low-complexity dynamic
resource allocation engine for cloud-based video transcoding services that are both scalable and capable of being
delivered in real-time. Firstly, through extensive experimentation, we establish resource requirement benchmarks for a
wide range of transcoding tasks. The set of tasks investigated covers the most widely used input formats (encoder type,
resolution, amount of motion and frame rate) associated with mobile devices and the most popular output formats
derived from a comprehensive set of use cases, e.g. a mobile news reporter directly transmitting videos to the TV
audience of various video format requirements, with minimal usage of resources both at the reporter’s end and at the
cloud infrastructure end for transcoding services.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The latest trend to access mobile cloud services through wireless network connectivity has amplified globally among
both entrepreneurs and home end users. Although existing public cloud service vendors such as Google, Microsoft Azure
etc. are providing on-demand cloud services with affordable cost for mobile users, there are still a number of challenges
to achieve high-quality mobile cloud based video applications, especially due to the bandwidth-constrained and errorprone
mobile network connectivity, which is the communication bottleneck for end-to-end video delivery. In addition,
existing accessible clouds networking architectures are different in term of their implementation, services, resources,
storage, pricing, support and so on, and these differences have varied impact on the performance of cloud-based real-time
video applications. Nevertheless, these challenges and impacts have not been thoroughly investigated in the literature.
In our previous work, we have implemented a mobile cloud network model that integrates localized and decentralized
cloudlets (mini-clouds) and wireless mesh networks. In this paper, we deploy a real-time framework consisting of
various existing Internet cloud networking architectures (Google Cloud, Microsoft Azure and Eucalyptus Cloud) and a
cloudlet based on Ubuntu Enterprise Cloud over wireless mesh networking technology for mobile cloud end users. It is
noted that the increasing trend to access real-time video streaming over HTTP/HTTPS is gaining popularity among both
research and industrial communities to leverage the existing web services and HTTP infrastructure in the Internet. To
study the performance under different deployments using different public and private cloud service providers, we employ
real-time video streaming over the HTTP/HTTPS standard, and conduct experimental evaluation and in-depth
comparative analysis of the impact of different deployments on the quality of service for mobile video cloud users.
Empirical results are presented and discussed to quantify and explain the different impacts resulted from various cloud
deployments, video application and wireless/mobile network setting, and user mobility. Additionally, this paper analyses
the advantages, disadvantages, limitations and optimization techniques in various cloud networking deployments, in
particular the cloudlet approach compared with the Internet cloud approach, with recommendations of optimized
deployments highlighted. Finally, federated clouds and inter-cloud collaboration challenges and opportunities are
discussed in the context of supporting real-time video applications for mobile users.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
While Denoising is an extensively studied task in signal processing research, most denoising methods are designed and evaluated using readily processed image data, e.g. the well-known Kodak data set. The noise model is usually additive white Gaussian noise (AWGN). This kind of test data does not correspond to nowadays real-world image data taken with a digital camera. Using such unrealistic data to test, optimize and compare denoising algorithms may lead to incorrect parameter tuning or suboptimal choices in research on real-time camera denoising algorithms. In this paper we derive a precise analysis of the noise characteristics for the different steps in the color processing. Based on real camera noise measurements and simulation of the processing steps, we obtain a good approximation for the noise characteristics. We further show how this approximation can be used in standard wavelet denoising methods. We improve the wavelet hard thresholding and bivariate thresholding based on our noise analysis results. Both the visual quality and objective quality metrics show the advantage of the proposed method. As the method is implemented using look-up-tables that are calculated before the denoising step, our method can be implemented with very low computational complexity and can process HD video sequences real-time in an FPGA.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the paper a novel filtering design based on the concept of exploration of the pixel neighborhood by digital paths is presented. The paths start from the boundary of a filtering window and reach its center. The cost of transitions between adjacent pixels is defined in the hybrid spatial-color space. Then, an optimal path of minimum total cost, leading from pixels of the window's boundary to its center is determined. The cost of an optimal path serves as a degree of similarity of the central pixel to the samples from the local processing window. If a pixel is an outlier, then all the paths starting from the window's boundary will have high costs and the minimum one will also be high. The filter output is calculated as a weighted mean of the central pixel and an estimate constructed using the information on the minimum cost assigned to each image pixel. So, first the costs of optimal paths are used to build a smoothed image and in the second step the minimum cost of the central pixel is utilized for construction of the weights of a soft-switching scheme. The experiments performed on a set of standard color images, revealed that the efficiency of the proposed algorithm is superior to the state-of-the-art filtering techniques in terms of the objective restoration quality measures, especially for high noise contamination ratios. The proposed filter, due to its low computational complexity, can be applied for real time image denoising and also for the enhancement of video streams.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper establishes a real-time auto-exposure method to guarantee that surveillance cameras in uncontrolled light conditions take advantage of their whole dynamic range while provide neither under nor overexposed images. State-of-the-art auto-exposure methods base their control on the brightness of the image measured in a limited region where the foreground objects are mostly located. Unlike these methods, the proposed algorithm establishes a set of indicators based on the image histogram that defines its shape and position. Furthermore, the location of the objects to be inspected is likely unknown in surveillance applications. Thus, the whole image is monitored in this approach. To control the camera settings, we defined a parameters function (Ef ) that linearly depends on the shutter speed and the electronic gain; and is inversely proportional to the square of the lens aperture diameter. When the current acquired image is not overexposed, our algorithm computes the value of Ef that would move the histogram to the maximum value that does not overexpose the capture. When the current acquired image is overexposed, it computes the value of Ef that would move the histogram to a value that does not underexpose the capture and remains close to the overexposed region. If the image is under and overexposed, the whole dynamic range of the camera is therefore used, and a default value of the Ef that does not overexpose the capture is selected. This decision follows the idea that to get underexposed images is better than to get overexposed ones, because the noise produced in the lower regions of the histogram can be removed in a post-processing step while the saturated pixels of the higher regions cannot be recovered.
The proposed algorithm was tested in a video surveillance camera placed at an outdoor parking lot surrounded by buildings and trees which produce moving shadows in the ground. During the daytime of seven days, the algorithm was running alternatively together with a representative auto-exposure algorithm in the recent literature. Besides the sunrises and the nightfalls, multiple weather conditions occurred which produced light changes in the scene: sunny hours that produced sharpen shadows and highlights; cloud coverages that softened the shadows; and cloudy and rainy hours that dimmed the scene. Several indicators were used to measure the performance of the algorithms. They provided the objective quality as regards: the time that the algorithms recover from an under or over exposure, the brightness stability, and the change related to the optimal exposure. The results demonstrated that our algorithm reacts faster to all the light changes than the selected state-of-the-art algorithm. It is also capable of acquiring well exposed images and maintaining the brightness stable during more time. Summing up the results, we concluded that the proposed algorithm provides a fast and stable auto-exposure method that maintains an optimal exposure for video surveillance applications. Future work will involve the evaluation of this algorithm in robotics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper a system for real-time recognition of objects in multidimensional video signals is proposed. Object
recognition is done by pattern projection into the tensor subspaces obtained from the factorization of the signal tensors
representing the input signal. However, instead of taking only the intensity signal the novelty of this paper is first to build
the Extended Structural Tensor representation from the intensity signal that conveys information on signal intensities, as
well as on higher-order statistics of the input signals. This way the higher-order input pattern tensors are built from the
training samples. Then, the tensor subspaces are built based on the Higher-Order Singular Value Decomposition of the
prototype pattern tensors. Finally, recognition relies on measurements of the distance of a test pattern projected into the
tensor subspaces obtained from the training tensors. Due to high-dimensionality of the input data, tensor based methods
require high memory and computational resources. However, recent achievements in the technology of the multi-core
microprocessors and graphic cards allows real-time operation of the multidimensional methods as is shown and analyzed
in this paper based on real examples of object detection in digital images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An effective color night vision system for ground vehicle navigation should operate in near real-time to be practical.
We described a system that uses a public database as a source of color information to colorize night vision imagery.
Such an approach presents several problems due to differences between acquired and reference imagery. Our system
performed registration, colorizing, and reference updating in near real-time in an effort to help drivers of ground
vehicles during night to see a colored view of a scene.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.