For decades, Discrete Cosine Transform (DCT) has been a crucial role for video and image compression since Chen and Pratt proposed image compression application based on DCT. The energy compaction property of DCT is highly efficient for the compression when combined with entropy coder and a specific scan order. By exploiting the property, DCT has widely been used for video and image compression from JPEG, which is the famous image compression format, to High Efficiency Video Coding (HEVC), which is the latest video compression standard, over the 20 years. Since DCT has been used for image compression, several transforms have been proposed for the better compression performance than DCT. Among them, the most famous transform is Karhunen–Loève transform (KLT). The KLT has the best performance in the aspects of the energy compaction. However, the KLT must send the extra information of transform basis, which is not required in DCT, therefore its compression performance is worse and complexity is heavier than DCT. To achieve the energy compaction performance of KLT without extra information, we propose the machine learning network, TransNet, for image/video transform. TransNet is trained to achieve the better energy compaction property than DCT and maintain the image quality simultaneously. To find the optimal point between reconstructed image quality and energy compaction, we propose new loss function based the orthogonal transform property and regularization term. To evaluate the compression performance of the proposed network, we compared DCT and TransNet using JPEG encoder. In terms of the BD-rate on Peak Signal to Noise Ratio (PSNR), the proposed network shows about 11% gain compared with DCT.
In the past, video codecs such as vc-1 and H.263 used a technique to encode reduced-resolution video and restore original resolution from the decoder for improvement of coding efficiency. The techniques of vc-1 and H.263 Annex Q are called dynamic frame resizing and reduced-resolution update mode, respectively. However, these techniques have not been widely used due to limited performance improvements that operate well only under specific conditions. In this paper, video frame resizing (reduced/restore) technique based on machine learning is proposed for improvement of coding efficiency. The proposed method features video of low resolution made by convolutional neural network (CNN) in encoder and reconstruction of original resolution using CNN in decoder. The proposed method shows improved subjective performance over all the high resolution videos which are dominantly consumed recently. In order to assess subjective quality of the proposed method, Video Multi-method Assessment Fusion (VMAF) which showed high reliability among many subjective measurement tools was used as subjective metric. Moreover, to assess general performance, diverse bitrates are tested. Experimental results showed that BD-rate based on VMAF was improved by about 51% compare to conventional HEVC. Especially, VMAF values were significantly improved in low bitrate. Also, when the method is subjectively tested, it had better subjective visual quality in similar bit rate.
KEYWORDS: High dynamic range imaging, Visualization, Visual analytics, Statistical analysis, Computer programming, Video coding, Video, Quantization, Video compression, Visual compression
In this paper, the visual quality of different solutions for high dynamic range (HDR) compression using MPEG test contents is analyzed. We also simulate the method for an efficient HDR compression which is based on statistical property of the signal. The method is compliant with HEVC specification and also easily compatible with other alternative methods which might require HEVC specification changes. It was subjectively tested on commercial TVs and compared with alternative solutions for HDR coding. Subjective visual quality tests were performed using SUHD TVs model which is SAMSUNG JS9500 with maximum luminance up to 1000nit in test. The solution that is based on statistical property shows not only improvement of objective performance but improvement of visual quality compared to other HDR solutions, while it is compatible with HEVC specification.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.