KEYWORDS: Genetics, Data hiding, Organisms, Data communications, Proteins, Molecules, Digital watermarking, Computer programming, Steganography, Bacteria
A number of methods have been proposed over the last decade for embedding information within deoxyribonucleic
acid (DNA). Since a DNA sequence is conceptually equivalent to a unidimensional digital signal, DNA data
embedding (diversely called DNA watermarking or DNA steganography) can be seen either as a traditional
communications problem or as an instance of communications with side information at the encoder, similar to
data hiding. These two cases correspond to the use of noncoding or coding DNA hosts, which, respectively, denote
DNA segments that cannot or can be translated into proteins. A limitation of existing DNA data embedding
methods is that none of them have been designed according to optimal coding principles. It is not possible either
to evaluate how close to optimality these methods are without determining the Shannon capacity of DNA data
embedding. This is the main topic studied in this paper, where we consider that DNA sequences may be subject
to substitution, insertion, and deletion mutations.
KEYWORDS: Distortion, Stochastic processes, Digital watermarking, Fourier transforms, Data hiding, Process modeling, Image quality, Image enhancement, Signal to noise ratio, Steganography
Desynchronization attacks based on fine resampling of a watermarked signal can be very effective from the point of view of degrading decoding performance. Nevertheless, the actual perceptual impact brought about by these attacks has not been considered in enough depth in previous research. In this work, we investigate geometric distortion measures which aim at being simultaneously general, related to human perception, and easy to compute in stochastic contexts. Our approach is based on combining the stochastic characterization of the sampling grid jitter applied by the attacker with empirically relevant perceptual measures. Using this procedure, we show that the variance of the sampling grid, which is a customary geometric distortion measure, has to be weighted in order to carry more accurate perceptual meaning. Indeed, the spectral characteristics of the geometric jitter signal have to be relevant from a perceptual point of view, as intuitively seen when comparing constant shift resampling and white jitter resampling. Finally, as the geometric jitter signal does not describe in full the resampled signal, we investigate more accurate approaches to producing a geometric distortion measure that takes into account the amplitude modifications due to resampling.
KEYWORDS: Distortion, Multimedia, Digital watermarking, Computer programming, Data hiding, Signal detection, Computer security, Signal processing, Quantization, Information security
This work deals with practical and theoretical issues raised by the information-theoretical framework for authentication
with distortion constraints proposed by Martinian et al.1 The optimal schemes proposed by these
authors rely on random codes which bear close resemblance to the dirty-paper random codes which show up
in data hiding problems. On the one hand, this would suggest to implement practical authentication methods
employing lattice codes, but these are too easy to tamper with within authentication scenarios. Lattice codes
must be randomized in order to hide their structure. One particular multimedia authentication method based
on randomizing the scalar lattice was recently proposed by Fei et al.2 We reexamine here this method under the
light of the aforementioned information-theoretical study, and we extend it to general lattices thus providing a
more general performance analysis for lattice-based authentication. We also propose improvements to Fei et al.'s
method based on the analysis by Martinian et al., and we discuss some weaknesses of these methods and their
solutions.
KEYWORDS: Quantization, Modulation, Optical spheres, Digital watermarking, Data hiding, Signal to noise ratio, Error analysis, Chemical elements, Matrices, Binary data
Spread-Transform Dither Modulation (STDM) is a side-informed data hiding method based on the quantization of a linear projection of the host signal. This projection affords a signal to noise ratio gain which is exploited by Dither Modulation (DM) in the projected domain. Similarly, it is possible to use to the same end the signal to noise ratio gain afforded by the so-called sphere-hardening effect on the norm of a vector. In this paper we describe the Sphere-hardening Dither Modulation (SHDM) data hiding method, which is based on the application of DM to the magnitude of a host signal vector, and we give an analysis of its characteristics. It shown that, in the same sense as STDM can be deemed to be the side-informed counterpart of additive spread spectrum (SS) with repetition coding, SHDM is the side-informed counterpart of multiplicative SS with repetition. Indeed, we demonstrate that SHDM performs similarly as STDM in front of additive independent distortions, but with the particularity that this is achieved through different quantization regions. The quantization hyperplanes which characterize STDM are replaced by quantization spheres in SHDM. The issue of securing SHDM is also studied.
Steganographic embedding is generally guided by two performance constraints at the encoder. Firstly, as is typical in the field of watermarking, all the transmission codewords must conform to an average power constraint. Secondly, for the embedding to be statistically undetectable (secure), it is required that the density of the watermarked signal must be equal to the density of the host signal. Assuming that this is not the case, statistical steganalysis will have a probability of detection error less than 1/2 and the communication may be terminated. Recent work has shown that some common watermarking algorithms can be modified such that both constraints are met. In particular, spread spectrum (SS) communication can be secured by a specific scaling of the host before embedding. Also, a side informed scheme called stochastic quantization index modulation (SQIM), maintains security with the use of an additive stochastic element during the embedding. In this work the performance of both techniques is analysed under the AWGN channel assumption. It will be seen that the robustness of both schemes is lessened by the steganographic constraints, when compared to the standard algorithms on which they are based. Specifically, the probability of decoding error in the SS technique increases when security is required, and the achievable rate of SQIM is shown to be lower than that of dither modulation (on which the scheme is based) for a finite alphabet size.
In this paper we present a statistical analysis of a particular audio fingerprinting method proposed by Haitsma et al.1 Due to the excellent robustness and synchronisation properties of this particular fingerprinting method, we would like to examine its performance for varying values of the parameters involved in the computation and ascertain its capabilities. For this reason, we pursue a statistical model of the fingerprint (also known as a hash, message digest or label). Initially we follow the work of a previous attempt made by Doets and Lagendijk2-4 to obtain such a statistical model. By reformulating the representation of the fingerprint as a quadratic form, we present a model in which the parameters derived by Doets and Lagendijk may be obtained more easily. Furthermore, our model allows further insight into certain aspects of the behaviour of the fingerprinting algorithm not previously examined. Using our model, we then analyse the probability of error (Pe) of the hash. We identify two particular error scenarios and obtain an expression for the probability of error in each case. We present three methods of varying accuracy to approximate Pe following Gaussian noise addition to the signal of interest. We then analyse the probability of error following desynchronisation of the signal at the input of the hashing system and provide an approximation to Pe for different parameters of the algorithm under varying degrees of desynchronisation.
The vulnerability of quantization-based data hiding schemes to amplitude scaling has required the formulation of countermeasures to this relatively simple attack. Parameter estimation is one approach, where the applied scaling is estimated from the received signal at the decoder. As scaling of the watermarked signal creates a mismatch with respect to the quantization step assumed by the decoder, this estimate can be used to correct the mismatch prior to decoding. In this work we first review previous approaches utilizing parameter estimation as a means of combating the scaling attack on DC-DM. We then present a method for maximum likelihood estimation of the scaling factor for this quantization-based method. Using iteratively decodable codes in conjunction with DC-DM, the estimation method exploits the reliabilities provided by the near-optimal decoding process in order to iteratively refine the estimate of the applied scaling. By performing estimation in cooperation with the decoding process, the complexity of which is tackled using the expectation maximization algorithm, reliable estimation is possible at very low watermark-to-noise power ratios by using sufficiently low rate codes.
Digital steganography is the art of hiding information in multimedia
content, such that it remains perceptually and statistically unchanged. The detection of such covert communication is referred to as steganalysis. To date, steganalysis research has focused primarily on either, the extraction of features from a document that are sensitive to the embedding, or the inference of some statistical difference between marked and unmarked objects. In this work, we evaluate the statistical limits of such techniques by developing asymptotically optimal tests (Maximum Likelihood) for a number of side informed embedding schemes. The required probability density functions (pdf) are derived for Dither Modulation (DM) and Distortion-Compensated Dither Modulation (DC-DM/SCS) from an steganalyst's point of view. For both embedding techniques, the pdfs are derived in the presence and absence of a secret dither key. The resulting tests are then compared to a robust blind steganalytic test based on feature extraction. The performance of the tests is evaluated using an integral measure and receiver operating characteristic (ROC) curves.
Compact representation of perceptually relevant parts of multimedia data, referred to as robust hashing or fingerprinting, is often used for efficient retrieval from databases and authentication. In previous work, we introduced a framework for robust hashing which improves the performance of any particular feature extraction method. The hash generation was achieved from a feature vector in three distinct stages, namely: quantization, bit assignment and application of the decoding stage of an error correcting code. Results were obtained for unidimensional quantization and bit assignment, on one code only. In this work, we provide a generalisation of those techniques to higher dimensions. Our framework is analysed under different conditions at each stage. For the quantization, we consider both the case where the codevectors are uniformly and nonuniformly distributed. For multidimensional quantizers, bit assignment to the resulting indexes is a non-trivial task and a number of techniques are evaluated. We show that judicious assignment of binary indices to the codevectors of the quantizer improves the performance of the hashing method. Finally, the robustness provided by a number of different channel codes is evaluated.
KEYWORDS: Expectation maximization algorithms, Data hiding, Distortion, Digital watermarking, Computer programming, Forward error correction, Lead, Reliability, Quantization, Signal to noise ratio
Distortion-Compensated Dither Modulation (DC-DM), also known as Scalar Costa Scheme (SCS), has been theoretically shown to be near-capacity achieving thanks to its use of side information at the encoder. In practice, channel coding is needed in conjunction with this quantization-based scheme in order to approach the achievable rate limit. The most powerful coding methods use iterative decoding (turbo codes, LDPC), but they require knowledge of the channel model. Previous works on the subject have assumed the latter to be known by the decoder. We investigate here the possibility of undertaking blind iterative decoding of DC-DM, using maximum likelihood estimation of the channel model within the decoding procedure. The unknown attack is assumed to be i.i.d. and additive. Before each iterative decoding step, a new optimal estimation of the attack model is made using the reliability information provided by the previous step. This new model is used for the next iterative decoding stage, and the procedure is repeated until convergence. We show that the iterative Expectation-Maximization algorithm is suitable for solving the problem posed by model estimation, as it can be conveniently intertwined with iterative decoding.
Data hiding using quantization has revealed as an effective way of taking into account side information at the encoder. When quantizing more than one host signal samples there are two choices: (1) using the Cartesian product of several one-dimensional quantizers, as made in Scalar Costa Scheme (SCS); or (2) performing vectorial quantization. The second option seems better, as rate-distortion theory affirms that higher dimensional quantizers yield improved performance due to better sphere-packing properties. Although the embedding problem does resemble that of rate-distortion, no attacks or host signal characteristics are usually considered when designing the quantizer in this way. We show that attacks worsen the performance of the a priori optimal lattice quantizer through a counterexample: the comparison under Gaussian distortion of hexagonal lattice quantization against bidimensional Distortion-Compensated Quantized Projection (DC-QP), a data hiding alternative based in quantizing a linear projection of the host signal. Apart from empirical comparisons, theoretical lower bounds on the probability of decoding error of hexagonal lattices under Gaussian host signal and attack are provided and compared to the already analyzed DC-QP method.
The performance of data hiding techniques in still images may be greatly improved by means of coding. In previous approaches repetition coding was firstly used to obtain N identical Gaussian channels over which block and convolutional coding were used. But knowing that repetition coding can be improved we may turn our attention to coding forwardly at the sample level. Bounds for both hard and soft decision decoding performance are provided and the use of concatenated coding and turbo coding for this approach is explored.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.