KEYWORDS: Databases, Digital watermarking, Quantization, Steganography, Video, Fourier transforms, Transmitters, Data conversion, Amplifiers, Systems modeling
A method of embedding data in an audio signal using cepstral domain modification is described. Based on successful
embedding in the spectral points of perceptually masked regions in each frame of speech, first the technique was
extended to embedding in the log spectral domain. This extension resulted at approximately 62 bits /s of embedding
with less than 2 percent of bit error rate (BER) for a clean cover speech (from the TIMIT database), and about 2.5
percent for a noisy speech (from an air traffic controller database), when all frames - including silence and transition
between voiced and unvoiced segments - were used. Bit error rate increased significantly when the log spectrum in the
vicinity of a formant was modified.
In the next procedure, embedding by altering the mean cepstral values of two ranges of indices was studied. Tests on
both a noisy utterance and a clean utterance indicated barely noticeable perceptual change in speech quality when lower
range of cepstral indices - corresponding to vocal tract region - was modified in accordance with data. With an
embedding capacity of approximately 62 bits/s - using one bit per each frame regardless of frame energy or type of
speech - initial results showed a BER of less than 1.5 percent for a payload capacity of 208 embedded bits using the
clean cover speech. BER of less than 1.3 percent resulted for the noisy host with a capacity was 316 bits. When the
cepstrum was modified in the region of excitation, BER increased to over 10 percent. With quantization causing no
significant problem, the technique warrants further studies with different cepstral ranges and sizes. Pitch-synchronous
cepstrum modification, for example, may be more robust to attacks. In addition, cepstrum modification in regions of
speech that are perceptually masked - analogous to embedding in frequency masked regions - may yield imperceptible
stego audio with low BER.
KEYWORDS: Steganography, Receivers, Global system for mobile communications, Data hiding, Quantization, Databases, Data communications, Video, Digital watermarking, Visualization
This paper presents the results of embedding short covert message utterances on a host, or cover, utterance by modifying the phase or amplitude of perceptually masked or significant regions of the host. In the first method, the absolute phase at selected, perceptually masked frequency indices was changed to fixed, covert data-dependent values. Embedded bits were retrieved at the receiver from the phase at the selected frequency indices. Tests on embedding a GSM-coded covert utterance on clean and noisy host utterances showed no noticeable difference in the stego compared to the hosts in speech quality or spectrogram. A bit error rate of 2 out of 2800 was observed for a clean host utterance while no error occurred for a noisy host. In the second method, the absolute phase of 10 or fewer perceptually significant points in the host was set in accordance with covert data. This resulted in a stego with successful data retrieval and a slightly noticeable degradation in speech quality. Modifying the amplitude of perceptually significant points caused perceptible differences in the stego even with small changes of amplitude made at five points per frame. Finally, the stego obtained by altering the amplitude at perceptually masked points showed barely noticeable differences and excellent data recovery.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.