Abstract: Information about relevant acoustic parameters such as formant frequency values, direction and duration of formant transitions, stop burst/aspiration duration etc., were exploited in synthesizing the labial-velar stops [kp] and [gb] from a sequence of [stop + w] (where the stop is either labial, alveolar or velar) in Edo disyllabic utterances of the VCCV type. Thus, combining the stop closure phase of labial, alveolar or velar stops with the reduced duration of the formant transitions of the labial-velar glide through editing of the acoustic signal, perceptually distinct labial-velar stops were obtained. These findings are in agreement with results of earlier studies on speech perception in respect of the fact that the perceptual information necessary for the discrimination of the stop consonants are located at the release portion of the stops. Results of the present study clearly demonstrate the possibility of acoustically editing sounds of natural speech tokens to produce speech materials of comparable perceptual quality as those of natural speech. Thus, the availability of highly sophisticated electronic hard and software which could possibly be applied in evidence tampering with respect to recorded speech materials (without any appreciable distortion of the original signal) presented in some criminal litigation (especially those involving disputed utterances), calls for extra caution in handling issues relating to tape authentication in a forensic phonetic investigation. Moreover, the relative stability of the third formant frequency (F3) obtained at the steady-state portion in all vowel contexts for the subject used in this study (between 2494Hz and 2581Hz) seems to suggest that this might be a possible robust cue for speaker identification. More studies in this respect, using more subjects, need to be carried out to support this claim
