Applying Speech Enhancement to Audio Surveillance

    Volume 35, Issue 5 (September 1990)

    ISSN: 0022-1198

    CODEN: JFSOAD

    Page Count: 10


    Chu, C-C
    Professor, professor, and scientists, INRS-Telecommunications, Université du Québec, Nuns Island, Quebec

    Bernardi, D
    Professor, professor, and scientists, INRS-Telecommunications, Université du Québec, Nuns Island, Quebec

    O'Shaughnessy, D
    Professor, professor, and scientists, INRS-Telecommunications, Université du Québec, Nuns Island, Quebec

    Moncet, J-L
    Professor, professor, and scientists, INRS-Telecommunications, Université du Québec, Nuns Island, Quebec

    Barbeau, L
    Professor, professor, and scientists, INRS-Telecommunications, Université du Québec, Nuns Island, Quebec

    Kabal, P
    Professor, professor, and scientists, INRS-Telecommunications, Université du Québec, Nuns Island, Quebec

    (Received 10 May 1989; accepted 15 September 1989)

    Abstract

    Audio surveillance tapes are prime candidates for speech enhancement because of the many degradations and sources of interference that mask the speech signals on such tapes. In this paper, the authors describe ways to cancel interference when an available reference signal is not synchronized with the surveillance recording, for example, when the reference is obtained later from a phonograph record or an air check recording from a broadcast source. As a specific example, we discuss our experiences processing a wiretap recording used in an actual court case. We transformed the reference signal to reflect room and transmission effects and then subtracted the resulting secondary signal from the primary intercept signal, thus enhancing the speech of the desired talkers by removing interfering sounds. Before the secondary signal could be subtracted, the signals had to be aligned properly in time. The intercept signal was subjected to time-scale modifications made necessary by the varying phonograph and tape recorder speeds. While these speed differences are usually small enough not to affect the perceived quality, they adversely affect the ability to cancel interference automatically. In working with recording devices, we took into account four factors that affect the signal quality: the frequency response, nonlinear distortion, noise, and speed variations. The two methods that were most successful for enhancement were the least-mean-squares (LMS) adaptive cancellation and spectral subtraction.


    Paper ID: JFS12940J

    DOI: 10.1520/JFS12940J

    ASTM International
    is a member of CrossRef.

    Author
    Title Applying Speech Enhancement to Audio Surveillance
    Symposium , 0000-00-00
    Committee E30