Introduction

Otoacoustic emissions (OAEs) are sounds produced by the ear when the basilar membrane of the cochlea is stimulated by a sound. When the brain detects a sound, the "tension" of the basilar membrane decreases to make the detection even easier. This phenomenon is analogous to an active amplifier. Thus, when excited, the outer hair cells vibrate and produce a nearly inaudible sound that propagates out of the cochlea and into the middle ear. In addition to the vibrating hair cells, intrinsic mechanical irregularities cause additional reflections off the basilar membrane that propagate into the middle ear.
This report simulates and analyzes three different types of OAEs.

Click Evoked OAE

The Click Evoked OAE (CEOAE), also known as the Transient Evoked OAE (TEOAE), is measured with an impulse-like, broadband pulse as the stimulus. Since the stimulus is broadband, it contains all frequencies of interest. Additionally, it is possible to extract frequency-band specific OAEs due to the temporal dispersion of frequencies. This dispersion occurs because of the distances different frequency components have to travel to reach their characteristic place along the basilar membrane. It takes low frequencies longer to travel towards the apex and back again than it takes high frequencies to reach their basal location.

There is however an inherent challenge to measuring CEOAEs. Since the stimulus is broadband, it cannot simply be removed by filtering the measurement. The stimulus also rings for several milliseconds before decaying which overpowers the low-travel-time high frequencies. Since the OAE is a result of a non-linear, compressive basilar membrane response, the OAE can easily be extracted by using the derived non-linear residue (DNLR) technique.

The DNLR technique is useful for extracting the non-linear component of a system from it's linear components. The theory is simple: assuming a linear system H(x) where H is the system and x is the input, then H(x) + H(x) - H(2x) = 0 by linearity. However, in this case, H is non-linear and the result will be non-zero. In this case, the result, or residue, will be the OAE!

Tone-burst OAE

The Tone-burst OAE (TBOAE), also known as the Single Frequency OAE (SFOAE), is measured with a single tone as the stimulus. This method is useful for a measuring frequency specific OAEs as it maximizes the energy on the characteristic place of interest. The temporal dispersion of frequencies still effect the measured OAE, however, since the OAE contains mostly just the tone frequency, the dispersion phenomenon appears as latency.

The same issues with measuring the CEOAE arise when measuring the TBOAE. The DNLR technique can be used again to remove the stimulus artifact. In her paper, Kalluri mentions two more methods that exist for separating the true TBOAE from the stimulus -- one method is to use two-tone suppression, the other is to use a signal processing approach and apply spectral smoothing. The two-tone suppression takes advantage of the brain's unilateral control of the cochlear amplifier. Playing a higher tone at a higher level in one ear and the target tone in the other will result in the lack of an OAE generated for the target tone. This measurement can then be used against the normal measurement. The spectral smoothing approach attempts to extract the magnitude and phase of the OAE by convolving the measurement with a smoothing function.

References:
Kalluri, R., & Shera, C.A. (2007). Comparing stimulus-frequency otoacoustic emissions measured by compression, suppression and spectral smoothing. Journal of the Acoustical Society of America, 122, 3562-3575.

Distortion Product OAE

The Distortion Product OAE (DPOAE) makes use of the non-linear frequency distortions in the ear to measure the OAE for a frequency not present in the stimulus. The stimulus is made up of two tones played at the same time, for example, a 1kHz tone and a 1.2kHz tone. When processed by the inner ear, harmonic distortions occur. One of the most prevalent frequencies caused by the harmonic distortion is called the lower cubic distortion product. Since the induced OAE frequency is not present in the stimulus, the OAE can be extracted by applying either a high-order low-pass or band-pass filter. The equation for the lower cubic distortion product frequency can be found below.

Typically, DPOAEs are measured by sweeping two tones while maintaining the ratio of their instantaneous frequencies. The recording is then filtered using a sliding bandpass filter to extract the broadband OAE over time.

Simulation of CEOAE

To simulate the CEOAEs, I created 100 microsecond pulses and used them as the input to the Verhulst model. In order to successfully capture the OAE, the stimulus itself must be removed from the recording. There are several ways to do this.

Change the model parameters

The Verhulst model is programmed so that the irregularities and non-linearities of the basilar membrane can be programmatically turned on or off during a simulation. If we run the model twice with the same pulse signal -- one with the irregularities on, and the other with the irregularities off -- subtracting the second output from the first output yields the OAE. This is referred to and plotted as the "Model Reference OAE".
Window the model output

Windowing the output signal is another approach to removing the stimulus artifact. In this case, windowing just kills the recorded output before the OAE is expected to arrive. In this case, I applied a rectangular window and zeroed out everything between t = 0ms to t = 5ms.

This method is easily the worst method for a couple reasons. First, windowing does not remove the residual effects of the stimulus artifact beyond t = 5ms. Second, the act of windowing induces spectral leakage (convolution in the frequency domain) that is evident in the frequency response of the Windowed OAE. The effect of leakage can be mitigated by selecting a different window but nevertheless leakage will occur.
Implement the DNLR technique

For the DNLR technique, three simulations needed to be run. The third simulation stimulus was a click in the opposite polarity and exactly +6dB greater than the other two stimuli. The OAE was then calculated by summing the three simulation outputs together leaving us with the non-linear, OAE component.

Simulation of DPOAE

In order to simulate DPOAEs, the stimulus signal needs to be generated by combining two tones of different frequencies. It's important to note that the combination is a weighted sum of the two tones. The lower frequency tone is played at 70dB-SPL while the higher frequency tone is played at 60dB-SPL. The DPOAE was then extracted from the model output by applying a high-order, FIR bandpass filter centered about the cubic distortion frequency. The filter was designed using the window method. The table below shows the stimulus frequencies used and the resulting lower cubic distortion frequency.

f1 f2 2f1 - f2

500Hz 610Hz 390Hz

1000Hz 1220Hz 780Hz

1500Hz 1830Hz 1170Hz

2000Hz 2440Hz 1560Hz

3000Hz 3660Hz 2340Hz

4000Hz 4880Hz 3120Hz

6000Hz 7320Hz 4680Hz

f1	f2	2f1 - f2
500Hz	610Hz	390Hz
1000Hz	1220Hz	780Hz
1500Hz	1830Hz	1170Hz
2000Hz	2440Hz	1560Hz
3000Hz	3660Hz	2340Hz
4000Hz	4880Hz	3120Hz
6000Hz	7320Hz	4680Hz

Bandpass Filter Responses
DPOAE Impulse Response
DPOAE Frequency Response
DPOAE Distortion Frequency Energy

Swept DPOAE

Similar to the Click Evoked OAEs, the Distortion Product OAEs can be used to measure a broadband otoacoustic emission. However, unlike the CEOAEs, the stimulus does not act instantaneously. For a broadband DPOAE, a swept tone consisting of two different frequencies is used as the stimulus. In order to extract the resulting OAE, the recording must be filtered using a high-order bandpass filter that tracks the cubic distortion frequency over the duration of the sweep. Below is my attempted at implementing this approach for measuring a swept DPOAE.

Signal Generation
First, two sine sweeps, s1 and s2 are generated independently. The first tone, s1, is swept from 500Hz to 6kHz. The second tone, s2, is swept from 610Hz to 7.32kHz. The tones are then combined using a weighted sum; s2 is 10dB lower than s1. The two tone sweeps are the given to the Verhulst model as the stimulus signal.
Identifying and Designing BPFs
In order to extract the DPOAE, I used a pseudo-filterbank approach. The model output signal is processed using the STFT and each resulting frame is filtered using a different BPF. Before applying the filters, the filters' parameters needed to first be identified and used to realize the filters' impulse response. The bandwidth of each frame's corresponding BPF was calculated using the instantaneous lower cubic distortion frequency and the instantaneous s1 frequency.
Applying the BPF Filterbank and OAE Reconstruction using OLA
When doing STFT processing, overlap-and-add (OLA) is required in order to achieve no time-domain aliasing or truncation of the output signal. A hamming window was applied to the model output to obtain the N-th frame. The N-th frame or windowing segment was filtered using the N-th BPF of the filterbank. The output of the N-th BPF was then accumulated in the resulting OAE vector. The animation below shows the OLA process.
DPOAE Spectrogram
Looking at the Spectrogram of the resulting swept DPOAE we can see that the s1 and s2 frequencies are removed however, the sweep looks to be more spread out over time. This can be explained by two factors. First, the post-ringing of the BPF smears the outputs over time. Second, the scaling of the LFAT spectrogram is terrible -- time-frequency coordinates that are -100dB(!!) are still a yellow-ish color.

Otoacoustic Emissions

Evan Shenkman

Introduction

Auditory Modeling Toolbox and the Verhulst Ear Model

Source Code

Simulation of CEOAE

Simulation of TBOAE

Simulation of DPOAE

Swept DPOAE