Otoacoustic emissions (OAEs) are sounds produced by the ear when the basilar membrane of the cochlea is stimulated by a sound. When the brain detects a sound, the "tension" of the basilar membrane decreases to make the detection even easier. This phenomenon is analogous to an active amplifier. Thus, when excited, the outer hair cells vibrate and produce a nearly inaudible sound that propagates out of the cochlea and into the middle ear. In addition to the vibrating hair cells, intrinsic mechanical irregularities cause additional reflections off the basilar membrane that propagate into the middle ear.
This report simulates and analyzes three different types of OAEs.
The Click Evoked OAE (CEOAE), also known as the Transient Evoked OAE (TEOAE), is measured with an impulse-like, broadband pulse as the stimulus. Since the stimulus is broadband, it contains all frequencies of interest. Additionally, it is possible to extract frequency-band specific OAEs due to the temporal dispersion of frequencies. This dispersion occurs because of the distances different frequency components have to travel to reach their characteristic place along the basilar membrane. It takes low frequencies longer to travel towards the apex and back again than it takes high frequencies to reach their basal location.
There is however an inherent challenge to measuring CEOAEs. Since the stimulus is broadband, it cannot simply be removed by filtering the measurement. The stimulus also rings for several milliseconds before decaying which overpowers the low-travel-time high frequencies. Since the OAE is a result of a non-linear, compressive basilar membrane response, the OAE can easily be extracted by using the derived non-linear residue (DNLR) technique.
The DNLR technique is useful for extracting the non-linear component of a system from it's linear components. The theory is simple: assuming a linear system H(x) where H is the system and x is the input, then H(x) + H(x) - H(2x) = 0 by linearity. However, in this case, H is non-linear and the result will be non-zero. In this case, the result, or residue, will be the OAE!
The Tone-burst OAE (TBOAE), also known as the Single Frequency OAE (SFOAE), is measured with a single tone as the stimulus. This method is useful for a measuring frequency specific OAEs as it maximizes the energy on the characteristic place of interest. The temporal dispersion of frequencies still effect the measured OAE, however, since the OAE contains mostly just the tone frequency, the dispersion phenomenon appears as latency.
The same issues with measuring the CEOAE arise when measuring the TBOAE. The DNLR technique can be used again to remove the stimulus artifact. In her paper, Kalluri mentions two more methods that exist for separating the true TBOAE from the stimulus -- one method is to use two-tone suppression, the other is to use a signal processing approach and apply spectral smoothing. The two-tone suppression takes advantage of the brain's unilateral control of the cochlear amplifier. Playing a higher tone at a higher level in one ear and the target tone in the other will result in the lack of an OAE generated for the target tone. This measurement can then be used against the normal measurement. The spectral smoothing approach attempts to extract the magnitude and phase of the OAE by convolving the measurement with a smoothing function.
References:
Kalluri, R., & Shera, C.A. (2007). Comparing stimulus-frequency otoacoustic emissions measured by compression, suppression and spectral smoothing. Journal of the Acoustical Society of America, 122, 3562-3575.
The Distortion Product OAE (DPOAE) makes use of the non-linear frequency distortions in the ear to measure the OAE for a frequency not present in the stimulus. The stimulus is made up of two tones played at the same time, for example, a 1kHz tone and a 1.2kHz tone. When processed by the inner ear, harmonic distortions occur. One of the most prevalent frequencies caused by the harmonic distortion is called the lower cubic distortion product. Since the induced OAE frequency is not present in the stimulus, the OAE can be extracted by applying either a high-order low-pass or band-pass filter. The equation for the lower cubic distortion product frequency can be found below.
Typically, DPOAEs are measured by sweeping two tones while maintaining the ratio of their instantaneous frequencies. The recording is then filtered using a sliding bandpass filter to extract the broadband OAE over time.
To simulate the OAEs, I am using the Auditory Modeling Toolbox (AMToolbox). The AMToolbox is an open-source Matlab toolbox that contains several psycho-acoustical models developed and introduced in highly-cited research papers. For simulating OAEs, I used the Verhulst transmission line model. This model captures all of the non-linear components that characterize the cochlea's compressive growth (the active amplifier) and the reflective response of the basilar membrane and simulates the OAE response.
References:
S. Verhulst, T. Dau, and C. A. Shera. Nonlinear time-domain cochlear model for transient stimulation and human otoacoustic emission. J. Acoust. Soc. Am., 132(6):3842 - 3848, 2012.
A. Altoe, S. Verhulst, and V. Pulkki. Transmission line cochlear models: improved accuracy and efficiency. J. Acoust. Soc. Am., 136(EL302), 2014.
Auditory Modeling Toolbox
The Matlab code written to run these simulations and generate these plots can be found here...
To simulate the CEOAEs, I created 100 microsecond pulses and used them as the input to the Verhulst model. In order to successfully capture the OAE, the stimulus itself must be removed from the recording. There are several ways to do this.
Change the model parameters
The Verhulst model is programmed so that the irregularities and non-linearities of the basilar membrane can be programmatically turned on or off during a simulation. If we run the model twice with the same pulse signal -- one with the irregularities on, and the other with the irregularities off -- subtracting the second output from the first output yields the OAE. This is referred to and plotted as the "Model Reference OAE".
Window the model output
Windowing the output signal is another approach to removing the stimulus artifact. In this case, windowing just kills the recorded output before the OAE is expected to arrive. In this case, I applied a rectangular window and zeroed out everything between t = 0ms to t = 5ms.
This method is easily the worst method for a couple reasons. First, windowing does not remove the residual effects of the stimulus artifact beyond t = 5ms. Second, the act of windowing induces spectral leakage (convolution in the frequency domain) that is evident in the frequency response of the Windowed OAE. The effect of leakage can be mitigated by selecting a different window but nevertheless leakage will occur.
Implement the DNLR technique
For the DNLR technique, three simulations needed to be run. The third simulation stimulus was a click in the opposite polarity and exactly +6dB greater than the other two stimuli. The OAE was then calculated by summing the three simulation outputs together leaving us with the non-linear, OAE component.
To simulate TBOAEs, I created several tone-bursts of the following frequencies: 500Hz, 1000Hz, 1500Hz, 2000Hz, 3000Hz, 4000Hz, and 6000Hz. Each tone burst was made to be 10ms long such that each stimulus contains the same amount of energy. The TBOAE can then be calculated two different ways: changing the model parameters and by using the DNLR technique.
In order to simulate DPOAEs, the stimulus signal needs to be generated by combining two tones of different frequencies. It's important to note that the combination is a weighted sum of the two tones. The lower frequency tone is played at 70dB-SPL while the higher frequency tone is played at 60dB-SPL. The DPOAE was then extracted from the model output by applying a high-order, FIR bandpass filter centered about the cubic distortion frequency. The filter was designed using the window method. The table below shows the stimulus frequencies used and the resulting lower cubic distortion frequency.
f1 | f2 | 2f1 - f2 |
---|---|---|
500Hz | 610Hz | 390Hz |
1000Hz | 1220Hz | 780Hz |
1500Hz | 1830Hz | 1170Hz |
2000Hz | 2440Hz | 1560Hz |
3000Hz | 3660Hz | 2340Hz |
4000Hz | 4880Hz | 3120Hz |
6000Hz | 7320Hz | 4680Hz |
Similar to the Click Evoked OAEs, the Distortion Product OAEs can be used to measure a broadband otoacoustic emission. However, unlike the CEOAEs, the stimulus does not act instantaneously. For a broadband DPOAE, a swept tone consisting of two different frequencies is used as the stimulus. In order to extract the resulting OAE, the recording must be filtered using a high-order bandpass filter that tracks the cubic distortion frequency over the duration of the sweep. Below is my attempted at implementing this approach for measuring a swept DPOAE.