【BI&AI】Lecture 5 - Auditory system

Lecture 5 - Auditory system

Terminology

auditory system
pinna
auditory canal
tympanic membrane
cochlea
ossicles
auditory-vestibular nerve auditory system auditory
window
attenuation reflex
tensor tympani muscle
stapedius muscle perilymph
perilymph
endolymph lymphatic
basilar membrane
organ of Corti
inner hair cells
outer hair cells
stereocilia
spiral ganglion cells
frequency tuning frequency tuning
Tonotopy audio topology
Phase locking phase locking
sound localization sound localization

Course Outline

Insert image description here

property of sounds

Q: How is sound produced?
A: Sound is produced by the vibration or vibration of an object. When an object vibrates, it creates mechanical waves in the surrounding medium, such as air, water, or solids. These waves are called sound waves. Sound waves travel through interactions between molecules or particles in a medium, eventually reaching our ears where we perceive these sound waves through hearing.
Insert image description here

Frequency/Pitch(frequency/pitch)

The frequency of sound refers to the periodic characteristics of sound wave vibration, indicating the number of sound wave vibrations per second . It is usually 赫兹(Hz)expressed in units. The higher the frequency, the higher pitched the sound; the lower the frequency, the lower pitched the sound .
Insert image description here

Intensity/Loudness(intensity/loudness)

Intensity is a property that describes the energy or power of a sound, and may also be called volume or loudness. It represents the relative strength or volume of a sound . Intensity is usually 分贝(Decibel,dB)measured in units of. Higher decibel values ​​represent greater sound intensity or volume, while lower decibel values ​​represent less sound intensity or volume .
Insert image description here

Hearing range

The picture below shows the hearing range of some creatures. The hearing range of humans is within the range 20到20K Hz, but there will be differences between different people, and as we age, humans may lose the ability to perceive certain high-frequency sounds.
Insert image description here
Insert image description here

Structure of human ear

In the human ear, sound waves first enter the external ear, travel from the pinna to the auditory canal , and are then conducted to the middle ear through the external auditory canal. The middle ear consists of the tympanic membrane and three small bones ( chain of ossicles ): the malleus, incus, and stapes. When sound waves pass through the eardrum and cause the ossicular chain to vibrate, the vibrations are transmitted to the oval window. When sound waves enter the inner ear through the elliptical window, they cause the movement of the endolymph in the cochlea (cochlea), activate the hair cells in the inner ear, and ultimately generate electrical电信号 signals that are transmitted through nerve fibers to the auditory nerve and then along the auditory nerve. to the auditory cortex of the brain. In the auditory cortex, these electrical signals are decoded and interpreted, allowing us to perceive and understand sounds.
Insert image description here

The attenuation reflex

Q: Our own sound is transmitted to the ear through bone or tissue, causing the eardrum to vibrate more strongly. Why doesn't this damage our hearing?
A: Our auditory system has a mechanism called an "attenuation reflex" that automatically adjusts the ear's sensitivity to avoid overstimulation and damage. The attenuation reflex reduces the intensity of sound by contracting the muscles in the ear . When we speak or make sounds, the attenuation reflex automatically adjusts the vibrations of the eardrum and ossicles to reduce the intensity of our own voice.
Two types of muscles: 鼓膜张肌和镫骨肌(tensor tympani muscle and
stapedius muscle)
The presence of a loud sound triggers a neural response that causes these muscles to contract. This response is called a fading reflex.
Muscle contraction -> ossicles become stiffer

Insert image description here

The cochlea

How does the cochlea convert vibrations into electrical signals? The image below is a magnified cross-section of the cochlea.
Insert image description here
When sound vibrations are transmitted to the endolymph (endolymph) in the cochlea, it causes the tiny hairs on the inner and outer hair cells to bend. These fine hairs are connected to the cell membrane of the hair cells and have ion channels. When the hairs bend, ion channels open, allowing charged ions (usually potassium ions) to enter or leave the hair cells. The opening and closing of this ion channel causes potential changes inside and outside the hair cells, forming electrical signals. These electrical signals are then transmitted through nerve fibers to the auditory nerve and ultimately to the auditory cortex of the brain for processing and interpretation.

The basilar membrane is located inside the cochlea and extends along the entire length of the cochlea, from the base (the narrow end of the basilar membrane, near the middle ear) to the apex (the wide end of the basilar membrane, near the apex of the cochlea). The basement membrane is made of elastic connective tissue with varying thicknesses and stiffnesses. The main function of the basilar membrane is to separate and analyze sound frequencies during hearing . When sound waves enter the cochlea, it causes pressure waves in the fluid in the inner ear. These pressure waves cause the basilar membrane to fluctuate, the amplitude and location of which depend on the frequency of the sound.

Because the width and stiffness of the basilar membrane vary at different locations, it exhibits a trapezoidal structure . This means that different frequencies of sound cause maximum vibration at different locations on the basilar membrane. Higher frequency sounds cause the narrow end of the basilar membrane to vibrate, while lower frequency sounds cause the wide end of the basilar membrane to vibrate.

This frequency specificity allows the basilar membrane to separate and resolve the different frequency components of sound. When specific areas of the basilar membrane vibrate, the hair cells connected to it are stimulated and generate electrical signals, which are then passed to the brain for further processing and interpretation.
Insert image description here

Q: What is the connection between lymph fluid and basement membrane?
A: Inside the cochlea, the basilar membrane is located between the endolymph and perilymph . Endolymph fills the cochlear chamber, and perilymph fills the superior and inferior cochlear chambers. The basement membrane is an elastic membrane that contacts endolymph below and perilymph above.
When sound fluctuations enter the cochlea and cause pressure waves in the endolymph, these pressure waves are transmitted through the basilar membrane. The special structure and elastic properties of the basilar membrane allow sounds of different frequencies to cause vibrations at different locations on the basilar membrane.
The vibrating basilar membrane causes the tiny hairs on the hair cells to bend, activating the hair cells to produce electrical signals . These electrical signals are then transmitted through the auditory nerve to the brain for processing and interpretation, resulting in our perception of sound.

organ of Corti

内毛细胞(inner hair cells)The sum mentioned above 外毛细胞(outer hair cells)are actually the components of Corti's organ, which are located above the basement membrane. These hair cells all have tiny hairs that are connected to rod-shaped cells (supporting cells) on the basement membrane.
The organ of Corti is mainly composed of the following parts:
Insert image description here
Rods of corti are elongated structures in supporting cells. They extend from the basement membrane and support the inner and outer hair cells in the middle ( inner and outer hair cells). The presence of the Rod of Corti helps maintain the structural stability of the Organ of Corti and provides support and protection for both internal and external hair cells .

Audition & Vision

The comparison between the auditory system and the visual system is summarized as follows:
Regarding the visual system, you can refer to this blog: Lecture 2 - Visual System

  • 听觉系统Converting physical signals into electrical signals, when sound enters the inner ear part of the cochlea, it triggers pressure waves in the fluid (lymph) within the inner ear . These pressure waves transmit vibrations through the basilar membrane and cause inner and outer hair cells to bend , thereby generating electrical signals. Electrical signals are processed at spiral ganglion cells and then transmitted to the brain through the auditory nerve . After complex processing and interpretation, we can finally perceive and understand sound.
  • 视觉系统Convert light signals into electrical signals to achieve visual perception. When light enters the eye, it is focused onto the retina through the eye's transparent structures, such as the cornea and lens. The retina is the light-sensitive layer located at the back of the eye and is made up of multiple cell layers. The innermost layer is photoreceptor cells , including rods and cones in the retina. Rod cells and cones are the first cells in the retina to receive light signals. These electrical signals are then transmitted to The next layer in the retina is the bipolar cells . The outermost layer is the neuronal layer in the retina, which includes neuronal cells, also called retinal ganglion cells . The axons of these neuron cells converge in the central area of ​​the retina to form the optic nerve . The optic nerve carries electrical signals from the retina to the visual cortex of the brain.Insert image description here

The primary auditory pathway

As shown in the brain slice below, electrical signals are transmitted from the spiral ganglion cells through the auditory nerve to the ventral cochlear nucleus and dorsal cochlear nucleus in the medulla . These two neurons are on the same side. (ipsilateral), and then passed to superior olive . It is then passed through the lateral lemnisus to the inferior colliculus in the midbrain . Finally, it reaches MGN (Medial Geniculate Nucleus), which transmits the signal to the auditory cortex of the brain .
Insert image description here

Encoding sound intensity and frequency

A neuron's response to sound is most sensitive to a specific frequency , called the neuron's characteristic frequency, and is less responsive to adjacent frequencies. This frequency tuning property is seen in many relay neurons from the cochlea to the cortex.

As in the visual pathway, the response properties of cells become more diverse and complex as one progresses up the auditory pathway from the brainstem to the cerebral cortex.

  • For example, some cells in the cochlear nucleus are sensitive to frequencies that change over time (think of a trombone sound sliding from low to high).
  • In the medial geniculate nucleus, some cells respond to more complex sounds, such as speech, and others show simple frequency selectivity similar to that of the auditory nerve.

Information about sound intensity & frequency

Information about sound intensity is encoded in two interrelated ways:

  • neuronal放电频率(firing rates)
  • active神经元的数量(the number of active neurons)

Information about sound frequencies is encoded in two ways:

  • in the basilar membrane, spiral ganglion, cochlear nucleus, medial geniculate nucleus, and auditory cortex 音频拓扑(Tonotopy).
  • 相位锁定(Phase locking): The timing of neuron discharge is consistent with the phase of the sound wave, that is, the neuron discharges continuously at the same phase of the sound wave .

Q: What is audio topology (Tonotopy)?
A: In the auditory system, Tonotopy (audio topology) refers to the spatial arrangement of sound frequencies in the nervous system . Specifically, sounds of different frequencies are encoded and represented in specific ways in different regions or structures of the auditory pathway.
The first manifestations of Tonotopy can be observed on the basilar membrane of the inner ear. The basilar membrane is a structure in the inner ear that contains thousands of tiny sensory cells called hair cells. The basilar membrane narrows from base to apex, forming a cone-like structure. Sounds of different frequencies activate different locations on the basilar membrane in specific ways, with high-frequency sounds activating at the base and low-frequency sounds at the top . This frequency-based spatial encoding creates Tonotopy in the basilar membrane. Tonotopy is also present in other structures of the auditory pathway. For example, in areas such as the spiral ganglion, cochlear nucleus, medial geniculate nucleus, and auditory cortex, the spatial arrangement of neurons is also related to the frequency of sound. Sounds of a specific frequency will activate groups of neurons with corresponding frequency preferences, while sounds of different frequencies will cause activity of groups of neurons in different spatial locations .
Through Tonotopy, the auditory system is able to encode and process sound frequencies at the neural level, allowing us to perceive and distinguish sounds of different frequencies . This spatial arrangement helps keep sound information delivered and processed in an orderly manner.

For example: The picture below is part of the process of auditory processing. The basilar membrane gradually narrows from the base to the top. High-frequency sounds are activated at the base, while low-frequency sounds are activated at the top. The basilar membrane encodes position, and then the information is transmitted to the ganglion. , the ganglion is also position-coded, and then transmitted to the neurons of the Cochlear nucleus in the brainstem, which is also position-coded in the brainstem. So there will be an anterior and posterior. The anterior part is low frequency, and the posterior part is high frequency. The more neurons activated in the posterior part, the greater the intensity of this part of high frequency . So frequency and intensity can be encoded through this audio topology.
Insert image description here

Q: What is phase locking?
A: Phase locking is a phenomenon in which the firing of neurons remains synchronized and consistent at a specific phase of a sound wave . When the sound waveform changes in a periodic manner, certain neurons generate action potentials at specific points in the sound waveform, forming phase locking.
This phase-locked phenomenon keeps the firing of neurons highly synchronized with the periodic changes in sound. The neuron fires when the sound waveform reaches certain phases , but does not generate action potentials at other phases. This allows the neurons to accurately encode the frequency information of the sound. Phase locking is particularly prominent in the processing of low-frequency sounds , because the periodic changes in low-frequency sounds are more likely to match the firing phase of neurons. However, the effect of phase locking may be diminished for high-frequency sounds because the periodicity of high-frequency sounds changes faster and neurons may not be able to keep up with changes in their phase.
By phase-locking, neurons are able to encode the frequency information of a sound in a precise temporal manner , which is crucial for us to perceive and distinguish sounds of different frequencies.

For example: As you can see from the middle low-frequency sound waveform, neurons may only respond at the same phase. It is possible that all the same phases will correspond, or only some of them will correspond, because the periodicity of high-frequency signals changes too fast, and many phases do not correspond.
Insert image description here

Encoding sound intensity and frequency

multiple neurons can provide a temporal code for frequency by working together.
Insert image description here

Mechanisms of sound localization

Sound localization is very necessary for survival. We can use the time difference between the left and right ears to locate the left and right , and we can also use the sound volume to locate the front and rear .
Different techniques are used to locate the source in the horizontal plane (left and right) and the vertical plane (up and down).
Insert image description here
The time difference between the sound reaching the left ear and the right ear is actually only 0.6ms. This time cannot be characterized by the firing rate of neurons, because neuron action potentials are usually completed within 1ms.
How do neurons do this?

The first structure to harbor binaural neurons was the superior olive. In this structure, as mentioned above, the nerve signal is transmitted from the cochlear nucleus to the superior olive , and both structures are bilateral.
Assuming that the sound comes from the left, the axon of the cochlear nucleus on the left discharges first and reaches neuron 1 in the superior olive, and then passes to the right; soon, the sound passes to the right ear, and the axon of the cochlear nucleus on the right discharges and reaches neuron 1 in the superior olive. Reaching neuron 3 in the superior olive, at the same time, the nerve signal from the left reaches neuron 3, and obviously the left one travels a longer distance. When it reaches 3 at the same time, superior olive can detect the time difference between the sound reaching the left and right ears. Therefore, through spatial arrangement and distance on the axon, spatial coding is generated, and the synchronous arrival of the left and right ears detects the position of the left and right ears.
Insert image description here
Vertical plane: Reflection produced by the pinna
• The curve of the outer ear is critical in assessing the height of a sound source.
• Bumps and ridges create reflections of incoming sound.
• The delay between the direct path and the reflected path changes as the sound source moves vertically.
• There are subtle differences in the combined sound, both direct and reflected, coming from above or below.
Insert image description here

Auditory cortex

In the figure below, A1 represents the primary auditory cortex . Like other cortexes, the auditory cortex has 6层. Neurons in the primary auditory cortex are mainly distributed in the second and third layers. These neurons perform preliminary decoding and encoding of basic characteristics of sound, such as frequency, timing and intensity. In addition, neurons in the primary auditory cortex are also involved in the processing of binaural sound localization and sound spatial encoding.

  • Layer I: Also called the molecular layer, it is mainly composed of nerve fibers and synaptic connections. This layer plays a role in regulating and modulating information transmission.
  • Layer II: Also called the outer granular layer, it mainly contains neurons that receive incoming information from other brain areas. These neurons perform preliminary analysis and encoding of the frequency, intensity, and timing of sounds.
  • Layer III: also called the outer pyramidal layer, contains many neurons of different shapes. This layer is involved in more advanced feature extraction and encoding, such as spectral analysis of sounds and pattern recognition of complex sounds.
  • Layer IV: Also called the inner granular layer, it receives incoming information from the next level of auditory nuclei (such as the hypothalamic auditory nuclei) . This layer further analyzes and encodes the frequency, intensity, and timing of sounds.
  • Layer V: Also called the inner pyramidal layer, it contains many projection neurons that transmit processed auditory information to other brain areas, such as other areas of the cerebral cortex .
  • Layer VI: Also known as the multiform layer, it contains various types of neurons that respond to feedback information from other brain areas and participate in integrating and regulating the processing of auditory information. .

Insert image description here

Auditory Cortex: Complex patterns

An experiment was conducted on a marmoset. First, record its sound, then play the sound normally and play it backwards, and record the response of its A1 area. The results are as follows. The frequency of the electrical signal in the A1 area responds the same to both sounds. , but spikes are significantly different.
Insert image description here

Auditory cortex:What & Where pathways

Like vision, hearing has two pathways. (For an introduction to the dorsal pathway and ventral pathway, you can refer to this blog BCI-Two-streams hypothesis )
Insert image description here

Auditory Cortex: Speech areas

Hearing is also closely related to speech
Insert image description here

Classical division on basis of aphasia following lesions:
Broca’s area: understand language but unable to speak or write
Wernicke’s area: speaks but cannot understand

Current understanding is that areas of the primary auditory cortex are not homogeneous, but are category-specific, and that the strongest activation is located near the sensory or motor area associated with that category .
For example:

  • Words for manipulable objects (tools) activate brain regions associated with reaching/grasping motor areas.
  • Moving words activate brain areas located near visual motion areas.
  • Words for complex objects, such as faces, activate visual recognition areas.

There is a study as follows. Different Chinese characters will reflect some body-related movements in their radicals, which will activate different brain areas. This result supports the universality of body topographic representations of action verbs in motor systems.
Insert image description here
Speech feature encoding of human superior temporal gyrus
As shown in the figure below, someone did an experiment, played a piece of speech, recorded the neural activity of the superior temporal gyrus when people listened, and represented it on different phonemes. The time from phoneme onset was recorded, and the discharge representation on different electrodes was recorded on the onset of different phonemes.
Insert image description here
The hierarchical clustering results of single electrode and population responses are as follows. The horizontal axis represents the electrode, and the vertical axis represents phoneme. From column B, you can see that the same type of phoneme produces responses on a fixed part of the electrodes (for example, voiced consonants are in the blue part). Similarly, from row C you can see that electrodes in different areas respond to fixed categories of phoneme .
Insert image description here

Guess you like

Origin blog.csdn.net/m0_51474171/article/details/135445971