Language and Linguistics

Speech Production Mechanism

Speech Production Mechanism and organs of speech

Speech Production Mechanism



It is very necessary to understand the speech production mechanism while studying the Phonetics.The voice is and has been, throughout history, a fundamental communication tool in human beings.

Today, the voice gets added value. Today’s society has countless competitive professionals that require communication skills to impact their work environment. The permanent use of the voice, the risk factors in each one of the contexts and the performance of inappropriate vocal behaviors can cause imbalances in the vocal function, increasing the levels of incidence and prevalence of structural and functional alterations.

The voice organ includes the lungs, larynx and mouth. The voice production is produced by the contraction of the chest muscles, the air is expelled from the lungs, which generates an excess of pressure, giving rise to an air current, and can be considered as an energy carrier that was modulated in its speed and therefore in pressure, for the production of sounds, vowels and consonants. This air current passes through the glottis, located at the base of the larynx, being the first modulating element, forming a membrane near the larynx, which, when opened and closed, modulates the current of air passing through it.

This sound instrument is determined by the positions of the lips, jaw, tongue and larynx. The movements of these elements close or dilate the vocal apparatus according to the positions, which allows a large number of vocal sounds to be produced.

The voice can originate with the intervention of other different organs for example with the belly.

This work aims to raise awareness of important aspects of the mechanisms of voice and speech production.


The basic process of voice production is the same for speaking and singing. The brain sends signals through the central nervous system to the muscles of the larynx, neck and thorax accompanied by a flow of air through the phonatory tract finally obtaining the voice.

The ability to speak requires different skills, first, the person must have something to talk about, for example, they can talk about something that is happening now or something that has already happened, in the first case they are talking about perceptions and in the second Case talks about memories. Both the perceptions and the past events involve the participation of the cerebral mechanisms of the posterior part of the cerebral hemispheres (the occipital, parietal and temporal lobes).

For the perceptions, memories and procedures to become speech, the participation of nerve mechanisms located in the frontal lobes is required.

The mechanism of speech production, briefly summarized is as follows:

The diaphragm pushes the lungs, causing air to be expelled.

The air circulates through the trachea and larynx, passing through the vocal cords and causing them to vibrate with a fundamental tone.

The fundamental tone produced by the vocal cords passes, through the larynx, to the resonance box formed by the nasal and oral cavities.

Some frequencies resonate in the nasal and oral cavities outward as the most important speech information.

The fundamental principle in the production of the voice is the vibration of the vocal cords, due to a coupling and modulation of the air flow that passes through them generating their movement.

The efficiency in the transformation of energy is given by the tension and the glottic configuration. Speaking would be defined as the result of the sound generated in the larynx and modified by the resonance of the supraglottic structures.


When the announcer emits an oral message, he proceeds to a series of operations in which he puts into play all the linguistic levels. This then produces a series of orders that, from the central nervous system to the muscles, via the peripheral nervous system, will allow piloting the evolution of the vocal duct.

Simplifying it could be said that the word is the result of the excitation of supraglottic cavities (nasal or oral) by one or two acoustic sources.

The first, essential, generates an output wave, is the laryngeal source; It can be considered quasi-periodic.

The second one can be added or replaced with the first one, this time it is about explosion or friction noises that can be born inside the vocal duct (from the glottis to the lips)

These sources will excite the vocal duct, but the arrangement depends on the joint.

The actors of this “staging” are the articulators. An articulator is a set that is comfortable to present as an anatomical structure, but also functional. In the descriptive part, muscles are an important part, since they are the ones that ensure the start-up and movements in the phonation phase. Approximately twenty muscles, acting in a coordinated manner, are involved in the implementation of an articulating organ.

The articulation is normally produced by the approach or contact between a fixed and a mobile articulator. Depending on the area where this contact takes place, the following classification of sounds is established according to the point of articulation:


bilabial, upper lip and lower lip.

labiovelar, biarticular consonant in the veil and the lower and upper lips.

labioalveolar, biarticular consonant in the alveoli and the lower and upper lips.

labiodental, lower lip and upper incisors.


dental, apex or dorsum of the tongue and the back of the upper incisors.

interdental, apex of the tongue located between the lower and upper incisors.

retroflex, elevation of the apex of the tongue towards the back of the alveoli.

alveolar, anterior part of the tongue and alveoli.


palatal, anterior part of the back of the tongue and hard palate.

velar, back of the tongue and veil of the palate.

uvular, back of the tongue and uvula.


pharyngeal, root of the tongue and pharyngeal wall.

Glottal, closure of the glottis.


The points of articulation (active and passive)

  1. Exolabial
  2. Endolabial
  3. Dental
  4. Alveolar
  5. Postalveolar
  6. Prepalatal
  7. Palatal
  8. Watch
  9. Uvular
  10. Faringal
  11. Glottal
  12. Epiglotal
  13. Radical
  14. Postdorsal
  15. Predorsal
  16. Laminar
  17. Apical
  18. Subapical.


The articulation is the modification of the sound that comes out of the vocal cords by changing the shape of the box through which it will pass. This box is F-shaped and is formed by the pharynx, the oral cavity and the nasal cavity.

Articulation is the final phase in voice production. This stage involves changing the shape and dimension of the oral cavity of the mouth, which produces sounds when speaking or singing. The correct pronunciation of the syllables, formed by consonants and vowels, automatically makes these changes. Teachers generally use syllables to get students to form correct mouth positions before moving on to placement details, such as consciously moving the jaw up and down, raising and lowering the tongue or moving the corners of the lips away and towards the center of the mouth to produce the desired results.

When the sensitivomotor mechanisms of the joint are altered, the classically known disorder with the designation of phonetic disintegration syndrome occurs. In this syndrome there are a series of distortions of the phonemes that vaguely seem to reproduce the deformations of children’s language, and hence the phonetic disintegration rating assigned to these joint disorders. From a neurological point of view, the etiopathogenic basis consists of the varied combination of paralytic, dystonic and apraxic disorders, intermingled and combined in a variable way, and that in this way interfere with the correct production of language sounds. As a consequence, the phonetic quality of the sounds emitted is altered by a disorganization of the articulatory motor harmony,

In general, we could say that the sounds tend to be emitted in conditions of a more elementary motility, so that all kinds of deformations (elisions, assimilations, substitutions, metathesis, ephesis) can be combined in a motley way. The articulation or sound production exercises include the correct pronunciation of sounds and syllables by the therapist usually during play activities. The therapist will physically demonstrate to the child how to emit certain sounds such as the “r” sound and how to move the tongue to produce certain sounds.

1.1. Fluency

It is the process that allows smoothness, rhythm, continuous flow, without pauses or repetitions, with which sounds, words and phrases come together in oral language. It consists in the ability to express ideas with agility, composed by the association and relationship of words, in a clear and understandable way in the linguistic environment that gives meaning and meaning to the statement. It is necessary to have verbal fluency to communicate. Verbal fluency is acquired, so it is directly related to culture. The person through culture participates in activities that develop this ability, which can also be educated, so read, watch TV, rummage through the Internet and participation in social events are essential. In speeches, conversations, interviews and offices this skill is always present.

Reading fluency is like speaking fluency. Both require precision, prosody (phrasing, intonation and expression) and adequate speed. Fluency and understanding are interrelated with speech and reading. It is necessary to grasp the meaning of a sentence to be able to say it or read it with expression. In addition, because fluent readers are able to decipher words accurately and automatically, they can focus their attention on constructing meaning from text instead of deciphering words one by one.

It allows you to express yourself and make yourself understood in a conversation in an agile way. It serves to present ideas, produce, associate and relate words. People with verbal fluency are easier to establish interpersonal relationships

Fluency is given in three areas:

Creative area: Ability to create or reproduce ideas.

Linguistic area: Ability to produce, express and match words.

Semantic area: Ability to know the meaning of words.

The process of fluency can be affected if the areas of the brain related to language are injured by extrinsic or intrinsic causes, “Broca’s area and Wernicke’s area”, if this occurred it would be directly affecting the fluency in all its dimensions.

Voice disturbances can also affect verbal fluency.

1.2. Voice

The voice is produced by the whole person. For this to happen, the participation of different organs and systems of the human body is necessary. The emission of sound is a consequence of the interaction of complex movements, of different motor, sensory and hormonal systems controlled and regulated by the central and peripheral nervous system.

The production of sound depends among other things on the emotional and situational affective situation of the one who emits the voice. In addition to responding to a biological structure, it is a physical-acoustic phenomenon, it is a sound, an air disturbance, in which characteristic features can be distinguished. The sensorimotor and metabolic-hormonal processes constitute the anatomical-physiological base where the voice sits; This is produced by the laryngeal effector and determined as well as biological function.

Perhaps the best way to express the multi-potentiality of the voice is through the comment of Leon Botstein14, one of the directors of the American Symphony Orchestra:

“Among all the gifts of nature that human beings have had to adapt to transform them into instruments (…), none has proven to be more versatile than the most common of them all: the voice (…), since it uses the same medium as He speaks to allow us to escape the limits of language . ”

To communicate it is necessary to have the intention of doing so. This is what brings into play the activity of the entire central nervous system (CNS) and peripheral, including emotional-emotional elements. Sound production and communication language, requires synchronous functioning of the muscles involved in phonation; what is achieved to the extent that there is a central and peripheral neurological location that controls the function. The complexity of speech-making mechanisms needs, in effect, a central and peripheral nervous system of integration and coordination, which acts especially on the audiomotor mechanisms of sound emission and the reception of said emission. This allows us to glimpse the mechanism of synthesis that concerns all brain areas, both sensitive and motor, for the production of articulated language. We could then summarize this complex mechanism in three great phases or moments.

The subject thinks what he wants to say or manifest (concepts, intentions, emotions, etc.), it is the moment of Ideation.  Then he resorts to his mental file or memory, in search of the words that represent the ideas or translate his inner state, in his multiple circumstances, and he represents himself the  Verbal Image,  while the cerebral cortex that governs the movement of the phonatory system gives the corresponding order, so that the muscles perform the appropriate contraction and emit the selected words, for the transmission of the message; The motor order is thus produced  .

In other words, to achieve a meaningful vocal broadcast, the person or the “issuer” must carry out the indicated phases. All this in a situational affective context, in which, in addition, intervening factors intervene among the speakers: the physical and emotional distance between them, the shared codes and the objective of that particular communicative act. This set of physical, affective and biological phenomena, with shared codes, constitutes communication through speech.

The voice does not have its own device; man has used, to produce it, systems of the organism primarily intended for other functions (respiratory, digestive, sphincter, etc.). Body and brain form an inseparable unit, a human being. The Neurological System covers not only the invisible processes of thought, but also visible physiological reactions to ideas and events. They are the structures of the Nervous System that influence vocal production, that is, nerve centers, conduction pathways, the specific action of said system in the phonatory process and the hormonal regulation that intervenes in it.

Vocal production is a motor activity produced by muscles, which belong to different areas of the human body and their functions correspond to other vital systems of man. It is in man the ability to bring them together in a joint and functional action, to produce noises, sounds, and give them a conventional meaning, thus elaborating the language for communication.

A detailed description of the multiple muscles that act in the phonatory process will provide clarity to understand the complexity of vocal production mechanics. Fonatoria muscular physiology encompasses respiratory, laryngeal, resonator, facial and lingual muscles. The laryngeal muscles in general have a special structure that makes them able to contract quickly with good fatigue resistance. The production of the voice is a motor activity, which occurs during a synchronized regulation of actions over time. It is the muscular activity started up; whose function is the contraction. This contraction produces a displacement of body segments when skeletal muscles pull their tendon and bone inserts. The physiological trigger of this activity is the Nervous System.

The contraction of the thoracic muscles (respiratory muscular physiology) begins a fraction of a second before that of the larynx, to support the air column that will be used in vocal production (with a certain tone, intensity and emphasis). This thoracic contraction anticipates power needs of the larynx (laryngeal muscular physiology); In turn, it must anticipate sounding requirements, for the emission of vowels and consonants, which are produced in a few thousandths of a second, by the organs of the joint (facial and lingual muscular physiology).

It is proven that the contraction of the elevators of the palate (resonator muscle physiology) and some facial muscles (facial muscle physiology) occur fractions of seconds before the laryngeal sound is produced. Each of these components is soon to change, instantly, to produce the following phoneme; and they interact in coordination with each other, in time, strength and sequence. An alteration in any of the nervous levels that have been mentioned, either in the superior ideomotor plane, in the mechanical part of the vocal production or in the emotional emotional balance of the subject, results in an alteration in the final product:  the voice. The Voice-Speech relationship is intimate, essential to understand the impact of the voice on the quality of communication, both at the level of the message and the relationship between the interlocutors.

In the act of producing the voice, carried out to communicate a thought, idea, feeling, etc., the psychological activity of the one who emits the sound is implied, as well as that of the one who listens to the message. It is possible to observe how a certain emotional state modifies the strength and mode of the speaker’s expression, while producing effects on the behavior and mood of the listener.

By repeatedly performing acts of communication, mental mechanisms that characterize human social nature develop. Without language and other communication tools, processes that are used to coordinate social activities and lead life in society could not be carried out.


Receptive language is the acquisition of language. It can be oral, written or symbolic communication that is processed by the brain of the listener. In receptive language one person communicates something while the other receives the language and, in some way determined by age and ability, learns something. Therefore, receptive language is half of the communication that is based on listening.

As for receptive language, it refers to how we capture and understand the spoken signal. The speech as we have been able to verify previously consists of a vibration of the ambient air. This produces a mobilization of the eardrum (ear) that by causing the mobilization of the ossicles of the middle ear, transforms the message to the inner ear and induces in it the mobilization of the liquids. This signal involves a displacement of different membranes, as well as hair cells. It is at this stage that information becomes neuronal. Subsequently, after passing through the Wernicke area there will be an understanding of the message issued by the interlocutor.

This language allows us to understand and acquire the meaning of words, that is, what the child stores, and forms the basis for the development of semantics in oral language.

They are indicators of receptive language:

Auditory perception and discrimination of words, phrases and sentences.

Auditory memory

Execution of orders.

Instruction Tracking

Understand the meaning of the language you hear, and your answers are appropriate.

The child presents / displays difficulties in the receptive language when it is observed difficulty to understand the spoken language, being able to present some of the following characteristics:

Constantly ask, ah? That?

He cannot understand the meaning of long sentences.

It is difficult to follow complex and simple instructions

It usually mimics or follows communication behaviors presented by their classmates.

2.1- Expressive.

The motor part, therefore, part of the cerebral cortex at the level of the primary motor area (drill area). Once the order is given, the sound emitted by the vocal cords after the exhalation of the air is characterized by intensity, timbre and height. Drill and pharynx act as resonance boxes and allow the formation of phonemes. This would be what would form what we know by expressive language.

Expressive language is the ability to convey ideas through logical patterns with pronunciation force, with melodies, timbre, rhythm, and appropriate cadences, or one that uses the whole body, including sight for what we want to convey as a message. It is also considered as a complex process that includes pronunciation, involves a precise motor activity and a well-established serial organization, as well as the retention of a general outline of the sentence or sentence. Therefore, several areas of the brain intervene.

The main objective of this type of language is to provide children with an expressiveness to all equally in an environment of socialization in meeting places, with art, play, play and word, image, imagination and creative activity of children to face the future. Try to provide technological means and materials for sensitive and body experimentation.

It also facilitates the execution of activities on different topics that have to do with the integral development of being.

Expressive language is what allows the child to express themselves through gestures, signs or words. Indicators:

Adequate and accurate vocabulary.

Grammar construction of sentences.

Logical and sequential ordering of the message.

Combination of words in phrases and sentences.

Under normal conditions, expressive language develops “in parallel” to receptive or comprehensive language. During the development period, the learning of expressive language skills deteriorates due to the slowdown in receptive language processing. When a person has trouble understanding others (receptive language), or expressing thoughts, emotions and ideas (expressive language), that person has a language disorder.

The language expression disorder can be evident before 3 years, this state causes concern in the parents of children who seem intelligent, but still do not speak, or have little vocabulary or low compression. It is a condition in which a child has a level of vocabulary, ability to compose complex sentences and remember words below normal according to their age. The period of 4 to 7 years is crucial. Normally at 8 years, one of the two evolutionary directions are established. The child can then progress towards practically normal language, remaining only subtle defects and perhaps symptoms of other learning disorders.

Alternatively, the child may remain disabled, show slow progress and later lose some previously acquired abilities. In this case there may be a decrease in nonverbal IQ, possibly due to the failure in the development of sequencing, categorization and superior cortical functions related to them.

Complications of expressive language disorder include shyness, withdrawal and emotional lability.

Expressive language disorder is a communication disorder in which there are difficulties in verbal and written expression. It is a specific language disorder characterized by the expressive use capacity of spoken language that is well below the level appropriate for mental age, but with an understanding of language that is within normal limits. There can be no problems with vocabulary, the production of complex sentences, and remembering the words, and there can be no abnormalities in the joint. In expressive language, their disorders are classified as a specific language disorder or SLI, where a child has not been able to acquire normal expressive language, even though they have been properly exposed to the language and there is an absence of notable medical causes or Genetic

This disorder affects work and schooling in many ways. Usually treated by specific language therapy, and usually cannot be expected to leave on their own.

Care should be taken to distinguish expressive language disorder from other communication disorders, sensory and motor disturbances, intellectual disability and / or environmental deprivation (DSM-IV-TR Criterion D). These factors affect a person’s speech and in writing to certain predictable extensions, and with certain differences

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button