The speech and Speech Production Process is very important to note as it is the essence of the branch of linguistics, Phonetics. in this article, we will demonstrate all dimensions regarding this very topic under consideration.

Speech and Speech Production Process

Speech is an extremely complex function that involves processes ranging from:


it considers emotional, affective, volitional, and psychic aspects that regulate the production of speech such as speed, prosodic inflections, etc.), on the other hand, it considers basic psychological processes such as expressive and comprehensive oral language, thought, and intelligence.


either at the neuromotor level, they will allow swallowing, sucking, chewing functions (orofacial motor acts), as well as at the neurosensory level they will allow the integration of auditory processes.


which implies the presence of Phonological Simplification Processes (PSF) where substitution errors (eg Moon for dune), omission (eg Elephant for epant) are frequently evident

In this sense, speech cannot be equated to a motor act, nor can it attempt to be valued or treated with non-verbal activities such as feeding, homeostatic breathing or non-verbal orofacial movements (praxis), and laryngeal movements

Difference between language production and speech production:

The production of speech refers to the execution of the motor plan or sequence of movements of the muscular structures involved in the articulation of sounds, while the production of language implies the production of speech, plus the conceptualization and formulation phases.

The speech production process is virtually unobservable – and therefore the most difficult to describe. A large number of models are built on the basis of reservations and pauses in speech. N. Chomsky’s transformational-generative grammar assumes that a person operates with certain rules that allow him to unfold a deep structure into a superficial one.


Speech production is the process by which thoughts are translated into speech. This involves choosing words, organizing the appropriate grammatical forms, and then articulating the sounds received by the motor system using the vocal apparatus. Speech reproduction can be spontaneous, for example, when a person creates words of conversation, reactive, for example, when he calls a picture or reads a written word aloud, or imitation, for example, when repeating speech. Speech production is not the same as production language, since the language can also be reproduced manually using characters. In normal fluent conversation, people pronounce about four syllables, ten or twelve phonemes, and two or three words from their vocabulary (which can contain 10,000 to 100,000 words) every second. Mistakes in speech pronunciation are relatively rare, approximately once in every 900 words in spontaneous speech. Words that are usually spoken or memorized at an early age, or that are easy to imagine, are spoken faster than those that are rarely spoken, memorized at a later age, or are abstract. Typically, speech is created by pulmonary pressure created by the lungs, which generate sound by phoning through the glottis in the larynx, which is then converted by the vocal tract into various vowels and consonants. However, speech production can occur without the use of the lungs and glottis, in alaryngeal speech using the upper parts of the vocal tract. An example of such a laryngeal speech is the conversation of Donald Duck. Vocal reproduction of speech can be associated with the production of hand gestures that enhance the comprehensibility of what is being said –

Speech Production process

To pronounce speech sounds, you need:

1) a stream of air, the energy of which is needed to excite sound,

2) a sound vibrator,

3) resonators for the formation of speech timbres

A jet of air is supplied from the lungs through the airways. The main sound vibrator is the vocal cords of the larynx. In addition, a noisy sound can be generated by air friction when passing through a narrow gap between the lips, between the tongue and teeth, between the tongue and the hard palate, when the bridge between the same organs breaks. Resonators of speech sounds are the oral cavity and pharynx, that is, the extension tube. It is here that speech timbres are formed.

The air leaving the lungs under a certain pressure passes through a narrow slit of the vocal cords, vibrating under the influence of impulses traveling along the nerve from the brain, as a result of which sound is formed (its height depends on the frequency of vibrations of the vocal cords).

The sound generated by the vibration of the vocal cords is not loud enough and lacks speech timbres. It looks like a soft squeak or hum. Amplification of sound and the formation of speech timbres occurs in two resonators – oral and pharyngeal.

Analysis of the Process of Speech Production

The speech production process is classified into three fields:

  • Conceptualization
  • Formulation
  • Coding

The concept of conceptualization

It is the highest level, it implies determining what to say. During conceptualization, the people who speak conceive an intention and choose the relevant information from memory, or from the environment, to prepare the construction of the intended expression. The product of conceptualization is a preverbal message. Levelt differentiated the process of conceptualization from macro-planning and micro-planning.

Formulation processes

They involve translating the conceptual representation into linguistic form.

The formulation has two components:


in the production of speech, go from semantics to sound.

Syntactic planning:

put the words together to make the sentence

The encoding processes.

The phonological performance involves turning words into sounds in the correct order, the correct speed, and the proper prosody (intonation, pitch, volume, and rhythm). Sounds need to be produced in the correct sequence and how to move the muscles of the articulatory system is specified.

The techniques used in the study of production have been the following: Analysis of transcripts of how the subjects choose what and how to say, computer simulations, analysis of pauses and errors, and lingual lapses. In recent years, experimental imaging naming studies

