Human speech is not just emitting noises that have a specific meaning. The audio side of the question has a complex, hierarchically arranged organization. Of individual sounds — phonemes — formed syllables, from syllables phonetic words (they may not coincide with grammatical words, for example, in the forest — one phonetic word, but two grammatical), phonetic words phonetic phrase (a combination of phonetic words without pauses between them), and phonetic sentences, or periods. It is not easy to construct and the sounds themselves. With the help of movements of the tongue, lips, jaw, palatal curtain, epiglottis, human, changing the resonant properties of the vocal tract attenuates some frequencies the resulting sound and amplifies others. Each vowel is characterized by its own "pattern" enhanced frequencies (formant). Consonants also have their frequency maxima and minima, but are recognized largely for the impact they have on the formants of adjacent vowels with them. For example, after sydneysider consonant (Gili) in the following vowel closer to the starting point of outlines of the second and third formant.
Generally speaking, direct identification of human language and articulate of speech is not quite true because sign languages of the deaf nor in any way are not "less human" than oral languages. Contrary to popular belief, the gestures of these languages convey not the individual letters (although finger alphabet — dactylology is also available, primarily for the transmission of proper names), and whole words (or morphemes — meaningful parts of words). Each gesture-word is composed of nonsignificant elements — hir, and said, as in oral language are composed of phrases and sentences. A grammar (for example, expression of the plural, a variety of specific differences and more. etc.), different styles of speech, they can engage in dialogue and monologue to make up stories on any topic (for example, you can tell buddy a surreal cartoon).
The role of the position of the larynx in the formation of human language And yet, apparently, the human language was formed primarily as a spoken language — talking about this available to Homo sapiens numerous devices for the production and perception of intelligible sound. The most important of them — the larynx is located lower than that of modern great apes. Low position of the larynx opens the possibility for clear pronunciation of the sounds of human language, but poses a risk of choking.
Note that in human babies, the larynx, too, like the chimpanzee, located high (this allows you to simultaneously suck and breathe). Approximately three years, the larynx descends, and this approximately coincides with the time the full mastery of the sound side of language. However, in fairness it should be said that the position of the larynx remains unchanged for the life of not only the man, according to the group of Japanese scientists, a lowering of the larynx is also observed in chimpanzees.
About what should the low position of the larynx, there are at least two hypotheses. According to one point of view, it is only necessary to articulate the speech, since it allows the tongue to move inside the vocal tract in the horizontal and in the vertical plane. This allows you to create various configurations of the oral cavity and pharynx independently and thus greatly expands the set of possible phonemes that differ on on what frequencies of sound are amplified, and which, on the contrary, dimmed.
According to another point of view, the main role of lowering of the larynx is providing the opportunity to publish more low sounds and thereby create the impression among the audience that the speaker is larger than it actually is. Apparently, this view is fundamentally wrong. Not only that exaggerate their own size — too small a purchase for such a huge "price" as the risk of choking. Most importantly, as it seems, then, that primates (and therefore, presumably, early hominids) is a group of animals with a sufficiently high level of intelligence. They live for years together often and know each other "in person" — as observations show, the role of interpersonal contacts in the monkey community is very high. In such a situation to try to create a false impression about the size (which is visible to the naked eye and the whole group has long been known) is simply useless (it is significant that the authors of this point of view refer to the frogs and birds that produce communicative action at such distances and in such an environment, the size of the one who produces a sound that a listener hears is not visible). The assumption is that lowering the tone of his voice was important in intergroup conflict (to far away to frighten the members of a neighboring group) also can hardly claim to validity: first, such a task were to determine the lowering of the larynx in adult males but not in females and three-year-olds, for which the intimidation of the neighbors are out of date, and secondly, the human ear is configured to preferential perception of the frequencies that are too high for those distances, which are intergroup communication (see below). Thus, it remains only one possibility: the low position of the larynx as a specific symptom is one of the adaptations for articulate of speech.
The difficulty lies in the fact that the larynx contains bone and soft tissues are not preserved, so that all available scientific data on position of the larynx in a particular type of hominid that reconstruction.
In providing articulate the speech involved the hyoid bone. In humans, it is located lower than the other primates, thereby greatly expanding the range of possible movements of the pharynx, larynx and tongue relative to each other. If the hyoid bone was located we have another, we would be able to say no more distinct sounds than, for example, chimpanzees. As was located this bone most of the other members of the hoards of man — is unknown, because this small bone is not attached to the rest of the skeleton, usually not preserved. To date, researchers have only very few samples of the hyoid bone. The most famous is found in Israel (cave Kebar) hyoid bone of the Neanderthal man and the hyoid bone of Heidelberg man from Spain (the region of Atapuerca, cave of SIMA de Los Huesos); in addition, the cave Sidron in Asturias, near the Pylon, was found partially preserved hyoid bone of a hominid that belonged to the Neanderthals (or heidelbergsa). All these bones, although somewhat different in structure, extremely similar to those seen in modern humans (in particular, they lack the holes for the neck bags, typical of modern chimpanzees, and it gives you the opportunity to Express the hypothesis that the voice of a Neanderthal, Heidelberg and modern humans anatomically very close. A recently conducted more detailed study of the hyoid bone of a Neanderthal from Kebara showed that it is similar to the hyoid bone of an anatomically modern humans not only in appearance but also in internal structure, which suggests that she experienced a similar load, that is, the probability that a Neanderthal could use a sounding speech.
On the contrary, recently found the hyoid bone of Australopithecus afarensis was the same as that of chimpanzees.
The role of respiration during speech is No less important for the use of speech fine breath control. The fact that in speech, in contrast to the inarticulate cry, the air should be served on the vocal cords immediately, but in small portions — syllables. It allows us to create long statements in one breath, punctuating his short breaths in the moments of important to the meaning and/or syntax delays. In one such statement, you can say a large number of syllables — thus, there is an evolutionary task is to implement these syllables with the necessary amount of differences that will allow to give the statement greater information content. If the air fed to the vocal cords all at once, the possibility of change of sound during a single exhalation-statements would be very restricted (the reader can verify this himself by trying to provide clear changes in the sound, say, a cry of horror). As a consequence, such a language would be very few words: too small possibilities for variation of the sound would allow a large number of differences. In addition, when pronouncing phonemes approaching the organs of articulation weaken the acoustic power is different in different cases, so that when the same force of air flow to the vocal cords, some sounds would have been louder than others so that it drowned out the latter (the perception there is a "masking" effect of the quiet sound loud immediately preceding or following it, is not recognized). So, speech breath must not only quantize the exhalation in syllables, but also to adjust the force of exhalation in the framework of one syllable so that adjacent sounds are not drowned out each other. As shown..And. By inkinen with rentgenograficheskoe shooting, is provided by movements of the diaphragm: "in a speech utterance diaphragm on the exhale makes sharp and clearly visible dyhatelnye and expiratory movements. She modulates with a certain amplitude at each speech sound, rising up and falling down, while expiration does not stop." For example, in the word of the rock "in the syllable ska aperture first makes two movements up (SK), then descends to and. After that comes a brief blagostanje lower diaphragms and new slowly, which starts with a small lift of the diaphragm on the l and second, the large rise in s", while "in the moment of falling of the diaphragm on saharastega breath doesn't happen." The diaphragm is innervated by phrenic nerves extending from cervical spinal cord at the level of the third, fourth and fifth cervical vertebrae. In speech breathing are also involved the intercostal muscles, which are innervated by of the thoracic spinal nerves. Thus, for effective control of breathing during speech, a relatively wide spinal canal. According to reports, the Neanderthal and Heidelberg man, this channel was about as wide as neoanthrope, while arhantrop is much narrower.
Sometimes you can find the assertion that a significant role for articulate speech plays chin projection. But that's not quite true. Chin projection is simply the result of uneven reduction of the jaws, occurring in the process of human evolution. Another thing is that with the development of speech muscles of the tongue made all the more subtly differentiated movements, and that the need for attachment of these muscles may have saved the jaw from the reduction. Moreover, it came to the mental spine and the ledge. Also in the development of articulate speech has played a role not chin projection as such, but changing the method of attachment podborodnikova with a fleshy muscle on the tendon. However, for conclusions about the system of communication may be more indicative of the structure of the inner surface of the lower jaw: in the middle (symphysis) a person has the mental spine (the attachment points of the chin-lingual muscles); in monkeys at this place a pit (because this muscle is attached to bones by tendon and fleshy part). Jaw fossil hominids demonstrate the range of transitional forms.
The role of the auditory analyzer Anatomical changes associated with the development of articulate of speech, affected not only the speech apparatus. In humans, differently than, for example, the chimpanzee, is arranged in the auditory analyzer. We have the area of best hearing in the range of 2 to 4 kHz exactly at those frequencies are focused important characteristics of phonemes. Chimpanzee is the best hear the sounds of a frequency of about 1 kHz, which is a very important because approximately the same frequency have their "long cries" (one of the types of communicative signals). Setting hearing sensitivity to high frequencies occurred in the ancestors of modern humans — Homo heidelbergensis. I. martínez and his colleagues studied the auditory ossicles of a Homo heidelbergensis found in Spain (the site of the SIMA de Los Huesos) and reconstructed, how the hearing had representatives of this species. It turned out that the region of best hearing in the range of 2 to 4 kHz, by this time already begun to form, but have not yet been established fully because of the degree of development of different individuals discover a lot of variability. The analyzer of the speech sounds in humans works extremely fast (faster than the recognized non-speech sounds) — up to 20-30, and in the artificial acceleration of speech up to 40 to 50 phonemes per second. While people can spend a fairly subtle phonetic differences, for example, we are able not to confuse similar sounds such as b and p (physically different from each other so that the vibrations of the vocal cords begin either at the same time open the lips, or after that).
An important property of human communication is that it is controlled by will, not emotions (i.e. control the structure of the cerebral cortex): in order to speak, we don't have to come in great excitement (it is rather hurt), we just want something to say.
The work of the hemispheres of the brain during speech the Main role in the functioning of language is played by two on the left (normal right-handers) hemisphere: Broca's area and Wernicke's area. The area of Wernicke, neighboring the area of visual recognition, stores the images of individual language elements, Broca's area, adjacent to premotor cortex, the programme of treatment. But no less important, and other parts of the brain, especially the frontal lobes — they provide the ability to suppress unnecessary emotions and to focus on the Essentials, abstracting from irrelevant details. In the absence of such opportunities, people will never be able to recognize, for example, which elements of the phonetic implementation of sound carry similarsocial load, and which are not. In patients with lesions of the frontal lobes the person loses the power of speech but loses the ability to build the behavior by verbal instructions. If the affected divisions of the prefrontal cortex, the patient may repeat words and whole phrases, but not able to Express some idea or ask a question. It should be noted that in the cerebral cortex are connected a variety of aspects of perception of the same object: its appearance, smell and taste (if it has them), the sounds of calling the object, the sounds produced by this object (if it makes sounds), the sense of this object in the hand (if it is possible to take in hand), the performance of manipulation with him, etc., in short, everything that allows us, seeing (hearing, smelling, tasting) this object, to understand what it can be expected that it is possible (or even necessary) to do, and what not. In storage, our knowledge about the different objects are involved, those parts of the brain that regulate associated with these objects behavior: for example, in pattern recognition tools involved premotor cortex, which governs labor movements, and "when categorization and naming of pictures of animals, in contrast, primarily activated occipital-temporal region, responsible for complex shape visual processing and perception of motion".
It is important for recognition of speech. In the brain, as shown by the data of brain mapping, there are special areas designed for the processing of speech sounds (different from those used for recognition of non-speech sounds). These sites allow you to discover various simple characteristics of acoustic events: the presence of sound at a certain frequency, the increase of sound energy, reduction of sound energy, speed of energy change of the sound frequency increase, frequency decrease, and some others; different combinations of readings of the detectors are formed in similarsocial signs of phonemes (the set of which is unique for each phoneme). But man, perceiving and understanding speech, does not recognize the phoneme for a phoneme, then putting them in words (like the computer, letter by letter recognizing scanned text), speech recognition is much more complicated. First, from sound to sound in speech are acoustically quite noticeable transitions (therefore, if the sounds in the syllables are reversed, people will hear not a syllable uttered on the contrary, a meaningless gibberish — due to the fact that the familiar rules of transition from sound to sound will be observed). Formant transitions between adjacent sounds often allow people to "hear" the sound you need, even if it was not really pronounced, people may not realize that instead of, say, He is the man responsible heard ...check sheer. Second, the speech sounds occur — if we do not take experiments — in the words, "information sufficient to identify the words in sound form, includes its total length, prosodic contour, several vowel and consonant sounds, following each other in a certain order".
In addition, words are used in statements, and statements — in certain situations, thus the number of "context" (both linguistic and extralinguistic) will be increased. To the recognition of speech sounds can connect also the visual analyzer, as evidenced by the famous "the effect of MC Gurk": if you give people to listen to the syllables ba and show him the lips, pronouncing ga, it, automatically making the corresponding amendment will perceive heard as the syllable da (open lips could not произноситьb, and the noise on those frequencies that are characteristic of b, it is possible with some tension to take d, but not for the g). All this allows people to understand each other even if there are errors.
All of these anatomical and physiological adaptations are accompanied by adaptations cognitive: children come into the world with the desire to detect the word, i.e. to interpret the sounds uttered by others as signs. The desire to hear and understand speech is so great that sometimes causes a person to detect speech in the noise nature (e.g., song birds lentils usually described as "Vitya saw?"). Great importance for the development of speech has the ability to onomatopoeia: the fact that people, for the most part, rather poor imitators, imitation of speech sounds is much better: already to three or four or five years children learn to correctly pronounce all the consonants and vowels of their native language, to reproduce the tones (in languages that have them), intonational structure different types of sentences, etc. it is Important that onomatopoeia is self-sustaining: children, learning a language, do not need special encouragement for the learned elements of the communication system.
"The gene for speech," in Homo sapiens all of the complex features related to the formation of speech, enshrined in the genome, and because the emergence of such a complex set of symptoms in the result of a single macromutation absolutely impossible, this means before type-ancestor task of adjusting to articulate the speech was already (and the basis of the child put those who could succeed).
As a "speech gene" is often considered the FOXP2 gene, located on the seventh chromosome. People with a defective version of this gene suffer from a specific speech disorder (eng. SLI) that affects as phonetic and grammatical components of language. In addition, they have a few broken volitional motor control over muscles of the mouth (in the area of both individual movements and sequences) — it is difficult, for example, the command to stick out his tongue or several times in a row to close and open the front teeth. Studies have shown that the FOXP2 gene is a gene regulator, a high-level (ie, regulates the activity of other genes-regulators); it is expressed in different parts of the brain, in particular, affects the nature of neural connections between the cerebral cortex and basal nuclei, increasing synaptic plasticity in them, which is extremely significant to be able to learn sequences of actions.
The FOXP2 gene apparently was a target of selection in hominidae line: since the separation of the ancestors of humans and chimpanzees in this gene have been two replacements, and both of them are nonsynonymous. As shown by recent studies, the same as the us version of this gene also possessed the Neanderthals. Thus, it is likely that human mutations in this gene occurred in the common ancestor of Neanderthal and Homo sapiens, i.e., Homo heidelbergensis, expanding its capabilities in the field of motor control over the organs of articulation, and in the automation of workflows, which is especially important for those who can recite long, complex cues and to whom, consequently, the challenge is to endow these replicas required number of differences between neighbouring elements.
Important to determine the time of formation of the articulate sounds of speech is a recent study by B. de Boer, aimed at identifying the functions of neck bags. These formations are that of modern chimpanzees (and had, judging by the structure of the hyoid bone, avarskikh Australopithecus), but are absent in humans (and Neanderthals and Homo heidelbergensis) Build a model of the speech of the resonator throat with bags and without them, he showed that in the presence of neck bags first, the resonances of the vocal tract are shifted closer to each other, and secondly, there are additional resonances and antiresonance — and appear independently produced articulation. This immediately visible negative role gorlovykh bags for clearandselect speech. First, if all areas of sound amplification close to each other, this means that the sounds are more similar to one another, while articulate speech is necessary, on the contrary, that sounds quite different. The increase differing by hearing sounds allows you to have a communication system with a large number of characters (and thus with greater expressive possibilities). Secondly, the presence of resonances and antiresonances, independent of the produce of articulation, greatly reduces the possibility of arbitrary variation of the produced sound. Such a task is relevant to the monkeys, who, having a high-positioned larynx, unable to eat and lokaliserat at the same time: if you have throat pouches food in the mouth, does not prevent to issue the necessary sounds. But intelligible speech leads to the opposite problem: with the help of articulation available volitional control, to ensure the largest possible number of differences in the sound. Another function of the throat pouches — the lower the pitch. This task is also relevant for the monkeys, who use sound communication for communication with neighbors that are relatively far away and hidden by the dense foliage of the tropical forest (in communication at a close distance more important role played by facial expressions, gestures, posture and a variety of touch), and — in this context — are the auditory analyzer that is configured for the preferential detection of low-frequency sounds. But for hominids living in open and semi-open landscapes, this task is gradually losing relevance. Auditory, analizatora heidelbergensis, showing the formation of additional areas of best hearing at high frequencies, indicates that at the forefront of hominids gradually emerged sound communication at a close distance.
All this allows in General terms to imagine the picture of the formation of speech in the course of evolution. Initially, the main carriers of the intentionally transmitted signal information from the hominids, like modern apes, was probably gestures — they are subject to volitional control and can be used to create ad hoc signals (signals generated therein, in which neither form nor meaning are not innate). The sounds could only be used as an emotional Supplement. But when the amount of manipulative activity of hominids has increased, in particular due to the increasing production and use of guns, the combination of a conventional and communicative activities has become difficult: the hands could not do both tools and signs, the brain had to choose which signal to send to the hands of what information to process from a practical or from a signal (this problem is easy to model, trying to speak and simultaneously chewing gum). This, apparently, led to the substitution effect — the signal from the brain structures that control communication, was filed not only on hands but also on the organs of zvukoprovodnosti (and Broca's area, adjacent to the premotor cortex has at its disposal the control of vocalization). Such a substitution could be made easier by the fact that in primates, the control of mouthparts and hand controls are connected since they together are involved in nutrition, grooming, etc.
Influence of production technology of guns on the emergence of speech With the development of technology the production of instruments demanded more time. If it was necessary to communicate, then the winner is who could, even before the actual transmission of information (i.e. the use of meaningful, controlled by the will of the gesture), the sound excited or attract attention to guess (at least to some extent) that will be reported. In principle this is not impossible: for example, people hearing a call to itself by name, maybe the tone to predict part of the meaning of the future of communications — whether speaking to ask him about something, to threaten him, to shame, to beckon to him, to tell us about some spectacular event, etc.; in the formation of speech in the child's mastery of intonation happens before mastering the words.
Accordingly, selection will be encouraged increasingly variable source signal and more accurate "guess" other individuals on this signal the meaning of the intended message. In this case, the information load will be moved to the audio channel, the other channel is reduced. This will increase the importance for communication of high frequency, since communication takes place on a close distance.
Homo heidelbergensis the situation has changed dramatically: increased vertebral canal suggests that they were able to speak a replica of several syllables by combining different articulate sounds. This makes sense only when there are opportunities to invest in sound maximum differences (thereby maximizing the transmitted information). Indeed, the absence of hyoid bone holes for neck bags, as well as tuning of hearing at high frequencies show that adaptations to better differentiate sounds by means of articulation referred to as zvukoprovodnostju and audition. It probably had a genetic underlying cause, secured by mutations in the FOXP2 gene. Thus, it can be argued that at the basis of communication of Heidelberg man was lying sounding speech in which differences between sounds was provided by articulation. But, apparently, this speech was not a real human language. Language is very important to be able to draw conclusions from several packages at the same time, to focus on the Essentials, abstracting from the inessential (in particular, this concerns purely the sound differences), to keep in memory enough units to be able to summarize the syntax rules that are defined on the long sentences. That's the frontal lobe of the cerebral cortex of the brain that Heidelberg man was much smaller than that of Homo sapiens.
So, anatomically modern man appeared no less than 195 ± 5 thousand years ago, has apparently enjoyed a real human language. But the base for this, mastery clearselection of speech, were laid the previous species, Homo heidelbergensis, hundreds, thousands of years before. published
The full article was published in Vestnik MGU, series "Anthropology", No. 3/2012