Studying human language is not a simple task, but it is very important. This complexity is the result of different approaches we have in linguistics.
It is possible to explain language facts by looking at the individual, at society, at the relationship between them, among other views. By uniting these approaches with technology, we have an important field of action: computational linguistics.
All these relationships have goals, successes, reasons, and also problems of their own. In this sense, it is up to the computational linguist to bridge the gap between natural language processing and technological innovations .
It is necessary to understand how the language works and to know how to work with the large amount of data available, associated with increasingly powerful and cheap processing.
With that in mind, we prepared this article for you to understand more about computational linguistics and learn how to start studying about the area. Good reading!
What is computational linguistics?
Computational linguistics is a multidisciplinary area that involves linguistics, artificial intelligence and informatics, using computational processes to control human language. The objective is to develop, through logical-formal modeling, systems capable of producing and recognizing information presented by natural language.
The field began in the 1950s, especially thanks to the United States, which used computers to automatically and quickly translate documents written in other languages. Even though the translations were still not perfect, it became possible to achieve a very reasonable quality.
This shows the consolidation of this field of research, which is dedicated to the development of algorithms, methods and software that enable computers to deal with a natural language in a sensible and useful way to our needs.
In addition to translation applications, we also have speech synthesis, speech recognition, search systems, information extraction in texts, automatic correctors, word processors and automatic summarization. This means that the applicability is wide and growing every day.
Initially, computational linguistics requires familiarity with concepts from mathematics and logic. Then it is important to master basic elements of programming languages , such as Python, with an emphasis on the Natural Language Toolkit (NLTK) library.
This is a very powerful working tool. In addition to saving time, it makes complex tasks much safer as it neutralizes human errors. It is for this reason that more and more people from different areas are turning to language as an object of study.
In addition, it is important to highlight that companies have already realized the importance of this market and are allocating resources in the scientific scenario, expanding the range of opportunities for those who are trained.
Subfields and related areas
Who are computer linguists?
Computational linguists are able to develop, through statistical or logical-formal models of natural languages, tools that recognize and produce information presented in natural language.
These professionals often act as members of multidisciplinary teams, including computer scientists , linguists, artificial intelligence specialists, logicians, philosophers, mathematicians, cognitive scientists, anthropologists, psycholinguists, cognitive psychologists, neuroscientists, among others.
Among the main competencies of the computational linguist are:
- create tools that help in the production and writing of a text, such as grammar and spelling checkers, electronic and synonym dictionaries, syntactic and morphological analyzers, systems specialized in the composition of business letters, among others;
- develop tools that assist in reading and browsing electronic pages, such as systems that read e-mails, specialized database search devices and resources for searching spoken material using voice recognition techniques;
- simplify automatic translation between different languages;
- to develop models with a voice interface capable of questioning the computer through speech recognition techniques to obtain answers – applicable in intelligent systems for the home or car and in recreational and educational games;
- facilitate daily tasks, such as e-commerce purchases;
- create computational tools that facilitate and stimulate the teaching-learning process in general – science, mathematics, geography, etc. – and, in particular, the teaching and learning of foreign and mother tongues, such as interactive and didactic games and programs that help to improve pronunciation;
- allow people with articulatory or visual impairments access to the information society, through recognition and synthesis programs that help them perform reading and writing tasks;
- index databases used in digital libraries and collections of diverse materials;
- promote user safety, based on individual voice characteristics and that they only react to your orders.
Objectives
Among the theoretical objectives of computational linguistics are the following:
- Formulation of grammar and semantic frameworks to characterize languages , so as to allow computationally manageable implementations of syntactic and semantic analysis.
- Discovery of processing techniques and learning principles that exploit the structural and distributive (statistical) properties of language.
- Development of cognitively and neuroscientifically plausible computational models of how language processing and learning could occur in the brain.
Applications of computational linguistics
Given the speed at which advances in the field of technology develop, it is not surprising to know the wide number and variety of applications of computational linguistics that exist today.
Among the best known are the following:
- Automatic translation.
- Document recovery applications and subsequent clustering.
- Extraction and summary of knowledge.
- Feeling analysis
- Chatbots and other types of friendly dialogue robots.
- Applications of computational linguistics within virtual universes, games and interactive fiction .
- User interfaces in natural language, such as the following:
- Answer to questions based on text.
- Answer to inferential questions (based on knowledge).
- Web services and voice-based assistants.
- Collaborative problem solvers and intelligent tutors.
- Different kinds of language-enabled robots
Some More Functions
The works in computational linguistics have technological applications increasingly necessary and quoted in the industry:
- human-machine interfaces (conversational agents), in which a natural language is used instead of an artificial one or a restricted menu of options
- speech and text recognition and synthesis (response systems, voice synthesizers and transcriptionists), which require syntactic knowledge to process prosodic aspects (intonation)
- automatic translation into other languages from textual or oral productions
- search engines and information retrieval , which require understanding the search conditions to recognize which documents are relevant or not
- information extraction , which needs to recognize the relevant information in a database to transfer it to predetermined formats (such as tables or graphs)
- textual entailment , which require understanding of natural language to recognize inferences and verify hypotheses from diverse texts.
- grammar and style proofreaders , who need syntactic knowledge to detect mismatches and “incomplete” or “incorrect” sentences
- spell checkers , who must have at least knowledge of morphological analysis and syllabic structure
- computerized assisted teaching of languages , which must have the capacity of syntactic analysis to propose and correct grammar and composition exercises
How to start studying computational linguistics?
There are many avenues for anyone who wants to learn about computational linguistics. The first tip is to seek partnerships with other people who wish to study. As it is an area that unites several fields of study, your learning will be much more fruitful if shared with professionals who have their own knowledge, experiences and visions.
In this sense, the internet, social networks and forums are excellent tools. They facilitate the meeting of materials, courses and people who study the area. At first, you can search for books that talk a little about language, computing, statistics and computational linguistics.
A good option is to read the book “ To know computational linguistics ”, by Marcelo Ferreira and Marcos Lopes. The work is aimed especially at students and provides good introductions on the subject. The language is clear and accessible and has practical exercises.
The introductory book presents some concepts and practical foundations of tasks related to the universe of computational linguistics, combining elements of programming, machine learning and computational models.
Those who already have a degree in linguistics have a greater knowledge of the analysis and description of languages. This professional studies, for example, speech, phonetics and language acquisition. In these cases, it is interesting to invest in a postgraduate course in the areas of Artificial Intelligence and Computing.
Computational linguistics is an area that relates to language modeling and, from these models, develop technologies and applications. Among the solutions are automatic typing error correction, instant translation and speech recognition and synthesis. The possibilities are many, which justifies the prominence the sector has received in recent years.