What is Computational Linguistics (LC)?
Computational linguistics is the scientific study of human or natural language from a computational perspective. It is a developing interdisciplinary field that encompasses theoretical linguistics, natural language processing, computer science, artificial intelligence, psychology, philosophy, mathematics and statistics, among others. Computational linguists are interested in providing computational models for various types of linguistic phenomena. These models can be knowledge (based on world knowledge and linguistic competence) or stochastic(based on probability and statistics from data). Research in computational linguistics is motivated in some cases from a scientific perspective, which is about offering a computational explanation for a particular linguistic or psycholinguistic phenomenon; in other cases the motivation may be rather technological, in which one wants to provide an operational component for a natural speech or language system.
In fact, the work of computational linguists is incorporated into many operational systems, including speech recognition systems, text-to-speech synthesizers, automatic voice response systems, web search engines, text editors and instructional materials. of languages, among others.
The practical objectives of the field are wide and varied. In practice, some of the most prominent are text retrieval on a desired topic; effective automatic translation ; answer to questions (ranging from simple questions in fact to questions that require inference and descriptive answers); text summary ; analysis of texts or spoken language by subject, feeling or other psychological attributes; Dialogue agents to perform particular tasks (purchases, solution of technical problems, travel planning, maintenance of schedules, medical advice, etc.); and, ultimately, the creation of computer systems with competence similar to the human one in dialogue, in the acquisition of language and in obtaining knowledge from the text.
Objectives of computational linguistics
Among the theoretical objectives of computational linguistics are the following:
- Formulation of grammar and semantic frameworks to characterize languages , so as to allow computationally manageable implementations of syntactic and semantic analysis.
- Discovery of processing techniques and learning principles that exploit the structural and distributive (statistical) properties of language.
- Development of cognitively and neuroscientifically plausible computational models of how language processing and learning could occur in the brain.
Applications of computational linguistics
Given the speed at which advances in the field of technology develop, it is not surprising to know the wide number and variety of applications of computational linguistics that exist today.
Among the best known are the following:
- Automatic translation.
- Document recovery applications and subsequent clustering.
- Extraction and summary of knowledge.
- Feeling analysis
- Chatbots and other types of friendly dialogue robots.
- Applications of computational linguistics within virtual universes, games and interactive fiction .
- User interfaces in natural language, such as the following:
- Answer to questions based on text.
- Answer to inferential questions (based on knowledge).
- Web services and voice-based assistants.
- Collaborative problem solvers and intelligent tutors.
- Different kinds of language-enabled robots
Some More Functions of computational linguistics
The works in computational linguistics have technological applications increasingly necessary and quoted in the industry:
- human-machine interfaces (conversational agents), in which a natural language is used instead of an artificial one or a restricted menu of options
- speech and text recognition and synthesis (response systems, voice synthesizers and transcriptionists), which require syntactic knowledge to process prosodic aspects (intonation)
- automatic translation into other languages from textual or oral productions
- search engines and information retrieval , which require understanding the search conditions to recognize which documents are relevant or not
- information extraction , which needs to recognize the relevant information in a database to transfer it to predetermined formats (such as tables or graphs)
- textual entailment , which require understanding of natural language to recognize inferences and verify hypotheses from diverse texts.
- grammar and style proofreaders , who need syntactic knowledge to detect mismatches and “incomplete” or “incorrect” sentences
- spell checkers , who must have at least knowledge of morphological analysis and syllabic structure
- computerized assisted teaching of languages , which must have the capacity of syntactic analysis to propose and correct grammar and composition exercises
Curriculum Sequence in Computational Linguistics
This curriculum sequence offers competitive training in the essential areas of the theory and application of computational linguistics (LC) and natural language processing (PLN).
Knowledge of linguistic theory, particularly syntax and semantics, is essential to understand universal principles and parameters of variation in natural languages, properties and features of language as a distinctive faculty of the human species and grammar as mental representations. of a computational cognitive system. The study of the formal foundations of computational linguistics provides the logical-mathematical tools necessary for the analysis and evaluation of computational models of learning, knowledge and linguistic processing based on deterministic and non-deterministic, symbolic and probabilistic systems, while allowing familiarization with online tools for processing natural languages, such as annotated corpus, structural analyzers and networks and semantic ontologies. The ability to program in procedural and declarative computer languages and to handle different techniques and formats for representation, structure, storage and retrieval of information is essential to develop computational models of PLN.
Computational linguistics, the field where science and letters meet
Artificial intelligence hides a large number of researchers who work daily to continue advancing in this field and add new functions and utilities. When we think of AI and Big Data, we imagine that behind all these advances are engineers, mathematicians, scientists, computer scientists or programmers. And there are. But in reality, other professionals such as linguists, psychologists or even philosophers are also necessary.
These profiles, which apparently have little to do with each other, make up the multidisciplinary teams that come to light when they deepen the day-to-day work and research of artificial intelligence.
To create intelligent instruments and tools it is essential that they can communicate, and it is at this point that the figure of the computational linguist appears, key in the investigation of language technologies. According to Wikipedia, computational linguistics ItIt is an interdisciplinary field that deals with the development of formalisms that describe the functioning of natural language, such that it can be transformed into programs executable by a computer. In this way, the linguist and the specialist engineers must transform the existing information, both in voice and text, into a structured language that artificial intelligence can understand and process, and for which it can generate an answer. A function in which not only professions eminently related to science are necessary, but also experts in language or behavior are essential.
Performing the task of converting all this unstructured information to data that can be processed is the great challenge of natural language processing , one of the most developed activities of AI. Currently, the PLN is one of the most demanded applications by companies that need to process and take advantage of all the information they handle in their day to day or that they store in their historical archives. Tasks such as machine translation, entity detection, information retrieval, automatic sentiment analysis, the extraction of main ideas from a text, the detection of trends or the development of chatbots are vital for many companies, because they They allow you to listen and learn from your users and their behavior.
It is from the detection of these needs when the linguist, along with the rest of the team, begins with the transformation process. The starting point of any PLN project is the corpus, a set of texts, ordered or not, that serve as the basis for any linguistic or statistical analysis. One of the main tasks of linguists is the systematic and exhaustive annotation, which converts the set of texts into an annotated corpus. To do this, the linguist must make a precise labeling of each term on the text. It is an expensive task, but essential for AI to start acting on that information.
Next, this corpus is introduced into linguistic engines where it is analyzed at a morphological, syntactic and semantic level through linguistic rules of different levels. Finally, in a more advanced phase, machine learning models are applied that result in automatically enriched texts with the correct labels. These procedures allow you to perform all those PLN tasks that offer a multitude of possibilities to companies, institutions or public administration depending on their needs and their characteristics.
The huge variety of clients allows linguists to embark on PLN projects that are very different from each other. From the creation of algorithms to train chatbots, which resolve doubts and incidents, to the detection of neologisms in a language, as is the case of the project to locate anglicisms in the use of Spanish in the US in social networks, carried out by the Cervantes Institute of Harvard University in collaboration with the Institute of Knowledge Engineering (IIC).
Science and letters, despite the widespread conception that they are opposite terms, advance much faster if they work as a team. Computational linguistics is the field where this conjugation of a priori antagonistic profiles is perfectly exemplified . AI is an unstoppable technology, which is constantly reinvented and that brings great advances in all fields. One of the keys to this success is that it combines multidisciplinary teams in which all branches add and complement each other.