Natural Language Processing (NLP) , is a subset of artificial intelligence that deals with the interaction between computers and humans using the natural language. The ultimate objective of NLP is to read, decipher, understand, and make sense of the human languages in a manner that is valuable.
Human vs Computer understanding of language
Think about it this way. Every day, humans say thousands of words that other humans interpret to do countless things. At its core, it\’s simple communication, but we all know words run much deeper than that. There\’s a context that we derive from everything someone says. Whether they imply something with their body language or in how often they mention something. While NLP doesn\’t focus on voice inflection, it does draw on contextual patterns.
Natural Language Processing (NLP) is that is focused on enabling computers to understand and process human languages, to get computers closer to a human-level understanding of language. Computers don’t yet have the same intuitive understanding of natural language that humans do. They can’t really understand what the language is really trying to say. In a nutshell, a computer can’t read between the lines.
Challenge of Natural Language
Working with natural language data is not solved.
It has been studied for half a century, and it is really hard.
It is hard from the standpoint of the child, who must spend many years acquiring a language … it is hard for the adult language learner, it is hard for the scientist who attempts to model the relevant phenomena, and it is hard for the engineer who attempts to build systems that deal with natural language input or output. These tasks are so hard that Turing could rightly make fluent conversation in natural language the centerpiece of his test for intelligence.
— Mathematical Linguistics, 2010.
Natural language is primarily hard because it is messy. There are very few rules.
And yet we can easily understand each other most of the time.
Human language is highly ambiguous … It is also ever changing and evolving. People are great at producing language and understanding language, and are capable of expressing, perceiving, and interpreting very elaborate and nuanced meanings. At the same time, while we humans are great users of language, we are also very poor at formally understanding and describing the rules that govern language.
— Neural Network Methods in Natural Language Processing, 2017.
Why NLP is so Hard?
NLP is hard for two primary reasons: humans don’t always express intent through semantically accurate language, and there are numerous ambiguities in language. Some examples include:
- Semantics “Mark invited me to his school ball”. What is “ball” in this context?
- Morphology (parts of the word that can be deconstructed to create different meanings).
Ambiguity of intent: “I just got back from the City”. What do they want?- Situational ambiguity: “John was found by the river head”. Could be by the head of the river (place) or the executive of the river (person) Unable to deduce the meaning of unknown words from context like humans can
- Disambiguation – “
jaguar ” can refer to a car or to an animal
Morphology (parts of the word that can be deconstructed to create different meanings).
As NLP advances, we can expect to see even better human to AI interaction. Devices like Google\’s Assistant and Amazon\’s Alexa, which are now making their way into our homes and even cars, are showing that AI is here to stay.