A Non-Technical Intro to Natural Language Processing (NLP) and its Applications

Published in

Product Management Lessons from the Startup Trenches

9 min readDec 18, 2018

They don’t like to type, are not good spellers, don’t read your help tutorials, and don’t want to figure out your app — your users want to just tell your app what they want and have your app deliver on the request. And since switching costs are near-zero, if your app doesn’t make it easy for them to get what they want, your users will leave you for another one.

Fortunately, technology has been rapidly advancing and there are tools available that make this a solvable problem. As someone on the business side of the house, it is not important for you to know how to perform Natural Language Processing (NLP), but in my experience you will add much more value to your product/users and work more effectively with your teams if you have a sense for what is involved.

In this post, I am going to give a quick primer on NLP to provide the information needed to better identify opportunities where NLP can add value and to help you better work with teams applying NLP techniques. I wish I knew this stuff when I started working with people focused on NLP a few years ago — hopefully, it will help you ramp up more quickly.

Where Might Natural Language Processing Add Value to Your Business?

Natural Language Processing is a type of Artificial Intelligence focused on helping machines to understand unstructured human language. NLP tools take unstructured data — whether spoken by a person or written in a document or message — and derive enough structure for a machine to know what to do with.

Natural Language Processing techniques do not create value on their own, but they can help your team add value to your business and improve your user experience. To help get you thinking about whether NLP is right for your business, here are some example use cases where NLP adds value:

Enhance Humans Navigation

IVR systems: helping consumers get routed to the person or information they need without going through a brutal phone tree (Liberty Mutual, I’m looking at you!).
Mobile apps: don’t make users type on that little screen or figure out your navigation, just let them tell you what they want and then take care of it.
Maps: searching for “hamburgers” in your favorite map application requires a fair amount of NLP in order to show you restaurants around you.
Cars: many people spend a lot of time in the car where they do not have the ability to interact with a screen. I spend a lot of time in the car and anything I can take care of on the drive home means more time with my family.

Transforming and Extracting Information from Text/Speech

Fact Extraction — automatically extracting structured information from unstructured documents. Highly valuable for search, chatbots, and Q&A applications.
Content Categorization — assigning a piece of text or document to one or more categories that can make it easier to manage or sort. For example, categorizing web pages or search terms into categories to assist marketers in identifying where they want to advertise.
Document Summarization — creating a short document from a longer document without a person having to be involved.
Machine Translation — allowing people to communicate with one another, across languages.

Helping Humans Get Answers

Web search and document search: interpreting what users are looking for in queries and identifying documents that contain the content users are looking for. This applies at web-scale of course, but also can be powerful for intra-organization search as well.
Call center agent support tools: helping agents find the information necessary to help consumers.
Consumer self-service help tools: helping consumers find information they need without having to wait on hold for 35 minutes until one of your call center agents can take your call (Wells Fargo, I’m looking at you!).
Chatbots: provide chat support to your users and prospects with minimal staffing cost increases.
Smart speakers and voice assistants: yep, lots of this being used by Amazon Echo, Google Home, etc. The barrier to entry to these speakers is not insurmountable — look for more entrants in this space in the coming months.

Deriving New Information from Large Volumes of Data

Sentiment Analysis (aka, Opinion Mining) — evaluating a set of text to identify the writer’s attitude toward a topic as being positive, neutral, or negative. The idea here is to understand how strongly someone feels about the topic. Sentiment analysis is often used to review survey results from customer-service interactions, product reviews, etc. One typical application of sentiment analysis is to evaluate people’s feelings in the aggregate about a brand or product (e.g., from social media posts or survey responses) — in this case the benefit is that you don’t have to understand every single post since the machine processes thousands of them.
Twitter analysis: categorizing tweets and identifying trends in sentiment, facts expressed, etc. related to people, companies, teams, etc. Note that getting access to the Twitter firehose for commercial purposes can be expensive, but there is power in staying up to the minute with what journalists, government and corporate officials, and reality stars are talking about.

Who uses it?

NLP tools and techniques are leveraged by data scientists, ontologists, computational linguists, data engineers, etc. One great thing about NLP is that there are many academics working in this area and there are a lot of open source tools, datasets, and libraries available to build on. Gone are the days where your technology team is going to need 12 months to build a platform for NLP.

Don’t let your teams reinvent the wheel, make sure they are building on the work of others and benefiting from the latest advancements in the field.

What are some of the key techniques involved?

Here is some spinach that may be useful as a reference. The following are some key techniques that your teams may employ, listed in the order in which they are typically performed (again, there are off-the-shelf tools for all of these):

Tokenization — it all starts with tokenization, breaking down text that includes words and punctuation into a set of pieces (tokens, aka terms or words) that are prepared for processing.
Transformation — the process of rewriting words to make other steps easier. For example, “don’t” may have been broken down into the three tokens (don, ‘, t) in the previous step. Normalization may glue them back together or rewrite it as (do, not).
Part of Speech Tagging — identifying whether a word is a noun, verb, adjective, etc. based on both its definition and its context. The standard set of tags commonly used in English is the Penn Tree Bank.
Stop Word Removal — discards extremely common words (such as “the”, “and”, “a”, “of”) since their prevalence in text make them meaningless individually. For example “Newton thought of an apple” may become “Newton thought apple”. This process is more a stop-gap measure if other downstream processes have a hard time with extremely frequent words.
Word Normalization — the process of turning words into a base form. There are two popular approaches to word normalization: 1) Stemming — removes the suffixes and prefixes a word to its root word stem. There are different algorithms that perform stemming, but in general stemming involves applying a set of rules that are sequentially applied — for example, “organize”, “organizes”, and “organizing” will all be reduced to the same root stem “organiz”. Stemming is similar to lemmatization (below), but is generally faster and less computationally demanding. A popular stemmer is the Porter Stemmer Algorithm supported by many open source SDKs. 2) Lemmatization — the process of identifying the base form of a word (e.g., that you might look up in a dictionary). Lemmatization is similar to stemming, except that lemmatization leverages a dictionary to identify the base form (or “lemma”) and takes into account the part of speech of the word (e.g., for words that can be both nouns and verbs, for example). For example, in the sentence “I organize the activities”, a stemmer will produce “I organ the activ” where a lemmatizer will produce a more understandable “I organize the activity”.
Sentence Splitting — breaks down the text into sentences. It is also referred to as Sentence Boundary Detection and facilitates more sophisticated analysis of the sentence structure (see Parsing below).
Named Entity Extraction — identifying people, places, companies, and other entities from the text based on the format of the text (e.g., phone numbers and addresses) and comparison with a dictionary or knowledge repository of known entities. This is closely related to Named Entity Resolution which is the process of relating a particular name in a text to an actual person in a repository (called an onomasticon). For example, a mention of “John Adams” refers to a specific person, but there are many, many famous people with this name. The context is often necessary to pick the right individual.
Pattern-Based Entity Extraction — refers to the process of recognizing various quantities and measurements such as time, date, monies, weights, distances, etc. This is useful when you intend to extract facts from unstructured data.
Parsing — determine the grammatical structure of a sentence. It helps identifying the subject, verb, complement of a sentence which is required in order to extract facts. The result of parsing is generally a tree structure which connects sentence parts to one another. Think of how subject and verb relate to one another (e.g. “John reads the book”), and how adjective relate to the noun they are associated with (e.g. “yellow cat”) There are two main categories of parsers known as shallow and deep parsers depending on how detailed is the resulting tree.
Semantic Disambiguation: the process of attaching meanings to words. This is achieved by relating words in the text to words in a structure that organizes the concepts related to your specific industry (called an ontology). This is especially valuable in search use cases where semantic analysis can provide the user with much better results vs. keyword searches alone. For example, if a user is searching for “what do jaguars eat”, semantic analysis can identify that this use of “jaguar” is referring to the animal (and not the car or football team), and can then retrieve the appropriate content.
Co-reference Resolution — the process of relating various mention of an entity (person, place) to one another. For example, in the sentence “John reads the book. He likes it a lot”, this process identifies that “John” and “he” refer to the same individual.
Lexicon — the language-specific expressions for concepts in the ontology (aka, a list of words and expressions). For example, a lexicon may have an entry for high blood pressure with two synonym expressions: “high blood pressure” and “hypertension”.
Ontology — a language independent identification of concepts and how different concepts relate to one another. The ontology is often connected to a lexicon in order to support relating words in a text to actual concepts in the ontology (see Semantic Disambiguation above).
Measure Text Similarity — the process of determining how similar two pieces of text (sentences or documents) are to one another. This is useful to group similar texts together and can be used for near duplicate detection.
Text Classification — using algorithms to label a set of documents or texts. Once the documents are classified, you can organize them (e.g. into news topics).
Spelling correction — if you are dealing with text entered by a human (i.e., not spoken to the machine), then it will likely contain some misspellings. Spelling correction algorithms can help figure out what the user meant to type.

Until 10 years ago, many of the above steps were based on a set of rules laboriously tweaked by humans (think 20% of general rules and 80% handling exceptions). Machine learning has come to the rescue and most of those steps are the result of complex neural networks trained on millions of documents.

Reputable resources to check out for more information:

The Stanford NLP group: https://nlp.stanford.edu/.
Apache OpenNLP: https://opennlp.apache.org/
Natural Language Toolkit (NLTK): http://www.nltk.org/
General Architecture for Text Engineering (GATE): https://gate.ac.uk/
If you are motivated, Stanford’s NLP group has an online course available on YouTube that is really good: https://www.youtube.com/watch?v=3Dt_yh1mf_U&list=PLQiyVNMpDLKnZYBTUOlSI9mi9wAErFtFm
Of course, Google, IBM, Microsoft, Amazon, and other commercial software providers will be glad to sell you their NLP products if one of the many open source projects doesn’t meet your needs.

One word of caution if you are thinking of leveraging open source software: most technologies will provide basic capabilities out of the box, but are likely to require adjustments depending on your industry and specific needs. Do not assume that open source software will be “good enough” as is.

Conclusion

Natural Language Processing is a set of powerful tools that can help your company exceed customer expectations and differentiate from your competitors. These tools are readily available from both software vendors and open source projects and leveraging them can doesn’t necessarily require hiring an army of PhD’s. NLP can be intimidating when you are first exposed to it and my hope is that this post might serve as a reference to help you climb the learning curve more quickly and feel more comfortable interacting with teams employing NLP techniques. Being more comfortable with NLP should help you better identify opportunities add value with this technology and should also allow you to be a better partner/customer of your technology team.

Thanks to Gerald Burnand and Brian Elmi for their feedback and suggestions on this post.

The opinions expressed in this post are solely my own and do not express the views or opinions of my employer.

#NLP #NaturalLanguageProcessing #Innovation #ProductManagement