Natural language processing (NLP) refers to the branch of computer science—and more specifically, the branch of artificial intelligence or AI—concerned with giving computers the ability to understand text and spoken words in much the same way human beings can.
The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves. The NLP procedures target at turning texts to machine-readable structured data, which can then be analysed by ML techniques.
People started communicating with machines through constructed language (e.g. programming languages, etc.) but are increasingly using the natural language to do it (e.g. chatbots, virtual assistants, etc.). Other NLP applications involve automatic text correction, content translation, use of chatbots on websites, conversion of sign language to text, speech recognition, and even identification of malicious emails, like spam, and phishing.
Natural language processing (NLP) methods extract information from unstructured data such as clinical notes/medical journals to supplement and enrich structured medical data. AI can use sophisticated algorithms to ‘learn’ features from a large volume of healthcare data, and then use the obtained insights to assist clinical practice. It can also be equipped with learning and self-correcting abilities to improve its accuracy based on feedback. Moreover, an AI system extracts useful information from a large patient population to assist in making real-time inferences for health risk alert and health outcome prediction.
In 2020, digital acceleration went into overdrive as the global pandemic pushed us faster, and further, into the Data Age. In response to the COVID-19 pandemic, the White House and a coalition of leading research groups have prepared the COVID-19 Open Research Dataset (CORD-19). CORD-19 is a resource of over 200,000 scholarly articles, including over 100,000 with full text, about COVID-19, SARS-CoV-2, and related coronaviruses. This freely available dataset is provided to the global research community to apply recent advances in natural language processing and other AI techniques to generate new insights in support of the ongoing fight against this infectious disease.
Moreover, large-scale social media platforms are also utilizing text analytics and NLP technologies for monitoring and tracking social media activities, such as political reviews and hate speeches. Platforms like Facebook and Twitter are managing the published content with the help of these tools.
NLP for military Multi-domain operations
With all the complex, high-speed data coming at military intelligence analysts, decision-makers and warfighters, the Army wants to use artificial intelligence and machine learning to decrease the cognitive burden on analysts, provide timely information to decision-makers and identify specific targets with high confidence in real or near-real-time. (NLP) can be applied to efficiently handle these tasks such as to analyze large quantities of intelligence reports and other documents written in different languages.
Natural language processing techniques and machine learning can automatically classify and match incoming data to indicators and warnings being monitored. It can provide alerts on trending topics, keywords or themes that may indicate emerging tactics, techniques and procedures. Text summarization is another aspect of NLP that holds great potential for the intelligence space. Reporting formats are often standard and summaritive in nature. AI can automatically summarize content in a corpus of documents by extracting key themes.
The military area of operations has changed from a battlefield to an operating environment that includes complex civil-military relationships and nation-building efforts. Sentiment models can automatically extract positive, negative and neutral sentiment to provide input on local populace perspectives towards ongoing military presence and operations in a given area. Automated sentiment analysis can add incredible value for the purposes of understanding populace opinions within a civil military environment.
The Army wants soldiers to talk with robots, and for the robots to talk back. Soldiers currently need to be “heads down, hands full,” operating clunky joysticks or working with other robotic hardware that doesn’t allow for the seamless integration into operations. Researchers are developing a voice-controlled “dialog system” called Joint Understanding and Dialogue Interface (JUDI), which uses natural language processing to allow two-way, hands-free communication between a soldier and robotic technology. The system, once scaled, could be a critical part of realizing the military’s multi-domain operations concept by having robots receive and deliver integrated information about an operating environment in easy-to-understand formats.
At the core of JUDI is a machine learning-based natural language processing system that is paired with other forms of AI to derive intent and turn commands into action. That action, through more AI-enabled systems, can then voice a robot’s actions back to the user’s headset so a solider does not have to look at a screen to know where the robot is. It is trained on a data set of collected voice samples from a series of experiments conducted to simulate how service members would talk in real-world scenarios. Marge said that collecting training data that is as close to the “natural” language is critically important for the system’s overall accuracy in the field. “We shouldn’t make assumptions about how people might speak,” he said.
JUDI is being designed to run completely at the edge with no need to access cloud-based data. But as the system grows more accurate, it could offer new capabilities for leveraging the Army’s new multi-domain operations concept of warfare, Marge said. The idea is to simultaneously handle the complexity of air, land, sea, space and cyberspace. A fully functional, scaled dialog system could support command centers, too. Robots that can sense their surroundings and turn it into coherent data will be critical for the Army, Marge said.
However, the use of speech recognition technology in high noise environments remains a challenge. Normally, for speech recognition systems to function properly, clean speech signals are required, with a high signal-to-noise ratio and wide frequency response. A microphone system is critical in providing the speech signal required for recognition accuracy. In high noise environments, providing a clean speech signal can be difficult. Interference, changes in the user’s voice, and additive noise – such as car engine noise, background chatter, and white noise – can reduce the accuracy of speech recognition systems. In military environments, additive noise and voice changes are common. For example, in military aviation, the stress resulting from low-level flying and other difficult missions can cause a speaker’s voice to change, reducing recognition accuracy.
Natural language processing (NLP) technology
Natural language refers to the way we, humans, communicate with each other that includes speech, writing, and signs. Linguistics is the scientific study of language, including its grammar, semantics, and phonetics. Classical linguistics involved devising and evaluating rules of language. Great progress was made on formal methods for syntax and semantics, but for the most part, the interesting problems in natural language understanding resist clean mathematical formalisms.
Computational linguistics is the modern study of linguistics using the tools of computer science. Large data and fast computers mean that new and different things can be discovered from large datasets of text by writing and running software. Computational linguistics also became known by the name of natural language process, or NLP, to reflect the more engineer-based or empirical approach of the statistical methods. Natural Language Processing, or NLP for short, is broadly defined as the automatic manipulation of natural language, like speech and text, by software.
NLP, taking into account 7 components that form the basis of a natural language, namely Phonetics, Phonology, Morphology, Lexicon, Syntax, Semantics, and Pragmatics. In short, phonetics and phonology are about sound and its acoustic properties. Morphology concerns the structure of words. Lexicon and syntax are related to the use and structure of words and phrases. Finally, semantics and pragmatics analyze the meaning and context of sentences, paragraphs, and texts.
Natural language processing (NLP) is a collective term referring to automatic computational processing of human languages. This includes both algorithms that take human-produced text as input, and algorithms that produce natural looking text as outputs. It has several sub-disciplines, including Natural Language Understanding (NLU), Natural Language Generation (NLG), and Natural Language Query (NLQ). Combining the intensity of Artificial Intelligence, computational phonetics, and computer science, NLP permits a machine to comprehend human language which only humans could possibly do, until now.
Traditionally most natural language processing systems were based on complex sets of hand-written rules. Recently representation learning and deep neural network-style machine learning methods have become widespread in natural language processing. This was due to both the steady increase in computational power (see Moore’s law) and due in part to a flurry of results showing that such techniques can achieve state-of-the-art results in many natural language tasks, for example in language modeling, parsing, and many others. This is increasingly important in medicine and healthcare, where NLP is being used to analyze notes and text in electronic health records that would otherwise be inaccessible for study when seeking to improve care.
Machine-learning algorithms have many advantages. Automatic learning procedures can make use of statistical inference algorithms to produce models that are robust to unfamiliar input (e.g. containing words or structures that have not been seen before) and to erroneous input (e.g. with misspelled words or words accidentally omitted). Systems based on automatically learning the rules can be made more accurate simply by supplying more input data.
There are different techniques and algorithms used in NLP.
1. Bag of Words: Bag of Words is an algorithm used to vectorize information from a text. That is, it’s a way to check the occurrence of words, or count words.
2. TFIDF: TFIDF is an algorithm that takes into account the occurrence and also the frequency in which words appear in texts. Some terms can have positive weight while others, negative.
3. Stemming: Stemming is a more rustic model used for text normalization or classification. It focuses on the root of words, removing affixes (prefixes, infixes, and suffixes).
4. Lemmatization: Lemmatization is a technique used to convert words into their basic form (lema) and to group different forms of the same term. It’s also used for text normalization.
5. BERT: BERT (Bidirectional Encoder Representations from Transformers) is an algorithm in the NLP field that has the ability to analyze and learn relationships between words based on a context. In NLP, this mechanism is called attention. Another great advantage of BERT in relation to other language models is that it was designed to analyze texts in both directions. That is, from right to left, and from left to right. This mechanism is called bidirectionality. The combination of attention and bidirectionality mechanisms allows some systems based on BERT to be extremely efficient in identifying and classifying texts.
Sequence mapping is a semi-supervised machine learning method that can be used to automatically identify and extract entities such as people, places, organizations and currencies from documents or messages. It’s also helpful in parsing, because it will detect different versions of a word without the analyst having to write specific rules about word variants. This means that entities identified by the machine as “PERSON” and any variations would be automatically labelled and extracted, because NLP provides a foundation for the machine to learn how a name is constructed in the human dialect it’s analyzing.
Textless NLP
The NLP field has almost always used written text for training models. This works very well for languages like English, which have enormous text data sets suitable for training. But the majority of the world’s languages lack these extensive data sets, which means they have been largely unable to benefit from NLP technology.
Text-based language models such as BERT, RoBERTa, and GPT-3 have made huge strides in recent years. When given written words as input, they can generate extremely realistic text on virtually any topic. In addition, they also provide useful pretrained models that can be fine-tuned for a variety of difficult natural language processing (NLP) applications, including sentiment analysis, translation, information retrieval, inferences, and summarization, using only a few labels or examples (e.g., BART and XLM-R).
There is an important limitation, however: These applications are mainly restricted to languages with very large text data sets suitable for training AI models.
Meta AI’s Generative Spoken Language Model (GSLM), the first high-performance NLP model that breaks free of this dependence on text. GSLM leverages recent breakthroughs in representation learning, allowing it to work directly from only raw audio signals, without any labels or text. It opens the door to a new era of textless NLP applications for potentially every language spoken on Earth—even those without significant text data sets. GSLM also enables the development of NLP models that incorporate the full range of expressivity of oral language.
Previously, connecting an NLP application to speech inputs meant that researchers had to first train an automatic speech recognition (ASR) system, a resource-intensive operation that introduced errors, did a poor job of encoding casual linguistic interactions, and was available for just a handful of languages. With textless NLP, our hope is to make ASR obsolete and to work in an end-to-end fashion, from the speech input to speech outputs. We think preschool children’s ability to learn about language solely from raw sensory inputs and audio interactions is an exciting template for the future advances this research may enable.
We are now sharing our baseline GSLM model, which has three components: an encoder that converts speech into discrete units that represent frequently recurring sounds in spoken language; an autoregressive, unit-based language model that’s trained to predict the next discrete unit based on what it’s seen before; and a decoder that converts units into speech.
Natural Language Processing (NLP) Market
The Global Natural Language Processing (NLP) Market was valued at USD 10.72 billion in 2020. The market is projected to grow from $20.98 billion in 2021 to $127.26 billion in 2028 at a CAGR of 29.4% in the forecast period.
The NLP market consists of major growth factors like the increase in smart device usage, growth in the adoption of cloud-based solutions and NLP-based applications to improve customer service, as well as the increase in technological investments in the healthcare industry.
Due to the ongoing Covid-19 pandemic the market is witnessing growth in healthcare sector.
Large organizations are one of the primary drivers and investors in the NLP market. As these organizations are increasingly adopting deep learning, along with supervised and unsupervised machine learning technologies for various applications, the adoption of NLP is likely to increase. NLP may also enhance the customer experience programs with various added benefits, thereby attracting more consumers, which, in turn, is projected to have a positive impact on the market growth in the country.
The demand for information extraction product application is also anticipated to increase due to the growing importance of the web data for effective marketing and decision-making. Within the next few years, mobile chatbots are anticipated to revolutionize the marketing and commerce sectors.
Cost and risk are some of the major factors driving the adoption of these technologies among large organizations. Most of the large end-user organizations across various industries are mainly utilizing these technologies to enhance their internal and external operations. Moreover, the ROI of the technology is not always in the monetary form, hence, most of the small organizations find it risky to invest in.
Key players include Google Inc. and Microsoft Corporation, among others.
Recent Developments
February 2021 – Baidu Inc. presented Ernie-M, which is a multilingual model that could analyze 96 languages. It is a training model and has the capacity to improve the cross-lingual transferability on languages that are data-sparse.
References and Resources also include:
https://www.fedscoop.com/army-robot-system-dialog-natural-language-processing/
https://blogs.sas.com/content/sascom/2019/02/13/nlp-for-military-intelligence/
https://ai.facebook.com/blog/textless-nlp-generating-expressive-speech-from-raw-audio/