Apache NLPCraft

Furkan KAMACI
4 min readMar 4, 2020
Photo by Volodymyr Hryshchenko on Unsplash

NLPCraft is an open source library for adding Natural Language Interface to any applications. Based on semantic modeling it requires no ML/DL model training or existing text corpora.

NLPCraft is simple to use: define a semantic model and intents to interpret user input. Securely deploy this model and use REST API to explore the data using natural language from your applications.

Why NLI

Natural Language Interface (NLI) enables users to explore any type of data sources using natural language augmenting existing UI/UX with fidelity and simplicity of conversational AI.

There is no learning curve, no special rules or applications to master, no syntax or terms to remember — just a natural language that your users already speak and the tools they already use.

Key Features

Semantic Modeling

Advanced semantic modeling and intent-based matching enable deterministic natural language understanding without requiring ML/DL training or text corpora.

Strong Security

HTTPs, model deployment isolation, 256-bit encryption, and ingress-only connectivity are among the key security features in NLPCraft.

Any Data Source

Any data source, device, or service — public or private. From databases and SaaS systems to smart home devices, voice assistants and chatbots.

Model-As-A-Code

Model-as-a-code convention natively supports any system development life cycle tools and frameworks in Java eco-system.

Java-First

REST API and Java-based implementation natively support the world’s largest ecosystem of development tools, programming languages, and services.

Out-Of-The-Box Integration

NLPCraft natively integrates with OpenNLP, Google Cloud Natural Language API, CoreNLP and spaCY for base NLP processing and named entity recognition.

How It Works

There are three main software components:

Data model specifies how to interpret user input, how to query a data source, and how to format the result back. Developers use a model-as-a-code approach to build models using any JVM language like Java or Scala.

Data probe is a DMZ-deployed application designed to securely deploy and manage data models. Each probe can manage multiple models and you can have many probes.

REST server provides REST endpoint for user applications to securely query data sources using NLI via data models deployed in data probes.

NLI Applications

Despite being seemingly obvious that NLI (Natural Language Interface) has wide applicability to many applications and software systems there are specific areas where NLI is already used today and has demonstrated its unique capabilities.

NLI-Enhanced Search

NLI-enhanced search, filter, and sort is one area where NLI has been successful for a number of years already. Look at Google Analytics, Gmail, JIRA, or many other applications that allow you to search, filter or sort their content with natural language queries. This use case is a perfect application of NLI as it naturally augments the existing UI/UX by replacing often cumbersome and hard-to-use search/filter/sort UX with a simple text box.

As a matter of fact, all major general-purpose search platforms today (i.e. Google, Bing, or Siri) use the NLI-enhanced approach to their search queries processing.

Chatbots

NLI is clearly at the heart of any chatbot implementation. And although most naive implementations of chatbots have failed to gain significant traction — the advancement in NLI technology is allowing modern chatbots to become gradually more sophisticated and outgrow the early “childhood” problems of parasitic dialogues, lack of contextual awareness, inability to comprehend a spoken, free-form language, and primitive rule-based logic.

Data Reporting

Fully deterministic NLI systems like NLPCraft provide critical technology for NLI-based data reporting. Unlike data insights analytics or data exploration, the data reporting typically cannot rely on the probabilistic nature of ML/DL-based approaches as it must provide 100% correctness in all cases.

NLPCraft employs advanced semantic modeling that provides fully deterministic results and NL comprehension.

Ad-Hoc Data Exploration

One of the most exciting applications of NLI is an ad-hoc data analytics or data exploration. This is the area where the proper NLI application can bring about a fundamental seismic change to how we explore our data and discover insights from it.

Today the most data is walled off in the silos of the individual, incompatible data systems making it mostly inaccessible to all but a few “power” users. Very few can gain access to all the different systems in a typical company, learn all the different ways to analyze the data and master incompatible and drastically different user interfaces.

The NLI-based approach can democratize access to the sprawling silo-ed data with a single unified UX by allowing users to use the natural language to explore and analyze the data. The natural language is the only UX/UI that everyone already knows, requires no training or learning and is universal regardless of the data source.

Device Control

With the popularization of consumer technologies like Amazon Alexa, Apple HomeKit, Mercedes MBUX and similar the NLI-based control of various devices and systems becoming a norm.

While most of these systems today can only understand the rudimentary 2–3 words command the advancements in NLI technology are rapidly leading to more sophisticated interfaces. The enterprise world is starting to catch up and NLI-based systems appear today in various manufacturing, oil and gas, pharma and medical applications.

NLPCraft has been accepted to Apache Incubator and I will be a Mentor of the Apache NLPCraft project. I’ll supervise the NLPCraft community in order to align with the Apache Way.

[1] https://nlpcraft.org/index.html

[2] https://medium.com/@furkankamaci/open-source-software-development-and-apache-incubator-372cc90081ae

--

--

Furkan KAMACI

Software engineer who works on AI and distributed systems and a member of the Apache Software Foundation.