Artificial Intelligence and Big DataArtificial Intelligence

Artificial Intelligence (AI) has been around for decades. However, recently with the advent of “Big Data”, it’s been getting more attention. Wikipedia says this about Artificial Intelligence:

“In computer science, the field of AI research defines itself as the study of ‘intelligent agents’: any device that perceives its environment and takes actions that maximize its chance of success at some goal.”

Big Data

Wikipedia defines Big Data as follows:

“Big data is a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them.”

Computers have become so powerful that we are able to now store millions of records per second.   Unfortunately, our power to analyze that data can be a bottleneck. It is challenging to keep up using traditional methods.

AI and Big Data: A Perfect Match

So why has Big Data brought attention to AI? The answer is simply that Artificial Intelligence can deal with large and complex data sets in ways that traditional data processing—or humans—cannot.

Let’s use a banking application as an example. The app streams millions of records a second, and we want it to send an alert if an anomalous activity occurs, like a fraud condition or theft. In this situation, people can’t possibly process or analyze more than a tiny fraction of this volume of data, second-by-second, to prevent or halt a crime. Even with hundreds of humans tasked with analyzing possible fraud conditions, the sheer volume of data simply overwhelms human decision-making capabilities.

Then how about traditional data processing systems? The problem is that they are algorithmic—bound to follow the same logic over and over. When looking for anomalies—not something we expect—flexibility is required, something traditional approaches are not good at.

Now enter AI. These systems work with fuzziness. They predict. They will consider a path but can abandon it if new data negates a line of reasoning—then begin looking at a new direction. Because AI systems get smarter as more data is given them, they are well-suited for identifying anomalies over time.

Artificial Intelligence Technologies Being Used with Big Data

Let’s now look at some of the AI technologies employed with Big Data. Examples of practical business uses for each technology will also be provided.

  1. Extrapolation

Extrapolation is the process of estimating, beyond the original observation range, the value of a variable based on its relationship with other variables. As an example, let’s assume some data is exhibiting a trend. Executives at the company want to know: Where will the company be in three months if this trend continues? Extrapolation can determine this. Keep in mind that not all trends are linear. Linear trends are simple; a simple line chart will suffice. Non-linear trends are much more involved and that is where extrapolation functions help. These algorithms are based on polynomial, conic, or curve equations.

  1. Anomaly Detection

Anomaly detection is also known as outlier detection. It consists of identifying items, events or observations which do not conform to an expected pattern, or other items in a dataset. Anomaly detection can identify events such as bank fraud (an application of AI previously mentioned). It also is applicable to several other domains including (but not limited to): fault detection, system health monitoring, sensor networks, and eco-system disturbances.

  1. Bayes Theorem

In probability theory and statistics, Bayes Theorem describes the probability of an event based on prior knowledge of conditions that might be related to the event. It’s a way of predicting the future based on the previous events. As an example, let’s assume a company wishes to know which customers they are at risk of losing (churn). Using Bayes, historical data of dissatisfied customers can be collected and used to predict customers likely to be lost in the future. This is a wonderful fit for Big Data because, as more historical data is fed to a Bayes algorithm, the more accurate its predictive results become.

  1. Automating Computationally Intensive Human Behavior

In some situations, it may be possible for a human being to analyze large amounts of data—but it proves exhausting over time. AI can help. Rule-based systems can be used to extract, store, and manipulate knowledge from humans for the purpose of interpreting data in useful ways. In practice, rules are derived from human experiences and represented as a set of “if-then” statements that use a set of assertions, on which rules on how to act upon those assertions are created. Rule-based systems can be used to create software that provide answers to a problem in lieu of a human expert. These systems may also be called expert systems. Consider a company that has a human expert capable of analyzing data for a specific objective. However, the task is monotonous and tedious. A rule-based system can capture and automate this expertise.

  1. Graph Theory

In mathematics, graph theory is the study of mathematical structures used to model pairwise relations between objects. A graph in this context is made up of vertices, nodes, or points connected by edges, arcs, or lines, and can be quite complex and large. With graph theory, insights into relationships between data can be easily obtained. For example, consider a complex network of computers. Graph theory can provide insights into how a bottleneck in the network will cause other problems as well as the root cause of a particular bottleneck.

  1. Pattern Recognition

As its name implies, pattern recognition is used to detect patterns and regularities in data, and is a form of machine learning. Pattern recognition systems are taught with training data, and this process is called supervised learning. They also can be used to discover previously unknown data patterns with a process called unsupervised learning. Unlike anomaly detection, which screens potential anomalies based on a single type of data, pattern recognition can discover previously unknown patterns in several pieces of data and take into consideration the patterns (or relationships) among the data. A company (of any industry) may be interested in knowing when something out of the ordinary begins to happen, such as if consumers all of a sudden begin purchasing one item to go with another item. This pattern may be of interest to a business.


In summary, AI is a way to navigate and gather insights in the world of Big Data. To see the AI concepts mentioned above at work, visit this website: Register for a free account, and you’ll receive a free repository. Once you’re logged in, you can witness Big Data being analyzed with AI in some of the sample repositories.