2017-05-22

Cognitive Computing using Flow Analytics

Overview

In this post we will learn how to use the IBM Watson cognitive computing actions in Flow to analyze unstructured text data.

The post will introduce a no-code approach to interacting with the artificial intelligence framework.

What is Cognitive Computing?


Cognitive computing is defined as "the simulation of human thought processes in a computerized model. Cognitive computing involves self-learning systems that use data mining, pattern recognition and natural language processing to mimic the way the human brain works".

The term is commonly associated with Microsoft's Cortana and IBM Watson artificial intelligence platforms. Both tech giants have spent a considerable amount of resources over the past decade training their artificial intelligence to be able to do advanced human-like tasks such as understanding text, analyzing sentiment or recognizing faces.

These technologies are now being released as powerful open frameworks for performing certain types of data analysis.

What Is Unstructured Data?

Unstructured data refers to any type of data that does not have a predefined data model. This data is usually text-heavy such as natural language exchanges or conversations.

In 1998, Merrill Lynch cited a rule of thumb that somewhere around 80-90% of all potentially usable business information may originate in unstructured form.

A more recent analysis by Computer World states that unstructured information might account for more than 70%–80% of all data in organizations.

Although the actual number is not known, these analysis make one thing clear - unstructured data is everywhere.

Every business has access to unstructured data in some form and this data can be extremely valuable. The challenge is unlocking it.

Introducing Watson Cognitive

We will begin our exploration of cognitive computing frameworks with Watson Cognitive. IBM Watson was originally developed to be a natural language engine. It is famous for beating a world renouned jeopardy player in early 2011. This was seen as a huge milestone for computing and artificial intelligence.

Since then, IBM has been hard at work training Watson and reinforcing it's natural language core. The objective being to create a framework which could make sense of all of the unstructured data on the internet.

In 2016 IBM released a set of cognitive apis for interacting with Watson called the Alchemy framework. The alchemy functionality emphasized natural language understanding and computer vision / object recognition. It also provided an advanced trade-off analytics engine and semantic news search.

Watson has since been consolidated into IBMs bluemix framework. Bluemix provides a gateway for accessing and developing against the various Watson apis.

In order to use the Watson functions in Flow you will need to generate a key from IBM bluemix.

Although IBM has made great strides and advances with Watson it still proves hard to access without a lot of technical engineering. Custom code has to be written against the apis to push and pull data. This is time consuming, costly, and puts the power of the technology out of reach for most business users.

Flow solves this by providing no-code direct interfaces on all Watson cognitive functions.

With Flow, any business or user can unlock the value of unstructured text data by designing autonomous workflows which harness the Watson no-code cognitive computing actions.

Using Flow to Perform Cognitive Analysis

I will now provide a step-by-step walkthrough of how to design a cognitive computing workflow using Flow. The workflow will demonstrate how to load unstructured text data and transform it into meaningful analysis results.

The Sample Data

In this example we will use a dataset containing 4D-IQs raw blog post text.

The dataset is available here: 4D-IQ Sample Blog Data

Each row in the dataset corresponds to a unique blog post and contains a data point which holds the full text for the post.

The blog post text data provides a good example of unstructured data. The techniques and methods we learn about in this post can be generalized to work on any kind of natural language data.

Examples of Unstructured Data

Twitter Tweets - Consumers are actively expressing their ideas and interests through tweets. Tweets provide powerful insights that can help identify new opportunities, understand customers, and analyze market trends.

Facebook Comments - Facebook comments can be used to extract information such as customer language, likes and dislikes, or product feedback.

Product or Business Reviews - Consumer review data can be found on just about any product or business. Review data can hold insights into how consumers rate your goods or services.

Customer Support Emails - Most businesses have eletronic customer support. Customers email about problems and questions. This data holds insights into what issues people are running into with your service, what customers would like to see from your service, what caused customers to churn, how successful your support efforts are and more.

Websites (HTML) - Every page on the internet is composed of HTML. The web holds the entire wealth of human knowledge as unstructured data. This data can be used to compile information on just about anything.

News Articles - News articles log what is happening in the world. This data holds insights into what products are coming out, what events are going on, and what people find important on a local or larger scale.

PDF and Text Documents - PDFs and text documents often hold transactional information, invoices, shipping labels, contracts and more. These documents can contain critical business information that is normally difficult to access because of its unstructured form.

Company Profiles and Bios - The web is full of companies describing themselves and their offerings. These descriptions can be used to intelligently compile, classify and enrich new or business data for tasks such as targeting marketing efforts.

Marketing Campaign Responses - Email marketing campaign responses can hold insights into data such as possible conversion rates. Text email responses can be clustered and grouped based on sentiment, response type or other cognitive attribute in order to maximize success.

Getting Started With Flow

If you already have an account click here to login, otherwise, you can sign up for an account here

Walk-Through: A Step by Step Guide to Building a Watson Cognitive Workflow


Because of the way IBM and Cortana price their cognitive services (a fixed number of daily free transactions then for higher volume a few cents per extra transaction) it is important to provide granular control over the number of calls Flow makes against the services.

To accomodate this - all cognitive functions take in single value variable input. Essentially the process is as follows:

  • Load in data
  • Foreach through data to loop through each record one at a time
  • Pass the values one at a time into cognitive functions
  • Accumulate results

To get started, you'll need to add the sample data to your Flow user account.

Add Sample Data


After logging into Flow, you'll first need to add the "Sample Blog Data" dataset to your Flow account. Click on the down arrow button in the top menu bar then click on the sample data icon to open the Add Sample Dataset dialog, as follows:

Click the ADD link next to the Hypercube Sample Data entry. The sample data will be added to your account.

Open Cloud Connect


Flow provides an integrated development environment for building analytics-oriented workflows called Cloud Connect.

Cloud Connect is a ClickOnce deployed desktop application for building and managing workflows. To launch Cloud Connect, open the Workflows menu in the left sidebar of the Flow portal then click the Launch Cloud Connect, see below:

Depending upon your browser, you may need to add an extension to enable ClickOnce and launch Cloud Connect.Click on the View Cloud Connect Requirements link beneath the Launch Cloud Connect button to learn the requirements for your particular browser.

If this is your first time using Cloud Connect, it may require a minute or so to install. Once the installation has completed, the application will open automatically.

Add a New Workflow


To get started, we'll need to add a new workflow. From the top menu, click the Add Workflow button. The Add New Workflow dialog will appear as shown below:

Enter the information as shown above then click OK. The new workflow will appear in the Workflows list on the left-side of your screen. To open the workflow for editing, simply double click its name. Alternatively, you can right-click the name and then select Open from the context menu.

Add Workflow Steps


From the Workflows list, open the workflow for editing by double-clicking its name or right-clicking the name then selecting Open.

Step 1 - Add a Load Data Workflow Step

Add a Load Dataset step by clicking on the menu and selectingLoad Dataset from the drop-down menu, the Load Dataset Dialog will display as follows:

Click OK to add the Load Dataset workflow step.

The Cloud Connect development environment provides a run-time engine that allows users to run workflow steps and view results. To run the workflow now. Click the run button in Workflow Menu, the following prompt will appear:

Click Yes or press Enter to run the workflow. When the workflow completes, the Sample Blog Data dataset will display under the Working Data tab, see below:

Step 2 - Add a Foreach Step

With our working data loaded into memory, we can now add a Foreach step. From the Actions drop-down menu, select Workflow Control > Jump then Foreach. The Foreach dialog box will appear as shown below.

Target the data collection to loop through

  • From the Working Data drop down list, select the Sample Blog Data dataset.
  • Click OK to add the workflow step.

After clicking OK, a Foreach workflow step will be added to the workflow Action List. The Foreach action can be thought of as opening a loop.

Right-click and run selected to execute the Foreach action once. This begins the loop through each record in the target dataset starting with the first row. The values of the current iteration of the loop get set into the variables vector.

Note - The Working Data tab displays a _Variables group. The cognitive functions can then pull their input values dynamically from the variables vector.

To recap, we have begun looping through our dataset. The first row is set to variables. This gives us a variable which contains our blog post text. We can then push the blog post text from the variable to our cognitive functions.

Step 3 - Add a Watson Cognitive Concept Analysis Action

The first function we will look at is the Concept Analysis cognitive action.

The power of the concept analysis function is that it is able to extract high level concepts or ideas from text. The concepts that Watson extracts are not necessarily explicit in the text. Watson is able to tap its enormous library of knowledge to imply higher meaning from text and return those higher concepts as data points for further analysis.

From the Actions drop-down menu, select Artificial Intelligence > Watson > Concept Analysis. The Watson AI Concept Analysis dialog will load, see below:

Steps required to add the Concept Analysis action:

  • Select your Credentials from drop-down list. This is your Watson api key.
  • Set the Text Variable input to be the variable Blog Post Text. Remember - this will pull the text data from the variable and dynamically pass the value to Watson.

When you are done configuring the action, click OK. The Watson AI Concept Analysis workflow step will be added to the workflow.

Right-click and run selected to execute the concept analysis function once. The text data is passed to Watson for evaluation and a dataset containing the results is returned.

Our working data container now contains a result dataset called 'ConceptAnalysis.concepts'.

The result dataset returned from Watson contains the following data points:

  • Text - the name of the concept extracted by Watson.
  • KnowledgeGraph.typeHierarchy - the graph path used to identify the concept
  • DBPedia - the dbpedia citation of the concept, entity type, description and graph location
  • Relevance - a 0 to 1 based score of the relevance of the concept pertaining to the source text
  • Freebase - the google freebase object citation for the concept
  • Website - the website for the concept if it is an entity such as a business or agency

The concept analysis function will also return various data points determined by the specific nature of concepts extracted. For example if an extracted concept is a location, Watson may return geographic coordinates. Watson will usually return more data points containing source citations (aside from whats listed above) but this also depends on the nature of the concepts extracted from the text.

Here are the workflow steps we have added thus far:

In just three steps we've loaded our data, created a loop through each record, and passed our raw blog post text to Watsons concept analysis function to extract a dataset of ideas and topics from the unstructured data.

The concept analysis function provides powerful summary analysis capability over natural text. It is a good first step to use when evaluating natural text data.

The applications of this are far reaching. For example we could connect to Twitter data and get a high level analysis of what our customers are talking about. We could analyze our support staff emails to see what our customers are mostly contacting us about.

In the next section, we continue our exploration of IBM Watson Cognitive functions by taking a look at the Entity Extraction cognitive action.

Step 4 - Add a Watson Cognitive Entity Extraction Action

In the prior step, we configured and added a Watson Concept Analysis cognitive action. We learned the concept analysis function is used to extract high level ideas from natural text.

Now, we will add a Watson Cognitive Entity Extraction action. The Watson Cognitive Entity Extraction is similar to the Watson Cognitive Concept Analysis action except that instead of returning abstract concepts it is used to extract concrete entities or named objects from text.

Entities are usually nouns and contain data representing persons, places or things. The entity extraction function will identify all objects in a text value and score the sentiment towards each identified object.

This can be used for all kinds of powerful analysis such as identifying what objects customers are talking about and how they feel about those entities.

From the Actions drop-down menu, select Artificial Intelligence > Watson > Entity Extraction. The Watson AI Entity Extraction dialog will load, see below:

Steps required to add a Watson Cognitive Entity Extraction action:

  • Select your Credentials from drop-down list. This is your Watson api key.
  • Set the Text Variable input to be the variable Blog Post Text. Remember - this will pull the text data from the variable and dynamically pass the value to Watson.

When you are done configuring the action, click OK. The Watson AI Entity Extraction workflow step will be added to the workflow.

Right-click and run selected to execute the entity extraction function once. The text data is passed to Watson for evaluation and a dataset containing the results is returned.

our working data= container now contains a result dataset called 'namedentities.entities'

Our working data container now contains a result dataset called 'NamedEntities.entities'.

The result dataset returned from Watson contains the following data points:

  • Text - the name of the entity extracted by Watson.
  • Count - the number of times the entity occured in the text.
  • Type - the type of the entity. this is a classifier value assigned by Watson.
  • Relevance - a 0 to 1 based score of the relevance of the concept pertaining to the source text
  • Sentiment.type - the sentiment class for the object (Positive,Negative,Neutral).
  • Sentiment.score - a decimal score weighting the sentiment towards the object. A value of +1 is most positive and a value of -1 is most negative.
  • KnowledgeGraph.typeHierarchy - the graph path used to identify the entity.

Watson also may return other data points which provide additional descriptor fields for entities. These additional data points are determined by the type of entities extracted.

The entity extraction function is a powerful tool. As we work through the remaining cognitive functions it is important to bear in mind that while each function individually provides great value - the real power of the artificial intelligence actions comes when they are all used together.

Combining the two functions we have explored so far - Concept Analysis and Entity Extraction - we are able to turn raw text into actionable insight.

We have identified high level concepts and ideas expressed in our unstructured data and combined that with a deeper analysis of the specific entities and sentiment present in the text.

Step 5 - Add a Watson Keyword Extraction Action

The third Watson cognitive function we will explore is the Keyword Extraction function. The keyword extraction function takes in an unstructured text value and returns a dataset containing all keywords or key phrases in the text.

The Keyword Extraction function tends to be more inclusive than the prior functions. Entities and concepts have stricter definitions and are usually less frequent. Whereas keywords and key phrases are more common in text.

From the Actions drop-down menu, select Artificial Intelligence > Watson > Keyword Extraction. The Watson AI Keyword Extraction dialog will load, see below:

Steps required to add the Keyword Extraction action:

  • Select your Credentials from drop-down list. This is your Watson api key.
  • Set the Text Variable input to be the variable Blog Post Text. Remember - this will pull the text data from the variable and dynamically pass the value to Watson.

When you are done configuring the action, click OK. The Watson AI Keyword Extraction workflow step will be added to the workflow.

Right-click and run selected to execute the keyword extraction function once. The text data is passed to Watson for evaluation and a dataset containing the results is returned.

Our working data container now contains a result dataset called 'KeywordExtraction.keywords'.

The result dataset returned from Watson contains the following data points:

  • Text - the value of the keywords extracted by Watson.
  • Relevance - a 0 to 1 based score of the relevance of the keyword in the source text.
  • Sentiment.type - the overall sentiment class for the keywords (Positive,Negative,Neutral).
  • Sentiment.score - a decimal score weighting the sentiment towards the keywords. A value of +1 is most positive and a value of -1 is most negative.
  • KnowledgeGraph.typeHierarchy - the graph path used to identify the entity.

Combining what we have learned in parts 1, 2 and 3 we are beginning to shape a powerful arsenal for analyzing text data.

To recap, we have now seen three watson cognitive functions:

  • Concept Analysis
  • Entity Extraction
  • Keyword Extraction

Each action takes in a raw text value and returns structured analysis results from the data.

  • The Concept Analysis action is used to extract high level ideas or concepts from text and rank them in order of relevance.
  • The Entity Extraction action is used to extract named entities or concrete objects from text and rank them in order of relevance and sentiment.
  • The Keyword Extraction action is used to extract all keywords and key phrases from text and rank them in order of relevance and sentiment.

Step 6 - Add a Watson Emotion Analysis Action

We will now add a Watson Cognitive Emotion Analysis action to our workflow. The Emotion Analysis function takes in an unstructured text value and returns a distribution of emotions for the text.

For each text value passed into the function Watson returns a statistic for Fear, Disgust, Anger, Sadness, and Joy. These emotion indicators have powerful predictive power when combinedwith machine learning techniques. They can be used to do a wide range of analytical tasks such as understanding overall customer emotions towards products or services, ranking employee attitude, or providing indicators to forecast market trends.

From the Actions drop-down menu, select Artificial Intelligence > Watson > Emotion Analysis. The Watson AI Emotion Analysis dialog will load, see below:

Steps required to add the Emotion Analysis action:

  • Select your Credentials from drop-down list. This is your Watson api key.
  • Set the Text Variable input to be the variable Blog Post Text. Remember - this will pull the text data from the variable and dynamically pass the value to Watson.

When you are done configuring the action, click OK. The Watson AI Emotion Analysis workflow step will be added to the workflow.

Right-click and run selected to execute the emotion analysis function once. The text data is passed to Watson for evaluation and a dataset containing the results is returned.

Our working data container now contains a result dataset called 'EmotionAnalysis.docEmotions'.

The result dataset returned from Watson contains the following data points:

  • Fear - a density score measuring fear in text
  • Disgust - a density score measuring disgust in text
  • Anger - a density score measuring anger in text
  • Sadness - a density score measuring sadness in text
  • Joy - a density score measuring joy in text

Each text value passed to Watson returns one record of result emotion scores. We can run the loop fully to execute the emotion analysis for each row in our data set. The results of this stack up in our EmotionAnalysis.docEmotions data collection. When the loop finishes executing we have a dataset which contains an emotion matrix for all of our blog posts in our sample data.

This allows us to generate, analyze and summarize massive amounts of emotion data across large document stores or many text values.

Using this function we can do tasks such as analyzing the total emotions of customer interactions with email support staff or bulk aggregate customer emotions towards different products or services.

Step 7 - Add a Watson Sentiment Analysis Action

The Watson Cognitive Sentiment Analysis function is used to evaluate a single statistical sentiment score for a text value.

In our previous actions we have seen how Watson can return sentiment scores for recognized entities and keywords. In those examples the sentiment was targeted towards specific objects or keywords identified in the text.

The Sentiment Analysis function evaluates an overall sentiment score for a text value and does not focus on granularity or object level sentiment.

From the Actions drop-down menu, select Artificial Intelligence > Watson > Sentiment Analysis. The Watson AI Sentiment Analysis dialog will load, see below:

Steps required to add the Sentiment Analysis action:

  • Select your Credentials from drop-down list. This is your Watson api key.
  • Set the Text Variable input to be the variable Blog Post Text. Remember - this will pull the text data from the variable and dynamically pass the value to Watson.

When you are done configuring the action, click OK. The Watson AI Sentiment Analysis workflow step will be added to the workflow.

Right-click and run selected to execute the sentiment analysis function once. The text data is passed to Watson for evaluation and a dataset containing the results is returned.

Our working data container now contains a result dataset called 'Taxonomy.docSentiment'.

The result dataset returned from Watson contains the following data points:

  • Sentiment.type - the overall sentiment class for the text (Positive,Negative,Neutral).
  • Sentiment.score - a decimal score representing the sentiment of the text. A value of +1 is most positive and a value of -1 is most negative

In our worked example see how to use Watsons sentiment analysis function to identify and seperate positive and negative feedback from customers. read post

Step 8 - Add a Watson Taxonomy Analysis Action

The Taxonomy Analysis cognitive function is used to return classifications and categories for text values.

This action is especially useful when it comes to grouping and clustering text values or documents based on categories or content.

We can use this function to do tasks such as segmenting together customers who share common attributes (i.e. customers who talk about business and technology vs customers who talk about advertising and marketing).

From the Actions drop-down menu, select Artificial Intelligence > Watson > Taxonomy Analysis. The Watson AI Taxonomy Analysis dialog will load, see below:

Steps required to add the Taxonomy Analysis action:

  • Select your Credentials from drop-down list. This is your Watson api key.
  • Set the Text Variable input to be the variable Blog Post Text. Remember - this will pull the text data from the variable and dynamically pass the value to Watson.

When you are done configuring the action, click OK. The Watson AI Taxonomy Analysis workflow step will be added to the workflow.

Right-click and run selected to execute the taxonomy analysis function once. The text data is passed to Watson for evaluation and a dataset containing the results is returned.

Our working data container now contains a result dataset called 'Taxonomy.taxonomy'.

The result dataset returned from Watson contains the following data points:

  • Score - a weighted score for a particular label/class
  • Label - classification label evaluated for the text
  • \
  • Confident - yes/no value indicating whether watson is confident beyond a degree of certainty for a given classification label

Step 9 - Add a Watson Relation Extraction Action

The Relation Extraction cognitive action is the final text analysis cognitive action we will introduce in this post. The relation extraction function takes in a text value and returns all entities / objects and their connections to one another.

The Relation Analysis action is almost an extension of the Entity Extraction function. The difference is that it examines the connections between entities or objects in the text.

Sentiment scores are returned which relate objects to objects or objects to a

#if !DEBUG #endif