Flow Analytics Blog

A Powerful Way to Normalize / Flatten Any JSON Data for Analysis

Flattening JSON data is often a difficult task. In this blog post, I demonstrate how to consume any JSON based data source into structured data for analysis. This blog post focuses on using the normalization adapter to automatically create relational tables from web and file-based JSON based sources. The technique outlined in this blog post will allow you to integrate tens of thousands of data sources on-demand and with no code.

A Quick Introduction to the Five Types of Filters in Flow

In this blog post, I provide an introduction to the five filter actions in Flow. Filter actions are functions which select a specific subset of records from a designated data collection based on some target match criteria. This blog post introduces the different types of filters and provides a comprehensive worked video example demonstrating how to configure and implement these filters against a sample data collection. The filter actions are some of the most elementary and integral operations in the Flow computing framework. Mastering the different types of filters is key to data processing, data analytics, and business intelligence workflow design.

How to Build Automated HyperCube-Based Business Intelligence Dashboards

In this blog post, I demonstrate how to build a hypercube-based autonomous BI dashboard. I explain the current-state landscape of BI reporting and data analytics technologies. I provide details on the current limitations of existing BI approaches to automated reporting. I then define the characteristics required for a next-generation BI analytics and reporting framework capable of meeting the current and emerging reporting requirements that most businesses face. I provide a worked example demonstrating how to develop a solution which answers these emerging challenges. In the worked example, I show how to compute hypercubes from raw data and use those hypercubes as the basis for n-dimensional drill through dashboards. I explore various transformations and aggregation techniques across hypercubes to demonstrate how to summarize data across multiple dimensions. I show the power of Flow's multidimensional visualization and pivot engine by creating visualizations which allow for 5+ levels of drill-down. I finish the example by designing an interactive dashboard and showing how to distribute the completed report across an organization. Finally, I cover how to deploy the developed workflow to Flow's agent framework to continuously and autonomously execute our reporting tasks on a schedule.

How to Import and Analyze Common File Data Sources

In this blog post, I provide a worked example demonstrating how to import and analyze different types of file-based data sources. File-based data sources are ubiquitous in business data analytics. This blog post focuses on how to work with data stored in delimited files, Excel workbooks, and XML documents. I provide an overview of each of these three data source types as well as a detailed explanation of the challenges of working with XML. I demonstrate how Flow is capable of automatically normalizing any XML irrespective of hierarchical complexity into structured data sets for analysis. I explain the advantages of the Flow approach to XML over traditional alternatives such as XPath. I finish the example by demonstrating basic analytics against the consolidated data by constructing and computing across various hypercubes.

Import and Analyze MS Access Data with Flow Analytics

This blog post provides a worked example of how to import and analyze Microsoft Access Data. We learn how to use the Access Database integration interface to consume the sample northwind database into Flow. A step-by-step walkthrough is provided which details how to denormalize the various relational tables into a consolidated flattened set for analysis. We learn how to apply generic expressions to compute new data points on the fly. Finally, we learn how to leverage Flow's multidimensional analysis engine to compute hypercubes and summarize the data.

Import and Analyze JSON data using Flow Analytics

In this blog post, I provide a worked example demonstrating how to import and analyze data from JSON based sources. Flow allows for the consumption of JSON data into a tabular form for analysis without requiring any knowledge of structure or schema. I demonstrate how to leverage this functionality to read and flatten JSON from a web-based resource into a dataset. I then show how to apply transformations to the data by using the expression builder to calculate new data points on the fly. I show how to compute hypercubes against the flattened data and perform a simple language analysis, highlighting the ability to wrangle and analyze the data. Finally, I demonstrate how to export the transformed data to various file formats allowing us to persist the flattened set for use elsewhere.

How to Analyze Blank / Missing Values in a Dataset

In this blog post, I provide a worked example demonstrating how to perform an analysis of blanks on a target dataset. When analyzing data a typical first step is to get an understanding of where there are missing values. Identifying where there are missing values in your data can help you make more informed decisions about your analysis approach.

Perform a Word Count Analysis Using Flow

This article demonstrates how to perform a word count analysis in Flow. In this blog post, I provide a worked example showing how to take in unstructured natural language data and compute a unigram language model against that data. The result of the language analysis returns a new profile dataset which holds each unique token present in our natural text and the count of times each word occurred. This blog post teaches a quick one-step technique for doing an initial exploratory analysis of unstructured text data.

Denormalize - Join Datasets using Flow Analytics

This blog post demonstrates how to configure the denormalize function to join disconnected data sets together. I provide a worked example that shows how to first import then join a collection of delimited files. After denormalizing the data, I show how to build and use a hypercube to aggregate and summarize the data.

How to Deduplicate a Dataset

This blog post demonstrates how to identify and remove duplicate records from a dataset. I provide a worked example showing how to configure and implement the deduplicate function against some sample customer data. The deduplicate function is an important action which allows the workflow developer to create rich data validation and transformation rules.

Building Grouped Reports with Flow Analytics

Here is another post focusing on building reports in Flow. In this post, I'll discuss building grouped reports. A grouped report organizes data into one or more nested groups where each group is a collection of records with a common column data value. There are two basic methods you can employ to create grouped reports in Flow. The first is to add a Grouped Report action to a new or existing workflow. The second way is to open a hypercube within the Flow portal then click on the report icon Create Report button in the toolbar located at the top of the hypercube view. This post will cover the first method.

How To Use Flow and Watson AI for SEO Keyword Research

This article demonstrates how to use AI (artificial intelligence) to perform keyword research for SEO (search engine optimization). Watson cognitive actions are leveraged to decompose keywords from competitor websites. I then compile the keywords into a dataset to provide better insight into potential SEO (search engine optimization) strategy.

Use Flow and Artificial Intelligence to Analyze the News

In this blog post, I provide a worked example demonstrating how to design a workflow which extracts and analyzes cryptocurrency news articles using artificial intelligence. I explain how to use the HTML integration interface to extract links for all top news stories from a target website into data. I show how to use generic expressions to transform and clean the raw links, preparing them for processing. Flow is used to loop through each of the structured links and invoke the built-in Watson artificial intelligence functions to perform advanced cognitive analytics against the text of each news article. Flow collects the results of the cognitive analysis and compiles an aggregate dataset of sentiments, emotions, concepts, topics, keywords, and named entities for all of the supplied articles. I finish the example by showing how to compute hypercubes against the cognitive output to summarize the results and generate various multidimensional views.

How to Perform a Cognitive Keyword Extraction Using Flow

This post demonstrates how to perform a cognitive keyword extraction against natural language text data in Flow. In this worked example I show how to use the artificial intelligence actions to process unstructured text values. The AI (artificial intelligence) actions are used to deduce all-important keywords, analyze sentiment towards those keywords, and compute emotion distribution scores for each string extracted from the unstructured text. The concepts examined in this post teach a powerful technique which can be used to develop advanced cognitive workflows against any data source.

Doing Data Quality with Flow Analytics

In this article, I provide an introduction to measuring and evaluating data quality using Flow. I briefly discuss data quality dimensions and data quality assessment. Then I examine how a schema-on-write approach increases the time and cost required to assess data quality along with a brief discussion of schema-on-read technology. I then introduce Flow's "Generic Data" technology as a solution to the deficiencies of schema-on-write and schema-on-read for data quality. Finally, I provide a hands-on working example of doing data quality in Flow using some sample name and address data.

Benford Analysis Using Flow Analytics

In this post, I show how to build a reusable eight-step Flow that performs a Benford's Analysis on a sample dataset. This Flow loads the sample data set then obtains the first digit from each observation, builds a hypercube and uses it to count the first digits, extracts a dataset containing the distribution and, finally, computes the expected distribution and compares it to the observed distribution by taking the difference.

Building Tables and Pivot Tables with Flow Analytics

In this blog post, we'll build a six-step workflow that produces Pivot Table and Table results. It shows how to load data, use expressions to derive time-period values from a date field, build a hypercube using those time-period values as dimensions and, finally, how to create and view pivot table and table results utilizing the hypercube.

Building Tabular Reports with Flow Analytics

Flow Analytics enables you to build many types of reports, such as tabular, grouped, pivot tables, tables, and data summaries, among others. A tabular report is the most basic type of report you can build in Flow Analytics. Tabular reports organize data into a multicolumn, multirow format, with each column corresponding to a column in a dataset. In this post, I show how to design a workflow that generates a tabular report in just a few steps.

An Introduction to Building Dashboards in Flow Analytics

Flow enables you to build dashboards containing a variety of elements including tables, charts, reports, and data summaries, among others. This post focuses on two methods you can use to build, populate, and update dashboards. I show how to add a new dashboard, then how to create and add chart result using one of the sample datasets provided. Next, I provide an in-depth discussion of adding workflow generated results to a dashboard.

#if !DEBUG #endif