Storytelling with Data Analytics

This is the age of data, and there are few places where this can be seen more clearly than the process industry. The classic problem has always been extracting the useful information from this vast sea of numbers. Sabisu encapsulates a large range of advanced analytics tools that make this information extraction simple. However, with more and more data being generated from a growing range of assets a new problem has arisen: From all of the extracted information, how do I determine the underlying story?

The story in the data is not simply the information it contains, but rather:

The subset of connected information that leads to the determination of a meaningful time line of events.

That is, the pieces of information that are related to each other can be extracted to describe a specific event, often in a coherent chronological order. Sabisu provides some important analytics tools to help elucidate this time line.

Example

A plant manager is concerned that the overall cost of operation has increased substantially over the past week. They ask an engineer to investigate.

The engineer could approach this by going directly to the anomaly detection system. However, there are a large number of assets, and it could take a long time to find one with with a high anomaly rate. Furthermore, how would they know that this was the asset that caused the problem? It could be an entirely unrelated issue.

A sensible starting point for the engineer is Jidoka, which tells the engineer that steam usage is much higher than it should be, and is the current highest contributor to the price of non conformance (PONC) for the plant.

Knowing that something must be underlying the increased steam usage, the engineer turns to correlation analysis. Cross correlation will identify signals that are related. By correlating other process data to the increasing steam curve, it may be possible to identify the section of the process that is responsible for the rising operational cost.

Plant Cost

The cost of operation for the plant. Notice the significant rise towards the end.

It is found that an increase in product viscosity “predicted” the increasing steam usage. This means that it is highly probable that the higher steam usage is due to this increase in viscosity. This can be seen by the off-centre peak in the correlation plot.

The cross correlation of product viscosity with steam usage. Notice the off centre peak, which indicates that the viscosity increase is the cause if the rising steam usage.

The cross correlation of product viscosity with steam usage. Notice the off centre peak, which indicates that the viscosity increase is the cause if the rising steam usage.

The engineer knows that an increase in steam usage is due to the automatic process controller opening the steam valve, something it would do to compensate for a higher product viscosity.

Now that the engineer knows that the underlying problem is related to product viscosity, they know what section of the plant is likely to be the cause of the problem. By looking at the anomaly detection results for the assets in this section the engineer notices an increasing number of anomalies on one of the centrifuges over the past few days. This increase in the number of anomalies precedes the increase in product viscosity. They order a deep clean of the centrifuge and set-up automatic anomaly notifications for all of the centrifuges to alert them if a similar problem arises in the future.

Anomaly detection results for the faulty centrifuge. Notice the high anomaly arrival rate (indicated by the red lines), and the fall in output towards the end.

Anomaly detection results for the faulty centrifuge. Notice the high anomaly arrival rate (indicated by the red lines), and the fall in output towards the end.

The Story

The story behind the data is that a problem with one of the centrifuges caused its output to fall. This lead to an increase in product density, which was compensated for automatically with an increase in steam. This rise in steam usage caused the operational costs to increase.

Anomaly detection has been configured to alert the engineers if a similar problem occurs, this will prevent the chain of events from occurring again.

The Order of Information

The order in which information is presented is key to effective communication. It is often tempting to work chronologically, summarising each step taken to reach the conclusion. However, this leads to the most important information being reached last. Instead, it is much more effective to present the key conclusion first, followed by supporting information.

In the case of our example, the engineer may report back to the plant manager. The key information is:

  • The cause of the increased costs was tracked to an under-performing centrifuge.
  • This asset has been cleaned to rectify the problem.
  • Anomaly detection has been configured to provide early warning of a similar event, which will prevent its re-occurrence.

These three pieces of information may well be all the manager needs to know. It explains the root cause of the problem, and shows that it has been rectified and should be prevented in the future.

Data Stories in Sabisu

Storytelling with data analytics is the most efficient way to extract the truly useful information from your large datasets. Sabisu provides a range of tools to simplify this process, and helps you display and share this information throughout your organisation.

Contact us

We’re always interested in hearing from you with any comments or suggestions, feel free to get in touch.

Start typing and press Enter to search