Enhancing Lean Six Sigma with advanced analytics

Within the Six Sigma framework, you may spend a large amount of your time gathering and analysing data, applying hypothesis testing, and looking for the root cause to an issue or improvement.  Your role is part-detective, part-data analyst.  However, with the emergence of the Industrial Internet of Things (IIoT) products, processes and issues are becoming more complex.  Consequently, the amount of data to analyse is often too large or too small, scattered in disparate systems, and rarely in a clean, ready format.

The challenge becomes not just defining the problem, but defining the dataset and then cleaning and manipulating it so that it can be analysed in Minitab or Excel.  The good news is however, that IIoT trends are pioneering innovation in process efficiency optimisations in continuous improvement lifecycles.  Industrial IoT and big data provide compelling enablers for real-time improvements as well as enhancing and dramatically speeding up Lean Six Sigma-type improvement.  The impact on business operations are significant and new technologies are bringing enhanced analytical capability to the typical data-intensive processes in Lean Six Sigma projects.

GE has publicly predicted $1 trillion in opportunity annually by improving how assets are used and how operations and maintenance are performed within industrial markets.  McKinsey estimate that the potential economic impact of Industrial IoT data analytics is between $1.4 trillion and $4.6 trillion by 2025 for factories and remote worksites such as oil rigs etc. (in 2015 dollars).  The main use cases are operations optimisation and predictive maintenance.


Applying Lean Six Sigma in the IIoT age

So the enabler is the data, but of course it’s a case of finding the right analytical methods to make most sense of the data and unlock insights that can spur action.  When using a structured problem solving approach like Lean Six Sigma, problems are typically resolved by applying an analytical framework and harnessing domain knowledge.  Teams generate appropriate datasets, which are used in statistical packages to identify root causes by confirming hypotheses.

This is a tried and tested approach and has resolved many issues since it was invented in Motorola several decades ago.  However, the landscape is changing and there are challenges going forward with the increase in big data and the complexity of new products and processes.  It is becoming increasingly difficult to quickly identify, join, cleanse and transform datasets to test hypotheses, and the cleansing process can introduce biases.  For these new challenges, new technology is needed to support rapid problem solving.

In contrast, there are new technologies on the market which don’t need pre-determined hypotheses or the cleansing of data. They can work with any sized dataset and can rapidly pinpoint possible root causes and solutions, which can be deployed within the problem solving framework. They use ‘non-statistical’ algorithms and ‘machine-learning’ to simplify and speed up the analytical process. In essence, the root causes of issues are pinpointed far more accurately and faster.

An identified root cause might specify a combination of process tolerances that the problem solving teams can change to prevent future defects in the most economical way.  By being ‘non-statistical’, this approach avoids bias that could be introduced with dirty datasets where statistical methods breakdown and require cleansing, sampling and making other assumptions.  This simplicity is also coupled with speed, particularly due to distributed architectures and in-memory continual analysis which can process vast datasets quickly and in a linear timeframe rather than an exponential one as most predictive analytics techniques require.  Rather than taking days or weeks to identify faults, this new breed of analytics can work in quasi-real time, sometimes in just a few seconds.

This approach supports the Six Sigma framework in several ways.  It can accelerate and improve confidence in project selection.  Instead of using opinions and experience to select a problem area of focus, the data itself identifies the areas of the process where projects should be initiated.  With existing analytical software, users would have had to significantly reduce large datasets to make them able to be analysed.  Another advantage of the new breed of software that isn’t based on hypothesis is that the gathering and cleaning of data sets from IT systems can be simplified.  The entire data set can be incorporated into the analysis and the team doesn’t have to pre-select which potential critical inputs should be studied, again avoiding bias, guesswork and time.

This adds an additional, powerful tool to the investigation process.  By asking the simple question, “Where did we see defects (or non-conformance or bottle-necks)?” we are able to identify possible ‘fault regions’.  That is, parameters and bounds of values which correspond to root causes and failure modes that might not appear when asking multiple questions like, “Is the output correlated to these inputs?” where our conclusions will be based on whether the variation in the input is making a strong contribution to the variation in the output.

It is also possible with new technology such as automated information retrieval to just ask the question “show me anomalies and clusters” without any supervision at all in terms of asking which events were ‘good’ versus ‘bad’.  This technique can be extremely useful for dealing with rare events without statistical significance, or for classifying records from the text and structured data (e.g. assigning and validating a warranty claim) from the dealers’ repair notes.  This is fundamental to group the ‘symptoms’ together for the Six Sigma analysis to then be performed.


Growing the ROI of problem solving projects

It can add value to and increase the ROI of problem solving projects.  The life cycle of specific projects can be shortened, particularly in the Define, Measurement and Control phases of DMAIC, by preventing defective products from being produced and predicting maintenance requirements.  The fault regions may correspond to ‘in-tolerance’ failure, i.e. failure not being picked up by Statistical Process Control (SPC) or because a parameter is out of tolerance.  It is as if within this region the product is accidentally designed to fail.

Once the regions which define the fault have been identified and the parameters quantified, engineers can work out corrective actions.  These could be to remanufacture, redesign the product, or (in the case of a warranty issue) to specify precisely the products in the field which require corrective action, and the lowest fix, without having to recall an entire fleet or batch, or implement an expensive workaround.  Conversely tolerances can actually be relaxed where they don’t cause quality issues or bottlenecks, saving costs and speeding throughput.

One major utilities provider recently used this new breed of analytics tool to significantly increase the amount of insight they would typically get from using Minitab.  They had a great deal of maintenance records which were mainly classifiable by the text written from the maintenance engineers.  There was a great deal of pre-processing required to even get the data in a state where it could be analysed in the traditional fashion.  This was estimated to take six months but by using automated information retrieval technology it was finished in a matter of days.  Predictive analytical software was then applied on top of these clusters, together with their existing Six Sigma approach, and they were able to not only speed up the process but benefit from additional insights gleaned from the data.

Motorola, the home of Six Sigma, has also used the latest automated technology to support their quality processes.  It was used to virtually eliminate two of their most prominent No Fault Found (NFF) quality issues for a particular mobile phone model.  These NFF issues were related to audio and battery. These issues were costing them a significant amount in terms of returns, replacements and reputation – the typical costs associated with COPQ.  The data provided were warranty data that highlighted NFF issues and also End of Line (“EOL”) testing process data consisting of 170 parameters.  Although reasonably comprehensive, the warranty data was incomplete and “dirty” thus making any statistical approach to resolving the root cause extremely challenging.  Note that all of the 170 testing parameters were within tolerance, meaning that it was not possible to identify the root cause of the problem from this data alone.

The results of the analysis picked out the key parameters which, albeit all individually within tolerance, were in combination contriving to cause the root cause of the issue.  Furthermore, the ‘fault region’ was also quantified, meaning that it was possible for the manufacturing engineers to easily identify and predict when the failure would occur again.  The faults could therefore be swiftly almost entirely eliminated by Motorola adjusting its Statistical Process Control to obviate the problem so that no latently faulty phones passed the EOL testing.  The model of phones was also redesigned in the next generation to improve the yield inside the factory.  As a result of using the software, the NFF issues which were the top 2 warranty failures were no longer in the top 50 warranty failures.

Organisations that successfully leverage industrial IoT, along with the appropriate big data analytics, and integrate other enabling technologies alongside the Six Sigma processes, end up with a much more advanced version of Lean and continuous improvement in general.

With the amount of ‘big data’ within most manufacturing organisations today, real value and insight is already now being unlocked for many companies’ benefit.  Advanced analytics won’t ever replace the tried and tested quality techniques we use today, but it can complement them and make them more powerful.  By integrating the two, we are already able to answer questions we couldn’t answer before and will become better at identifying issues, performing analyses, and solving business problems.

About your guest bloggers:  Dan Somers is a founder of Warwick Analytics. He holds an MA from Cambridge University in Natural Sciences and a Diploma in Business Studies. Warwick Analytics’ algorithms automatically resolve product faults and process failures.

Dave Hauff is co-founder of GELRAD Europe, and holds a B.S. in Nuclear Engineering and an M.S. in Engineering Management. GELRAD is a global consulting company using Lean Six Sigma, Change Acceleration, and DFSS.



  1. You mention, “In contrast, there are new technologies on the market which don’t need pre-determined hypotheses or the cleansing of data. They can work with any sized dataset and can rapidly pinpoint possible root causes and solutions, which can be deployed within the problem solving framework. They use ‘non-statistical’ algorithms and ‘machine-learning’ to simplify and speed up the analytical process. In essence, the root causes of issues are pinpointed far more accurately and faster,” but don’t provide any examples of the types of new technologies.

    Can you provide an example?

    I agree the amount and availability of data is creating new problem solving opportunities, but I am not sure how to pursue using what you suggest in the article.

    • Thanks Ernest for your comment.

      One such technology is from our company, Warwick Analytics. It can generate predictive analytics automatically from dirty, unstructured and incomplete data without hypotheses and without cleansing.

      There are many examples of applications in automated root cause analysis, yield improvement, resolving bottlenecks and predictive maintenance. We have case studies in different manufacturing sectors: automotive, aerospace, pharma, process and energy production. See website: http://www.warwickanalytics.com. If you’d like to know more please complete the form and I’ll get back to you.