Analyzing Heart Disease risk using Key Influencers AI visual in Power BI

The Gartner Magic Quadrant for 2019, announced earlier this month, names Microsoft the leader in Analytics and Business Intelligence Platforms. Microsoft also coincidentally announced the public preview release of its first AI-driven visual for Power BI Key Influencers – this month, among a number of new features for Feb 2019. Inbuilt integration of Power BI with many Azure data products would catapult Power BI miles ahead of Tableau in the long run.

EN-CNTNT-GartnerMQ-BI2019.jpg

Key Influencers is the first of many AI visuals Microsoft would release I assume, in their efforts to democratize AI and make their customers look cool 🙂 In this article, we will go over the various features of this new visual using a publicly available dataset, and get familiar with interpreting the results. Download a copy of Power BI Desktop file for the example I am using in this article and try it out yourself using the free Power BI Desktop tool.

Key Influencers

Key Influencers is a powerful Power BI visual that lets us understand the factors that drive a metric. Power BI analyzes data, ranks the factors that matter, and displays them as key influencers. Under the hood, Power BI uses ML.NET to run logistic regression to calculate the key influencers. Logistic regression is a statistical model that compares different groups to each other, also taking into consideration the number of data points available for a factor.

As the visual is still in preview, there are a number of limitations. My first attempt to use Key Influencers using a survey responses dataset was rather unimpressive.

In my second attempt, I used the popular Heart Disease dataset from UCI to identify key influencers affecting heart disease, and achieved good results.

Heart Disease - Key Influencers Power BI.jpg

Limitations

Before we delve any further, let us take a look at the limitations that apply in the public preview phase of the visual. Pay attention here to avoid frustration as you explore the visual.

Following features are not supported:

  • Analyzing metrics that are aggregates/measures
  • Direct Query / Live Connection / Row Level Security – support
  • Consuming the visual in Power BI Embedded and Power BI mobile apps

Using the Key Influencers Visual

As a first time user, I found the Key Influencers visual intuitive and self-explanatory. It hardly takes a few minutes to set up the visual once you have clean data. Check out Microsoft documentation to understand all aspects of the visual. You could also download a copy of Power BI Desktop file for the example I am using in this article.

Note: Keep column names readable as this will help interpret the visual better

Getting Familiar

There are 2 tabs available within the visual – Key influencers and Top Segments.

The Key influencers tab displays the key factors affecting the metric value selected. In this case, the top factor that affects positive diagnosis of Heart Disease, based on our dataset, is Reversible Defect Thalassemia – increasing the risk of heart disease by 2.83 times when the value of Reversible Defect Thalassemia is 7.

On the right hand side, there is a column chart showing distribution of the selected factor. The check box at the bottom lets you display only influential factor values. We could click-select a different factor to see how it contributes to heart disease.

Heart Disease - Key Influencers Power BI - Getting Familiar.jpg


The Top segments tab displays different segments identified by Power BI within the population, for the metric value selected. Click-select a segment to view more details such as the factor values that define the segment, and how the segment compares against the average. We could also drill down further into the segment to split by additional fields.

Under the hood, Power BI uses ML.NET to run a decision tree to find interesting subgroups. The objective of the decision tree is to end up with a subgroup of datapoints that is relatively high in the metric we are interested in – in our case, the patients who  are suspected to have heart disease.

Heart Disease - Key Influencers Power BI - Top Segment.jpg

 

Heart Disease - Key Influencers Power BI - Top Segment Details.jpg

First Impression

Considering that it is still in preview and is only going to get better, Key Influencers ticks the right boxes. The rationale behind choosing a popular dataset, such as the Heart Disease dataset from UCI, for my example was to allow for comparison of results to Machine Learning models that are already publicly available. Power BI seems to identify influencers correctly and does a good job at presentation. I’m thoroughly impressed by this new feature.

Suggested Reading

If you enjoyed this article, consider reading my other articles on Azure data products.

https://sqlroadie.wordpress.com/2018/04/29/what-is-azure-cosmos-db/
https://sqlroadie.wordpress.com/2018/08/05/azure-cosmos-db-partition-and-throughput/
https://sqlroadie.wordpress.com/2019/02/17/azure-databricks-introduction-free-trial/

Resources:

Download the Power BI workbook used in the example – https://drive.google.com/open?id=13Pt25UPt7dOW3raZmavHHVl7gAStv5uy
Intro to Key Influencers by Microsoft: https://docs.microsoft.com/en-us/power-bi/visuals/power-bi-visualization-influencers
Power BI Feb 2019 feature summary – https://powerbi.microsoft.com/en-us/blog/power-bi-desktop-february-2019-feature-summary/

Heart Disease Data source
Donor:  David W. Aha (aha ‘@’ ics.uci.edu) (714) 856-8779
Creators:

  • Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D.
  • University Hospital, Zurich, Switzerland: William Steinbrunn, M.D.
  • University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D.
  • V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., Ph.D.

Global AI Bootcamp – Developing AI, responsibly

Global AI Bootcamp, Brisbane 2018

Yesterday, I attended the Global AI Bootcamp Brisbane (at the Precinct, Valley) along with nearly 100 other technology enthusiasts. The event was well organized by David Alzamendi of Wardy IT Solutions and Thiago Passos of SSW Consulting. I rocked up to the event hoping to get an update on the rapidly evolving Data Platform offerings from Microsoft. While the event did meet most of my expectations, it planted one particular seed of thought in my head. As I walked away at end of the day, I was enthralled about the rigorous, almost paranoic, awareness and research of the social responsibility that AI developers and solution providers should exert.

IMG_20181215_093303

Role of Ethics in AI

The event started with playback of the recorded keynote address by distinguished researchers of Microsoft AI. It was probably the small shot of long black coffee I had just had, I sat there wide-eyed and amazed by the wise words of Hanna Wallach, Principal Researcher at Microsoft Research, NYC. Hanna’s research covers a broad range of topics; she was clearly passionate about the impact of AI on society – FATE (Fairness, Accountability, Transparency and Ethics in AI). I had never thought about ethics in AI the same way, but it made perfect sense.

The one reaction that the average Joe has to AI is the notion that it is almost magical, but always reliable and authentic. That’s a dangerous prejudice! AI, much like any other branch of science, can be used for good or bad. The elevated status that AI enjoys amongst the masses, thanks to Hollywood movies, and research companies pitching AI as the field of science that would shape 21st century, leads to the belief that AI = TRUTH! Those in the know, are aware that inherent biases in training data sets lead to biases in scoring. My heart skips a beat just to think how a technologically illiterate person may be led to believe utter lies, much like the predictions of this highly controversial Israeli company – Faception. They claim to be able to apply facial personality analytics technology to predict a person’s IQ, their personality – whether they are an academic researcher or a terrorist, for instance, just by looking at their face.

Utilizing advanced machine learning techniques we developed and continue to evolve an array of classifiers. These classifiers represent a certain persona, with a unique personality type, a collection of personality traits or behaviors. Our algorithms can score an individual according to their fit to these classifiers (sic).

When Vanessa Love, Assistant Director of Integration and DevOps at Australian Bureau of Statistics, talked about Faception during her session – I ain’t afraid of no terminator – at the Bootcamp, my initial impression was that the company was called out on its claims and was obviously identified as a scam. I could resonate with her frustration and anger as she went on to explain how Faception was working with governments, and clients in Fintech and Retail. There are numerous such shocking applications of AI. For instance, Stanford researchers built an AI solution that could predict a person’s sexuality from facial analysis. The only aspect that is more appalling than the intent of their research is the fact that the average Joe doesn’t read the T&Cs – in this case, their model was correct only 81% and 71% of the times in predictions for males and females respectively. So, what about the 48 wrong predictions for every 152 correct predictions? Vanessa also mentioned Amazon’s AI enabled recruiting tool that was stood down due to racial and sexist biases. In this case, AI helped to reveal the truth about inherent historical bias in recruitment practices at one of the biggest technology companies. So, sometimes AI = TRUTH. Tricky? Food for thought!

Will AI enslave human beings?

The age-old question! This is a recurring question I am asked when I discuss AI with less technologically-literate acquaintances. I usually go on to explain how Machine Learning works, and the differences between Supervised and Unsupervised learning. The key point I try to drive home is that AI is not a person or a thing, and more importantly, like all software solutions, it is error prone and not to be taken for granted. When we do take technology for granted, self-driving cars kill people and auto-pilot programs crash planes. Technology is meant to aid and assist, not render humanity obsolete!

Developing AI, responsibly

Luckily, researchers like Hanna Wallach and Yoshua Bengio are actively working on building a code of conduct for AI research and application. A result of that vigil is the Montreal Declaration for Responsible Development of Artificial Intelligence, inked earlier this month. At the time, I read about it and quickly slid that thought to the slow sectors of my brain. I signed the declaration a little while ago. As a technologist, I not only have the responsibility to develop AI responsibly, but also educate others about the pros and cons of  AI solutions.

Other interesting learnings from the Bootcamp

Jernej Kavka, Software Architect at SSW Consulting, presented his experiments with Real-Time Face Recognition using Microsoft Cognitive Services. He explained how his team successfully reduced costs by 99% by applying caching and pre-processing. I found his session remarkable.

Joseph Zhou, Data Scientist and Solution Development Consultant at Avanade – talked about drag-and-drop AI using Azure Machine Learning Services. I found his session crisp and relevant. Later, Yousry Mohamed, Consultant at Readify, explained how to apply DevOps practices in Azure Machine Learning and automating model-selection using “a bit of simple code”. As always, Yousry’s presentation was animated and wonderful.

A day well spent!

Overall, it was a day well spent. Thanks to all sponsors and volunteers for making the event happen! I could tell everyone was excited to be there, and we all went home with various thoughts in our little heads, a little wiser than we were at start of the day. The thought in my head was – what about the Tesla driver who relied on the self-driving capability, what about the black Facebook employee who the soap dispenser denied, what happens when AI goes wrong?