There is an epidemic in machine learning due largely to a few core misconceptions. As CEO,
your team is giving you a “false confidence” that they have it under control. But when you
really dig in to the misconceptions, you are going to find out that they don’t, and to compound the situation they are big-timing the very folks who can help them.
Machine learning done correctly can be incredibly impactful to your business, and there is a
growing sense of urgency in the C-suite around how to capitalize on it, quickly. But to do this, CEOs must take the time to dig in and confront the misconceptions head-on. CEOs need to give machine learning the attention it deserves and get involved in the strategy and approach themselves. This is not an initiative they can blindly hand-off to their team of “ML experts” if they want a good outcome.
As the Executive Chairman of an ML company, I have the privilege of working with a team of actual ML experts, led by one of the most prolific ML researchers in the world. What is amazing to me is how many times our experts get big-timed by teams in larger organizations that think they know ML. These “Big-Timers” lecture you that their organization is so big and established that of course they know what they are doing and have this problem solved. Time and again, I’ve learned that almost everybody thinks they know ML, and they are almost always mistaken.
Assuming you read the last article on decoding the alphabet soup of machine learning, please don’t stop there. Dig in to the following three misconceptions so that you can ask your team the right hard questions and lead them to effectively cultivate and capitalize on the ML opportunity.
Misconception #1: I’ve got a team who knows ML.
What you probably have is a team who can work with simple neural networks, which are single layered algorithms where both the input data and output data are labeled, e.g. translating English to French. Smart, technical people are usually able to produce these results. DO NOT mistake them for ML experts. In many cases these people have only just started working in machine learning. Seriously, ask yourself how many ML Experts you have met who were last year’s Cloud Experts and the prior year’s Security Experts? The answer is a lot.
“Let’s assume you have the right ML team. The next step is to make sure they are working on projects that will deliver real business impact and scale across your enterprise.”
Because most enterprises don’t realize that these are relatively easy projects to do, their
success leads them to think they’ve “cracked the code” on ML. Emboldened by this success,
enterprises start giving demonstrations in leadership and board meetings across the globe.
This is where the problems that give rise to the epidemic start. The more these demos are seen, the more CEOs and leadership teams think of questions they would like to see answered by the ML technology. These questions get harder and harder because they are complex. The data gets harder and harder because there is less of it (and it is not always labeled), and the answers get harder and harder because now you need real ML experts to solve these questions and deal with the data.
Trust me, the folks who created the first ML success that is being demoed in your company are not going to be able to harness the data and develop the ML technology necessary to deliver answers to the harder and harder questions. Why? Because unless you have a team of PhD level folks who have studied a combination of mathematics, computer science, and statistics, and have trained in research labs for 15 plus years, you don’t have the team.
Most companies cannot hire these folks because they are busier and harder to come by and
most of them are linked to top tech companies or have started their own companies. Because these top folks are now busy with business initiatives, they are not teaching as much as they did five years ago, so the supply for this type of talent is shrinking while the demand is exploding. If you are following the space, then you have read about how valuable these ML resources are and how much money they can command in the market now.
Even when I started implementing ML, I certainly didn’t have the right team. My initial success quickly turned to disappointment, lost time and lost money when I finally found out that the ML team I had could not make the jump to deep learning, which requires the ability to build multi-layered neural networks to solve these more complex questions and data. So review your ML team’s degrees, lab experience, number of years dealing with all types of questions and data, and programming experience, which includes coding and leveraging GPU power.
Misconception #2: The ML projects my team are performing will scale to achieve true business IMPACT!
Let’s assume you have the right ML team. The next step is to make sure they are working on projects that will deliver real business impact and scale across your enterprise. Real business impact is defined by only two things: increasing revenue or decreasing costs. If the project is not doing either, you are funding a very expensive science experiment. A well-known example is the AlphaGo algorithm, which beat the ancient Chinese game “Go.” As cool as this was (there is a Netflix documentary on it), it did not create business impact (remember: increase revenue or decrease cost). As CEOs, this is what makes our blood boil: spending money doing things that can’t scale and won’t have any business impact. So before your team spends too much time on any ML projects ask yourself these questions:
1. Does this project solve a top three question you or your top customers want answered?
2. Does this project help the company to increase revenue or reduce costs?
3. Does this project create a unique data set for your company?
If you cannot answer “YES” to at least one of these questions, congratulations! Like I did, you are officially funding a science experiment.
Misconception #3 – My Data is Ready to Go!
This is my favorite misconception because it is such an easy mistake to make, and I made it. I ran a transaction processing business where we made money off of processing the data
correctly. So of course my data was good. I could not have been more wrong. My data was not accessible, sizable, usable, understandable, or maintainable. This was a major problem,
because without data, there is no ML!
What I found was that my high-powered, high-cost ML team was having to clean data much like the janitor cleans the office each night. This is not glamorous work but it has to be done (my prediction is that over the next five years the same amount of big dollars that were spent installing source data entry systems like SAP, Oracle, or Cerner/Epic in the Healthcare space will be spent getting data out of the systems and making use of it).
I share this to make you all feel better because trust me, my situation is not unique. On the data, please dig in and have your team walk you through the following questions:
1. Is our data accessible? (When is the last time you tried to download 10 files from your
systems? Not very often, because it is frowned upon to take your data anywhere and
there are security systems built around making sure this does not happen. If this is going
to work, you need to be able to get a ton of data to the cloud so it can be used.)
2. Is our data sizable? (In ML, you need a lot of data to get the ML technology to learn. The
more data, the better. At the end of the day, you want the ML technology to perform at
human levels and this cannot be achieved without a sizable data set.)
3. Is our data usable? Is your data clean? Is there junk in the fields or even if there is not
junk in the fields, is it quality data? (My team was examining a field called job
description and found 27 different versions of the job description of an accountant.
Most of them were real terms used to describe accountant, like: CPA and ACT so you
can see that making the data usable is not a no-brainer.)
4. Is our data understandable? (Developing human level performance ML technology is all
about training the technology with data. If your team does not know what the data
means in each field and cannot communicate it, then it is pretty hard to train a person
on the data, let alone a piece of technology.)
5. Is our data maintainable? Do we have a process and a designated team to maintain the
data? (Many companies spend a ton of money and time cleaning data but forget to
create a plan for dealing with new data and keeping their clean data current. Bottom
line: you want to reliably produce the data set in an ongoing manner.)
Companies that will win in ML will develop a Data Science Culture that tackles these five
Learn from my mistakes. After early success, I had “false confidence” and ran head first into each of these three misconceptions. The experience almost made me a skeptic about ML because I blamed the ML technology for the issue versus my lack leadership for not digging in.
Do yourself a favor and take a shortcut to ML business impact by confronting these
misconceptions head-on. If you do, your team will be able to substantiate their confidence
when they say they have ML covered. But if you don’t, welcome to the club of Fortune 500
“Big-Timers” who are victims of the ML epidemic!