Deep learning has been all the rage. From chatbots in customer service, through image recognition solutions in retail, to autonomous vehicles in transportation – artificial intelligence companies seem to be shaping the future of business. But, like with every tech craze, confusion and overblown expectations reign supreme, and like moths to a flame, way too many businesses reach for deep learning solutions when they shouldn’t.
There are several factors that make relatively simpler models more suitable than their deep learning counterparts, but first, let’s quickly address what deep learning really is. Unfortunately, too often, deep learning is used interchangeably with AI and machine learning (ML) which is not the case at all. To get a more clear understanding of AI and ML, check out our Everything You Need to Know About Key Differences Between AI, Data Science, Machine Learning, and Big Data post.
Deep learning is a subset of machine learning (which is a subset of AI) with several unique characteristics that can be a complete deal-breaker when choosing the right solution for your business. This post is about these characteristics.
What makes deep learning not the right fit for your business?
Costs
Yes, deep learning developments bring about monumental breakthroughs, but not every company lives on the cutting-edge of innovation. The problems most, especially small, businesses are facing do not really require very complex and sophisticated methods which only increase costs and time.
The costs of developing ML solutions are not cheap to begin with, and then they only skyrocket with increased complexity.
Why?
There are many more decisions to make and test. They include choosing the right type of network architecture, activation functions, optimizer, regularization strategy, the list goes on, not to mention a lot of hyperparameters that also need to be fine-tuned.
Deep learning is also inherently slow. For example, several weeks of just training using multiple GPUs is nothing extraordinary. Having to make all these decisions also means deciding to pay for them as more employee time and stronger machines are required.
Not enough good-quality data
Many businesses are just now starting to catch up to the information revolution and begin understanding the value of storing data. That means that despite their good intentions and current enlightenment, their data sets are not big enough for deep learning.
Let’s remember that deep learning demands huge sample sizes to train on. Deep neural networks often take hundreds of thousands or even more samples to achieve high performance.
Of course, there are some areas of application where complex models can be used even without huge datasets. For example, it is sometimes possible to use already pre-trained models as the basis of our own solution, but these areas are very limited. Currently, such techniques are only applicable to some types of image classification (like detection and identification of animals, cars, boats, etc.) and some limited NLP (Natural Language Processing).
Also, many problems require so-called labeled datasets (meaning that each sample is annotated with an expected value, e.g., a classification result or output value to be predicted). Such labeling is time-consuming and may often require hiring people to do it manually – read more costs.
Yes, good and plentiful data are key. But that doesn’t mean that, in its absence, a good AI company shouldn’t get creative. Check out two of our cases where we made up for the deficiency in quality data with a bit of ingenuity.
In the first case, we were tasked with creating an algorithm that would detect brand logos on pictures in social media. We chose deep learning as the best and most efficient way to get the job done, but we had some challenges.
We needed a sizable, well-labeled dataset for deep learning to work its magic, but where were we going to get that for the less prominent and recognizable brands? So, we decided to fill that gap with artificially generated data to make deep learning possible. Our initial results are very promising. Academic research reports about 60% accuracy in recognizing logos based on artificial data. We’re on our way to improving on that number.
In another case, we had a client who came with the challenge of segmenting Facebook users based on their psychological traits. Again, this was a bar set quite high due to the range and amount of data to process, but at the same time we’ve never met a challenge we didn’t want to conquer, so we went to work.
We decided to build on the research of Michal Kosinski who had tremendous success using linear regression to psychologically target audiences based on their digital footprint. Kosinski, however, was able to use huge amounts of data since it was then still legal. We decided to go a step further in the methods used to create the algorithm. Of course, we couldn’t even come close to the amount of data processed because of the new privacy protection laws and Facebook regulations. But we employed more advanced machine learning methods and got a correlation between Facebook likes and psychological characteristics of a person.
Limited interpretability of deep learning
Deep neural networks are known for being these black boxes whose inner operations are not really interpretable. It’s not that we’re not trying to understand it. There is some work in this area but as of yet, no general answers. This ability to explain solutions is inherent to many simpler methods, in particular, linear ones, where the direct relationship between parameters can be analyzed.
Why is this interpretability important?
From the business owner’s point of view, interpretability is important because it can give new insights into relationships between numerous variables and expected outcomes. It is also great to prove that the model does not operate magically which increases the trust of the people who use it in their daily business decisions.
There have been quite a few cases in recent years when the lack of such interpretability resulted in seriously skewed results. In one case, the software used to assess the probability of recidivism in offenders gave mistaken results due to racial bias. In another, scientists at Carnegie Mellon University found that men were much more likely to be shown ads for well-paid jobs on Google. As you can see, in some models, this interpretability is necessary to understand the legitimacy of the result.
In extreme cases, this interpretability may be the purpose in and of itself. ML analysis may be used to discover the structure and order in, e.g., business processes or customer behavior.
For every success story, there are thousands of cases of companies bogged down in a deep AI solution they had no business getting into in the first place. The hype surrounding AI and deep learning is not unjustified – it’s a game-changer in many industries. Right on this blog, we’ve covered how deep neural networks are used in accounting (3 A.I. solutions for tax and accounting that will help you keep your business alive) and medicine (How technology can improve medicine and How AI and Data Science can help manage diabetes in everyday life). That does not mean that it’s the right choice for you. Make sure you’re fully aware of what deep learning solutions entail before you embark on that journey (really, it’s a journey!), and remember that most often the best solution is the simplest solution that works.