ViSenze Blog

Has deep learning come to a dead end?

Posted by ViSenze 28-August-2015

Deep learning and its small window of innovation

Just a number of weeks ago, Li Deng, a lead deep learning expert at Microsoft Research, expressed
his views on the narrowing innovation gap with regards to deep learning.

Deng is very confident that the low-hanging fruit for progress in deep neural networks has been picked and that for the next couple of years, refinement of neural networks and deep learning algorithms will be isolated and incremental.

deep neural networksAccording to Deng, when it comes to image recognition, companies have to work hard nowadays for even a small gain. Unlike three or four years ago where “you take just one GPU, dump a load of data there, do a little innovation in the training and get a huge amount of gain.”

Deng mentioned that most quantum leaps in machine learning has been made - most notably on the error rate, which has dropped to less than 4.7% (almost the same as human capability) - meaning that there is less space for discovery.

While he is right in saying that there is not much of an improvement we can bring to the technology in the image recognition field (in terms of accuracy and latency), he has overlooked something very important - image recognition is more than just technology, it is also an application.

In essence, technology has made its improvement, but with regards to problem solving and opportunities, we have yet to apply it to so many areas.

And like most modern applications, there is more value in customizing and diving deep into developing specific solutions for different verticals than to purely increase the accuracy score of the technology.


Focus on applications, not technology

Deep learning, especially in the realm of visual technology, is hardly a nascent field. But it is one that has become increasingly important as companies are wrestling with massive amounts of text, image, and video data that cannot be parsed and organized by any kind of speed by human beings.

We are experiencing breakthrough after breakthrough in the field these days, most notably made by Google, Baidu, Yahoo, Facebook, and other titans of the web that have been pushing the deep learning envelope.

But really, this is all just the beginning. This new stage of visual technology and communications is like a newborn baby learning to walk.

Although the big leaps forward might be fewer and farther between, there is still quite a bit of work to be done.

deep learning gaps


Considerable gaps in deep learning application

For example, e-commerce companies have seen an uplift in their conversion rates of up to 50% - a benefit that they have reaped from utilizing deep learning visual technology. That’s quantum improvement too.

At ViSenze (an R&D company specializing in the area of visual technologies), we apply our deep learning image recognition and visual search technologies to highly-specific verticals.

One of our customers, Lazada - Southeast Asia's leading e-commerce marketplace with millions of products at competitive prices - applies our image recognition solution to improve tagging and categorization of their fashion products. They have also integrated our visual search technology to raise the bar for a seamless customer experience.

Our image recognition technology is able to fine-tune itself and dive deep into particular product categories within the fashion wear segment. For example, our algorithms are able to tell different pieces of jackets, bags, and shoes apart, based on their individual color, patterns, silhouette, and more.

It is capable of not just telling the jackets from the hoodies and cardigans and sweaters, but also recognize if the item is a double breasted, single breasted, or zipper jacket. As for bags, even subtle differences between textures and materials - such as those made from canvas, normal leather, or crocodile leather - are apparent at a glimpse.

The precision and accuracy of our image recognition technology extends even to the types of shoes, women's pumps for instance, encompassing the classic pump, the slingback, strappy, t-strap and stretching over to the heel type, including chunky heels, kitten heels, mules, platforms, wedges, and so on.

This verticalization and customization of our deep learning image recognition and visual search technologies to a highly-specific vertical surfaces an important message:

Improvements and developments in the deep learning arena will come from the customization of applications rather than in-lab adjustments to the pure technology.

convolutional network

Accuracy can only be improved upon by so much, the same goes for latency and speed of search. What we should be concerned with is the applications to industry verticals.

So what about applications in areas outside of e-commerce?

Well, lots of work await to be done in the health and medical, security, and even education sectors. Use cases such as self-diagnosis using computer vision, for instance, has the potential to revolutionize the medical industry, giving everyone access to intelligence that currently lies in the hands of a few.


The future of deep learning - full of exploration and potential

To round off, deep learning has come a long way in the last few years on the technological side, but the real key to pushing their performance and development lies in taking full advantage of spreading its uses deep into verticals.

At this point, deep learning is still in its embryonic stage. The view that it has a narrowing innovation gap is a discouraging one that condemns it to a premature death.

The goal is to try to find efficient ways to maximize deep learning technology across a breadth of use cases while customizing it for specialized applications, allowing the connecting of technology with a wider market, with a far greater ability to improve lives.

Finding some application to use the technology, instead of purely improving the technology itself will smarter systems, better services, and more time to solve the human problems that computers will never be able to fix.


New Call-to-action


Image credits: 1, 2

  Opinion pieces