What are Computer Vision Applications?
The term gets tossed around frequently in tech circles whenever self-driving cars or other futuristic AI technologies are being discussed.
The concept – teaching machines to see – seems to be fascinating to most. But can an average business runner, whose firm isn’t yet featured in Fortune 500, find practical uses of it?
In this post, we’ll shed some light on how computer vision works, in broad terms, and provide examples as to how it can be applied by companies across various industries.
What is Computer Vision?
As humans, we give no regard to the extraordinary gift of sight. However, to capture reality with one’s eyes and understand, in a fraction of a second, what is it being observed is a staggeringly complex and sophisticated process.
Let’s suppose you notice a car approaching at a dangerous speed the sidewalk you are strolling through. You register the object, it’s passed through your eyes, and then the visual signal hits retinas. Next, after being browsed briefly by retinas, the data is sent to the visual cortexes so that your brain can perform a more nuanced analysis. Finally, the image is perceived by the rest of the cortex and matched against your brain’s database – the object is thus classified, and its dimensions are established. What you get as an output is an impulse to move out of the way in a safe direction which your mind, too, has figured out after analyzing the car’s speed and movement trajectory.
All of this happens in a blink.
Understanding just how intricate our system of visual perception is, you get the idea of how difficult it is to recreate it. Quite possibly, it is the toughest problem ever attempted by the humankind.
Roughly speaking, we must complete three subtasks – emulating the eye, emulating the visual cortex, and copying the way the rest of our brain responds to visual information. Computer Vision is an interdisciplinary field that concerns itself with exactly that – teaching machines how to extract and interpret content from images.
What is the Current State of Computer Vision?
As far as mimicking the human eye, today’s cameras are pretty much on point. Equipped with perfect optical lenses, they can simulate very closely (and at times even exceed) our ability to record with precision the distribution of photons in any given direction.
The real problems arise further along the road when we endeavor to write software that can recognize and extract meaning from these clear pictures. A camera, no matter how advanced, can’t identify an apple and much less duck if one is flying at its expensive lens. Our brains use past experiences as context to categorize and define what we see. Machines do not have that option.
The discipline that allows us to recreate, at least partially, our brain’s system of using knowledge to make sense of objects is called Deep Learning. In the particular case of Computer Vision, we mostly use Convolutional Neural Networks (CNN).
What CNNs do is break down each picture into small matrixes of pixels known as filters. After that, they proceed to perform calculations on these groups of pixels and match them against specific patterns they were trained to look for. At the first level, much like our brain, CNNs determine things like rough curves and edges within an image. A few convolutions after, however, they start piecing together surfaces, info about depth, layers, discontinuities in the visual spaces, and, finally, begin to make out objects such as faces, clothes, fish, cars, animals, etc.
In the beginning, the machine mostly fails as all its filters’ values are randomized. As it keeps comparing its random outputs to the actual ones from the labeled dataset, and using error and loss functions to correct itself, it gradually increases accuracy. It is crucial that a neural network is trained on a large enough labeled dataset. Otherwise, it just won’t know what to look for.
CNNs proved great for determining features on still images, but they fail completely when processing a series of picture frames, i.e. a video. They can’t identify items that might change over time and grasp the context of a progression of images, which is instrumental for proper video labeling.
So, to process videos, computer vision experts build upon the work of CNNs and then introduce another type of algorithms to the equation – Recurrent Neural Networks (RNNs). They feed the output from a convolutional network into an RNN as the latter is equipped to address the temporal structure of every motion clip.
The key difference between CNNs and RNNs is that the former deal with each matrix of pixels independently, while the latter are able to “memorize” data they’ve already processed and make decisions based on the knowledge they’ve accumulated.
It is also worth mentioning here, that neural networks, which are often likely to be overfitting, can benefit from transfer learning, i.e. using models that have been trained for other purposes as a starting point for models that are working on a new task. An algorithm that’s efficient at spotting animals, for instance, can be trained further to distinguish humans and so on. This method can help companies obtain a training dataset of a sufficient size much easier.
What are The Most Popular Computer Vision Applications?
As we’ve established, Convolutional Neural Networks, if trained properly, can determine location invariant features automatically, providing there’s a sufficient number of input-output pairs (aka labeled data) for them to train on. This opens numerous opportunities for firms across various industries: Computer Vision is being adopted rapidly for various purposes in healthcare, agriculture, insurance, automotive industries, and so on.
Healthcare. Medical imaging has been on the rise for years and multiple healthcare startups have been partnering with prominent hardware providers to build bleeding-edge computer vision tools. One of the most popular use cases, up until recently, was leveraging CNNs to detect diseases from MRI. But now things have taken an even more interesting turn – companies such as Arterys have been given clearance from FDA to apply deep learning in clinical settings. Mass adoption might just be around the corner.
Agriculture. The drone technology has been booming too and, as a result of the advancements in the field, the costs of acquiring huge sets of aerial imagery have been lowered dramatically compared to few years ago. This, combined with recent breakthroughs in Machine Learning, presents lots of promise for Agricultural firms. Computer Vision can help farmers spot crop diseases, predict crop yields, and, overall, automate the time-consuming processes on manual field inspection.
Insurance. Orbital Insights, among other startups, has been using satellite imaginary to assist in various ways insurance and reinsurance companies (as well as companies in other fields). Particularly what the firm does, as they themselves put it, is look closely at the lids on the oil tanks, track movements of the tankers and monitor oil drilling rigs to make accurate predictions on oil production. Besides that, the data they provide can help improve underwriting models and streamline renewals of insurers’ books of businesses through continuous “always-on” monitoring.
Automotive. Apart from self-driving cars, there’s a broad array of use cases for computer vision in the automotive industry. Some companies, for example, use the tech to have cars automatically set speed limits, detect lanes, interpret signs, and perform overall scene analysis.
We don’t put a drop of effort into interpreting visual information. We just glance out of the window and our brains tell us which object on the street is a tree and which is a flower. On a deeper level, however, each frame our eyes perceive is just a set of pixels that our visual cortex has to properly integrate and process. When attempting to extend our visual ability to machines, one has to not only build precise light capturing hardware (that part we’ve pretty much nailed) but also robust software that can make sense of the numbers that represent intensities on the color spectrum.
More parts of the human brain are dedicated to processing visual signals than to any other task. And despite all the AI hype, so far, we’re nowhere near being able to simulate the way our minds work; we, ourselves, can barely define it. That doesn’t mean, however, that the breakthroughs in the field of Computer Vision are in any way insignificant. Already we can train models to achieve accuracy that, at times, rivals the human’s image recognition abilities and multiple companies, big and small, have found ways to exploit these advancements to reduce operational costs and streamline business processes.
Want to learn more about computer vision and how it can benefit your business? Reach out for a free consultation with our expert!