How does the human brain think, sense, and learn? The fundamental underlying mechanism is composed of a vast set of interconnected neurons, also called nerve cells.
Neurons transmit and process the information received from our sensory organs (eyes, ears, nose, tongue, and skin) by releasing brain chemicals (neurotransmitters) that pass electrical impulses from one neuron to another, leading to the thought process.
This maze of neurons and accompanying neurotransmitters — which empower the supreme force called human intelligence — is a neural network.
Classical computers cannot think — they can just read data from memory and execute instructions as programmed. The mantle of imparting the ability to think, sense, and learn to computers falls to the field of Artificial Intelligence, more specifically the domain of Deep Learning which mimics the functioning of the human brain.
Deep Learning, a subset of Machine Learning, uses layers of algorithms that pass information via electronic signals from one layer to the next, proceeding from the input layer to the output layer with one or more hidden layers in-between.
Each layer consists of one or more nodes that process data. Such layers of algorithms, cumulatively referred to as an Artificial Neural Network (ANN), let computers analyze data, understand human speech, and visually recognize objects. Artificial Neural Networks start raw with no knowledge or perceptive ability, and need extensive training with large datasets before they can think, sense, and learn.
The Evolution of Neural Networks
In 1943, neurophysiologist Warren McCulloch and mathematician Walter Pitts laid the foundation of neural networks, modeling a simple neural network using electrical circuits.
In the 1950s, Nathanial Rochester from IBM research laboratories built upon ideas in Donald Hebb’s 1949 book ‘The Organization of Behavior’ to lead the first effort at a small-scale simulation of a human brain’s neural network. That attempt failed, although later attempts were successful.
In 1958, Frank Rosenblatt, a neurobiologist at Cornell, developed the Perceptron, the simplest neural network that could separate data into one of two classes. The Perceptron is the oldest neural network still in use today.
In 1959, Bernard Widrow and Marcian Hoff at Stanford developed two neural network models called ADALINE and MADALINE. ADALINE could predict the next bit when reading bits streamed from a phone line. MADALINE was the first neural network applied to a real-world problem. It used an adaptive filter to eliminate echoes on phone lines. MADALINE is still in commercial use.
The early successes of some neural networks led to an exaggeration of the potential of neural networks, which in turn created over-expectations considering the computing power available at the time. As a result, a long winter set in which saw little progress in neural networks.
In 1989, Yann LeCun et al. at AT&T Bell Labs applied a technique called backpropagation to read handwritten zip codes.
But the AI winter fully thawed only in 2006, when multiple technical breakthroughs — most significantly Geoffrey Hinton’s Greedy Layer-wise Training to train Deep Belief Networks — combined with tremendous gains in computing resources led to a rapid renewal in interest in neural networks.
Current and Future Applications of Neural Networks
Neural networks have made immense advances over the last couple of decades and now affect our daily lives in several ways, often without our awareness.
Governments use the technology for needs ranging from facial recognition and identity theft detection to satellite imagery and weather forecasts. Law enforcement agencies further use neural networks to analyze transactional data to identify fraudulent or criminal activity.
Many industries already serve consumers with the silent power of ANN-driven Deep Learning models lurking backstage. With the recent advances, the role of neural networks in industrial applications is expected to shoot up in the future.
Energy companies can use neural networks to locate new energy sources and analyze minerals. Oil companies can use neural networks to streamline oil distribution for higher efficiency and cost savings. Electricity companies can use neural networks to predict demand loads.
Manufacturing companies can use neural networks for a variety of tasks: logistics, inventory management, robots that tackle hazard-prone tasks.
Neural networks can assist cybersecurity organizations in preventing and detecting intrusions and malware and network traffic analysis.
Banks and credit-card companies can use neural networks to prevent fraud. Banks can also use neural networks to assess risks related to loan failure and client bankruptcy. Predictive analytics can further identify investment opportunities and help investment firms know when to buy or sell stock.
Sensors attached to a patient can help neural networks track the patient’s health. Healthcare organizations can use neural networks to detect, diagnose and treat diseases.
Retailers can use neural networks to capture and analyze data to tailor your shopping experience, optimize prices, and manage inventory. Businesses can use neural networks to decide what online ads to show you.
One of the foremost applications of neural networks is in Natural Language Processing (NLP) — computers understanding human-delivered speech and text. NLP is what drives programs like Amazon’s Alexa.
A Silicon Valley company named OpenAI has blazed a trail in NLP with its language processing model GPT-3. In the coming days, GPT-3 is expected to revolutionize web searches, text classification, question answering systems, automated generation of reports, resumes and summaries, speech parsing and recognition, and many other NLP goals.
Merging the Physical and Digital Worlds
Many exciting neural network applications of the future revolve around computer vision — self-driving cars, motion capture, 3D modeling. Such technologies naturally are heavily GPU-dependent. Nvidia’s CUDA-enabled GPUs provide a good platform to build such applications.
Nvidia has developed its own deep neural network (a network with multiple hidden layers) called PredictionNet, which uses a rasterized top-down view of the world provided by onboard perception systems and computes predictions from past observations. This form of machine learning is key to applications needing extensive and complex visual perception.
The PyTorch module of Python provides easy integration with CUDA-enabled GPUs. It can immensely speed up the neural network training process, about 10x faster than a normal CPU.
PyTorch lets you easily transfer your Deep Learning model and your training dataset to the GPU. You might want to try scaling computer vision with PyTorch solutions yourself.
A special class of deep neural networks called Convolutional Neural Networks (CNNs) promises to speed up training further. CNN’s are known to achieve great performance in perception and control tasks compared to other techniques. CNN’s can learn millions of parameters using a large amount of labeled data.
There will remain challenges that neural network developers will have a hard time overcoming. You can build a perfect self-driving car that makes no mistakes even on the most perilous of driving routes. But what if there are drunken drivers driving cars irrationally enough to swerve into your car’s path head-on suddenly?
With time and further development, the limits of neural networks will be identified and applications they’re used in are designed with overcoming those limitations or challenges in mind.