Neural networks are powerful learning models that achieve state-of-the-art results in a wide range of supervised and unsupervised machine learning tasks. They are suited especially well for machine perception tasks, where the raw underlying features are not individually interpretable. The use of recurrent neural networks are often related to deep learning and the use of sequences to evolve models that simulate the neural activity in the human brain.

### Overview of Recurrent Neural Network (RNN)

The fundamental feature of a Recurrent Neural Network (RNN) is that the network contains at least one feed-back connection, so the activations can flow round in a loop. That enables the networks to do temporal processing and learn sequences, e.g., perform sequence recognition/reproduction or temporal association/prediction.

Recurrent Neural Networks (RNNs) are connectionist models with the ability to selectively pass information across sequence steps, while processing sequential data one element at a time. Thus they can model input and/or output consisting of sequences of elements that are not independent. Further, recurrent neural networks can simultaneously model sequential and time dependencies on multiple scales.

#### Figure 1: Recurrent Neural Network

In other words, the RNN will be a function with inputs\( x_t \) (input vector) and previous state \( h_(t-1) \). The new state will be \( h_t \).

The recurrent function \( f_W \), will be fixed after training and used to every time step.

Recurrent Neural Networks are the best model for regression, because it take into account past values. RNN are computation “Turing Machines” which means, with the correct set of weights it can compute anything, imagine this weights as a program.

### Use cases of Recurrent Neural Networks

- Machine translation (English –> French)
- Speech to text
- Market prediction
- Scene labelling (Combined with CNN)
- Car wheel steering. (Combined with CNN)

### Defining Recurrent Neural Network

A recurrent neural network (RNN) is a class of neural networks that includes weighted connections within a layer (compared with traditional feed-forward networks, where connects feed only to subsequent layers). Because RNNs include loops, they can store information while processing new input. This memory makes them ideal for processing tasks where prior inputs must be considered (such as time-series data). For this reason, current deep learning networks are based on RNNs. This tutorial explores the ideas behind RNNs and implements one from scratch for series data prediction.

*“A recurrent neural network (RNN) is a type of advanced artificial neural network (ANN) that involves directed cycles in memory. One aspect of recurrent neural networks is the ability to build on earlier types of networks with fixed-size input vectors and output vectors.”*

The idea behind RNNs is to make use of sequential information. In a traditional neural network we assume that all inputs (and outputs) are independent of each other. But for many tasks that’s a very bad idea. If you want to predict the next word in a sentence you better know which words came before it. RNNs are called *recurrent* because they perform the same task for every element of a sequence, with the output being depended on the previous computations. Another way to think about RNNs is that they have a “memory” which captures information about what has been calculated so far. In theory RNNs can make use of information in arbitrarily long sequences, but in practice they are limited to looking back only a few steps (more on this later). Here is what a typical RNN looks like:

#### Figure 2: A recurrent neural network and the unfolding in time of the computation involved in its forward computation

The above diagram shows a RNN being *unrolled* (or unfolded) into a full network. By unrolling we simply mean that we write out the network for the complete sequence. For example, if the sequence we care about is a sentence of 5 words, the network would be unrolled into a 5-layer neural network, one layer for each word. The formulas that govern the computation happening in a RNN are as follows:

- \( x_t \), is the input at time step t. For example, \( x_1 \) could be a one-hot vector corresponding to the second word of a sentence.
- \( s_t \), is the hidden state at time step t. It’s the “memory” of the network\( s_t \) , is calculated based on the previous hidden state and the input at the current step: \( s_t=f (Ux_t+Ws_(t-1)) \) The function f usually is a nonlinearity such as tanh or ReLU. \( S_(-1) \) , which is required to calculate the first hidden state, is typically initialized to all zeroes.
- \( o_t \), is the output at step t. For example, if we wanted to predict the next word in a sentence it would be a vector of probabilities across our vocabulary. \( o_t=softmax (VS_t). \) .

### References

[1] M. Tim Jones, “Recurrent neural networks deep dive”, published on August 17, 2017, available online at: https://www.ibm.com/developerworks/library/cc-cognitive-recurrent-neural-networks/index.html

[2] Denny Britz, “Recurrent Neural Networks Tutorial, Part 1 – Introduction to RNNs”, WILDML, Artificial Intelligence, Deep Learning, and NLP, available online at: http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/

[3] Salehinejad, Hojjat, Julianne Baarbe, Sharan Sankar, Joseph Barfett, Errol Colak, and Shahrokh Valaee. “Recent Advances in Recurrent Neural Networks.” arXiv preprint arXiv: 1801.01078 (2017).

[4] “Introduction: Recurrent Neural Networks”, available online at: https://leonardoaraujosantos.gitbooks.io/artificialinteligence/content/recurrent_neural_networks.html

## No Comments