Memory and Attention in Deep Learning

Authors

  • Yi Zhang
  • Ziying Fan

DOI:

https://doi.org/10.54097/k801wm68

Keywords:

Lstm, Deep Learning, Memory Mechanisms.

Abstract

This paper has highlighted the advancements made within the deep learning approached through the evolution of attention and memory algorithms. There has been a gradual replacement of the traditional approaches in deep learning through the inclusion of these algorithms that helps in capturing the data based on time and sequence. The model performance is greatly enhanced through the usage of RNNs that uses these algorithms for developing sequential modelling of data. This paper has performed a peer review to understand the different mechanisms that include GRUs, MANN, LSTM and self-attention mechanisms. Memory mechanism assists in capturing the past sequence of datasets for analysing the hidden state within the datasets. Attention mechanisms help in capturing a particular location within a video dataset to understand the patterns of a data. There is an accurate recognition of human actions in the datasets through the implementation of attention mechanisms. This helps in increasing the model performance by enhancing the prediction accuracy and visibility within a particular dataset.

Downloads

Download data is not yet available.

References

Alzubaidi, L., Zhang, J., Humaidi, A.J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., Santamaría, J., Fadhel, M.A., Al-Amidie, M. and Farhan, L., 2021. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. Journal of big Data, 8, pp.1-74.

Chandra, N., Ahuja, L., Khatri, S.K. and Monga, H., 2021. Utilizing Gated Recurrent Units to Retain Long Term Dependencies with Recurrent Neural Network in Text Classification. J. Inf. Syst. Telecommun, 2, p.89.

Dai, X., Chen, Y., Xiao, B., Chen, D., Liu, M., Yuan, L. and Zhang, L., 2021. Dynamic head: Unifying object detection heads with attentions. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7373-7382).

de Santana Correia, A. and Colombini, E.L., 2022. Attention, please! A survey of neural attention models in deep learning. Artificial Intelligence Review, 55(8), pp.6037-6124.

Guo, M.H., Xu, T.X., Liu, J.J., Liu, Z.N., Jiang, P.T., Mu, T.J., Zhang, S.H., Martin, R.R., Cheng, M.M. and Hu, S.M., 2022. Attention mechanisms in computer vision: A survey. Computational visual media, 8(3), pp.331-368.

Hafiz, A.M., Parah, S.A. and Bhat, R.U.A., 2021. Attention mechanisms and deep learning for machine vision: A survey of the state of the art. arXiv preprint arXiv:2106.07550.

Hernández, A. and Amigó, J.M., 2021. Attention mechanisms and their applications to complex systems. Entropy, 23(3), p.283.

Jordan, I.D., Sokół, P.A. and Park, I.M., 2021. Gated recurrent units viewed through the lens of continuous time dynamical systems. Frontiers in computational neuroscience, 15, p.678158.

Kardakis, S., Perikos, I., Grivokostopoulou, F. and Hatzilygeroudis, I., 2021. Examining attention mechanisms in deep learning models for sentiment analysis. Applied Sciences, 11(9), p.3883.

Kossen, J., Band, N., Lyle, C., Gomez, A.N., Rainforth, T. and Gal, Y., 2021. Self-attention between datapoints: Going beyond individual input-output pairs in deep learning. Advances in Neural Information Processing Systems, 34, pp.28742-28756.

Landi, F., Baraldi, L., Cornia, M. and Cucchiara, R., 2021. Working memory connections for LSTM. Neural Networks, 144, pp.334-341.

Lau, K.W., Po, L.M. and Rehman, Y.A.U., 2024. Large separable kernel attention: Rethinking the large kernel attention design in cnn. Expert Systems with Applications, 236, p.121352.

Le, H.T., 2021. Memory and attention in deep learning, 1. doi:https://doi.org/10.48550/arXiv.2107.01390.

Lin, Z., Li, M., Zheng, Z., Cheng, Y. and Yuan, C., 2020, April. Self-attention convlstm forspatiotemporal prediction. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11531-11538).

Nassaji, H., 2020. Good qualitative research. Language Teaching Research, 24(4), pp.427-431.

Pan, X., Ge, C., Lu, R., Song, S., Chen, G., Huang, Z. and Huang, G., 2022. On the integration of self-attention and convolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 815-825).

Park, S., Kim, S., Lee, S., Bae, H. and Yoon, S., 2018, April. Quantized memory-augmented neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 32, No. 1).

Rafiq, G., Rafiq, M. and Choi, G.S., 2023. Video description: A comprehensive survey of deep learning approaches. Artificial Intelligence Review, pp.1-80.

Ragab, M., Chen, Z., Wu, M., Kwoh, C.K., Yan, R. and Li, X., 2020. Attention sequence to sequence model for machine remaining useful life prediction. arXiv preprint arXiv:2007.09868.

Rajamani, S.T., Rajamani, K.T., Mallol-Ragolta, A., Liu, S. and Schuller, B., 2021, June. A novel attention-based gated recurrent unit and its efficacy in speech emotion recognition. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6294-6298). IEEE.

Ranftl, R., Bochkovskiy, A. and Koltun, V., 2021. Vision transformers for dense prediction. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 12179-12188).

Salehi, A.W., Khan, S., Gupta, G., Alabduallah, B.I., Almjally, A., Alsolai, H., Siddiqui, T. and Mellit, A., 2023. A Study of CNN and Transfer Learning in Medical Imaging: Advantages, Challenges, Future Scope. Sustainability, 15(7), p.5930.

Smys, S., Chen, J.I.Z. and Shakya, S., 2020. Survey on neural network architectures with deep learning. Journal of Soft Computing Paradigm (JSCP), 2(03), pp.186-194.

Torfi, A., Shirvani, R.A., Keneshloo, Y., Tavaf, N. and Fox, E.A., 2020. Natural language processing advancements by deep learning: A survey. arXiv preprint arXiv:2003.01200.

Weng, O., 2021. Neural network quantization for efficient inference: A survey. arXiv preprint arXiv:2112.06126.

Wilson, M., Wellington, B., Merrick, A. and Huxley, I., 2023. A recommendation model based on deep feature representation and multi-head self-attention mechanism.

Xi, Z., Chen, W., Guo, X., He, W., Ding, Y., Hong, B., Zhang, M., Wang, J., Jin, S., Zhou, E. and Zheng, R., 2023. The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864.

Xie, Y., Zhou, T., Mao, Y. and Chen, W., 2020. Conditional self-attention for query-based summarization. arXiv preprint arXiv:2002.07338.

Zhao, Y., Wang, D., Xu, B. and Zhang, T., 2020. Monaural speech dereverberation using temporal convolutional networks with self attention. IEEE/ACM transactions on audio, speech, and language processing, 28, pp.1598-1607.

Zulqarnain, M., Abd Ishak, S., Ghazali, R., Nawi, N.M., Aamir, M. and Hassim, Y.M.M., 2020. An improved deep learning approach based on variant two-state gated recurrent unit and word embeddings for sentiment classification. International Journal of Advanced Computer Science and Applications, 11(1).

Downloads

Published

15-04-2024

Issue

Section

Articles

How to Cite

Zhang, Y., & Fan, Z. (2024). Memory and Attention in Deep Learning . Academic Journal of Science and Technology, 10(2), 109-113. https://doi.org/10.54097/k801wm68