Prediction architecture based on block matching statistics for mixed spatial-resolution multi-view video coding

Article

Said, Hany, Moniri, M. and Chibelushi, Claude C. 2017. Prediction architecture based on block matching statistics for mixed spatial-resolution multi-view video coding. EURASIP Journal on Image and Video Processing. 2017 (1). https://doi.org/10.1186/s13640-017-0164-7

Publication dates
Authors	Said, Hany, Moniri, M. and Chibelushi, Claude C.
Abstract	The use of mixed spatial resolutions in multi-view video coding is a promising approach for coding videos efficiently at low bitrates. It can achieve a perceived quality, which is close to the view with the highest quality, according to the suppression theory of binocular vision. The aim of the work reported in this paper is to develop a new multi-view video coding technique suitable for low bitrate applications in terms of coding efficiency, computational and memory complexity, when coding videos, which contain either a single or multiple scenes. The paper proposes a new prediction architecture that addresses deficiencies of prediction architectures for multi-view video coding based on H.264/AVC. The prediction architectures which are used in mixed spatial-resolution multi-view video coding (MSR-MVC) are afflicted with significant computational complexity and require significant memory size, with regards to coding time and to the minimum number of reference frames. The architecture proposed herein is based on a set of investigations, which explore the effect of different inter-view prediction directions on the coding efficiency of multi-view video coding, conduct a comparative study of different decimation and interpolation methods, in addition to analyzing block matching statistics. The proposed prediction architecture has been integrated with an adaptive reference frame ordering algorithm, to provide an efficient coding solution for multi-view videos with hard scene changes. The paper includes a comparative performance assessment of the proposed architecture against an extended architecture based on the 3D digital multimedia broadcast (3D-DMB) and the Hierarchical B-Picture (HBP) architecture, which are two most widely used architectures for MSR-MVC. The assessment experiments show that the proposed architecture needs less bitrate by on average 13.1 Kbps, less coding time by 14% and less memory consumption by 31.6%, compared to a corresponding codec, which deploys the extended 3D-DMB architecture when coding single-scene videos. Furthermore, the codec, which deploys the proposed architecture, accelerates coding by on average 57% and requires 52% less memory, compared to a corresponding codec, which uses the HBP architecture. On the other hand, multi-view video coding which uses the proposed architecture needs more bitrate by on average 24.9 Kbps compared to a corresponding codec that uses the HBP architecture. For coding a multi-view video which has hard scene changes, the proposed architecture yields less bitrate (by on average 28.7 to 35.4 Kbps), and accelerates coding time (by on average 64 and 33%), compared to the HBP and extended 3D-DMB architectures, respectively. The proposed architecture will thus be most beneficial in low bitrate applications, which require multi-view video coding for video content depicting hard scene changes.
Keywords	H.264/AVC; Mixed spatial-resolution; Multi-view video coding; Prediction architecture
Journal	EURASIP Journal on Image and Video Processing
Journal citation	2017 (1)
ISSN	1687-5281
	1687-5176
Year	2017
Publisher	SpringerOpen
Publisher's version	Prediction achitecture - Mansour Moniri.pdf License CC BY
Digital Object Identifier (DOI)	https://doi.org/10.1186/s13640-017-0164-7
Print	13 Feb 2017
Publication process dates
Deposited	16 May 2017
Accepted	23 Jan 2017
Funder	Staffordshire University
	Staffordshire University
Copyright information	© The authors 2017. Open Access - This article is distributed under the terms of the Creative Commons Attribution 4.0 International License CC-BY which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Permalink -

https://repository.uel.ac.uk/item/84x1q

Download files

Publisher's version

	Prediction achitecture - Mansour Moniri.pdf
License: CC BY

397
total views
241
total downloads
4
views this month
3
downloads this month

Export as

Related outputs

Robust Deepfake Speech Algorithm Recognition: Classifying Generative Algorithms via Speaker X-Vectors and Deep Learning

Maltby, H., Wall, J., Glackin, C., Moniri, M., Shrestha, R., Cannings, N. and Salami, I. 2025. Robust Deepfake Speech Algorithm Recognition: Classifying Generative Algorithms via Speaker X-Vectors and Deep Learning. IEEE International Joint Conference on Neural Networks (IJCNN). Vancouver, Canada 24 - 29 Jul 2016 IEEE.

A Frequency Bin Analysis of Distinctive Ranges Between Human and Deepfake Generated Voices

Maltby, H., Wall, J., Glackin, C., Moniri, M., Cannings, N. and Salami, I. 2024. A Frequency Bin Analysis of Distinctive Ranges Between Human and Deepfake Generated Voices. 2024 International Joint Conference on Neural Networks (IJCNN) - Neural Networks Models. Yokohama, Japan 30 Jun - 05 Jul 2024 IEEE. https://doi.org/10.1109/IJCNN60899.2024.10650554

Enhancing Automatic Speech Recognition Quality with a Second-Stage Speech Enhancement Generative Adversarial Network

Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2023. Enhancing Automatic Speech Recognition Quality with a Second-Stage Speech Enhancement Generative Adversarial Network. The 35th IEEE International Conference on Tools with Artificial Intelligence (ICTAI). Atlanta, Georgia (USA) 06 - 08 Nov 2023 IEEE Computer Society. https://doi.org/10.1109/ICTAI59109.2023.00087

A Deep Learning Speech Enhancement Architecture Optimised for Speech Recognition and Hearing Aids

Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2023. A Deep Learning Speech Enhancement Architecture Optimised for Speech Recognition and Hearing Aids. The 35th IEEE International Conference on Tools with Artificial Intelligence (ICTAI). Atlanta, Georgia (USA) 06 - 08 Nov 2023 IEEE Computer Society. https://doi.org/10.1109/ICTAI59109.2023.00088

An Extended Reality Solution for Mitigating the Video Fatigue of Online Meetings

Glackin, C., Cannings, N., Poobalasingam, V., Wall, J., Sharif, S. and Moniri, M. 2023. An Extended Reality Solution for Mitigating the Video Fatigue of Online Meetings. in: Jung, T. and tom Dieck, M. C. (ed.) XR-Metaverse Cases: Business Application of AR, VR, XR and Metaverse Springer. pp. 45-54

Short Utterance Dialogue Act Classification Using a Transformer Ensemble

Maltby, H., Wall, J., Goodluck Constance, T., Moniri, M., Glackin, C., Rajwadi, M. and Cannings, N. 2023. Short Utterance Dialogue Act Classification Using a Transformer Ensemble. UA-DIGITAL 2023: UA Digital Theme Research Twinning. Online virtual conference 27 - 31 Mar 2023

Deception Detection in Conversations using the Proximity of Linguistic Markers

Bajaj, N., Rajwadi, M., Goodluck Constance, T., Wall, J., Moniri, M., Laird, T., Woodruff, C., Laird, J., Glackin, C. and Cannings, N. 2023. Deception Detection in Conversations using the Proximity of Linguistic Markers. Knowledge-Based Systems. 23 (Art. 110422). https://doi.org/10.1016/j.knosys.2023.110422

Convolutional Recurrent Smart Speech Enhancement Architecture for Hearing Aids

Abdallah Abdelhafiz Nossier, S., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2022. Convolutional Recurrent Smart Speech Enhancement Architecture for Hearing Aids. INTERSPEECH 2022. Incheon, Korea 18 - 22 Sep 2022

Two-Stage Deep Learning Approach for Speech Enhancement and Reconstruction in The Frequency and Time Domains

Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2022. Two-Stage Deep Learning Approach for Speech Enhancement and Reconstruction in The Frequency and Time Domains. WCCI 2022: IEEE World Congress on Computational Intelligence. Padua, Italy 23 May - 18 Jul 2022 IEEE. https://doi.org/10.1109/IJCNN55064.2022.9892355

A Mixed Reality Approach for dealing with the Video Fatigue of Online Meetings

Wall, J., Poobalasingam, V., Sharif, S., Moniri, M., Glackin, C. and Cannings, N. 2022. A Mixed Reality Approach for dealing with the Video Fatigue of Online Meetings. 7th International XR Conference. Lisbon, Portugal 27 - 29 Apr 2022

Resolving Ambiguity in Hedge Detection by Automatic Generation of Linguistic Rules

Goodluck Constance, T., Bajaj, N., Rajwadi, M., Maltby, H., Wall, J., Moniri, M., Woodruff, C., Laird, T., Laird, J., Glackin, C. and Cannings, N. 2021. Resolving Ambiguity in Hedge Detection by Automatic Generation of Linguistic Rules. 30th International Conference on Artificial Neural Networks (ICANN). Online 14 - 17 Sep 2021 Springer. https://doi.org/10.1007/978-3-030-86383-8_30

An Experimental Analysis of Deep Learning Architectures for Supervised Speech Enhancement

Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2020. An Experimental Analysis of Deep Learning Architectures for Supervised Speech Enhancement. Electronics. 10 (Art. 17). https://doi.org/10.3390/electronics10010017

Mapping and Masking Targets Comparison using Different Deep Learning based Speech Enhancement Architectures

Abdallah Abdelhafiz Nossier, S., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2020. Mapping and Masking Targets Comparison using Different Deep Learning based Speech Enhancement Architectures. 2020 International Joint Conference on Neural Networks (IJCNN). Glasgow, UK 19 - 24 Jul 2020 IEEE. https://doi.org/10.1109/IJCNN48605.2020.9206623

A Comparative Study of Time and Frequency Domain Approaches to Deep Learning based Speech Enhancement

Abdallah Abdelhafiz Nossier, S., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2020. A Comparative Study of Time and Frequency Domain Approaches to Deep Learning based Speech Enhancement. 2020 International Joint Conference on Neural Networks (IJCNN). Glasgow, UK 19 - 24 Jul 2020 IEEE. https://doi.org/10.1109/IJCNN48605.2020.9206928

Fraud detection in telephone conversations for financial services using linguistic features

Bajaj, N., Goodluck Constance, T., Rajwadi, M., Wall, J., Moniri, M., Glackin, C., Cannings, N., Woodruff, C. and Laird, J. 2019. Fraud detection in telephone conversations for financial services using linguistic features. Neural Information Processing Systems - NeurIPS 2019. Vancouver, Canada 08 - 14 Dec 2019 NeurIPS.

A Framework for Augmented Reality Based Shared Experiences

Ali, A., Glackin, C., Cannings, N., Wall, J., Sharif, S. and Moniri, M. 2019. A Framework for Augmented Reality Based Shared Experiences. Immersive Learning Research Network - iLRN. London, UK 23 - 27 Jun 2019 Technischen Universität Graz. https://doi.org/10.3217/978-3-85125-657-4-24

Comparative Analysis on the Competitiveness of Conventional and Compressive Sensing-based Query Processing

Fayed, Salema, Youssef, Sherin, El-Helw, Amr, Akbari, Akbar Sheikh, Patwary, Mohammad and Moniri, M. 2014. Comparative Analysis on the Competitiveness of Conventional and Compressive Sensing-based Query Processing. in: Advances in Information Science and Applications, Volume 1: Proceedings of the 18th International Conference on Computers (part of CSCC '14) Institute for Natural Sciences and Engineering (INASE).

Evaluation of Performance Enhancement for Crash Constellation Prediction via Car-to-Car Communication

Kuehbeck, Thomas, Hakobyan, Gor, Sikora, Axel, Chibelushi, Claude C. and Moniri, M. 2014. Evaluation of Performance Enhancement for Crash Constellation Prediction via Car-to-Car Communication. in: Communication Technologies for Vehicles Springer.

Compressive Sensing-based Target Tracking for Wireless Visual Sensor Networks

Fayed, Salema, Youssef, Sherin, El-Helw, Amr, Patwary, Mohammad and Moniri, M. 2014. Compressive Sensing-based Target Tracking for Wireless Visual Sensor Networks. in: Advances in Information Science and Applications, Volume I: Proceedings of the 18th International Conference on Computers (part of CSCC '14) Institute for Natural Sciences and Engineering (INASE).

Analytical framework for Adaptive Compressive Sensing for Target Detection within Wireless Visual Sensor Networks

Fayed, Salema, Youssef, Sherin, El-Helw, Amr, Patwary, Mohammad and Moniri, M. 2017. Analytical framework for Adaptive Compressive Sensing for Target Detection within Wireless Visual Sensor Networks. Multimedia Tools and Applications. 77 (13), pp. 16533-16559. https://doi.org/10.1007/s11042-017-5227-3

A Hybrid Adaptive Compressive Sensing Model for Visual Tracking in Wireless Visual Sensor Networks

Fayed, Salema, Youssef, Sherin, El-Helw, Amr, Patwary, Mohammad and Moniri, M. 2015. A Hybrid Adaptive Compressive Sensing Model for Visual Tracking in Wireless Visual Sensor Networks. International Journal of Circuits, Systems, and Signal Processing. 9, pp. 134-144.

Adaptive compressive sensing for target tracking within wireless visual sensor networks-based surveillance applications

Fayed, Salema, M.Youssef, Sherin, El-Helw, Amr, Patwary, Mohammad and Moniri, M. 2015. Adaptive compressive sensing for target tracking within wireless visual sensor networks-based surveillance applications. Multimedia Tools and Applications. 75 (11), pp. 6347-6371. https://doi.org/10.1007/s11042-015-2575-8

Image segmentation using adaptive video analytics, Image processingUS 9047677 B2

Sedky, Mohamed Hamed Ismail, Chibelushi, Claude Chilufya and Moniri, M. 2015. Image segmentation using adaptive video analytics, Image processingUS 9047677 B2. US 13/140,378

Towards a fully automated monitoring system for Manhole Cover: Smart cities and IOT applications

Aly, Hesham H., Soliman, Abdel Hamid and Moniri, M. 2015. Towards a fully automated monitoring system for Manhole Cover: Smart cities and IOT applications. in: 2015 IEEE First International Smart Cities Conference (ISC2) IEEE. pp. 24-30

Spectral-360: A Physics-Based Technique for Change Detection

Sedky, Mohamed, Moniri, M. and Chibelushi, Claude C. 2014. Spectral-360: A Physics-Based Technique for Change Detection. 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2014). Columbus, OH, USA 23 - 28 Jun 2014 IEEE. pp. 405-408 https://doi.org/10.1109/CVPRW.2014.65

Prediction architecture based on block matching statistics for mixed spatial-resolution multi-view video coding

Download files

Publisher's version

397

241

4

3

Export as

Related outputs