ثبت تصویر سنجش از دور چند زمانی با استفاده از شبکه‌های عصبی عمیق و نواحی مورد علاقه

سریانی, محسن; حسینی پناه, سید محمد; خوش سیما, مسعود

doi:10.22034/jssta.2024.462010.1173

ثبت تصویر سنجش از دور چند زمانی با استفاده از شبکه‌های عصبی عمیق و نواحی مورد علاقه

نوع مقاله : مقاله پژوهشی

نویسندگان

محسن سریانی ¹

سید محمد حسینی پناه ²

مسعود خوش سیما ³

¹ عضو هیئت علمی دانشگاه علم و صنعت ایران

² دانشجو دانشگاه علم و صنعت ایران

³ پژوهشگاه فضایی ایران

10.22034/jssta.2024.462010.1173

چکیده

هدف از ثبت تصویر، تراز کردن دو یا چند تصویر است که از یک صحنه، در زمان‌های مختلف و/یا از دیدگاه‌های مختلف و/یا با استفاده از دستگاه های مختلف گرفته شده است. در سال‌های اخیر با بهبود مستمر توانایی رصد زمین، نیاز به مدل‌های جدید ثبت تصویر که بتواند محاسبات بالای این ثبت و پردازش تصاویر را انجام دهد و همچنین از دقت بالایی برخوردار باشد، مشاهده می‌شود. در این مقاله، به منظور کاهش ناحیه جست‌و‌جو و افزایش دقت از نواحی مورد علاقه استفاده می‌شود. برای این منظور ابتدا ناحیه‌هایی که بین دو تصویر یکسان هستند، شناسایی می‌شوند و سپس، ثبت تصویر با توجه به ناحیه‌های مشابه صورت می‌گیرد. برای پیدا کردن ناحیه مورد علاقه، از یک مدل شبکه عصبی عمیق ترانسفورمر استفاده شده است. شبکه عصبی عمیق ترانسفورمر مورد استفاده شامل چندین لایه توجه درونی و توجه متقاطع است که وظیفه یادگیری اهمیت موقعیت‌های مختلف در درون یک تصویر و بین دو تصویر را دارد. مدل پیشنهادی یک مدل خودنظارتی است که از روش " تعویض بخش" برای تولید داده‌های آموزشی استفاده می‌کند. داده‌های آموزشی، از تصاویر Google Earth جمع‌آوری شده است و توسط ما نشانه‌گذاری شده‌است. پس از آموزش مدل و بدست آوردن ناحیه‌های مشابه از روش رایج SIFT برای بدست آوردن ویژگی‌ها و ثبت تصویر استفاده می‌کنیم. برای آزمایش، از تصاویر هوایی Sentinel-2 استفاده کرده‌ایم. برای ارزیابی کمی نتایج، از ریشه میانگین مربعات خطا استفاده می‌کنیم. نتایج کمی و کیفی نشان دهنده بهبود عملکرد در هزینه و دقت، در مقایسه با روش های مرسوم برای ثبت تصاویر هوایی است.

کلیدواژه‌ها

سنجش از دور#تطبیق تصویر#ناحیه مورد علاقه#شبکه عصبی عمیق ترانسفورمر

موضوعات

سنجش از دور

[1] S. Hausler, S. Garg, M. Xu, M. Milford, and T. Fischer, “Patch-netvlad: Multi-scale fusion of locally global descriptors for place recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.14141–14152, June 2021.
[2] Y.-C. Chen, Y.-Y. Lin, M.-H. Yang, and J.-B. Huang, “Show, match and segment: Joint weakly supervised learning of semantic matching and object co-segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.43, no.10, pp.3632–3647, 2021.
[3] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems (I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds.), vol.30, Curran Associates, Inc., 2017.
[4] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” ArXiv, vol.1409, 09 2014.
[5] D. Lowe, “Object recognition from local scale-invariant features,” vol.2, pp.1150 – 1157, 1999.
[6] Y. Ye and J. Shan, “A local descriptor based registration method for multispectral remote sensing images with non-linear intensity differences,” ISPRS Journal of Photogrammetry and Remote Sensing,vol.90, p.83–95, 2014.
[7] Z. Hossein-nejad and M. Nasri, “Rkem: Redundant keypoint elimination method in image registration,” IET Image Processing, vol.11, 2017.
[8] W. Ma, Y. Wu, Y. Zheng, Z. Wen, and L. Liu, “Remote sensing image registration based on multi-feature and region division,” IEEE Geoscience and Remote Sensing Letters, vol.14, no.10, pp.1680–1684, 2017.
[9] W. Ma, Y. Wu, S. Liu, Q. Su, and Y. Zhong, “Remote sensing image registration based on phase congruency feature detection and spatial constraint matching,” IEEE Access, vol.6, pp.77554–77567, 2018.
[10] D. Quan, S. Wang, M. Ning, T. Xiong, and L. Jiao, “Using deep neural networks for synthetic aperture radar image registration,” pp.2799–2802, 2016.
[11] S. Wang, D. Quan, X. Liang, M. Ning, Y. Guo, and L. Jiao, “A deep learning framework for remote sensing image registration,” ISPRS Journal of Photogrammetry and Remote Sensing, vol.145,2018.
[12] F. Ye, Y. Su, H. Xiao, X. Zhao, and W. Min, “Remote sensing image registration using convolutional neural network features,” IEEE Geoscience and Remote Sensing Letters, vol.PP, pp.1–5, 2018.
[13] Z. Yang, T. Dan, and Y. Yang, “Multi-temporal remote sensing image registration using deep convolutional features,” IEEE Access, vol.PP, pp.1–1, 2018.
[14] C. Liu, J. Yuen, and A. Torralba, “Sift flow: Dense correspondence across scenes and its applications.,” IEEE Trans. Pattern Anal. Mach. Intell., vol.33, pp.978–994, 2011.
[15] P.-E. Sarlin, D. DeTone, T. Malisiewicz, and A. binovich, “Superglue: Learning feature matching with graph neural networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
[16] W. Jiang, E. Trulls, J. Hosang, A. Tagliasacchi, and K. Yi, “Cotr: Correspondence transformer for matching across images,” pp.6187–6197, 2021.
[17] J. Sun, Z. Shen, Y. Wang, H. Bao, and X. Zhou, “Loftr: Detector-free local feature matching with transformers,” pp.8918–8927, 2021.
[18] V. H. Vo, F. Bach, M. Cho, K. Han, Y. Lecun, P. Perez, and J. Ponce, “Unsupervised image matching and object discovery as optimization,” 2019.
[19] V. H. Vo, F. Bach, M. Cho, K. Han, Y. Lecun, P. Perez, and J. Ponce, “Unsupervised image matching and object discovery as optimization,” 2019.
[20] X. Shen, A. Efros, and M. Aubry, “Discovering visual patterns in art collections with spatially-consistent feature learning,” pp.9270–9279, 2019.
[21] X. Shen, A. A. Efros, A. Joulin, and M. Aubry, “Learning co-segmentation by segment swappingfor retrieval and discovery,” in Proceedings of the IEEE/CVF Conference on Computer Vision andPattern Recognition (CVPR) Workshops, pp.5082–5092, June 2022.
[22] P. Pérez, M. Gangnet, and A. Blake, “Poisson image editing,” ACM Trans. Graph., vol.22, pp.313–318, 2003.
[23] X. Huang and S. Belongie, “Arbitrary style transfer in real-time with adaptive instance normalization,” pp.1510–1519, 2017.
[24] X. Shen, A. Efros, and M. Aubry, “Discovering visual patterns in art collections with spatially-consistent feature learning,” pp.9270–9279, 2019.
[25] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” pp.770–778, 2016.
[26] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and F.-F. Li, “Imagenet: a large-scale hierarchical image database,” pp.248–255, 2009.
[27] X. Chen, H. Fan, R. Girshick, and K. He, “Improved baselines with momentum contrastive learning,” 2020.
[28] N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, End-to-End Object Detection with Transformers, pp.213–229. 2020.
[29] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” International Conference on Learning Representations, 2014.
[30] H. Goncalves, J. A. Goncalves, and L. Corte-Real, “Measures for an objective evaluation of the geometric correction process quality,” IEEE Geoscience and Remote Sensing Letters, vol.6, no.2, pp.292–296, 2009.
[31] W. Ma, Y. Wu, Y. Zheng, Z. Wen, and L. Liu, “Remote sensing image registration based on multi-feature and region division,” IEEE Geoscience and Remote Sensing Letters, vol.14, no.10, pp.1680–1684, 2017.