Mixer-NeRF: Research on 3D Reconstruction Methods of Neural Radiance Fields Based on Hybrid Spatial Feature Information

Authors

  • Linhua Jiang
  • Ruiye Che
  • Lingxi Hu
  • Xinfeng Chen
  • Yuanyuan Yang
  • Lei Chen
  • Wentong Yang
  • Peng Liu
  • Wei Long

DOI:

https://doi.org/10.54097/0ek2xd53

Keywords:

Neural Radiance Fields, Joint Learning, 3D Reconstruction, Feature Extraction

Abstract

Compared to traditional 3D reconstruction techniques, Neural Radiance Fields (NeRF) have demonstrated significant performance advantages in implicit 3D reconstruction. However, simple multi-layer perceptron (MLP) models often result in blurry details in the reconstructed 3D scenes due to their lack of necessary local information capture during the sampling process. To address this issue, this paper proposes a joint learning method based on hybrid spatial feature information to enhance detail reconstruction. First, after the position encoding output, a hybrid spatial feature information learning module (MLCA) is added before entering the MLP network and before the input direction information, capturing spatial feature information and mixing the channels of these features to obtain the features of different channels at each spatial location. Then, a squeeze-and-excitation network module is introduced between the NeRF sampling and inference layers to filter out high-weight information. Experimental results show that the proposed 3D reconstruction method based on hybrid spatial feature information can effectively improve NeRF’s reconstruction performance. It is applicable to complex, real, multi-view scenarios, especially showing the best performance under high-texture and complex lighting conditions. Compared to NeRF, this method improves the learning of perceptual image block similarity by more than 30%.

References

[1] Zhang R, Tsai P S, Cryer J E, et al. Shape-from-shading: a survey[J]. IEEE transactions on pattern analysis and machine intelligence, 1999, 21(8): 690-706.

[2] Yin Y K, Yu K, Yu C Z, et al. 3D imaging using geometric light field: a review[J]. Chinese Journal of Lasers, 2021, 48(12): 1209001.

[3] Zhang B,Yu J,Jiao X,et al.Line Structured Light Binocular Fusion Filling and Reconstruction Technology[J].Laser Optoelectron.Prog,2023, 60: 1611001.

[4] Shi S F, Ye N, Zhang L Y. Calibration of two-camera vision system with far and near sight distance[J]. Acta Optica Sinica, 2021, 41(24): 2415001.

[5] Liu H X, Zhao Y M, Zhang C L. Study on tooth cone beam CT image reconstruction based on improved U-net network[J]. Chinese Journal of Lasers, 2022, 49(24): 2407207.

[6] Wang M J, Li L, Yi F. Three-dimensional reconstruction and analysis of target laser point cloud data under simulated real water environment[J]. Chinese Journal of Lasers, 2022, 49(3): 0309001.

[7] Mingyang L I, Wei C, Shanshan W, et al. Survey on 3D reconstruction methods based on visual deep learning[J]. Journal of Frontiers of Computer Science & Technology, 2023, 17(2): 279.

[8] Bui G, Le T, Morago B, et al. Point-based rendering enhancement via deep learning[J]. The Visual Computer, 2018, 34: 829-841.

[9] Riegler G, Koltun V. Free view synthesis[C]//Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIX 16. Springer International Publishing, 2020: 623-640.

[10] Sitzmann V, Thies J, Heide F, et al. Deepvoxels: Learning persistent 3d feature embeddings[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 2437-2446.

[11] Srinivasan P P, Tucker R, Barron J T, et al. Pushing the boundaries of view extrapolation with multiplane images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 175-184.

[12] Schirmer L, Schardong G, da Silva V, et al. Neural networks for implicit representations of 3D scenes[C]//2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). IEEE, 2021: 17-24.

[13] Mildenhall B, Srinivasan P P, Tancik M, et al. Nerf: Representing scenes as neural radiance fields for view synthesis[J]. Communications of the ACM, 2021, 65(1): 99-106.

[14] Zhu Z, Peng S, Larsson V, et al. Nice-slam: Neural implicit scalable encoding for slam[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 12786-12796.

[15] Zhang K, Riegler G, Snavely N, et al. Nerf++: Analyzing and improving neural radiance fields[J]. arXiv preprint arXiv:2010.07492, 2020.

[16] Arandjelović R, Zisserman A. Nerf in detail: Learning to sample for view synthesis[J]. arXiv preprint arXiv:2106.05264, 2021.

[17] Yang G W, Zhou W Y, Peng H Y, et al. Recursive-nerf: An efficient and dynamically growing nerf[J]. IEEE Transactions on Visualization and Computer Graphics, 2022, 29(12): 5124-5136.

[18] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.

[19] Cheng D, Meng G, Cheng G, et al. SeNet: Structured edge network for sea–land segmentation[J]. IEEE Geoscience and Remote Sensing Letters, 2016, 14(2): 247-251.

Downloads

Published

26-12-2024

Issue

Section

Articles

How to Cite

Jiang, L., Che, R., Hu, L., Chen, X., Yang, Y., Chen, L., Yang, W., Liu, P., & Long, W. (2024). Mixer-NeRF: Research on 3D Reconstruction Methods of Neural Radiance Fields Based on Hybrid Spatial Feature Information. Journal of Computing and Electronic Information Management, 15(3), 149-155. https://doi.org/10.54097/0ek2xd53