Enhancing Kitchen Independence: Deep Learning-Based Object Detection for Visually Impaired Assistance

Bo Dang; Danqing Ma; Shaojie Li; Xinqi Dong; Hengyi Zang; Rui Ding

doi:10.54097/hc3f1045

Authors

Bo Dang
Danqing Ma
Shaojie Li
Xinqi Dong
Hengyi Zang
Rui Ding

DOI:

https://doi.org/10.54097/hc3f1045

Keywords:

Machine Learning, Object Detection, MobileNet SSD, Tensorflow Lite, Deep Learning, Transfer Learning, Text to Speech.

Abstract

Visually impaired individuals face substantial challenges in kitchens, where identifying objects accurately is crucial yet difficult due to the complexity and variability of the environment. Traditional object detection¹ methods fall short in these settings, struggling with the assortment of items. This research highlights the need for advanced, kitchen-specific solutions that leverage deep learning to improve detection accuracy and offer real-time, interactive guidance through speech technologies. By focusing on the unique demands of kitchen environments, the proposed system aims to significantly enhance the autonomy and safety of visually impaired users, presenting a notable advancement in assistive technology. The effectiveness of this approach is assessed by its ability to accurately identify kitchen items for visually impaired individuals.

Downloads

Download data is not yet available.

References

Liu, Y., Yang, H. & Wu, C. Unveiling Patterns: A Study on Semi-Supervised Classification of Strip Surface Defects. IEEE Access 11, 119933–119946 (2023).

Qiao, Y., Ni, F., Xia, T., Chen, W. & Xiong, J. AUTOMATIC RECOGNITION OF STATIC PHENOMENA IN RETOUCHED IMAGES: A NOVEL APPROACH. in The 1st International scientific and practical conference “Advanced technologies for the implementation of new ideas”(January 09-12, 2024) Brussels, Belgium. International Science Group. 2024. 349 p. 287 (2024).

Liang, Z. et al. Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation. Preprint at (2023).

Ni, F., Zang, H. & Qiao, Y. SMARTFIX: LEVERAGING MACHINE LEARNING FOR PROACTIVE EQUIPMENT MAINTENANCE IN INDUSTRY 4.0. in The 2nd International scientific and practical conference “Innovations in education: prospects and challenges of today”(January 16-19, 2024) Sofia, Bulgaria. International Science Group. 2024. 389 p. 313 (2024).

Pan, Z. et al. Ising-Traffic: Using Ising Machine Learning to Predict Traffic Congestion under Uncertainty. Proceedings of the AAAI Conference on Artificial Intelligence 37, 9354–9363 (2023).

Liu, S., Wu, K., Jiang, C., Huang, B. & Ma, D. Financial Time-Series Forecasting: Towards Synergizing Performance And Interpretability Within a Hybrid Machine Learning Approach. Preprint at (2023).

Li, S. S. et al. A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors. Preprint at (2023).

Qiao, Y., Jin, J., Ni, F., Yu, J. & Chen, W. APPLICATION OF MACHINE LEARNING IN FINANCIAL RISK EARLY WARNING AND REGIONAL PREVENTION AND CONTROL: A SYSTEMATIC ANALYSIS BASED ON SHAP. WORLD TRENDS, REALITIES AND ACCOMPANYING PROBLEMS OF DEVELOPMENT 331, (2023).

Wei, J., Zhang, Y., Zhou, Z., Li, Z. & Al Faruque, M. A. Leaky DNN: Stealing Deep-Learning Model Secret with GPU Context-Switching Side-Channel. in 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) 125–137 (2020). doi:10.1109/DSN48063.2020.00031.

Xiao, T., Zeng, L., Shi, X., Zhu, X. & Wu, G. Dual-Graph Learning Convolutional Networks for Interpretable Alzheimer’s Disease Diagnosis. in International Conference on Medical Image Computing and Computer-Assisted Intervention 406–415 (2022).

Zhang, X. et al. A Brief Survey of Machine Learning and Deep Learning Techniques for E-Commerce Research. Journal of Theoretical and Applied Electronic Commerce Research 18, 2188–2216 (2023).

Mittal, G. et al. HyperSTAR: Task-Aware Hyperparameters for Deep Networks. Preprint at (2020).

Su, J., Nair, S. & Popokh, L. Optimal Resource Allocation in SDN/NFV-Enabled Networks via Deep Reinforcement Learning. in 2022 IEEE Ninth International Conference on Communications and Networking (ComNet) 1–7 (2022). doi:10.1109/ComNet55492.2022.9998475.

Li, Y., Liu, T., Jiang, D. & Meng, T. Transfer-learning-based Network Traffic Automatic Generation Framework. in 2021 6th International Conference on Intelligent Computing and Signal Processing (ICSP) 851–854 (2021). doi:10.1109/ICSP51882.2021.9408767.

Liu, K., Han, Y., Gong, Z. & Xu, H. Low-Data Drug Design with Few-Shot Generative Domain Adaptation. Bioengineering 10, (2023).

Si, S. et al. SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented Dialogue Agents. in Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2023).

Jin, X., Larson, J., Yang, W. & Lin, Z. Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language Models. Preprint at (2023).

Gu, Y. et al. Mutual Correlation Attentive Factors in Dyadic Fusion Networks for Speech Emotion Recognition. in Proceedings of the 27th ACM International Conference on Multimedia 157–166 (Association for Computing Machinery, New York, NY, USA, 2019). doi:10.1145/3343031.3351039.

Jin, X. & Wang, Y. Understand Legal Documents with Contextualized Large Language Models. arXiv preprint arXiv:2303.12135 (2023).

Chen, K. et al. Chemist-X: Large Language Model-empowered Agent for Reaction Condition Recommendation in Chemical Synthesis. Preprint at (2024).

Chen, Y., Arkin, J., Zhang, Y., Roy, N. & Fan, C. Scalable Multi-Robot Collaboration with Large Language Models: Centralized or Decentralized Systems? Preprint at (2023).

Guo, Z. & Cao, Y. SA-CNN: Application to text categorization issues using simulated annealing-based convolutional neural network optimization. in Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering (ACM, 2022). doi:10.1145/3573428.3573788.

Li, Q., Hu, Y., Dong, Y., Zhang, D. & Chen, Y. Focus on Hiders: Exploring Hidden Threats for Enhancing Adversarial Training. arXiv preprint arXiv:2312.07067 (2023).

Xiao, Y. & Alam, F. Nexus at ArAIEval Shared Task: Fine-Tuning Arabic Language Models for Propaganda and Disinformation Detection. Preprint at (2023).

Ukey, N. et al. Survey on Exact kNN Queries over High-Dimensional Data Space. Sensors 23, (2023).

Wantlin, K. et al. Benchmd: A benchmark for modality-agnostic learning on medical images and sensors. arXiv preprint arXiv:2304.08486 (2023).

Li, L. CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting. Preprint at (2023).

Popokh, L., Su, J., Nair, S. & Olinick, E. IllumiCore: Optimization Modeling and Implementation for Efficient VNF Placement. in 2021 International Conference on Software, Telecommunications and Computer Networks (SoftCOM) 1–7 (2021).

Wang, H. et al. Quantpipe: Applying Adaptive Post-Training Quantization For Distributed Transformer Pipelines In Dynamic Edge Environments. in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 1–5 (2023).

Xu, C., Yu, J., Chen, W. & Xiong, J. DEEP LEARNING IN PHOTOVOLTAIC POWER GENERATION FORECASTING: CNN-LSTM HYBRID NEURAL NETWORK EXPLORATION AND RESEARCH. in The 3rd International scientific and practical conference “Technologies in education in schools and universities”(January 23-26, 2024) Athens, Greece. International Science Group. 2024. 363 p. 295 (2024).