Leveraging Vision Language Models for Automated NVH Analysis

Authors

BALAJI CHANDRASEKARAN

Abstract

Content: Optimizing Noise, Vibration, and Harshness (NVH) is crucial in the automotive industry. Traditional analysis methods, including time series analytics and frequency spectrum analysis, as well as advanced techniques like time-frequency spectrogram analysis, are often time-consuming and rely heavily on expert interpretation. This research explores the potential of Vision Language Models (VLMs) specifically LLaMA 3.2 Vision, GPT-4o, and Microsoft Phi-3 vision for automating noise spectrogram analysis. By employing techniques such as few-shot learning and fine-tuning these models on a curated dataset of spectrograms paired with expert interpretations, we aim to facilitate rapid and accurate NVH assessments and report generation. The proposed approach includes assembling a dataset derived from vibration sensors and expert analyses to inform model training. Furthermore, we will evaluate the accuracy and reliability of the trained models to ensure robust performance. The techniques discussed in this research for predicting NVH characteristics from new spectrograms contribute to the automation of NVH analysis and have the potential for adaptation across a wide range of applications.

Meta Tags

Topics: Analysis methodologies
Vibration
Noise
Harshness
Research and development
Noise, Vibration, and Harshness (NVH)
Education and training
Assembling

Details

Citation: CHANDRASEKARAN, B., and Cury, R., "Leveraging Vision Language Models for Automated NVH Analysis," SAE Technical Paper 2025-01-0127, 2023, .

Additional Details