GENERATIVE AI FOR AUTOMATED MACHINE LEARNING: A COMPARATIVE STUDY

Authors

  • Gaman Kumar Nayak, Nikhil Dayakar K, Rakesh Kumar B

DOI:

https://doi.org/10.25215/8194288770.31

Abstract

The growing capabilities of Large Language Models (LLMs) such as ChatGPT, Gemini, Quent, and DeepSeek have redefined the boundaries of traditional data science workflows. This research presents a comparative study on the capability of Large Language Models (LLMs) to automate machine learning tasks through prompt-based interaction. The study involves four LLMs—ChatGPT, Gemini, Qwen, and DeepSeek—evaluated on their performance in handling end-to-end machine learning processes across multiple Kaggle datasets from diverse sectors such as finance and healthcare. A single, comprehensive prompt was developed after reviewing relevant literature on effective prompt engineering for data-driven tasks. The outputs were analyzed based on key metrics such as accuracy, precision, recall, F1-score, and confusion matrices. The results reveal notable differences in model interpretation, consistency, and analytical depth among the LLMs. This comparative analysis aims to identify which LLM performs most efficiently and accurately in executing machine learning workflows, providing valuable insights into the potential of generative AI for automated data science.

Published

2026-03-11