ChatGPT answers more than half of software engineering questions incorrectly

June Wan/ZDNET

ChatGPT’s ability to provide conversational answers to any question at any time makes the chatbot a handy resource for your information needs. Despite the convenience, a new study finds that you may not want to use ChatGPT for software engineering prompts.

Before the rise of AI chatbots, Stack Overflow was the go-to resource for programmers who needed advice for their projects, with a question-and-answer model similar to ChatGPT’s.

Also: How to block OpenAI’s new AI-training web crawler from ingesting your data

However, with Stack Overflow, you have to wait for someone to answer your question while with ChatGPT, you don’t.

As a result, many software engineers and programmers have turned to ChatGPT with their questions. Since there was no data showing just how efficacious ChatGPT is in answering those types of prompts, a new Purdue University study investigated the dilemma.

To find out just how efficient ChatGPT is in answering software engineering prompts, the researchers gave ChatGPT 517 Stack Overflow questions and examined the accuracy and quality of those answers.

Also: How to use ChatGPT to write code

The results showed that out of the 512 questions, 259 (52%) of ChatGPT’s answers were incorrect and only 248 (48%) were correct. Moreover, a whopping 77% of the answers were verbose.

Despite the significant inaccuracy of the answers, the results did show that the answers were comprehensive 65% of the time and addressed all aspects of the question.

To further analyze the quality of ChatGPT responses, the researchers asked 12 participants with different levels of programming expertise to give their insights on the answers.

Also: Stack Overflow uses AI to give programmers new access to community knowledge

Although the participants preferred Stack Overflow’s responses over ChatGPT’s across various categories, as seen by the graph, the participants failed to correctly identify incorrect ChatGPT-generated answers 39.34% of the time.

Purdue University

According to the study, the well-articulated responses ChatGPT outputs caused the users to overlook incorrect information in the answers.

“Users overlook incorrect information in ChatGPT answers (39.34% of the time) due to the comprehensive, well-articulated, and humanoid insights in ChatGPT answers,” the authors wrote.

Also: How ChatGPT can rewrite and improve your existing code

The generation of plausible-sounding answers that are incorrect is a significant issue across all chatbots because it enables the spread of misinformation. In addition to that risk, the low accuracy scores should be enough to make you reconsider using ChatGPT for these types of prompts.

Originally Appeared Here

Pages

Categories

ChatGPT answers more than half of software engineering questions incorrectly

About the Author:

You May Also Like

Why context engineering matters more than prompt engineering

Stock markets and AI: How to use artificial intelligence tools to up your investing game

7 Genius Gemini Prompts to Boost AI Productivity and Transform Your Workflow

Here’s How to Write an Effective AI Prompt, According to Anthropic

7 Gemini Gems Use Cases : Save Hours With Custom AI Experts

Advanced Generative AI Course for Engineers – Interview Kickstart Launches New Program Focused on LLM Applications, Prompt Engineering, and Real-World AI Systems

About the Author: