Get the latest installment of AI code quality research that has been cited by industry leaders like
AI Copilot Code Quality: 2024 Data Shows 4x More Code Cloning
Stack Overflow's 2024 Developer Survey showed that 63% of professional developers are currently using AI in their development process, with another 14% saying they plan to begin using AI soon.
We examine 5 years worth of data (2020-2024), using the largest dataset that has been brought to bear (211 million changed lines across repos like Chromium and Visual Studio Code) to examine how AI Assistants influence the quality of code being written.
We observe a spike in the prevalence of duplicate code blocks, along with increases in short-term churn code, and the continued decline of moved lines (code reuse).
Start reading to learn:
- What code quality metrics are threatened by the proliferation of AI?
- What do Technical Leaders need to be on the lookout for 2025?
- How can you measure the impact of AI on your team's code quality?
Abstract
Code assistants diligently accepted a far greater share of code-writing responsibility during 2024 than ever before. According to Stack Overflow’s 2024 Developer Survey, 63% of Professional Developers said they currently use AI in their development process. Another 14% say they plan to soon.
The 36,894 developers who answered “most important benefit you’re hoping to achieve with AI” overwhelmingly picked “Increased productivity.” Developers seem to view AI as a means to write more code, faster. Through the lens of “does more code get written?” common sense and research agree: resounding yes. AI assistants do beget more lines.
But if you ask a Senior Developer “what would unlock your team’s full potential?” their answer won’t be “more code lines added.” To retain high project velocity over years, research suggests that a DRY (Don’t Repeat Yourself), modular approach to building is essential [4] [5]. Canonical systems are documented, well-tested, reused, and periodically upgraded. This research aims to evaluate how today’s profusion of LLM-authored code is influencing the maintainability of tomorrow’s systems.
To investigate, we will analyze the largest known database of highly structured code change data. 211 million changed lines of code, authored between January 2020 and December 2024, will be assessed on quantifiable “code quality” metrics.