Part 37: AI in Software Development & Vibe Coding

Introduction — The AI Revolution in Software Development

Between 2022 and 2026, artificial intelligence went from a curiosity that could write mediocre fizzbuzz solutions to a daily companion that writes production code, reviews pull requests, generates documentation, and debugs complex systems. This transformation happened faster than any previous technology shift in software engineering.

But here is the critical insight that separates thoughtful engineers from hype followers: AI changes what developers do, not whether they are needed. The assembly line did not eliminate manufacturing jobs — it changed them from manual labour to machine supervision. AI coding tools are doing the same thing to software development.

                            
                            Key Insight: AI does not replace software engineers. It replaces the typing part of software engineering — which was always the smallest part of the job. Understanding requirements, designing systems, making tradeoffs, reviewing correctness, and maintaining code over years remain fundamentally human activities.
                        

The 2022–2026 Timeline

The pace of change has been extraordinary:

Year	Milestone	Impact
2021	GitHub Copilot Technical Preview	First mainstream AI code completion tool
2022	ChatGPT launch (GPT-3.5)	Developers discover conversational coding assistance
2023	GPT-4, Claude, Copilot Chat	Multimodal AI understands code context deeply
2024	Cursor, Windsurf, AI-native IDEs	AI moves from plugin to foundational IDE experience
2025	Agentic coding (Copilot Agent Mode, Cursor Composer)	AI autonomously edits multiple files, runs tests, iterates
2026	AI agents in CI/CD, automated PR creation	AI operates within delivery pipelines, not just editors

Understanding this trajectory is essential because the tools you use today will be obsolete in 18 months — but the principles of working effectively with AI will compound over your entire career.

AI-Assisted Coding Tools

The market for AI coding assistants has exploded. Each tool takes a slightly different approach to the same fundamental problem: how do we help developers write correct code faster?

How They Work

All modern AI coding tools share the same basic architecture:

AI Code Completion Architecture

flowchart LR
    A["Developer Types Code"] --> B["Context Gathering"]
    B --> C["LLM Inference"]
    C --> D["Suggestion Filtering"]
    D --> E["Display to Developer"]
    B --> F["Current File"]
    B --> G["Open Tabs"]
    B --> H["Repository Context"]
    B --> I["Language Server Info"]

The key components are:

Context window — The text sent to the LLM including surrounding code, imports, comments, and related files
Retrieval-Augmented Generation (RAG) — Searching the codebase for relevant snippets to include in context
Fill-in-the-Middle (FIM) — Specialised model training that predicts code given both prefix and suffix
Post-processing — Filtering suggestions for syntax validity, security issues, and style consistency

Tool Comparison

Tool	Model	IDE Support	Key Differentiator	Pricing
GitHub Copilot	GPT-4o, Claude, Gemini (multi-model)	VS Code, JetBrains, Neovim	Deepest GitHub integration, agent mode	$10–39/month
Cursor	Claude, GPT-4o, custom models	Custom VS Code fork	AI-native IDE, Composer multi-file editing	$20/month
Windsurf	Cascade (proprietary)	Custom VS Code fork	Flows — multi-step autonomous coding	$15/month
Amazon CodeWhisperer (Q Developer)	Amazon proprietary	VS Code, JetBrains	AWS integration, security scanning	Free tier available
Tabnine	Custom models (on-prem option)	All major IDEs	Private deployment, no code leaves network	$12/month
Sourcegraph Cody	Claude, GPT-4o	VS Code, JetBrains	Entire codebase context via code graph	Free tier available

Vibe Coding

In early 2025, Andrej Karpathy coined the term "vibe coding" — a style of programming where you describe what you want in natural language, let the AI generate the code, and mostly accept the output without deeply reading every line. You "give in to the vibes" and trust the AI.

                            
                            Andrej Karpathy: "There's a new kind of coding I call 'vibe coding', where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's not really coding — I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works."
                        

When Vibe Coding Works

Vibe coding is genuinely effective for certain categories of work:

Rapid prototyping — Building a proof-of-concept to validate an idea
Boilerplate generation — CRUD endpoints, configuration files, data models
One-off scripts — Data migration scripts, file processing utilities
Learning and exploration — Trying a new framework or API
Personal projects — Where correctness standards are lower

# Vibe coding example: "Create a Flask API that accepts a JSON body
# with a 'text' field and returns the sentiment using TextBlob"

from flask import Flask, request, jsonify
from textblob import TextBlob

app = Flask(__name__)

@app.route('/sentiment', methods=['POST'])
def analyze_sentiment():
    data = request.get_json()
    if not data or 'text' not in data:
        return jsonify({'error': 'Missing text field'}), 400

    blob = TextBlob(data['text'])
    polarity = blob.sentiment.polarity

    if polarity > 0.1:
        label = 'positive'
    elif polarity < -0.1:
        label = 'negative'
    else:
        label = 'neutral'

    return jsonify({
        'text': data['text'],
        'polarity': round(polarity, 3),
        'label': label
    })

if __name__ == '__main__':
    app.run(debug=True, port=5000)

The code above was generated from a single sentence description. It works. For a prototype, it is good enough. But is it production-ready? Absolutely not — no input validation limits, no rate limiting, no authentication, no error handling for TextBlob failures, and debug=True in production is a security vulnerability.

When Vibe Coding Fails

Vibe coding breaks down for:

Security-critical code — Authentication, authorisation, cryptography, input sanitisation
Complex business logic — Domain-specific rules that require deep understanding
Performance-sensitive paths — Code that must handle millions of requests
Long-lived production systems — Code that must be maintained for years
Distributed systems — Concurrency, consistency, failure handling

Research

Stanford Study: AI-Generated Code Security (2023)

Researchers at Stanford found that developers using AI coding assistants produced significantly more security vulnerabilities than those coding without AI help. Worse, participants using AI were more confident their code was secure. The combination of more bugs and more confidence is dangerous — it means AI does not just introduce vulnerabilities, it reduces the vigilance that would normally catch them.

Security Overconfidence Research

Prompt Engineering for Developers

The quality of AI-generated code is directly proportional to the quality of your prompt. Prompt engineering for code generation follows specific patterns that differ from general LLM prompting.

The Five Pillars of Effective Code Prompts

Context — What language, framework, and existing patterns are you using?
Constraints — What are the limits? (no external dependencies, must run in Python 3.9, must be thread-safe)
Examples — Show the AI what the output should look like
Intent — What is the purpose, not just the mechanics?
Quality requirements — Should it have error handling? Type annotations? Tests?

Good vs Bad Prompts

# BAD PROMPT: "Write a function to process data"
# Result: Vague, generic function that probably doesn't do what you need

# GOOD PROMPT:
# "Write a Python function called `calculate_monthly_retention` that:
# - Takes a pandas DataFrame with columns: user_id (str), event_date (datetime), event_type (str)
# - Groups users by their first month (cohort month)
# - Calculates the percentage of users active in each subsequent month
# - Returns a DataFrame with cohort_month as index and months 0-12 as columns
# - Include type annotations and a docstring
# - Handle the case where the DataFrame is empty (return empty DataFrame)"

import pandas as pd
import numpy as np
from typing import Optional

def calculate_monthly_retention(events: pd.DataFrame) -> pd.DataFrame:
    """Calculate monthly cohort retention rates.

    Args:
        events: DataFrame with columns user_id, event_date, event_type

    Returns:
        DataFrame with cohort months as index and retention rates (0-12 months)
    """
    if events.empty:
        return pd.DataFrame()

    # Determine each user's cohort (first activity month)
    events = events.copy()
    events['event_month'] = events['event_date'].dt.to_period('M')
    cohorts = events.groupby('user_id')['event_month'].min().rename('cohort_month')
    events = events.merge(cohorts, on='user_id')

    # Calculate month offset from cohort
    events['month_offset'] = (
        events['event_month'].astype(int) - events['cohort_month'].astype(int)
    )

    # Build retention matrix
    cohort_sizes = events.groupby('cohort_month')['user_id'].nunique()
    retention = events.groupby(['cohort_month', 'month_offset'])['user_id'].nunique()
    retention = retention.unstack(fill_value=0)

    # Convert to percentages
    retention = retention.div(cohort_sizes, axis=0) * 100

    # Keep only months 0-12
    retention = retention.loc[:, retention.columns.isin(range(13))]

    return retention.round(1)

The difference is stark. The good prompt produces code that handles edge cases, has proper types, and does exactly what you need. The bad prompt produces code you will spend 30 minutes rewriting anyway.

Chain-of-Thought Prompting for Complex Logic

// PROMPT TECHNIQUE: Break complex logic into steps
// "Implement a rate limiter with the following logic:
// Step 1: Check if the user's IP exists in the rate limit store
// Step 2: If not, create an entry with count=1 and timestamp=now
// Step 3: If yes, check if the time window (60 seconds) has expired
// Step 4: If expired, reset the counter to 1
// Step 5: If not expired and count < limit (100), increment counter
// Step 6: If not expired and count >= limit, return 429 Too Many Requests
// Use an in-memory Map with automatic cleanup every 5 minutes"

const rateLimitStore = new Map();
const WINDOW_MS = 60 * 1000; // 60 seconds
const MAX_REQUESTS = 100;
const CLEANUP_INTERVAL_MS = 5 * 60 * 1000; // 5 minutes

function rateLimit(ip) {
    const now = Date.now();
    const entry = rateLimitStore.get(ip);

    // Step 1-2: No existing entry
    if (!entry) {
        rateLimitStore.set(ip, { count: 1, windowStart: now });
        return { allowed: true, remaining: MAX_REQUESTS - 1 };
    }

    // Step 3-4: Window expired — reset
    if (now - entry.windowStart >= WINDOW_MS) {
        rateLimitStore.set(ip, { count: 1, windowStart: now });
        return { allowed: true, remaining: MAX_REQUESTS - 1 };
    }

    // Step 5: Within window and under limit
    if (entry.count < MAX_REQUESTS) {
        entry.count += 1;
        return { allowed: true, remaining: MAX_REQUESTS - entry.count };
    }

    // Step 6: Rate limited
    const retryAfter = Math.ceil((entry.windowStart + WINDOW_MS - now) / 1000);
    return { allowed: false, remaining: 0, retryAfter };
}

// Automatic cleanup of expired entries
setInterval(() => {
    const now = Date.now();
    for (const [ip, entry] of rateLimitStore) {
        if (now - entry.windowStart >= WINDOW_MS) {
            rateLimitStore.delete(ip);
        }
    }
}, CLEANUP_INTERVAL_MS);

// Usage example
const result = rateLimit('192.168.1.100');
console.log(result);
// { allowed: true, remaining: 99 }

AI in the Development Workflow

AI does not just live in your editor. It touches every stage of the development lifecycle:

AI Touchpoints in the Development Lifecycle

flowchart TD
    A["Plan"] -->|"AI: Story generation,\nestimation"| B["Code"]
    B -->|"AI: Completion,\ngeneration, refactoring"| C["Review"]
    C -->|"AI: Code review,\nsecurity scanning"| D["Test"]
    D -->|"AI: Test generation,\ncoverage analysis"| E["Document"]
    E -->|"AI: Docstrings,\nREADME, API docs"| F["Deploy"]
    F -->|"AI: Config generation,\nincident detection"| G["Monitor"]
    G -->|"AI: Anomaly detection,\nroot cause analysis"| A

Code Review by AI

AI-powered code review has matured rapidly. Tools like CodeRabbit, Sourcery, and GitHub Copilot Code Review can now provide meaningful feedback on pull requests within seconds of opening them.

What AI reviewers catch well:

Style inconsistencies and formatting issues
Common bug patterns (null dereference, off-by-one, resource leaks)
Security vulnerabilities (SQL injection, XSS, hardcoded secrets)
Performance anti-patterns (N+1 queries, unnecessary allocations)
Missing error handling

What AI reviewers miss:

Whether the code solves the right business problem
Architectural fitness — does this belong here?
Domain-specific correctness (financial calculations, regulatory compliance)
Whether the abstractions will hold up over time
Team conventions that are not in written guidelines

                            
                            Warning: AI code review should augment human review, never replace it. The most dangerous bugs are the ones that look correct — subtle logic errors, race conditions, and incorrect assumptions. AI is particularly bad at catching these because they require understanding intent, not just syntax.
                        

AI-Generated Documentation

AI can generate documentation at various levels:

# AI can generate inline documentation from code
# Example: Ask AI to document this function

def backoff_retry(func, max_retries=3, base_delay=1.0, max_delay=60.0):
    """Execute a function with exponential backoff retry logic.

    Retries the given function up to max_retries times with
    exponentially increasing delays between attempts. Uses jitter
    to prevent thundering herd problems.

    Args:
        func: Callable to execute. Should raise an exception on failure.
        max_retries: Maximum number of retry attempts (default: 3).
        base_delay: Initial delay in seconds between retries (default: 1.0).
        max_delay: Maximum delay cap in seconds (default: 60.0).

    Returns:
        The return value of func if successful.

    Raises:
        Exception: The last exception raised by func after all retries
            are exhausted.

    Example:
        >>> result = backoff_retry(lambda: requests.get(url), max_retries=5)
    """
    import time
    import random

    last_exception = None
    for attempt in range(max_retries + 1):
        try:
            return func()
        except Exception as e:
            last_exception = e
            if attempt == max_retries:
                raise
            delay = min(base_delay * (2 ** attempt), max_delay)
            jitter = random.uniform(0, delay * 0.1)
            time.sleep(delay + jitter)
    raise last_exception

The quality is often good for mechanical documentation (parameter descriptions, return types). It struggles with why documentation — the reasoning behind design decisions, the history of changes, and the gotchas that only experience reveals.

AI for Debugging

One of the most immediately valuable applications of AI in development is debugging assistance. LLMs excel at:

Error message interpretation — Translating cryptic stack traces into plain English
Pattern recognition — Identifying common causes of specific error types
Log analysis — Finding anomalies in verbose log output
Rubber duck debugging — Explaining your code back to you reveals flaws

# Example: Pasting an error into an AI assistant
# "I'm getting this error when deploying my Kubernetes pod:"

# Error:
# CrashLoopBackOff - Back-off restarting failed container
# Last State: Terminated - Reason: OOMKilled - Exit Code: 137

# AI Response (summarised):
# Your container is being killed by Kubernetes because it exceeds its
# memory limit. Exit code 137 = SIGKILL (out of memory).
#
# Solutions:
# 1. Increase memory limits in your pod spec
# 2. Profile your application for memory leaks
# 3. Check if your app allocates memory at startup that exceeds limits
#
# Quick fix - increase limits:
# resources:
#   limits:
#     memory: "512Mi"  # Increase from current value
#   requests:
#     memory: "256Mi"

Risks & Limitations

AI coding tools introduce specific risks that every engineering team must understand and mitigate:

Hallucinated Code

AI models generate plausible-looking code that references non-existent APIs, uses deprecated methods, or implements algorithms incorrectly. The code looks right, compiles, and might even pass superficial tests — but contains subtle bugs.

Outdated Patterns

Training data has a cutoff date. AI may suggest patterns that were best practice in 2022 but are now deprecated, insecure, or have better alternatives. Always verify against current documentation.

License Concerns

AI models trained on open-source code may reproduce snippets verbatim. This raises questions about GPL compliance, attribution requirements, and intellectual property ownership.

Security Vulnerabilities

AI-generated code frequently contains security issues: SQL injection via string concatenation, missing input validation, hardcoded credentials in examples, and unsafe deserialization patterns.

The Skill Atrophy Problem

Cognitive Science

The Generation Effect & AI Assistance

Cognitive psychology research shows that actively generating information leads to stronger learning than passively receiving it (the "generation effect"). When AI writes your code, you skip the generative process that builds deep understanding. Junior developers who rely heavily on AI from day one may develop shallow pattern matching without genuine comprehension. The implication: use AI as a force multiplier for skills you already have, not as a substitute for learning fundamentals.

Learning Cognition Skill Development

The Changing Developer Role

The role of the software developer is shifting from code writer to code curator. This means:

Reviewing — Reading and evaluating AI-generated code for correctness, security, and maintainability
Directing — Providing the right context, constraints, and requirements to get useful output
Testing — Verifying that AI-generated code actually solves the problem
Integrating — Combining AI-generated pieces into a coherent architecture
Understanding — Knowing why code works, not just that it works

This makes fundamentals more important, not less. You cannot review code you do not understand. You cannot test edge cases you cannot imagine. You cannot design systems if you do not know the tradeoffs.

                            
                            The New 10x Developer: The "10x developer" myth was always about productivity, not typing speed. In the AI era, the 10x developer is someone who can direct AI effectively, review output critically, and design systems that AI cannot. They use AI to amplify their existing expertise, not to substitute for understanding.
                        

Skills That Become More Valuable

Skill	Why More Valuable
System design	AI generates code, not architecture. Design decisions require human judgment.
Code review	More code generated faster means more review needed. Critical evaluation is essential.
Debugging complex systems	AI-generated code in production creates novel bugs that require deep understanding.
Requirements analysis	The better you specify, the better AI output. Understanding what to build is the bottleneck.
Testing strategy	Knowing what to test and why becomes critical when code is generated rapidly.
Security mindset	AI lacks adversarial thinking. Security requires assuming the worst.

Ethics & Intellectual Property

The legal and ethical landscape of AI-generated code remains unsettled:

Training Data & Licensing

GitHub Copilot was trained on public GitHub repositories. This includes code under various licenses — MIT, Apache, GPL, and proprietary code accidentally made public. The question of whether AI-generated code is a "derivative work" of its training data remains legally unresolved.

The Open-Source Debate

Supporters argue: AI learning from code is analogous to humans learning from reading code — no license violation
Critics argue: AI can reproduce substantial portions of copyrighted code — this is copying, not learning
Pragmatists argue: Use tools that offer indemnification (GitHub, Amazon) and configure filters to block exact matches

Practical Guidelines for Teams

Enable Copilot's duplicate detection filter to block verbatim reproductions
Run license scanners (FOSSA, Snyk) on AI-generated code
Document which code was AI-assisted in commit messages or PR descriptions
Never use AI for code that handles credentials, encryption keys, or security logic without expert review
Establish team guidelines for acceptable AI usage levels by code criticality

Exercises

                            
                            Exercise 1 — Prompt Comparison: Take a moderately complex function in your codebase. Write three different prompts to regenerate it: (1) a vague one-liner, (2) a detailed specification, (3) a chain-of-thought breakdown. Compare the outputs. Which produces the most correct, maintainable code?
                        

                            
                            Exercise 2 — Security Audit: Ask an AI assistant to generate a user authentication endpoint (login with email/password, return JWT). Review the generated code for security vulnerabilities. How many issues can you find? (Hint: check for timing attacks, password storage, token expiration, input validation, rate limiting.)
                        

                            
                            Exercise 3 — Vibe Coding Boundaries: Attempt to build a small feature using pure vibe coding (accept all AI suggestions without editing). Then build the same feature manually. Compare: lines of code, bugs found in testing, time spent, and your confidence in correctness.
                        

                            
                            Exercise 4 — AI Review Calibration: Submit a pull request with three intentionally planted bugs (one subtle logic error, one security vulnerability, one performance issue) to an AI code review tool. Does it catch all three? Which does it miss? What does this tell you about AI review reliability?
                        

Conclusion & Next Steps

AI in software development is not a passing trend — it is a permanent shift in how code is written, reviewed, and maintained. The developers who thrive will be those who treat AI as a powerful tool that requires skilled operation, not a magic wand that eliminates the need for expertise.

The key principles: use AI for acceleration, not substitution. Review everything. Maintain your fundamentals. Understand that AI shifts the bottleneck from writing code to designing, reviewing, and maintaining systems.

Next in the Series

In Part 38: AI Agents for Testing, Review & Self-Healing Code, we explore the next frontier — autonomous AI agents that generate tests, review code, and maintain test suites without human intervention.

Previous Part 36: Agile Testing Next Part 38: AI Agents for Testing

Cookie Consent

Part 37: AI in Software Development & Vibe Coding

Table of Contents

Introduction — The AI Revolution in Software Development

The 2022–2026 Timeline

AI-Assisted Coding Tools

How They Work

Tool Comparison

Vibe Coding

When Vibe Coding Works

When Vibe Coding Fails

Stanford Study: AI-Generated Code Security (2023)

Prompt Engineering for Developers

The Five Pillars of Effective Code Prompts

Good vs Bad Prompts

Chain-of-Thought Prompting for Complex Logic

AI in the Development Workflow

Code Review by AI

AI-Generated Documentation

AI for Debugging

Risks & Limitations

Hallucinated Code

Outdated Patterns

License Concerns

Security Vulnerabilities

The Skill Atrophy Problem

The Generation Effect & AI Assistance

The Changing Developer Role

Skills That Become More Valuable

Ethics & Intellectual Property

Training Data & Licensing

The Open-Source Debate

Practical Guidelines for Teams

Exercises

Conclusion & Next Steps

Next in the Series

Cookie Consent

Part 37: AI in Software Development & Vibe Coding

Table of Contents

Introduction — The AI Revolution in Software Development

The 2022–2026 Timeline

AI-Assisted Coding Tools

How They Work

Tool Comparison

Vibe Coding

When Vibe Coding Works

When Vibe Coding Fails

Stanford Study: AI-Generated Code Security (2023)

Prompt Engineering for Developers

The Five Pillars of Effective Code Prompts

Good vs Bad Prompts

Chain-of-Thought Prompting for Complex Logic

AI in the Development Workflow

Code Review by AI

AI-Generated Documentation

AI for Debugging

Risks & Limitations

Hallucinated Code

Outdated Patterns

License Concerns

Security Vulnerabilities

The Skill Atrophy Problem

The Generation Effect & AI Assistance

The Changing Developer Role

Skills That Become More Valuable

Ethics & Intellectual Property

Training Data & Licensing

The Open-Source Debate

Practical Guidelines for Teams

Exercises

Conclusion & Next Steps

Next in the Series

Continue the Series

Part 38: AI Agents for Testing, Review & Self-Healing Code

Part 39: Testing Large Language Models

Part 1: Software Delivery Mental Models & the SDLC