MULTI-AGENT SYSTEM SERIES

Code Generation — How Agentic Workflows Transform Requirements into Code

Part 6 of 9 — Building AI-Powered SDLC — A Practical Guide

13 min readFeb 10, 2025

Evoluation of code creation — Fully Manual vs Using Automation Tools vs Using GenAI

One of the most critical steps in the software development life cycle (SDLC) is transitioning from design to code. Developers take Software Requirements Specifications (SRS) and High-Level Designs (HLD), do their magic and turn them into functional code to realize what has been defined and designed.

As projects scale, this step grows more complex, often slowed by repetitive tasks, managing dependencies, and debugging issues and fraught with risks like misalignment between design and the final product. Automation helps mitigate some of those issues and accelerates code generation and human oversight to ensure the job is done correctly. It should never be taken as a replacement of Developers, but rather something that empowers them to use their time effectively and gain stakeholder confidence by providing a more consistent and traceable output.

This blog explores the challenges of manual coding, the limitations of traditional automation tools, and how LangGraph can speed up the coding process while maintaining quality and consistency.

Let’s dive into a future where code writes itself and developers reclaim their creativity.

As always, you can directly jump to the implementation section if you want to skip the theory.

Challenges with Manual Coding

While coding manually has been the backbone of Software development for decades, it has become more than just typing. It’s become more complex and difficult with multiple steps like Requirement Analysis, Design and Architecture, Code Writing, Reviewing, Unit Testing, Debugging and Documentation bringing in systemic inefficiencies that ripple across the SDLC.

Human error and Inconsistency — Misinterpreting requirements, inconsistent code styles across multiple developers and missing or avoiding edge cases lead to costly rework, fragmented code base and accumulation of technical debt.
Time and Resource Drains — Developers spend up to 30% of their time writing repetitive code. Debugging syntax errors and re-aligning with HLD specs remove focus from innovation and performance tuning.
Design and Code Misalignment — Specification drift occurs because of lost traceability between design and code.
Communication Overload — A lot of time is spent on keeping everyone on the same page however Assumption Driven Development still becomes a problem.
Compliance and Security Risks — Manual oversights could lead to skipping compliance checks or security practices leading to system vulnerabilities.

Why These Challenges Matter?

These challenges indicate that manual coding is no longer sufficient. These are not just minor inconveniences — they compound over time inflating costs, delaying time to market and eroding team morale.

Automating code generation can address these challenges by streamlining tasks, reducing errors, and improving efficiency. The key advantages include:

Accelerating Repetitive Tasks — Automating routine coding activities frees developers to focus on solving complex problems.
Improving Consistency — Standardized templates and predefined styles ensure uniformity across the codebase.
Reducing Manual Effort — Automation tools handle tasks like error-checking and formatting, allowing developers to focus on critical areas and increase productivity.

Now, as illustrated in the above image, the question arises: several code automation tools are available in the market that speed up coding. Why not just use them? Let’s look at a few popular tools.

Popular Automation Tools

Several tools are available to automate aspects of code generation, including:

Swagger Codegen: It creates client libraries, server stubs, and API documentation from OpenAPI specifications. While powerful for API-driven projects, it requires a pre-existing API specification and lacks intelligent error handling.
JHipster: Combines Spring Boot backends with Angular or React frontends for rapid development. However, it has limitations in customizing code beyond templates.
Angular CLI: Streamlines frontend development by automating the generation of components and modules. It focuses on Angular applications but does not integrate backend logic.

While there is no doubt that these tools accelerate the initial skeleton, they leave developers handling the nuances of transforming business logic into working code. This is where we wanted to leverage LangGraph to generate code with business logic/rules not just a skeleton.

How LangGraph Helps in the Coding Process

LangGraph introduces an AI-driven, agentic workflow to automate and streamline the coding process. By handling each step of the development cycle, LangGraph delivers quality, consistency, and speed:

Agentic Workflow for Code Generation

Seek Clarifications: The process commences by gathering essential inputs such as the Software Requirement Specification (SRS), High-Level Design (HLD), and project structure for coding. If there are any uncertainties or missing details in the provided inputs, the agents seek clarifications from the user or stakeholders to ensure a full understanding of the requirements.
Code Generation: Once all inputs are sorted, the developer (AI agent) automatically generates the initial version of the code based on SRS, HLD and the predefined project structure. It generates code to realize business requirements.
Code Review: The generated code is reviewed by a code reviewer (AI agent). It evaluates the code for potential issues, improvements, and compliance with best practices, providing critique and feedback. Ensure it fulfils requirements (SRS) and aligns with design (HLD) and project structure.
Debugging: The debugger agent analyzes the critique from the reviewer and makes necessary fixes to address any issues or improvements suggested.
Continuous Refinement: The cycle continues iteratively: the reviewer provides further feedback, the debugger applies fixes, and the process repeats until the reviewer is satisfied with the code quality.
Final Approval: When the Reviewer agent is satisfied, the finalized code is stored for human review.

LangGraph leverages a State to manage data flow between agents, ensuring seamless integration and efficient workflows.

It increases the velocity by reducing development time by 40–60%, consistency by enforcing coding standards across the teams, scalability by providing the ability to handle large and complex systems effortlessly and enhanced collaboration by aligning stakeholders early.

But automation using agentic workflow isn’t without risks. There is always a catch, isn’t it?

Risks and Mitigation

Inaccurate Code Generation

Mitigation: Regular reviews and feedback loops ensure code meets requirements.

Oversimplification of Complex Logic

Mitigation: Human oversight ensures the accuracy and completeness of generated code.

Agent Errors or Biases

Mitigation: Continuous monitoring and quality checks minimize the risk of errors.

Trust issues

Mitigation: Introduce explainability in the process and have agents log their decisions for them to be reviewed and phased adoptions by starting with non-critical modules and later graduating to the core system.

Now let’s dive into the implementation and get our hands dirty.

Hands-On for Automating Code Generation Using LangGraph

Installing Dependencies

Before running the AI-powered agent, we must install the necessary dependencies.

pip install --upgrade pip
pip install -U langgraph langchain_openai

Importing Essential Modules

The project integrates multiple Python libraries for AI interactions, file management, and structured messaging.

import os
from typing import List
from langchain_openai import ChatOpenAI
from langchain_core.messages import SystemMessage, AIMessage, HumanMessage, ToolMessage
from pydantic import BaseModel, Field
from tempfile import TemporaryDirectory
from langchain_community.agent_toolkits import FileManagementToolkit

Defining the AI Model

model_name = "gpt-4o-mini"

Why GPT-4o-mini? It offers improved reasoning and efficiency compared to previous models, making it ideal for AI-powered SDLC automation.

Enabling File Management

The AI assistant needs to interact with files. We define three key operations: reading, writing, and listing files.

file_stores = FileManagementToolkit(
    selected_tools=["read_file", "write_file", "list_directory"], # use current folder
).get_tools()

read_file, write_file, list_file = file_stores

The AI agent can store user inputs, retrieve saved conversations, or generate files dynamically.

Defining the Consultant Agent (Seek Clarifications)

The AI agent is designed to clarify requirements before getting into writing code.

DevClarification = """
** Role: ** You are a Senior Software Consultant who is expert in consulting with user, and gather information.
** Goal: ** You are tasked to gather information from user that information helps software developer \
to build a codebase based on Requirement and HLD. You may leverage tool. 

** Inputs: **
You will receive following **pre-requisites** or **inputs**, \
1. Software Requirements Specification
2. High Level Design
"""

def get_messages_info(messages):
    return [SystemMessage(content=DevClarification)] + messages

class CodeReq(BaseModel):
    """Instructions on how to prompt the LLM."""
    tech_stack: str # Full Stack including UI, APIs, Middleware, Database
    microservice_list : str # list of microservice and functionality

llm = ChatOpenAI(model=model_name, temperature=0, max_retries=1)
llm_with_tool = llm.bind_tools([CodeReq])

def information_gathering(state):
    messages = get_messages_info(state["messages"])
    coding = state.get('coding')

    if coding:
        pass
    else:
        response = llm_with_tool.invoke(messages)
        return {"messages": [response]}

def conclude_conversation(state):
    return {
        "messages": [
            ToolMessage(
                content="Clarified and proceeding further",
                tool_call_id=state["messages"][-1].tool_calls[0]["id"],
            )
        ]
    }

def is_clarified(state):
    messages = state["messages"]
    coding = state.get('coding')
    if isinstance(messages[-1], AIMessage) and messages[-1].tool_calls:
        print("**** conclude_conversation *****")
        return "yes" #"conclude_conversation"
   else:
        print("**** continue_conversation *****")
        return "no" #"continue_conversation"

AI-driven requirement clarification reduces miscommunication, ensuring developers receive well-structured, actionable inputs.
The conclude_conversation method helps signal the end of the clarification stage, making it clear to the system or user that the conversation is transitioning to the next phase.
The is_clarified function helps determine whether the conversation has reached a point where it should be concluded, ensuring that all relevant information has been clarified.

Defining the Developer Agent (Generate Code)

class Code(BaseModel):
    """Plan to follow in future"""
    code: str = Field(
        description="Detailed optmized error-free and executable Python codebase with imports for provided requirements"
    )
    setup: str = Field(
        description="Detailed instructions to deploy the code, execute and test"
    )
    testscripts: str = Field(
        description="scripts to test the generated codebase"
    )

CodingPrompt = """Role: Software Developer
** Inputs: **
You will receive following **pre-requisites** or **inputs**, 
1. Requirements: {reqs}
2. Software Requirements Specification
3. High Level Design
4. Project Structure
"""
def get_prompt_messages(messages: list, state):
    tool_call = None
    other_msgs = []
    
    for m in messages:
        if isinstance(m, AIMessage) and m.tool_calls:
            tool_call = m.tool_calls[0]["args"]
        elif isinstance(m, ToolMessage):
            continue
        elif tool_call is not None:
            other_msgs.append(m)

    iteration = state['iteration']

    hld = state['hld'][-1].content
    srs = state['srs'][-1].content
    last_message = state['messages'][-1].content
    coding = state.get('coding')
    
    if coding:
        print("***** Revision Number ****", iteration)
        coding = state['coding'][-1].content
        return[
            SystemMessage(content=CodingPrompt),
            HumanMessage(content=last_message + hld + srs + coding)]
    else:
        return[
            SystemMessage(content=CodingPrompt.format(reqs=tool_call)),
            HumanMessage(content=last_message + hld + srs)] + other_msgs 

def generate_code(state):
    messages = get_prompt_messages(state["messages"], state)
    response = llm.invoke(messages)
    iteration = str(state['iteration'])
    file_name = "output/code v0." + str(iteration) +".md"
    write_file.invoke({"file_path": file_name, "text": response.content})

    return {
       "messages": [response.content] ,
        "coding": [response.content]
        
    }

It streamlines the code generation process by ensuring that developers receive optimized, error-free code, complete with setup instructions and test scripts. It reduces manual work and ensures consistency, making the development cycle faster and more reliable.

Defining the Code Reviewer Agent (Review Code)

class CodeReview(BaseModel):
    """Plan to follow in future"""
    review: str = Field(
        description="List of review comments based on code review"
    )

ReviewPrompt = """
** Role: ** Senior Software Developer
--Instructions --
You must send all review comments in one go.

-- Inputs --
1. Software Requirements Specification:  {srs} 
2. High Level Design: {hld} 
3. Coding: {coding}
4. Code Review

--Desired Output --
CodeReview = <Say "Satisfied" if there is no review comments, otherwise share entire list of code review comments. No obvious feedback>

"""

def get_review_info(coding, hld, srs):
    return [SystemMessage(content=ReviewPrompt),
           HumanMessage(content=coding + hld + srs)] 

llm = ChatOpenAI(model=model_name, temperature=0, max_retries=1)

def review_code(state):
    hld = state['hld'][-1].content
    srs = state['srs'][-1].content
    coding = state['coding'][-1].content

    messages = get_review_info(coding, hld, srs)
    response = llm.invoke(messages)
    iteration = state['iteration'] + 1
   
    return {
        "messages": [response.content],
        "iteration": iteration
    }

def is_reviewed(state):
    max_iteration = state['max_iteration']
    iteration = state['iteration']
    print(" ***** iteration *****", iteration)
    last_message = state['messages'][-1].content
    if 'satisfied' in last_message.lower():
        return 'satisfied'
    elif iteration > max_iteration:
        return 'satisfied'
    else:
        return 'enhance'

Ensures that the generated code is thoroughly reviewed for compliance and quality.
Provides clear, actionable feedback to developers, improving the overall code quality and reducing errors.
The is_reviewed function tracks the status of the review process, determining whether the code has been reviewed and whether further enhancements are needed. This ensures that the code quality review process is aligned with expectations

Defining the Code Debugger Agent (Debug Code)

class DebugCode(BaseModel):
    """Plan to follow in future"""
    code: str = Field(
        description="Revised code based on review"
    )
    setup: str = Field(
        description="Revised setup based on review, revised codebase"
    )
    testscripts: str = Field(
        description="Revised testscripts based on review, revised codebase"
    )

DebugPrompt = """
** Role: ** Senior Software Developer

** Inputs: **
You will receive following **pre-requisites** or **inputs**, 
1. Codebase
2. CodeReview
"""

def get_debug_messages(state):
   
    iteration = state['iteration']
    last_message = state['messages'][-1].content
    coding = state['coding'][-1].content
    hld = state['hld'][-1].content
    srs = state['srs'][-1].content
    
    return [SystemMessage(content=DebugPrompt),
           HumanMessage(content=last_message + coding + hld + srs )] 

def debug_code(state):
    messages = get_debug_messages(state)
    response = llm.invoke(messages)
    
    iteration = str(state['iteration'])
    
    file_name = "output/code v0." + str(iteration) +".md"
    #result = "Coding:\n" + response.code + "Setup Instructions:\n" + response.setup + "Test Scripts:\n" + response.testscripts
    write_file.invoke({"file_path": file_name, "text": response.content})
    return {
        "messages": [response.content] ,
        "coding": [response.content]
        
    }

Enhances the code by addressing the feedback from the code review while preserving working sections.
Helps developers improve code quality iteratively, ensuring each version is better than the last and complies with the review comments.

Defining the Store Agent (Store Code)


import re

def extract_and_write_files(content):

    # Regex patterns
    file_and_code_pattern = re.compile(
        r"(?<=\*\*)([^\n*]+)\*\*\s*```(?:\w+)?\n(.*?)```",
        re.DOTALL
    )

    # Find all matches for file paths and code blocks
    matches = file_and_code_pattern.findall(content)

    if not matches:
        print("No code blocks found in the content.")
        return

    for file_path, code in matches:
        # Validate the file path (skip if it's not valid)
        if not is_valid_file_path(file_path):
            print(f"Skipping invalid file path: {file_path}")
            continue
        file_path = os.path.join("code", file_path)
        #print("********** file path ******** \n\n\n", file_path)
        write_code_to_file(file_path.strip(), code.strip())

def is_valid_file_path(file_path):
    # Basic validation: file path should contain a directory and a file name
    return "/" in file_path and file_path.endswith(('.py', '.md'))

def write_code_to_file(file_path, code):
    # Create directories if they don't exist
    os.makedirs(os.path.dirname(file_path), exist_ok=True)

    # Write the code to the file
    with open(file_path, "w") as file:
        file.write(code)
    print(f"Written to {file_path}")

def store_code(state):
    coding = state.get('coding')
    extract_and_write_files(coding)

Automates Code Storage: This process helps automate the extraction of code snippets from content and stores them in the appropriate files, ensuring no manual intervention is required.
Prevents Errors: The code validates file paths to ensure that only valid paths are used, preventing the creation of invalid files or directories.
Facilitates Organization: By storing code in a directory structure, it helps maintain organization and ensures the extracted code is saved in the proper format.
Useful in CI/CD Pipelines: This function can be integrated into CI/CD pipelines where code needs to be extracted and saved dynamically from content or documentation, reducing manual effort and human error.

Defining and Compiling Workflow

from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import StateGraph, START
from langgraph.graph.message import add_messages
from typing import Annotated
from typing_extensions import TypedDict
from langgraph.graph import END

class State(TypedDict):
    messages: Annotated[list, add_messages]
    srs: Annotated[list, add_messages]
    hld: Annotated[list, add_messages]
    coding: Annotated[list, add_messages]
    max_iteration: int
    iteration: int

    
memory = MemorySaver()
workflow = StateGraph(State)

workflow.add_edge(START, "Seek Clarifications")

workflow.add_node("Seek Clarifications", information_gathering)
workflow.add_node("Generate Code", generate_code)
workflow.add_node("Clarified", conclude_conversation)
workflow.add_node("Review Code",review_code)
workflow.add_node("Debug Code",debug_code)
workflow.add_node("Store Code", store_code)

workflow.add_conditional_edges(
    "Seek Clarifications", 
    is_clarified, 
    {"yes": "Clarified",  "no": END}
    #{"yes": "conclude_conversation", "wip" : "generate_code", "no": END}
)

workflow.add_conditional_edges(
    "Review Code", 
    is_reviewed, 
    {"satisfied": "Store Code",  "enhance": "Debug Code"}
)

workflow.add_edge("Clarified", "Generate Code")
workflow.add_edge("Generate Code", "Review Code")
workflow.add_edge("Debug Code", "Review Code")
workflow.add_edge("Store Code", END)

graph = workflow.compile(checkpointer=memory)

The workflow defines a state graph that organizes and controls the flow of tasks in a process. The tasks are represented as nodes, including stages such as:

Seek Clarifications: Gathering further details or feedback.
Generate Code: Creating Python code based on gathered requirements.
Review Code: Review the generated code for compliance with requirements and quality standards.
Debug Code: Revising and correcting issues found during the review.
Store Code: Storing the final version of the code.
Memory Management: The system uses MemorySaver to store the state of the process.

Automation and Structuring: It automates and structures a multi-step, complex process like code generation, review, debugging, and storing.
Flow Control: By using conditional edges, the workflow can adapt to various situations, ensuring that the next step is based on real-time feedback, like whether clarification is needed or whether the review is complete.
State Persistence: The memory checkpointing ensures that all context (e.g., previous steps, decisions, code, and feedback) is preserved throughout the workflow. This allows for seamless transitions between stages and ensures that information is not lost, which is critical for iterative processes like debugging and refining code.

Visualize Workflow

from IPython.display import Image, display
display(Image(graph.get_graph().draw_mermaid_png()))

Executing Workflow

thread = {"configurable": {"thread_id": 22}}

SRS = read_file.invoke({"file_path": "input/srs.md"})
HLD = read_file.invoke({"file_path": "input/hld.md"})
FORMAT = read_file.invoke({"file_path": "input/ps.md"})

#print(readfile)

count = 1

while True:
    user = input("User (q/Q to quit): ")
    if user.lower() in ["quit", "q", "Q"]:
        print("AI: Byebye")
        break
    output = None
    
    #print(count)
    user_msg = user + " \n SRS:" + SRS + " \n HLD:" + HLD + " \n Project Structure:" + FORMAT 
    #print(count)    
    for output in graph.stream(
        {
            "messages": [HumanMessage(content=user_msg)],
            "srs": [HumanMessage(content=SRS)],
            "hld": [HumanMessage(content=HLD)],
            "iteration" : 1,
            "max_iteration": 2,
        }, 
        config=thread, 
        stream_mode="updates"):
        #print(output)
        for key, value in output.items():
            print("***** Result from Agent: ",key)
            print("***** value: ",value)

Input Handling: The code initiates a loop where the user interacts with the system by inputting queries or commands.
File Reading: It reads the contents of several files (e.g., SRS, HLD and Project Structure). These files provide important context such as the Software Requirements Specification (SRS), High-Level Design (HLD), and Project Structure.
Iteration Management: The loop manages the number of iterations (iteration and max_iteration), ensuring that the process progresses and potentially stops after a set number of steps or when the user chooses to quit.

You can find the code snippets here

Conclusion

The article highlights how agentic workflows facilitate the automation of code generation including business logic within the SDLC, allowing for improved efficiency and faster delivery. By leveraging AI Agents, developers can focus on higher-level tasks, leading to reduced errors and more streamlined development.

What’s Next?

In the next instalment of our series, we will delve into the Test Case Preparation phase of the software development life cycle. We will explore how Agentic workflow transforms the Software Requirement Specifications (Generate SRS), and High-Level Design (Generated HLD) into Test Cases and best practices that ensure quality and maintainability.

Stay tuned to discover how to leverage Agentic Workflow (LangGraph) to enhance test case creation, and make the development cycle even more efficient!

Disclaimer:
While automation offers substantial advantages in tasks like code generation using LangGraph, human supervision is necessary for final validation and approval. This ensures that the generated code aligns with project-specific requirements, organizational standards, and business goals. The code snippets are only examples to show the automation process. They are not production-ready and will need modifications for actual use.

If you missed the previous part, please check them out below,

K G Aravinda Kumar

Multi Agent System

View list

7 stories

This diagram illustrates a system in which application codebases are parsed, summarized using LLMs and stored in a Neo4j graph database. Users interact with a chat interface to query the codebase, while LLMs simplify and translate these queries into Cypher for execution.