Winning 2nd Place at IEEE CISOSE: Graph-Based LLM Prompting for Microservice API Testing
I'm excited to share that my research paper, "Graph-Based LLM Prompting for Scalable Microservice API Testing," won 2nd Place at the IEEE CISOSE 2025 Student Research Competition. This work tackles a fundamental challenge in modern software testing: how do we leverage LLMs to generate meaningful API tests for microservices without hitting context limits or drowning in noise?
Testing microservice APIs is notoriously difficult. With hundreds of interdependent services, complex data flows, and intricate business logic spread across multiple codebases, traditional testing approaches struggle to keep up. LLMs offer a promising solution—but feeding entire source code into an LLM quickly hits context limits and generates irrelevant test cases. This research presents a different approach.
Research Overview
| Aspect | Details |
|---|---|
| Problem | LLM-based test generation fails at scale—full source code exceeds context limits and introduces noise |
| Solution | Use Interprocedural Control Flow Graphs (ICFGs) to extract only path-specific context for each API endpoint |
| Venue | IEEE International Conference on Service-Oriented System Engineering (CISOSE) 2025 |
| Award | 2nd Place, Student Research Competition (July 23, 2025) |
The Problem: Why Current LLM-Based Testing Struggles
Large Language Models have demonstrated impressive code understanding capabilities. Naturally, researchers have explored using them for automated test generation. The idea is simple: feed the LLM your source code, and let it generate comprehensive test cases.
But for microservices, this approach breaks down:
Context Limit Problem
A typical microservice might have thousands of lines of code across dozens of files. Even a single API endpoint can involve multiple service classes, database repositories, utility functions, and configuration files. Modern LLMs have context windows of 128K-200K tokens—sounds like a lot, but it fills up fast when you include all relevant code.
Noise Problem
Even when code fits within context limits, most of it is irrelevant to any specific test case. An endpoint for user registration doesn't need to see code for order processing. Including irrelevant code confuses the LLM, leading to generic test cases that miss critical edge cases.
Scalability Problem
Microservice architectures constantly evolve. New endpoints appear, existing ones change, services get refactored. Any testing approach that requires full codebase analysis for every change becomes a bottleneck in CI/CD pipelines.
The Solution: ICFG-Based Prompting
Instead of feeding entire source code to the LLM, this approach uses Interprocedural Control Flow Graphs (ICFGs) to extract precisely the code paths relevant to each API endpoint.
What is an ICFG?
An ICFG represents program execution flow across function boundaries. Unlike a simple call graph that just shows which functions call which, an ICFG captures:
- Control flow within functions (branches, loops, conditions)
- Data dependencies between functions
- Complete execution paths from entry point to exit
For API testing, this means we can trace exactly what code gets executed for any given endpoint request—and nothing more.
Key Insight
For any API endpoint, only a fraction of the codebase is actually relevant. By using ICFGs to isolate execution paths, we can provide the LLM with focused, relevant context that fits within token limits and produces targeted test cases.
How It Works
The pipeline works in four stages:
1. Static Analysis
The system analyzes the microservice codebase to identify all API endpoints and their handler functions. This creates a mapping between routes (e.g., POST /users) and their corresponding code entry points.
2. ICFG Construction
For each endpoint, the system builds an Interprocedural Control Flow Graph starting from the handler function. This graph traces all possible execution paths, including:
- Service layer method calls
- Repository/database operations
- External service invocations
- Validation and error handling branches
3. Path Isolation
The ICFG is traversed to extract distinct execution paths. Each path represents a specific scenario the endpoint can handle—successful creation, validation failure, duplicate detection, etc. The code along each path is extracted as a focused snippet.
4. LLM Prompting
Each isolated path is sent to the LLM along with:
- The extracted code snippet (typically 100-500 lines vs. 10,000+)
- The API endpoint signature
- Any relevant data models
- Instructions to generate test cases for that specific path
Key Benefits
Scalability
Context stays bounded regardless of codebase size. Adding new services doesn't increase prompt size for existing endpoints.
Precision
Tests target specific code paths rather than generic endpoint behavior. Edge cases and error conditions get proper coverage.
Incremental Updates
When code changes, only affected paths need re-analysis. The ICFG approach supports efficient CI/CD integration.
Better Coverage
By enumerating all execution paths, the system ensures comprehensive test generation that manual testing often misses.
Privacy Preservation
Organizations can use this approach with hosted LLMs while minimizing code exposure. Only path-specific snippets are sent to the LLM, not the entire codebase.
Future Directions
This work opens several research directions:
Async Behavior Handling — Current ICFG construction focuses on synchronous execution paths. Microservices heavily use async patterns (message queues, event-driven architectures) that require extended graph models.
Cross-Service Integration Testing — The current approach tests individual microservices. Extending ICFGs across service boundaries could enable integration test generation that covers end-to-end workflows.
API Contract Enrichment — Combining ICFG-derived paths with OpenAPI specifications could provide even richer context for test generation, including request/response schema validation.
Award Recognition
IEEE CISOSE 2025
The IEEE International Conference on Service-Oriented System Engineering (CISOSE) is a premier venue for research on service-oriented architectures, microservices, and cloud-native systems. The Student Research Competition showcases innovative work from graduate students worldwide.
I'm grateful to my advisor Dr. Tomas Cerny and the research group at the University of Arizona for their guidance and support throughout this project.
Citation
If you find this work useful, please cite:
@inproceedings{uddin2025graph,
title={Graph-Based LLM Prompting for Scalable Microservice API Testing},
author={Uddin, Md Arfan},
booktitle={2025 IEEE International Conference on Service-Oriented System Engineering (SOSE)},
year={2025},
organization={IEEE},
doi={10.1109/SOSE67019.2025.00034}
}
About This Research
This research was conducted at the University of Arizona as part of my graduate studies in software engineering. The work focuses on improving developer productivity through intelligent tooling that leverages modern AI capabilities.
The code is available on GitHub. Feel free to reach out if you have questions or want to collaborate on related research.