SyscaLLM is an system call error injection testing framework that combines Large Language Models (LLMs) with error injection techniques to monitor software robustness. It provides a comprehensive testbed for evaluating application robustness by emulating a wide range of operating system error conditions via system call manipulation using strace.
The general workflow of SyscaLLM is as follows:
Manual Pages → LLM Generation → JSON Tests → Strace Commands → Configurations → Error Injection → Analysis
SyscaLLM is structured around four main components, each aligned with key stages of this workflow:
- LLM-based System Call Test Generation (
syscallm-generation/): Covers theManual Pages → LLM Generation → JSON Testsstages. This module uses LLMs to generate system call error injection tests based on Linux man pages. This is added as a subdirectory syscallm-generation. - Errorload Processing Pipeline (
src/process-json/): Implements theJSON Tests → Strace Commands → Configurationsstages. It processes the LLM-generated JSON tests into executable error injection configurations that can be used in the error injection environment. - Error Injection Testbed (
syscallm-injection/): Corresponds to theError Injectionstage. This Docker-based testbed simulates faulty Linux OS behavior by injecting system call errors. This is added as a subdirectory syscallm-injection. - Visualization of Experiment Results (
src/plot/): Supports theAnalysisstage. This component provides scripts and tools to visualize and interpret application behavior in response to injected system call errors.
- Automated Test Generation: Uses GPT-5.2 to generate realistic system call failure scenarios
- Comprehensive Error Injection: Supports both success return value and error code for system calls manipulation
- Real-world Application Testing: Pre-configured support for Redis, memcached, Python, Nginx and other applications
- Configurable Error Distribution: Supports uniform and logarithmic error distribution patterns
- Error Injection Environment: Isolated Docker-based testing with system call monitoring
- Detailed Analysis Tools: Visualization and analysis scripts for test results and coverage
First, clone this repository and its subrepository (submodule):
# clone the main repository
git clone https://github.com/boschresearch/syscallm.git
cd syscallm
# initialize and update submodules
git submodule update --init --recursive- Install dependencies for
syscallm-generation
pip install -r syscallm-generation/requirements.txt- Set up the environment for
syscallm-injection - Extract Linux manual pages
cd ./syscallm-generation
bash ./scripts/extract_syscall_man_pages.sh- Generate JSON files with LLM
bsub < gpt5.2.bsub- Configure environment variables
cd ../syscallm-injection
source ./config/configure- Process JSON files to strace command options
bash ../scripts/process_json.sh- Build the syscall monitor image
bash ./scripts/build_monitor_image.sh- Build wrapped application image
bash ./scripts/build_test_image.sh- Run injection
bash ./scripts/batch.shSyscaLLM orchestrates a multi-stage workflow to test application robustness against OS-level errors by injecting errors into system calls. Here, we break down each stage of this pipeline:
The LLM generates test cases based on the following components:
- Prompt (
syscallm-generation/src/prompt.py): Guides the LLM to generate erroneous values for system call success return values and error codes. The current prompt is targeting valid return value errors (i.e., return values that fall within a valid range but are incrrect in context) and invalid return value errors (i.e., clearly invalid values, such as numbers that exceed the data type range or violate syscall specifications) - Manual Pages (
syscallm-generation/scripts/extract_syscall_man_pages.sh): Serves as prior knowledge for each system call and dynamically inserted into the prompt. - JSON Schema (
syscallm-generation/src/output_json_schema.py): Defines the expected structure of the LLM-generated test cases, which improves the quality of the output. See JSON schema.
To understand the details of LLM-based test generation, see syscallm-generation/README.md for information on:
- How to extract manual pages for each system call
- Setting up
OPENAI_ENDPOINTandOPENAI_API_KEYand configuring model parameters
A sample LLM-generated test case for the accept4 system call's success return values:
{
"test_values": [
0,
1,
2,
1024,
65536,
2147483647,
4294967295,
9223372036854775807,
18446744073709551615
]
}Run the complete errorload processing pipeline:
bash ./scripts/process_json.shThis script will run src/process_json/main.py for the following steps:
- Processing JSON files that have out of bound values
- For success return values, it will filter out values that are below
0and over18446744073709551615. - For error_codes, it will filter out non-existing error codes (e.g.,
EACCES).
- For success return values, it will filter out values that are below
- Converting JSON files to strace commands
- Translates the filtered JSON test cases into corresponding strace tampering parameters.
- Filtering strace commands
- Uses
src/utils/app_syscalls.pyto identify system calls actually invoked by the application-under-test's error-free execution logs, and removes commands that are irrelevant for error injection.
- Uses
- Adding when parameter to the strace commands
- For system calls that are invoked multiple times, error values to inject are propagated across every invocation.
- Convert strace command to error injection config files
- Converts the strace command to the specific config file format that
syscallm-injectionuses for error injection.
- Converts the strace command to the specific config file format that
- Sampling
- There could be extensive amount of config files generated for one application-under-test. Therefore, we sample config files randomly for each run.
- Generating random config files
- Produces a separate set of random configurations to serve as a baseline in the associated scientific publication.
A sample of one config files related to the system call accept4, after the pipeline:
{
"syslog_monitor_config": {
"id": "accept4_98",
"strace_output": "/export/strace.output.{id}",
"output": [
{
"format": "csv",
"target": "/export/output.{id}.csv"
}
],
"faults": [
"inject=accept4:retval=4294967295:when=14..14"
]
}
}First, make sure you have set up your test environment, specified in Error Injection - Setup. For a quick start, follow Error Injection - Quick Start.
A very detailed documentation is provided in syscallm-injection/README.md that includes:
- How to configure the experiment environment
- How to test your own application
- How to build a monitor component
- How to build a test image
- How to run the experiments
- How to extract the experiment results
After running the experiments and extracting the results by:
python3 ./syscallm-injection/src/failure_analysis/main.py --output result.csvGenerate coverage and failure analysis plots by:
# analyze failure patterns
python3 src/plot/plot_failure.pyThis project relies on the usage of open-source Python libraries. Please see syscallm-generation/README.md and syscallm-injection/README.md.
For any questions or issues, please contact Min Hee Jo.
SyscaLLM is open-sourced under the AGPL-3.0 license. See the LICENSE file for details.