Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
amem_config.json	amem_config.json
api_config.json	api_config.json
run_construction.sh	run_construction.sh
run_evaluation.sh	run_evaluation.sh
run_search.sh	run_search.sh

Evaluate A-MEM on LongMemEval

This example walks through a complete evaluation of the A-MEM memory layer on the LongMemEval dataset using the three-stage pipeline: memory construction, memory retrieval, and question answering with evaluation.

You can replace A-MEM with any other supported memory layer by swapping the config and --memory-type argument.

Step 1: Download the Dataset

Download the LongMemEval dataset from HuggingFace:

https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned

Save it to a local path, e.g., /path/to/longmemeval.json.

Step 2: Prepare Configuration Files

Memory Configuration

Each memory layer requires its own configuration JSON file. This example uses A-MEM. See amem_config.json.

Note: The user_id field is a placeholder that will be overwritten during execution. API keys and base URLs are read from the environment variables OPENAI_API_KEY and OPENAI_API_BASE by default. You can also set them explicitly via llm_api_key / llm_base_url and embedding_api_key / embedding_base_url in the config if needed.

The full list of configuration fields can be found in membase/configs/amem.py.

API Configuration (for Evaluation)

The evaluation stage requires an API config to call LLM-based QA and judge models. See api_config.json:

{
    "api_keys": ["sk-your-api-key-1", "sk-your-api-key-2"],
    "base_urls": ["https://api.openai.com/v1", "https://api.openai.com/v1"]
}

Alternatively, set environment variables instead of using --api-config-path:

export OPENAI_API_KEY="sk-your-api-key"
export OPENAI_API_BASE="https://api.openai.com/v1"

Step 3: Run Memory Construction (Stage 1)

Edit run_construction.sh to set your dataset path, API keys, and base URLs, then run:

bash examples/evaluate_amem_on_longmemeval/run_construction.sh

This example processes 4 trajectories split across 2 parallel processes (ranges 0-2 and 2-4), each with num_workers=2. Monitor progress in amem_logs/.

Step 4: Run Memory Retrieval (Stage 2)

After memory construction completes, edit run_search.sh and run:

bash examples/evaluate_amem_on_longmemeval/run_search.sh

The output will be saved to {save_dir}/{top_k}_{start_idx}_{end_idx}.json (e.g., amem_output/10_0_4.json).

Step 5: Run Evaluation (Stage 3)

Edit run_evaluation.sh and run:

bash examples/evaluate_amem_on_longmemeval/run_evaluation.sh

The evaluation results will be saved as {search_results_path}_evaluation.json.

Tips

API Rate Limits: Set num_workers conservatively (e.g., 4-8) to avoid upstream API overload.
Resume Interrupted Runs: If the process is interrupted, simply re-run the same command. Completed trajectories will be skipped automatically.
Token Cost Tracking: Check the generated token_cost_*.json files for detailed token consumption statistics.
Log Files: Monitor {log_dir}/process_*.log files for real-time progress and debugging.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Evaluate A-MEM on LongMemEval

Step 1: Download the Dataset

Step 2: Prepare Configuration Files

Memory Configuration

API Configuration (for Evaluation)

Step 3: Run Memory Construction (Stage 1)

Step 4: Run Memory Retrieval (Stage 2)

Step 5: Run Evaluation (Stage 3)

Tips

FilesExpand file tree

evaluate_amem_on_longmemeval

Directory actions

More options

Directory actions

More options

Latest commit

History

evaluate_amem_on_longmemeval

Folders and files

parent directory

README.md

Evaluate A-MEM on LongMemEval

Step 1: Download the Dataset

Step 2: Prepare Configuration Files

Memory Configuration

API Configuration (for Evaluation)

Step 3: Run Memory Construction (Stage 1)

Step 4: Run Memory Retrieval (Stage 2)

Step 5: Run Evaluation (Stage 3)

Tips