HiKE: Hierarchical Evaluation Framework for Korean-English Code-Switching Speech Recognition

Gio Paik*, Yongbeom Kim, Soungmin Lee, Sangmin Ahn†, and Chanwoo Kim†, EACL Findings 2026
* Corresponding Author, † Equal Contribution

🇰🇷 한국어 문서

✨ Code | 🤗 Dataset | 📖 Paper

Introduction

HiKE is the first Korean-English Code-Switching (CS) Automatic Speech Recognition (ASR) benchmark composed of high-quality, natural CS data across various topics. We use Mixed Error Rate (MER) and Point of Interest Error Rate (PIER) [1] to precisely evaluate the models' CS ASR capability.

Experimental results show that all multilingual ASR models exhibit significantly higher error rates on code-switching data, and that their CS-ASR capabilities can be improved through fine-tuning.

For further details, please refer to our paper.

[1] Ugan et al., “PIER: A Novel Metric for Evaluating What Matters in Code-Switching”, ICASSP 2025

Hierarchical CS-Level Labels

To provide more fine-grained comparison of model performance on different forms of code-switching, we labeled each utterance according to the following levels:

Word-level CS: Code-switching that occurs at the word level, typically as the substitution of a single noun or adjective.
Phrase-level CS: Occurs when a multi-word phrase within a sentence appears in another language.
Sentence-level CS: The alternation between languages on a sentence-by-sentence basis.

Loanword Labels

Loanwords are words adopted from a foreign language and adapted to the phonology and orthography of the new language. For example, the Korean loanword '버스' [bəs] and the English word 'bus' [bʌs] are pronounced almost identically and can be used interchangeably in a CS context. To avoid this problem, we meticulously labeled all loanwords contained in our dataset.

How To Use

Install Dependencies

git clone --recurse-submodules https://github.com/ThetaOne-AI/HiKE
cd HiKE
pip install -r requirements.txt
apt-get update && apt-get install -y ffmpeg  # install ffmpeg if needed

Run Evaluation

bash scripts/evaluate_whisper.sh
# or
python src/main.py --model whisper --model_name openai/whisper-large --batch_size 8

The results will be saved in ./outputs.

Evaluate Your Model

Implement a class that follows the BaseASR interface in src/models/your_model.py, and register it in src/main.py.

Create src/models/your_model.py:

from typing import List, Dict, Any
from src.models import BaseASR


class YourModel(BaseASR):
    def __init__(self, model_name: str = "your/model-or-config"):
        self.model_name = model_name
        # TODO: load your model or client here

    def generate(self, input, batch_size: int | None = None, **kwargs) -> List[Dict[str, Any]]:
        if not isinstance(input, list):
            input = [input]
        return [{"text": your_transcribe_fn(x)} for x in input]

Register in src/main.py:

elif model == "your_model":
    from models.your_model import YourModel
    asr = YourModel(model_name)

Run:

python src/main.py --model your_model --model_name your/model-or-name

Citation

@inproceedings{paik2026hike,
    title = "{H}i{KE}: Hierarchical Evaluation Framework for {K}orean-{E}nglish Code-Switching Speech Recognition",
    author = "Paik, Gio  and
      Kim, Yongbeom  and
      Lee, Soungmin  and
      Ahn, Sangmin  and
      Kim, Chan Woo",
    editor = "Demberg, Vera  and
      Inui, Kentaro  and
      Marquez, Llu{\'i}s",
    booktitle = "Findings of the {A}ssociation for {C}omputational {L}inguistics: {EACL} 2026",
    month = mar,
    year = "2026",
    address = "Rabat, Morocco",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2026.findings-eacl.33/",
    doi = "10.18653/v1/2026.findings-eacl.33",
    pages = "673--681",
    ISBN = "979-8-89176-386-9"
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
docs		docs
scripts		scripts
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
README_ko.md		README_ko.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HiKE: Hierarchical Evaluation Framework for Korean-English Code-Switching Speech Recognition

Introduction

Hierarchical CS-Level Labels

Loanword Labels

How To Use

Install Dependencies

Run Evaluation

Evaluate Your Model

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HiKE: Hierarchical Evaluation Framework for Korean-English Code-Switching Speech Recognition

Introduction

Hierarchical CS-Level Labels

Loanword Labels

How To Use

Install Dependencies

Run Evaluation

Evaluate Your Model

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages