Skip to content

[WebAssembly] WASIP3 and component model threading support#175800

Open
TartanLlama wants to merge 126 commits intollvm:mainfrom
TartanLlama:sy/wasip3
Open

[WebAssembly] WASIP3 and component model threading support#175800
TartanLlama wants to merge 126 commits intollvm:mainfrom
TartanLlama:sy/wasip3

Conversation

@TartanLlama
Copy link
Copy Markdown
Contributor

@TartanLlama TartanLlama commented Jan 13, 2026

The WebAssembly Component Model has added support for cooperative multithreading. This has been implemented in the Wasmtime engine and is part of the wider project of WASI preview 3, which is currently tracked here.

These changes will require updating the way that __stack_pointer and __tls_base work purely for a new wasm32-wasip3 target; other targets will not be touched. Specifically, rather than using a Wasm global for tracking the stack pointer and TLS base, the new context.get/set component model builtin functions will be used (the intention being that runtimes will need to aggressively optimize these calls into single load/stores). For justification on this choice rather than switching out the global at context-switch boundaries, see this comment and this comment.

I guard the new behaviour behind a new component-model-thread-context feature, which is automatically enabled for WASIP3 targets. The linker determines what thread context ABI to use based on the presence of this feature.

Includes #182896

@github-actions
Copy link
Copy Markdown

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

Copy link
Copy Markdown
Contributor

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two higher-ish level thoughts on this:

  • Would it be possible to decouple the selection of how things work internally from the target name and "wasip3" suffix? These options will, I believe, be useful for experimenting on other wasm targets (e.g. even wasm32-wasip2) and it would be useful to have knobs to turn without faking/forcing a target. My thinking is that the default behavior for wasm32-wasip3 is still the same, exactly as-is in this PR, but the knobs could be further refined if so desired for power users. Effectively there'd be per-target defaults for the knobs, but the knobs could be manually overridden if needed.
  • What happens if objects of one ABI are mixed with objects of another ABI? For example if I were to link code compiled for wasm32-wasip2 with code for wasm32-wasip3, what would happen? Ideally I'd expect a linker-level error to be emitted with some long enough string that could be googled but probably wouldn't be descriptive in its own right. My main worry is less mixing targets and more mixing versions of LLVM by accident and ensuring that things don't silently link and then get weirdly corrupted at runtime.

@TartanLlama TartanLlama marked this pull request as ready for review February 17, 2026 08:06
@llvmbot llvmbot added lld backend:WebAssembly clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" llvm:mc Machine (object) code lld:wasm labels Feb 17, 2026
@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Feb 17, 2026

@llvm/pr-subscribers-lld-wasm
@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-lld

Author: Sy Brand (TartanLlama)

Changes

(Currently in draft, as this will evolve alongside other toolchain component updates)

The WebAssembly Component Model has added support for cooperative multithreading. This has been implemented in the Wasmtime engine and is part of the wider project of WASI preview 3, which is currently tracked here.

These changes will require updating the way that __stack_pointer and __tls_base work purely for a new wasm32-wasip3 target; other targets will not be touched. Specifically, rather than using a Wasm global for tracking the stack pointer and TLS base, the new context.get/set component model builtin functions will be used (the intention being that runtimes will need to aggressively optimize these calls into single load/stores). For justification on this choice rather than switching out the global at context-switch boundaries, see this comment and this comment.


Patch is 55.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/175800.diff

22 Files Affected:

  • (modified) clang/lib/Basic/Targets/WebAssembly.cpp (+2-1)
  • (modified) clang/lib/Driver/ToolChains/WebAssembly.cpp (+22-11)
  • (modified) lld/wasm/Config.h (+12)
  • (modified) lld/wasm/Driver.cpp (+51-16)
  • (modified) lld/wasm/Relocations.cpp (+2-2)
  • (modified) lld/wasm/Symbols.cpp (+3-5)
  • (modified) lld/wasm/SyntheticSections.cpp (+11-11)
  • (modified) lld/wasm/Writer.cpp (+22-15)
  • (modified) lld/wasm/WriterUtils.cpp (+23-1)
  • (modified) lld/wasm/WriterUtils.h (+4)
  • (modified) llvm/include/llvm/MC/MCSymbolWasm.h (+2-6)
  • (modified) llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp (+20-2)
  • (modified) llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h (+121-121)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp (+26-5)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyFrameLowering.cpp (+33-20)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyFrameLowering.h (+3-3)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp (+2-4)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp (+4-17)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyLateEHPrepare.cpp (+1-1)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp (+21-12)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyUtilities.cpp (+18)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyUtilities.h (+9)
diff --git a/clang/lib/Basic/Targets/WebAssembly.cpp b/clang/lib/Basic/Targets/WebAssembly.cpp
index daaefd9a1267c..1905b838e52a1 100644
--- a/clang/lib/Basic/Targets/WebAssembly.cpp
+++ b/clang/lib/Basic/Targets/WebAssembly.cpp
@@ -410,7 +410,8 @@ void WebAssemblyTargetInfo::adjust(DiagnosticsEngine &Diags, LangOptions &Opts,
   // Turn off POSIXThreads and ThreadModel so that we don't predefine _REENTRANT
   // or __STDCPP_THREADS__ if we will eventually end up stripping atomics
   // because they are unsupported.
-  if (!HasAtomics || !HasBulkMemory) {
+  if (getTriple().getOSName() != "wasip3" &&
+      (!HasAtomics || !HasBulkMemory)) {
     Opts.POSIXThreads = false;
     Opts.setThreadModel(LangOptions::ThreadModelKind::Single);
     Opts.ThreadsafeStatics = false;
diff --git a/clang/lib/Driver/ToolChains/WebAssembly.cpp b/clang/lib/Driver/ToolChains/WebAssembly.cpp
index b5fa5760a46a0..efeadcc6556de 100644
--- a/clang/lib/Driver/ToolChains/WebAssembly.cpp
+++ b/clang/lib/Driver/ToolChains/WebAssembly.cpp
@@ -30,13 +30,14 @@ using namespace llvm::opt;
 std::string WebAssembly::getMultiarchTriple(const Driver &D,
                                             const llvm::Triple &TargetTriple,
                                             StringRef SysRoot) const {
-    return (TargetTriple.getArchName() + "-" +
-            TargetTriple.getOSAndEnvironmentName()).str();
+  return (TargetTriple.getArchName() + "-" +
+          TargetTriple.getOSAndEnvironmentName())
+      .str();
 }
 
 std::string wasm::Linker::getLinkerPath(const ArgList &Args) const {
   const ToolChain &ToolChain = getToolChain();
-  if (const Arg* A = Args.getLastArg(options::OPT_fuse_ld_EQ)) {
+  if (const Arg *A = Args.getLastArg(options::OPT_fuse_ld_EQ)) {
     StringRef UseLinker = A->getValue();
     if (!UseLinker.empty()) {
       if (llvm::sys::path::is_absolute(UseLinker) &&
@@ -79,6 +80,10 @@ static bool WantsPthread(const llvm::Triple &Triple, const ArgList &Args) {
   return WantsPthread;
 }
 
+static bool WantsSharedMemory(const llvm::Triple &Triple, const ArgList &Args) {
+  return WantsPthread(Triple, Args) && !TargetBuildsComponents(Triple);
+}
+
 void wasm::Linker::ConstructJob(Compilation &C, const JobAction &JA,
                                 const InputInfo &Output,
                                 const InputInfoList &Inputs,
@@ -90,10 +95,14 @@ void wasm::Linker::ConstructJob(Compilation &C, const JobAction &JA,
   ArgStringList CmdArgs;
 
   CmdArgs.push_back("-m");
+  std::string arch;
   if (ToolChain.getTriple().isArch64Bit())
-    CmdArgs.push_back("wasm64");
+    arch = "wasm64";
   else
-    CmdArgs.push_back("wasm32");
+    arch = "wasm32";
+  if (ToolChain.getTriple().getOSName() == "wasip3")
+    arch += "-wasip3";
+  CmdArgs.push_back(Args.MakeArgString(arch));
 
   if (Args.hasArg(options::OPT_s))
     CmdArgs.push_back("--strip-all");
@@ -160,7 +169,7 @@ void wasm::Linker::ConstructJob(Compilation &C, const JobAction &JA,
 
   AddLinkerInputs(ToolChain, Inputs, Args, CmdArgs, JA);
 
-  if (WantsPthread(ToolChain.getTriple(), Args))
+  if (WantsSharedMemory(ToolChain.getTriple(), Args))
     CmdArgs.push_back("--shared-memory");
 
   if (!Args.hasArg(options::OPT_nostdlib, options::OPT_nodefaultlibs)) {
@@ -233,9 +242,9 @@ void wasm::Linker::ConstructJob(Compilation &C, const JobAction &JA,
 /// Given a base library directory, append path components to form the
 /// LTO directory.
 static std::string AppendLTOLibDir(const std::string &Dir) {
-    // The version allows the path to be keyed to the specific version of
-    // LLVM in used, as the bitcode format is not stable.
-    return Dir + "/llvm-lto/" LLVM_VERSION_STRING;
+  // The version allows the path to be keyed to the specific version of
+  // LLVM in used, as the bitcode format is not stable.
+  return Dir + "/llvm-lto/" LLVM_VERSION_STRING;
 }
 
 WebAssembly::WebAssembly(const Driver &D, const llvm::Triple &Triple,
@@ -508,7 +517,8 @@ void WebAssembly::AddClangSystemIncludeArgs(const ArgList &DriverArgs,
   if (getTriple().getOS() != llvm::Triple::UnknownOS) {
     const std::string MultiarchTriple =
         getMultiarchTriple(D, getTriple(), D.SysRoot);
-    addSystemInclude(DriverArgs, CC1Args, D.SysRoot + "/include/" + MultiarchTriple);
+    addSystemInclude(DriverArgs, CC1Args,
+                     D.SysRoot + "/include/" + MultiarchTriple);
   }
   addSystemInclude(DriverArgs, CC1Args, D.SysRoot + "/include");
 }
@@ -637,5 +647,6 @@ void WebAssembly::addLibStdCXXIncludePaths(
   // Second add the generic one.
   addSystemInclude(DriverArgs, CC1Args, LibPath + "/c++/" + Version);
   // Third the backward one.
-  addSystemInclude(DriverArgs, CC1Args, LibPath + "/c++/" + Version + "/backward");
+  addSystemInclude(DriverArgs, CC1Args,
+                   LibPath + "/c++/" + Version + "/backward");
 }
diff --git a/lld/wasm/Config.h b/lld/wasm/Config.h
index 31e08e4e248a4..d291a42da200f 100644
--- a/lld/wasm/Config.h
+++ b/lld/wasm/Config.h
@@ -35,6 +35,7 @@ class Symbol;
 class DefinedData;
 class GlobalSymbol;
 class DefinedFunction;
+class UndefinedFunction;
 class DefinedGlobal;
 class UndefinedGlobal;
 class TableSymbol;
@@ -50,6 +51,8 @@ enum class BuildIdKind { None, Fast, Sha1, Hexstring, Uuid };
 // and such fields have the same name as the corresponding options.
 // Most fields are initialized by the driver.
 struct Config {
+  bool isMultithreaded() const { return sharedMemory || isWasip3; }
+
   bool allowMultipleDefinition;
   bool bsymbolic;
   bool checkFeatures;
@@ -71,6 +74,7 @@ struct Config {
   bool importTable;
   bool importUndefined;
   std::optional<bool> is64;
+  bool isWasip3;
   bool mergeDataSegments;
   bool noinhibitExec;
   bool pie;
@@ -252,6 +256,14 @@ struct Ctx {
     // Used as an address space for function pointers, with each function that
     // is used as a function pointer being allocated a slot.
     TableSymbol *indirectFunctionTable;
+
+    // __wasm_component_model_builtin_context_set_1
+    // Function used to set TLS base in component model modules.
+    UndefinedFunction *contextSet1;
+
+    // __wasm_component_model_builtin_context_get_1
+    // Function used to get TLS base in component model modules.
+    UndefinedFunction *contextGet1;
   };
   WasmSym sym;
 
diff --git a/lld/wasm/Driver.cpp b/lld/wasm/Driver.cpp
index b1e36f2ecff74..6eaacd7288f22 100644
--- a/lld/wasm/Driver.cpp
+++ b/lld/wasm/Driver.cpp
@@ -656,15 +656,16 @@ static void readConfigs(opt::InputArgList &args) {
   ctx.arg.exportDynamic =
       args.hasFlag(OPT_export_dynamic, OPT_no_export_dynamic, ctx.arg.shared);
 
-  // Parse wasm32/64.
+  // Parse wasm32/64 and maybe -wasip3.
   if (auto *arg = args.getLastArg(OPT_m)) {
     StringRef s = arg->getValue();
-    if (s == "wasm32")
+    if (s.starts_with("wasm32"))
       ctx.arg.is64 = false;
-    else if (s == "wasm64")
+    else if (s.starts_with("wasm64"))
       ctx.arg.is64 = true;
     else
       error("invalid target architecture: " + s);
+    ctx.arg.isWasip3 = s.ends_with("-wasip3");
   }
 
   // --threads= takes a positive integer and provides the default value for
@@ -827,6 +828,10 @@ static void checkOptions(opt::InputArgList &args) {
     if (ctx.arg.tableBase)
       error("--table-base may not be used with -shared/-pie");
   }
+
+  if (ctx.arg.sharedMemory && ctx.arg.isWasip3) {
+    error("--shared-memory is incompatible with the wasip3 target");
+  }
 }
 
 static const char *getReproduceOption(opt::InputArgList &args) {
@@ -885,7 +890,7 @@ static void writeWhyExtract() {
 // Equivalent of demote demoteSharedAndLazySymbols() in the ELF linker
 static void demoteLazySymbols() {
   for (Symbol *sym : symtab->symbols()) {
-    if (auto* s = dyn_cast<LazySymbol>(sym)) {
+    if (auto *s = dyn_cast<LazySymbol>(sym)) {
       if (s->signature) {
         LLVM_DEBUG(llvm::dbgs()
                    << "demoting lazy func: " << s->getName() << "\n");
@@ -906,6 +911,18 @@ createUndefinedGlobal(StringRef name, llvm::wasm::WasmGlobalType *type) {
   return sym;
 }
 
+static UndefinedFunction *
+createUndefinedFunction(StringRef name, std::optional<StringRef> importName,
+                        std::optional<StringRef> importModule,
+                        WasmSignature *signature) {
+  auto *sym = cast<UndefinedFunction>(symtab->addUndefinedFunction(
+      name, importName, importModule, WASM_SYMBOL_UNDEFINED, nullptr, signature,
+      true));
+  ctx.arg.allowUndefinedSymbols.insert(sym->getName());
+  sym->isUsedInRegularObj = true;
+  return sym;
+}
+
 static InputGlobal *createGlobal(StringRef name, bool isMutable) {
   llvm::wasm::WasmGlobal wasmGlobal;
   bool is64 = ctx.arg.is64.value_or(false);
@@ -946,11 +963,13 @@ static void createSyntheticSymbols() {
 
   bool is64 = ctx.arg.is64.value_or(false);
 
+  auto stack_pointer_name =
+      ctx.arg.isWasip3 ? "__init_stack_pointer" : "__stack_pointer";
   if (ctx.isPic) {
     ctx.sym.stackPointer =
-        createUndefinedGlobal("__stack_pointer", ctx.arg.is64.value_or(false)
-                                                     ? &mutableGlobalTypeI64
-                                                     : &mutableGlobalTypeI32);
+        createUndefinedGlobal(stack_pointer_name, ctx.arg.is64.value_or(false)
+                                                      ? &mutableGlobalTypeI64
+                                                      : &mutableGlobalTypeI32);
     // For PIC code, we import two global variables (__memory_base and
     // __table_base) from the environment and use these as the offset at
     // which to load our static data and function table.
@@ -963,14 +982,15 @@ static void createSyntheticSymbols() {
     ctx.sym.tableBase->markLive();
   } else {
     // For non-PIC code
-    ctx.sym.stackPointer = createGlobalVariable("__stack_pointer", true);
+    ctx.sym.stackPointer = createGlobalVariable(stack_pointer_name, true);
     ctx.sym.stackPointer->markLive();
   }
 
-  if (ctx.arg.sharedMemory) {
+  if (ctx.arg.isMultithreaded()) {
     // TLS symbols are all hidden/dso-local
-    ctx.sym.tlsBase =
-        createGlobalVariable("__tls_base", true, WASM_SYMBOL_VISIBILITY_HIDDEN);
+    auto tls_base_name = ctx.arg.isWasip3 ? "__init_tls_base" : "__tls_base";
+    ctx.sym.tlsBase = createGlobalVariable(tls_base_name, true,
+                                           WASM_SYMBOL_VISIBILITY_HIDDEN);
     ctx.sym.tlsSize = createGlobalVariable("__tls_size", false,
                                            WASM_SYMBOL_VISIBILITY_HIDDEN);
     ctx.sym.tlsAlign = createGlobalVariable("__tls_align", false,
@@ -979,6 +999,21 @@ static void createSyntheticSymbols() {
         "__wasm_init_tls", WASM_SYMBOL_VISIBILITY_HIDDEN,
         make<SyntheticFunction>(is64 ? i64ArgSignature : i32ArgSignature,
                                 "__wasm_init_tls"));
+    if (ctx.arg.isWasip3) {
+      ctx.sym.tlsBase->markLive();
+      ctx.sym.tlsSize->markLive();
+      ctx.sym.tlsAlign->markLive();
+      static WasmSignature contextSet1Signature{{}, {ValType::I32}};
+      ctx.sym.contextSet1 = createUndefinedFunction(
+          "__wasm_component_model_builtin_context_set_1", "[context-set-1]",
+          "$root", &contextSet1Signature);
+      ctx.sym.contextSet1->markLive();
+      static WasmSignature contextGet1Signature{{ValType::I32}, {}};
+      ctx.sym.contextGet1 = createUndefinedFunction(
+          "__wasm_component_model_builtin_context_get_1", "[context-get-1]",
+          "$root", &contextGet1Signature);
+      ctx.sym.contextGet1->markLive();
+    }
   }
 }
 
@@ -1017,7 +1052,7 @@ static void createOptionalSymbols() {
   //
   // __tls_size and __tls_align are not needed in this case since they are only
   // needed for __wasm_init_tls (which we do not create in this case).
-  if (!ctx.arg.sharedMemory)
+  if (!ctx.arg.sharedMemory && !ctx.arg.isWasip3)
     ctx.sym.tlsBase = createOptionalGlobal("__tls_base", false);
 }
 
@@ -1026,15 +1061,15 @@ static void processStubLibrariesPreLTO() {
   for (auto &stub_file : ctx.stubFiles) {
     LLVM_DEBUG(llvm::dbgs()
                << "processing stub file: " << stub_file->getName() << "\n");
-    for (auto [name, deps]: stub_file->symbolDependencies) {
-      auto* sym = symtab->find(name);
+    for (auto [name, deps] : stub_file->symbolDependencies) {
+      auto *sym = symtab->find(name);
       // If the symbol is not present at all (yet), or if it is present but
       // undefined, then mark the dependent symbols as used by a regular
       // object so they will be preserved and exported by the LTO process.
       if (!sym || sym->isUndefined()) {
         for (const auto dep : deps) {
-          auto* needed = symtab->find(dep);
-          if (needed ) {
+          auto *needed = symtab->find(dep);
+          if (needed) {
             needed->isUsedInRegularObj = true;
             // Like with handleLibcall we have to extract any LTO archive
             // members that might need to be exported due to stub library
diff --git a/lld/wasm/Relocations.cpp b/lld/wasm/Relocations.cpp
index a3f87ea3d69c0..cb597fdeffcf3 100644
--- a/lld/wasm/Relocations.cpp
+++ b/lld/wasm/Relocations.cpp
@@ -33,7 +33,7 @@ static bool requiresGOTAccess(const Symbol *sym) {
   return true;
 }
 
-static bool allowUndefined(const Symbol* sym) {
+static bool allowUndefined(const Symbol *sym) {
   // Symbols that are explicitly imported are always allowed to be undefined at
   // link time.
   if (sym->isImported())
@@ -125,7 +125,7 @@ void scanRelocations(InputChunk *chunk) {
       // In single-threaded builds TLS is lowered away and TLS data can be
       // merged with normal data and allowing TLS relocation in non-TLS
       // segments.
-      if (ctx.arg.sharedMemory) {
+      if (ctx.arg.isMultithreaded()) {
         if (!sym->isTLS()) {
           error(toString(file) + ": relocation " +
                 relocTypeToString(reloc.Type) +
diff --git a/lld/wasm/Symbols.cpp b/lld/wasm/Symbols.cpp
index f2040441e6257..97a9871a06308 100644
--- a/lld/wasm/Symbols.cpp
+++ b/lld/wasm/Symbols.cpp
@@ -95,7 +95,7 @@ WasmSymbolType Symbol::getWasmType() const {
 }
 
 const WasmSignature *Symbol::getSignature() const {
-  if (auto* f = dyn_cast<FunctionSymbol>(this))
+  if (auto *f = dyn_cast<FunctionSymbol>(this))
     return f->signature;
   if (auto *t = dyn_cast<TagSymbol>(this))
     return t->signature;
@@ -223,9 +223,7 @@ bool Symbol::isExportedExplicit() const {
   return forceExport || flags & WASM_SYMBOL_EXPORTED;
 }
 
-bool Symbol::isNoStrip() const {
-  return flags & WASM_SYMBOL_NO_STRIP;
-}
+bool Symbol::isNoStrip() const { return flags & WASM_SYMBOL_NO_STRIP; }
 
 uint32_t FunctionSymbol::getFunctionIndex() const {
   if (const auto *u = dyn_cast<UndefinedFunction>(this))
@@ -413,7 +411,7 @@ void LazySymbol::setWeak() {
   flags |= (flags & ~WASM_SYMBOL_BINDING_MASK) | WASM_SYMBOL_BINDING_WEAK;
 }
 
-void printTraceSymbolUndefined(StringRef name, const InputFile* file) {
+void printTraceSymbolUndefined(StringRef name, const InputFile *file) {
   message(toString(file) + ": reference to " + name);
 }
 
diff --git a/lld/wasm/SyntheticSections.cpp b/lld/wasm/SyntheticSections.cpp
index ede6ac4da77b3..023c690c14354 100644
--- a/lld/wasm/SyntheticSections.cpp
+++ b/lld/wasm/SyntheticSections.cpp
@@ -466,8 +466,7 @@ void GlobalSection::addInternalGOTEntry(Symbol *sym) {
 void GlobalSection::generateRelocationCode(raw_ostream &os, bool TLS) const {
   assert(!ctx.arg.extendedConst);
   bool is64 = ctx.arg.is64.value_or(false);
-  unsigned opcode_ptr_add = is64 ? WASM_OPCODE_I64_ADD
-                                 : WASM_OPCODE_I32_ADD;
+  unsigned opcode_ptr_add = is64 ? WASM_OPCODE_I64_ADD : WASM_OPCODE_I32_ADD;
 
   for (const Symbol *sym : internalGotSymbols) {
     if (TLS != sym->isTLS())
@@ -477,7 +476,7 @@ void GlobalSection::generateRelocationCode(raw_ostream &os, bool TLS) const {
       // Get __memory_base
       writeU8(os, WASM_OPCODE_GLOBAL_GET, "GLOBAL_GET");
       if (sym->isTLS())
-        writeUleb128(os, ctx.sym.tlsBase->getGlobalIndex(), "__tls_base");
+        writeGetTLSBase(ctx, os);
       else
         writeUleb128(os, ctx.sym.memoryBase->getGlobalIndex(), "__memory_base");
 
@@ -520,9 +519,9 @@ void GlobalSection::writeBody() {
       // the correct runtime value during `__wasm_apply_global_relocs`.
       if (!ctx.arg.extendedConst && ctx.isPic && !sym->isTLS())
         mutable_ = true;
-      // With multi-theadeding any TLS globals must be mutable since they get
+      // With multi-threading any TLS globals must be mutable since they get
       // set during `__wasm_apply_global_tls_relocs`
-      if (ctx.arg.sharedMemory && sym->isTLS())
+      if (ctx.arg.isMultithreaded() && sym->isTLS())
         mutable_ = true;
     }
     WasmGlobalType type{itype, mutable_};
@@ -559,10 +558,11 @@ void GlobalSection::writeBody() {
     } else {
       WasmInitExpr initExpr;
       if (auto *d = dyn_cast<DefinedData>(sym))
-        // In the sharedMemory case TLS globals are set during
-        // `__wasm_apply_global_tls_relocs`, but in the non-shared case
+        // In the multi-threaded case, TLS globals are set during
+        // `__wasm_apply_global_tls_relocs`, but in the non-multi-threaded case
         // we know the absolute value at link time.
-        initExpr = intConst(d->getVA(/*absolute=*/!ctx.arg.sharedMemory), is64);
+        initExpr =
+            intConst(d->getVA(/*absolute=*/!ctx.arg.isMultithreaded()), is64);
       else if (auto *f = dyn_cast<FunctionSymbol>(sym))
         initExpr = intConst(f->isStub ? 0 : f->getTableIndex(), is64);
       else {
@@ -646,7 +646,7 @@ void ElemSection::writeBody() {
   uint32_t tableIndex = ctx.arg.tableBase;
   for (const FunctionSymbol *sym : indirectFunctions) {
     assert(sym->getTableIndex() == tableIndex);
-    (void) tableIndex;
+    (void)tableIndex;
     writeUleb128(os, sym->getFunctionIndex(), "function index");
     ++tableIndex;
   }
@@ -663,7 +663,7 @@ void DataCountSection::writeBody() {
 }
 
 bool DataCountSection::isNeeded() const {
-  return numSegments && ctx.arg.sharedMemory;
+  return numSegments && ctx.arg.isMultithreaded();
 }
 
 void LinkingSection::writeBody() {
@@ -992,4 +992,4 @@ void BuildIdSection::writeBuildId(llvm::ArrayRef<uint8_t> buf) {
   memcpy(hashPlaceholderPtr, buf.data(), hashSize);
 }
 
-} // namespace wasm::lld
+} // namespace lld::wasm
diff --git a/lld/wasm/Writer.cpp b/lld/wasm/Writer.cpp
index dfd856f2faee6..50d6449ca79a9 100644
--- a/lld/wasm/Writer.cpp
+++ b/lld/wasm/Writer.cpp
@@ -311,7 +311,8 @@ void Writer::writeBuildId() {
 }
 
 static void setGlobalPtr(DefinedGlobal *g, uint64_t memoryPtr) {
-  LLVM_DEBUG(dbgs() << "setGlobalPtr " << g->getName() << " -> " << memoryPtr << "\n");
+  LLVM_DEBUG(dbgs() << "setGlobalPtr " << g->getName() << " -> " << memoryPtr
+                    << "\n");
   g->global->setPointerValue(memoryPtr);
 }
 
@@ -358,7 +359,8 @@ void Writer::layoutMemory() {
     placeStack();
     if (ctx.arg.globalBase) {
       if (ctx.arg.globalBase < memoryPtr) {
-        error("--global-base cannot be less than stack size when --stack-first is used");
+        error("--global-base cannot be less than stack size when --stack-first "
+              "is used");
         return;
       }
       memoryPtr = ctx.arg.globalBase;
@@ -382,6 +384,7 @@ void Writer::layoutMemory() {
   for (OutputSegment *seg : segments) {
     out.dylinkSec->memAlign = std::max(out.dylinkSec->memAlign, seg->alignment);
     memoryPtr = alignTo(memoryPtr, 1ULL << seg->alignment);
+
     seg->startVA = memoryPtr;
     log(formatv("mem: {0,-15} offset={1,-8} size={2,-8} align={3}", seg->name,
                 memoryPtr, seg->size, seg->alignment));
@@ -1029,7 +1032,7 @@ static StringRef getOutputDataSegmentName(const InputChunk &seg) {
 OutputSegment *Writer::createOutputSegment(StringRef name) {
   LLVM_DEBUG(dbgs() << "new segment: " << name << "\n");
   OutputSegment *s = make<OutputSegment>(name);
-  if (ctx.arg.sharedMemory)
+  if (ctx.arg.isMultithreaded())
     s->initFlags = WASM_DATA_SEGMENT_IS_PASSIVE;
   if (!ctx.arg.relocatable && name.starts_with(".bss"))
     s->isBss = true;
@@ -1163,14 +1166,14 @@ void Writer::createSyntheticInitFunctions() {
         "__wasm_init_memory", WASM_SYMBOL_VISIBILITY_HIDDEN,
         make<SyntheticFunction>(nullSignature, "__wasm_init_memory"));
     ctx.sym.initMemory->mark...
[truncated]

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Feb 17, 2026

@llvm/pr-subscribers-llvm-mc

Author: Sy Brand (TartanLlama)

Changes

(Currently in draft, as this will evolve alongside other toolchain component updates)

The WebAssembly Component Model has added support for cooperative multithreading. This has been implemented in the Wasmtime engine and is part of the wider project of WASI preview 3, which is currently tracked here.

These changes will require updating the way that __stack_pointer and __tls_base work purely for a new wasm32-wasip3 target; other targets will not be touched. Specifically, rather than using a Wasm global for tracking the stack pointer and TLS base, the new context.get/set component model builtin functions will be used (the intention being that runtimes will need to aggressively optimize these calls into single load/stores). For justification on this choice rather than switching out the global at context-switch boundaries, see this comment and this comment.


Patch is 55.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/175800.diff

22 Files Affected:

  • (modified) clang/lib/Basic/Targets/WebAssembly.cpp (+2-1)
  • (modified) clang/lib/Driver/ToolChains/WebAssembly.cpp (+22-11)
  • (modified) lld/wasm/Config.h (+12)
  • (modified) lld/wasm/Driver.cpp (+51-16)
  • (modified) lld/wasm/Relocations.cpp (+2-2)
  • (modified) lld/wasm/Symbols.cpp (+3-5)
  • (modified) lld/wasm/SyntheticSections.cpp (+11-11)
  • (modified) lld/wasm/Writer.cpp (+22-15)
  • (modified) lld/wasm/WriterUtils.cpp (+23-1)
  • (modified) lld/wasm/WriterUtils.h (+4)
  • (modified) llvm/include/llvm/MC/MCSymbolWasm.h (+2-6)
  • (modified) llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp (+20-2)
  • (modified) llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h (+121-121)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp (+26-5)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyFrameLowering.cpp (+33-20)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyFrameLowering.h (+3-3)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp (+2-4)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp (+4-17)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyLateEHPrepare.cpp (+1-1)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp (+21-12)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyUtilities.cpp (+18)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyUtilities.h (+9)
diff --git a/clang/lib/Basic/Targets/WebAssembly.cpp b/clang/lib/Basic/Targets/WebAssembly.cpp
index daaefd9a1267c..1905b838e52a1 100644
--- a/clang/lib/Basic/Targets/WebAssembly.cpp
+++ b/clang/lib/Basic/Targets/WebAssembly.cpp
@@ -410,7 +410,8 @@ void WebAssemblyTargetInfo::adjust(DiagnosticsEngine &Diags, LangOptions &Opts,
   // Turn off POSIXThreads and ThreadModel so that we don't predefine _REENTRANT
   // or __STDCPP_THREADS__ if we will eventually end up stripping atomics
   // because they are unsupported.
-  if (!HasAtomics || !HasBulkMemory) {
+  if (getTriple().getOSName() != "wasip3" &&
+      (!HasAtomics || !HasBulkMemory)) {
     Opts.POSIXThreads = false;
     Opts.setThreadModel(LangOptions::ThreadModelKind::Single);
     Opts.ThreadsafeStatics = false;
diff --git a/clang/lib/Driver/ToolChains/WebAssembly.cpp b/clang/lib/Driver/ToolChains/WebAssembly.cpp
index b5fa5760a46a0..efeadcc6556de 100644
--- a/clang/lib/Driver/ToolChains/WebAssembly.cpp
+++ b/clang/lib/Driver/ToolChains/WebAssembly.cpp
@@ -30,13 +30,14 @@ using namespace llvm::opt;
 std::string WebAssembly::getMultiarchTriple(const Driver &D,
                                             const llvm::Triple &TargetTriple,
                                             StringRef SysRoot) const {
-    return (TargetTriple.getArchName() + "-" +
-            TargetTriple.getOSAndEnvironmentName()).str();
+  return (TargetTriple.getArchName() + "-" +
+          TargetTriple.getOSAndEnvironmentName())
+      .str();
 }
 
 std::string wasm::Linker::getLinkerPath(const ArgList &Args) const {
   const ToolChain &ToolChain = getToolChain();
-  if (const Arg* A = Args.getLastArg(options::OPT_fuse_ld_EQ)) {
+  if (const Arg *A = Args.getLastArg(options::OPT_fuse_ld_EQ)) {
     StringRef UseLinker = A->getValue();
     if (!UseLinker.empty()) {
       if (llvm::sys::path::is_absolute(UseLinker) &&
@@ -79,6 +80,10 @@ static bool WantsPthread(const llvm::Triple &Triple, const ArgList &Args) {
   return WantsPthread;
 }
 
+static bool WantsSharedMemory(const llvm::Triple &Triple, const ArgList &Args) {
+  return WantsPthread(Triple, Args) && !TargetBuildsComponents(Triple);
+}
+
 void wasm::Linker::ConstructJob(Compilation &C, const JobAction &JA,
                                 const InputInfo &Output,
                                 const InputInfoList &Inputs,
@@ -90,10 +95,14 @@ void wasm::Linker::ConstructJob(Compilation &C, const JobAction &JA,
   ArgStringList CmdArgs;
 
   CmdArgs.push_back("-m");
+  std::string arch;
   if (ToolChain.getTriple().isArch64Bit())
-    CmdArgs.push_back("wasm64");
+    arch = "wasm64";
   else
-    CmdArgs.push_back("wasm32");
+    arch = "wasm32";
+  if (ToolChain.getTriple().getOSName() == "wasip3")
+    arch += "-wasip3";
+  CmdArgs.push_back(Args.MakeArgString(arch));
 
   if (Args.hasArg(options::OPT_s))
     CmdArgs.push_back("--strip-all");
@@ -160,7 +169,7 @@ void wasm::Linker::ConstructJob(Compilation &C, const JobAction &JA,
 
   AddLinkerInputs(ToolChain, Inputs, Args, CmdArgs, JA);
 
-  if (WantsPthread(ToolChain.getTriple(), Args))
+  if (WantsSharedMemory(ToolChain.getTriple(), Args))
     CmdArgs.push_back("--shared-memory");
 
   if (!Args.hasArg(options::OPT_nostdlib, options::OPT_nodefaultlibs)) {
@@ -233,9 +242,9 @@ void wasm::Linker::ConstructJob(Compilation &C, const JobAction &JA,
 /// Given a base library directory, append path components to form the
 /// LTO directory.
 static std::string AppendLTOLibDir(const std::string &Dir) {
-    // The version allows the path to be keyed to the specific version of
-    // LLVM in used, as the bitcode format is not stable.
-    return Dir + "/llvm-lto/" LLVM_VERSION_STRING;
+  // The version allows the path to be keyed to the specific version of
+  // LLVM in used, as the bitcode format is not stable.
+  return Dir + "/llvm-lto/" LLVM_VERSION_STRING;
 }
 
 WebAssembly::WebAssembly(const Driver &D, const llvm::Triple &Triple,
@@ -508,7 +517,8 @@ void WebAssembly::AddClangSystemIncludeArgs(const ArgList &DriverArgs,
   if (getTriple().getOS() != llvm::Triple::UnknownOS) {
     const std::string MultiarchTriple =
         getMultiarchTriple(D, getTriple(), D.SysRoot);
-    addSystemInclude(DriverArgs, CC1Args, D.SysRoot + "/include/" + MultiarchTriple);
+    addSystemInclude(DriverArgs, CC1Args,
+                     D.SysRoot + "/include/" + MultiarchTriple);
   }
   addSystemInclude(DriverArgs, CC1Args, D.SysRoot + "/include");
 }
@@ -637,5 +647,6 @@ void WebAssembly::addLibStdCXXIncludePaths(
   // Second add the generic one.
   addSystemInclude(DriverArgs, CC1Args, LibPath + "/c++/" + Version);
   // Third the backward one.
-  addSystemInclude(DriverArgs, CC1Args, LibPath + "/c++/" + Version + "/backward");
+  addSystemInclude(DriverArgs, CC1Args,
+                   LibPath + "/c++/" + Version + "/backward");
 }
diff --git a/lld/wasm/Config.h b/lld/wasm/Config.h
index 31e08e4e248a4..d291a42da200f 100644
--- a/lld/wasm/Config.h
+++ b/lld/wasm/Config.h
@@ -35,6 +35,7 @@ class Symbol;
 class DefinedData;
 class GlobalSymbol;
 class DefinedFunction;
+class UndefinedFunction;
 class DefinedGlobal;
 class UndefinedGlobal;
 class TableSymbol;
@@ -50,6 +51,8 @@ enum class BuildIdKind { None, Fast, Sha1, Hexstring, Uuid };
 // and such fields have the same name as the corresponding options.
 // Most fields are initialized by the driver.
 struct Config {
+  bool isMultithreaded() const { return sharedMemory || isWasip3; }
+
   bool allowMultipleDefinition;
   bool bsymbolic;
   bool checkFeatures;
@@ -71,6 +74,7 @@ struct Config {
   bool importTable;
   bool importUndefined;
   std::optional<bool> is64;
+  bool isWasip3;
   bool mergeDataSegments;
   bool noinhibitExec;
   bool pie;
@@ -252,6 +256,14 @@ struct Ctx {
     // Used as an address space for function pointers, with each function that
     // is used as a function pointer being allocated a slot.
     TableSymbol *indirectFunctionTable;
+
+    // __wasm_component_model_builtin_context_set_1
+    // Function used to set TLS base in component model modules.
+    UndefinedFunction *contextSet1;
+
+    // __wasm_component_model_builtin_context_get_1
+    // Function used to get TLS base in component model modules.
+    UndefinedFunction *contextGet1;
   };
   WasmSym sym;
 
diff --git a/lld/wasm/Driver.cpp b/lld/wasm/Driver.cpp
index b1e36f2ecff74..6eaacd7288f22 100644
--- a/lld/wasm/Driver.cpp
+++ b/lld/wasm/Driver.cpp
@@ -656,15 +656,16 @@ static void readConfigs(opt::InputArgList &args) {
   ctx.arg.exportDynamic =
       args.hasFlag(OPT_export_dynamic, OPT_no_export_dynamic, ctx.arg.shared);
 
-  // Parse wasm32/64.
+  // Parse wasm32/64 and maybe -wasip3.
   if (auto *arg = args.getLastArg(OPT_m)) {
     StringRef s = arg->getValue();
-    if (s == "wasm32")
+    if (s.starts_with("wasm32"))
       ctx.arg.is64 = false;
-    else if (s == "wasm64")
+    else if (s.starts_with("wasm64"))
       ctx.arg.is64 = true;
     else
       error("invalid target architecture: " + s);
+    ctx.arg.isWasip3 = s.ends_with("-wasip3");
   }
 
   // --threads= takes a positive integer and provides the default value for
@@ -827,6 +828,10 @@ static void checkOptions(opt::InputArgList &args) {
     if (ctx.arg.tableBase)
       error("--table-base may not be used with -shared/-pie");
   }
+
+  if (ctx.arg.sharedMemory && ctx.arg.isWasip3) {
+    error("--shared-memory is incompatible with the wasip3 target");
+  }
 }
 
 static const char *getReproduceOption(opt::InputArgList &args) {
@@ -885,7 +890,7 @@ static void writeWhyExtract() {
 // Equivalent of demote demoteSharedAndLazySymbols() in the ELF linker
 static void demoteLazySymbols() {
   for (Symbol *sym : symtab->symbols()) {
-    if (auto* s = dyn_cast<LazySymbol>(sym)) {
+    if (auto *s = dyn_cast<LazySymbol>(sym)) {
       if (s->signature) {
         LLVM_DEBUG(llvm::dbgs()
                    << "demoting lazy func: " << s->getName() << "\n");
@@ -906,6 +911,18 @@ createUndefinedGlobal(StringRef name, llvm::wasm::WasmGlobalType *type) {
   return sym;
 }
 
+static UndefinedFunction *
+createUndefinedFunction(StringRef name, std::optional<StringRef> importName,
+                        std::optional<StringRef> importModule,
+                        WasmSignature *signature) {
+  auto *sym = cast<UndefinedFunction>(symtab->addUndefinedFunction(
+      name, importName, importModule, WASM_SYMBOL_UNDEFINED, nullptr, signature,
+      true));
+  ctx.arg.allowUndefinedSymbols.insert(sym->getName());
+  sym->isUsedInRegularObj = true;
+  return sym;
+}
+
 static InputGlobal *createGlobal(StringRef name, bool isMutable) {
   llvm::wasm::WasmGlobal wasmGlobal;
   bool is64 = ctx.arg.is64.value_or(false);
@@ -946,11 +963,13 @@ static void createSyntheticSymbols() {
 
   bool is64 = ctx.arg.is64.value_or(false);
 
+  auto stack_pointer_name =
+      ctx.arg.isWasip3 ? "__init_stack_pointer" : "__stack_pointer";
   if (ctx.isPic) {
     ctx.sym.stackPointer =
-        createUndefinedGlobal("__stack_pointer", ctx.arg.is64.value_or(false)
-                                                     ? &mutableGlobalTypeI64
-                                                     : &mutableGlobalTypeI32);
+        createUndefinedGlobal(stack_pointer_name, ctx.arg.is64.value_or(false)
+                                                      ? &mutableGlobalTypeI64
+                                                      : &mutableGlobalTypeI32);
     // For PIC code, we import two global variables (__memory_base and
     // __table_base) from the environment and use these as the offset at
     // which to load our static data and function table.
@@ -963,14 +982,15 @@ static void createSyntheticSymbols() {
     ctx.sym.tableBase->markLive();
   } else {
     // For non-PIC code
-    ctx.sym.stackPointer = createGlobalVariable("__stack_pointer", true);
+    ctx.sym.stackPointer = createGlobalVariable(stack_pointer_name, true);
     ctx.sym.stackPointer->markLive();
   }
 
-  if (ctx.arg.sharedMemory) {
+  if (ctx.arg.isMultithreaded()) {
     // TLS symbols are all hidden/dso-local
-    ctx.sym.tlsBase =
-        createGlobalVariable("__tls_base", true, WASM_SYMBOL_VISIBILITY_HIDDEN);
+    auto tls_base_name = ctx.arg.isWasip3 ? "__init_tls_base" : "__tls_base";
+    ctx.sym.tlsBase = createGlobalVariable(tls_base_name, true,
+                                           WASM_SYMBOL_VISIBILITY_HIDDEN);
     ctx.sym.tlsSize = createGlobalVariable("__tls_size", false,
                                            WASM_SYMBOL_VISIBILITY_HIDDEN);
     ctx.sym.tlsAlign = createGlobalVariable("__tls_align", false,
@@ -979,6 +999,21 @@ static void createSyntheticSymbols() {
         "__wasm_init_tls", WASM_SYMBOL_VISIBILITY_HIDDEN,
         make<SyntheticFunction>(is64 ? i64ArgSignature : i32ArgSignature,
                                 "__wasm_init_tls"));
+    if (ctx.arg.isWasip3) {
+      ctx.sym.tlsBase->markLive();
+      ctx.sym.tlsSize->markLive();
+      ctx.sym.tlsAlign->markLive();
+      static WasmSignature contextSet1Signature{{}, {ValType::I32}};
+      ctx.sym.contextSet1 = createUndefinedFunction(
+          "__wasm_component_model_builtin_context_set_1", "[context-set-1]",
+          "$root", &contextSet1Signature);
+      ctx.sym.contextSet1->markLive();
+      static WasmSignature contextGet1Signature{{ValType::I32}, {}};
+      ctx.sym.contextGet1 = createUndefinedFunction(
+          "__wasm_component_model_builtin_context_get_1", "[context-get-1]",
+          "$root", &contextGet1Signature);
+      ctx.sym.contextGet1->markLive();
+    }
   }
 }
 
@@ -1017,7 +1052,7 @@ static void createOptionalSymbols() {
   //
   // __tls_size and __tls_align are not needed in this case since they are only
   // needed for __wasm_init_tls (which we do not create in this case).
-  if (!ctx.arg.sharedMemory)
+  if (!ctx.arg.sharedMemory && !ctx.arg.isWasip3)
     ctx.sym.tlsBase = createOptionalGlobal("__tls_base", false);
 }
 
@@ -1026,15 +1061,15 @@ static void processStubLibrariesPreLTO() {
   for (auto &stub_file : ctx.stubFiles) {
     LLVM_DEBUG(llvm::dbgs()
                << "processing stub file: " << stub_file->getName() << "\n");
-    for (auto [name, deps]: stub_file->symbolDependencies) {
-      auto* sym = symtab->find(name);
+    for (auto [name, deps] : stub_file->symbolDependencies) {
+      auto *sym = symtab->find(name);
       // If the symbol is not present at all (yet), or if it is present but
       // undefined, then mark the dependent symbols as used by a regular
       // object so they will be preserved and exported by the LTO process.
       if (!sym || sym->isUndefined()) {
         for (const auto dep : deps) {
-          auto* needed = symtab->find(dep);
-          if (needed ) {
+          auto *needed = symtab->find(dep);
+          if (needed) {
             needed->isUsedInRegularObj = true;
             // Like with handleLibcall we have to extract any LTO archive
             // members that might need to be exported due to stub library
diff --git a/lld/wasm/Relocations.cpp b/lld/wasm/Relocations.cpp
index a3f87ea3d69c0..cb597fdeffcf3 100644
--- a/lld/wasm/Relocations.cpp
+++ b/lld/wasm/Relocations.cpp
@@ -33,7 +33,7 @@ static bool requiresGOTAccess(const Symbol *sym) {
   return true;
 }
 
-static bool allowUndefined(const Symbol* sym) {
+static bool allowUndefined(const Symbol *sym) {
   // Symbols that are explicitly imported are always allowed to be undefined at
   // link time.
   if (sym->isImported())
@@ -125,7 +125,7 @@ void scanRelocations(InputChunk *chunk) {
       // In single-threaded builds TLS is lowered away and TLS data can be
       // merged with normal data and allowing TLS relocation in non-TLS
       // segments.
-      if (ctx.arg.sharedMemory) {
+      if (ctx.arg.isMultithreaded()) {
         if (!sym->isTLS()) {
           error(toString(file) + ": relocation " +
                 relocTypeToString(reloc.Type) +
diff --git a/lld/wasm/Symbols.cpp b/lld/wasm/Symbols.cpp
index f2040441e6257..97a9871a06308 100644
--- a/lld/wasm/Symbols.cpp
+++ b/lld/wasm/Symbols.cpp
@@ -95,7 +95,7 @@ WasmSymbolType Symbol::getWasmType() const {
 }
 
 const WasmSignature *Symbol::getSignature() const {
-  if (auto* f = dyn_cast<FunctionSymbol>(this))
+  if (auto *f = dyn_cast<FunctionSymbol>(this))
     return f->signature;
   if (auto *t = dyn_cast<TagSymbol>(this))
     return t->signature;
@@ -223,9 +223,7 @@ bool Symbol::isExportedExplicit() const {
   return forceExport || flags & WASM_SYMBOL_EXPORTED;
 }
 
-bool Symbol::isNoStrip() const {
-  return flags & WASM_SYMBOL_NO_STRIP;
-}
+bool Symbol::isNoStrip() const { return flags & WASM_SYMBOL_NO_STRIP; }
 
 uint32_t FunctionSymbol::getFunctionIndex() const {
   if (const auto *u = dyn_cast<UndefinedFunction>(this))
@@ -413,7 +411,7 @@ void LazySymbol::setWeak() {
   flags |= (flags & ~WASM_SYMBOL_BINDING_MASK) | WASM_SYMBOL_BINDING_WEAK;
 }
 
-void printTraceSymbolUndefined(StringRef name, const InputFile* file) {
+void printTraceSymbolUndefined(StringRef name, const InputFile *file) {
   message(toString(file) + ": reference to " + name);
 }
 
diff --git a/lld/wasm/SyntheticSections.cpp b/lld/wasm/SyntheticSections.cpp
index ede6ac4da77b3..023c690c14354 100644
--- a/lld/wasm/SyntheticSections.cpp
+++ b/lld/wasm/SyntheticSections.cpp
@@ -466,8 +466,7 @@ void GlobalSection::addInternalGOTEntry(Symbol *sym) {
 void GlobalSection::generateRelocationCode(raw_ostream &os, bool TLS) const {
   assert(!ctx.arg.extendedConst);
   bool is64 = ctx.arg.is64.value_or(false);
-  unsigned opcode_ptr_add = is64 ? WASM_OPCODE_I64_ADD
-                                 : WASM_OPCODE_I32_ADD;
+  unsigned opcode_ptr_add = is64 ? WASM_OPCODE_I64_ADD : WASM_OPCODE_I32_ADD;
 
   for (const Symbol *sym : internalGotSymbols) {
     if (TLS != sym->isTLS())
@@ -477,7 +476,7 @@ void GlobalSection::generateRelocationCode(raw_ostream &os, bool TLS) const {
       // Get __memory_base
       writeU8(os, WASM_OPCODE_GLOBAL_GET, "GLOBAL_GET");
       if (sym->isTLS())
-        writeUleb128(os, ctx.sym.tlsBase->getGlobalIndex(), "__tls_base");
+        writeGetTLSBase(ctx, os);
       else
         writeUleb128(os, ctx.sym.memoryBase->getGlobalIndex(), "__memory_base");
 
@@ -520,9 +519,9 @@ void GlobalSection::writeBody() {
       // the correct runtime value during `__wasm_apply_global_relocs`.
       if (!ctx.arg.extendedConst && ctx.isPic && !sym->isTLS())
         mutable_ = true;
-      // With multi-theadeding any TLS globals must be mutable since they get
+      // With multi-threading any TLS globals must be mutable since they get
       // set during `__wasm_apply_global_tls_relocs`
-      if (ctx.arg.sharedMemory && sym->isTLS())
+      if (ctx.arg.isMultithreaded() && sym->isTLS())
         mutable_ = true;
     }
     WasmGlobalType type{itype, mutable_};
@@ -559,10 +558,11 @@ void GlobalSection::writeBody() {
     } else {
       WasmInitExpr initExpr;
       if (auto *d = dyn_cast<DefinedData>(sym))
-        // In the sharedMemory case TLS globals are set during
-        // `__wasm_apply_global_tls_relocs`, but in the non-shared case
+        // In the multi-threaded case, TLS globals are set during
+        // `__wasm_apply_global_tls_relocs`, but in the non-multi-threaded case
         // we know the absolute value at link time.
-        initExpr = intConst(d->getVA(/*absolute=*/!ctx.arg.sharedMemory), is64);
+        initExpr =
+            intConst(d->getVA(/*absolute=*/!ctx.arg.isMultithreaded()), is64);
       else if (auto *f = dyn_cast<FunctionSymbol>(sym))
         initExpr = intConst(f->isStub ? 0 : f->getTableIndex(), is64);
       else {
@@ -646,7 +646,7 @@ void ElemSection::writeBody() {
   uint32_t tableIndex = ctx.arg.tableBase;
   for (const FunctionSymbol *sym : indirectFunctions) {
     assert(sym->getTableIndex() == tableIndex);
-    (void) tableIndex;
+    (void)tableIndex;
     writeUleb128(os, sym->getFunctionIndex(), "function index");
     ++tableIndex;
   }
@@ -663,7 +663,7 @@ void DataCountSection::writeBody() {
 }
 
 bool DataCountSection::isNeeded() const {
-  return numSegments && ctx.arg.sharedMemory;
+  return numSegments && ctx.arg.isMultithreaded();
 }
 
 void LinkingSection::writeBody() {
@@ -992,4 +992,4 @@ void BuildIdSection::writeBuildId(llvm::ArrayRef<uint8_t> buf) {
   memcpy(hashPlaceholderPtr, buf.data(), hashSize);
 }
 
-} // namespace wasm::lld
+} // namespace lld::wasm
diff --git a/lld/wasm/Writer.cpp b/lld/wasm/Writer.cpp
index dfd856f2faee6..50d6449ca79a9 100644
--- a/lld/wasm/Writer.cpp
+++ b/lld/wasm/Writer.cpp
@@ -311,7 +311,8 @@ void Writer::writeBuildId() {
 }
 
 static void setGlobalPtr(DefinedGlobal *g, uint64_t memoryPtr) {
-  LLVM_DEBUG(dbgs() << "setGlobalPtr " << g->getName() << " -> " << memoryPtr << "\n");
+  LLVM_DEBUG(dbgs() << "setGlobalPtr " << g->getName() << " -> " << memoryPtr
+                    << "\n");
   g->global->setPointerValue(memoryPtr);
 }
 
@@ -358,7 +359,8 @@ void Writer::layoutMemory() {
     placeStack();
     if (ctx.arg.globalBase) {
       if (ctx.arg.globalBase < memoryPtr) {
-        error("--global-base cannot be less than stack size when --stack-first is used");
+        error("--global-base cannot be less than stack size when --stack-first "
+              "is used");
         return;
       }
       memoryPtr = ctx.arg.globalBase;
@@ -382,6 +384,7 @@ void Writer::layoutMemory() {
   for (OutputSegment *seg : segments) {
     out.dylinkSec->memAlign = std::max(out.dylinkSec->memAlign, seg->alignment);
     memoryPtr = alignTo(memoryPtr, 1ULL << seg->alignment);
+
     seg->startVA = memoryPtr;
     log(formatv("mem: {0,-15} offset={1,-8} size={2,-8} align={3}", seg->name,
                 memoryPtr, seg->size, seg->alignment));
@@ -1029,7 +1032,7 @@ static StringRef getOutputDataSegmentName(const InputChunk &seg) {
 OutputSegment *Writer::createOutputSegment(StringRef name) {
   LLVM_DEBUG(dbgs() << "new segment: " << name << "\n");
   OutputSegment *s = make<OutputSegment>(name);
-  if (ctx.arg.sharedMemory)
+  if (ctx.arg.isMultithreaded())
     s->initFlags = WASM_DATA_SEGMENT_IS_PASSIVE;
   if (!ctx.arg.relocatable && name.starts_with(".bss"))
     s->isBss = true;
@@ -1163,14 +1166,14 @@ void Writer::createSyntheticInitFunctions() {
         "__wasm_init_memory", WASM_SYMBOL_VISIBILITY_HIDDEN,
         make<SyntheticFunction>(nullSignature, "__wasm_init_memory"));
     ctx.sym.initMemory->mark...
[truncated]

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Feb 17, 2026

@llvm/pr-subscribers-backend-webassembly

Author: Sy Brand (TartanLlama)

Changes

(Currently in draft, as this will evolve alongside other toolchain component updates)

The WebAssembly Component Model has added support for cooperative multithreading. This has been implemented in the Wasmtime engine and is part of the wider project of WASI preview 3, which is currently tracked here.

These changes will require updating the way that __stack_pointer and __tls_base work purely for a new wasm32-wasip3 target; other targets will not be touched. Specifically, rather than using a Wasm global for tracking the stack pointer and TLS base, the new context.get/set component model builtin functions will be used (the intention being that runtimes will need to aggressively optimize these calls into single load/stores). For justification on this choice rather than switching out the global at context-switch boundaries, see this comment and this comment.


Patch is 55.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/175800.diff

22 Files Affected:

  • (modified) clang/lib/Basic/Targets/WebAssembly.cpp (+2-1)
  • (modified) clang/lib/Driver/ToolChains/WebAssembly.cpp (+22-11)
  • (modified) lld/wasm/Config.h (+12)
  • (modified) lld/wasm/Driver.cpp (+51-16)
  • (modified) lld/wasm/Relocations.cpp (+2-2)
  • (modified) lld/wasm/Symbols.cpp (+3-5)
  • (modified) lld/wasm/SyntheticSections.cpp (+11-11)
  • (modified) lld/wasm/Writer.cpp (+22-15)
  • (modified) lld/wasm/WriterUtils.cpp (+23-1)
  • (modified) lld/wasm/WriterUtils.h (+4)
  • (modified) llvm/include/llvm/MC/MCSymbolWasm.h (+2-6)
  • (modified) llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp (+20-2)
  • (modified) llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h (+121-121)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp (+26-5)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyFrameLowering.cpp (+33-20)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyFrameLowering.h (+3-3)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp (+2-4)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp (+4-17)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyLateEHPrepare.cpp (+1-1)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp (+21-12)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyUtilities.cpp (+18)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyUtilities.h (+9)
diff --git a/clang/lib/Basic/Targets/WebAssembly.cpp b/clang/lib/Basic/Targets/WebAssembly.cpp
index daaefd9a1267c..1905b838e52a1 100644
--- a/clang/lib/Basic/Targets/WebAssembly.cpp
+++ b/clang/lib/Basic/Targets/WebAssembly.cpp
@@ -410,7 +410,8 @@ void WebAssemblyTargetInfo::adjust(DiagnosticsEngine &Diags, LangOptions &Opts,
   // Turn off POSIXThreads and ThreadModel so that we don't predefine _REENTRANT
   // or __STDCPP_THREADS__ if we will eventually end up stripping atomics
   // because they are unsupported.
-  if (!HasAtomics || !HasBulkMemory) {
+  if (getTriple().getOSName() != "wasip3" &&
+      (!HasAtomics || !HasBulkMemory)) {
     Opts.POSIXThreads = false;
     Opts.setThreadModel(LangOptions::ThreadModelKind::Single);
     Opts.ThreadsafeStatics = false;
diff --git a/clang/lib/Driver/ToolChains/WebAssembly.cpp b/clang/lib/Driver/ToolChains/WebAssembly.cpp
index b5fa5760a46a0..efeadcc6556de 100644
--- a/clang/lib/Driver/ToolChains/WebAssembly.cpp
+++ b/clang/lib/Driver/ToolChains/WebAssembly.cpp
@@ -30,13 +30,14 @@ using namespace llvm::opt;
 std::string WebAssembly::getMultiarchTriple(const Driver &D,
                                             const llvm::Triple &TargetTriple,
                                             StringRef SysRoot) const {
-    return (TargetTriple.getArchName() + "-" +
-            TargetTriple.getOSAndEnvironmentName()).str();
+  return (TargetTriple.getArchName() + "-" +
+          TargetTriple.getOSAndEnvironmentName())
+      .str();
 }
 
 std::string wasm::Linker::getLinkerPath(const ArgList &Args) const {
   const ToolChain &ToolChain = getToolChain();
-  if (const Arg* A = Args.getLastArg(options::OPT_fuse_ld_EQ)) {
+  if (const Arg *A = Args.getLastArg(options::OPT_fuse_ld_EQ)) {
     StringRef UseLinker = A->getValue();
     if (!UseLinker.empty()) {
       if (llvm::sys::path::is_absolute(UseLinker) &&
@@ -79,6 +80,10 @@ static bool WantsPthread(const llvm::Triple &Triple, const ArgList &Args) {
   return WantsPthread;
 }
 
+static bool WantsSharedMemory(const llvm::Triple &Triple, const ArgList &Args) {
+  return WantsPthread(Triple, Args) && !TargetBuildsComponents(Triple);
+}
+
 void wasm::Linker::ConstructJob(Compilation &C, const JobAction &JA,
                                 const InputInfo &Output,
                                 const InputInfoList &Inputs,
@@ -90,10 +95,14 @@ void wasm::Linker::ConstructJob(Compilation &C, const JobAction &JA,
   ArgStringList CmdArgs;
 
   CmdArgs.push_back("-m");
+  std::string arch;
   if (ToolChain.getTriple().isArch64Bit())
-    CmdArgs.push_back("wasm64");
+    arch = "wasm64";
   else
-    CmdArgs.push_back("wasm32");
+    arch = "wasm32";
+  if (ToolChain.getTriple().getOSName() == "wasip3")
+    arch += "-wasip3";
+  CmdArgs.push_back(Args.MakeArgString(arch));
 
   if (Args.hasArg(options::OPT_s))
     CmdArgs.push_back("--strip-all");
@@ -160,7 +169,7 @@ void wasm::Linker::ConstructJob(Compilation &C, const JobAction &JA,
 
   AddLinkerInputs(ToolChain, Inputs, Args, CmdArgs, JA);
 
-  if (WantsPthread(ToolChain.getTriple(), Args))
+  if (WantsSharedMemory(ToolChain.getTriple(), Args))
     CmdArgs.push_back("--shared-memory");
 
   if (!Args.hasArg(options::OPT_nostdlib, options::OPT_nodefaultlibs)) {
@@ -233,9 +242,9 @@ void wasm::Linker::ConstructJob(Compilation &C, const JobAction &JA,
 /// Given a base library directory, append path components to form the
 /// LTO directory.
 static std::string AppendLTOLibDir(const std::string &Dir) {
-    // The version allows the path to be keyed to the specific version of
-    // LLVM in used, as the bitcode format is not stable.
-    return Dir + "/llvm-lto/" LLVM_VERSION_STRING;
+  // The version allows the path to be keyed to the specific version of
+  // LLVM in used, as the bitcode format is not stable.
+  return Dir + "/llvm-lto/" LLVM_VERSION_STRING;
 }
 
 WebAssembly::WebAssembly(const Driver &D, const llvm::Triple &Triple,
@@ -508,7 +517,8 @@ void WebAssembly::AddClangSystemIncludeArgs(const ArgList &DriverArgs,
   if (getTriple().getOS() != llvm::Triple::UnknownOS) {
     const std::string MultiarchTriple =
         getMultiarchTriple(D, getTriple(), D.SysRoot);
-    addSystemInclude(DriverArgs, CC1Args, D.SysRoot + "/include/" + MultiarchTriple);
+    addSystemInclude(DriverArgs, CC1Args,
+                     D.SysRoot + "/include/" + MultiarchTriple);
   }
   addSystemInclude(DriverArgs, CC1Args, D.SysRoot + "/include");
 }
@@ -637,5 +647,6 @@ void WebAssembly::addLibStdCXXIncludePaths(
   // Second add the generic one.
   addSystemInclude(DriverArgs, CC1Args, LibPath + "/c++/" + Version);
   // Third the backward one.
-  addSystemInclude(DriverArgs, CC1Args, LibPath + "/c++/" + Version + "/backward");
+  addSystemInclude(DriverArgs, CC1Args,
+                   LibPath + "/c++/" + Version + "/backward");
 }
diff --git a/lld/wasm/Config.h b/lld/wasm/Config.h
index 31e08e4e248a4..d291a42da200f 100644
--- a/lld/wasm/Config.h
+++ b/lld/wasm/Config.h
@@ -35,6 +35,7 @@ class Symbol;
 class DefinedData;
 class GlobalSymbol;
 class DefinedFunction;
+class UndefinedFunction;
 class DefinedGlobal;
 class UndefinedGlobal;
 class TableSymbol;
@@ -50,6 +51,8 @@ enum class BuildIdKind { None, Fast, Sha1, Hexstring, Uuid };
 // and such fields have the same name as the corresponding options.
 // Most fields are initialized by the driver.
 struct Config {
+  bool isMultithreaded() const { return sharedMemory || isWasip3; }
+
   bool allowMultipleDefinition;
   bool bsymbolic;
   bool checkFeatures;
@@ -71,6 +74,7 @@ struct Config {
   bool importTable;
   bool importUndefined;
   std::optional<bool> is64;
+  bool isWasip3;
   bool mergeDataSegments;
   bool noinhibitExec;
   bool pie;
@@ -252,6 +256,14 @@ struct Ctx {
     // Used as an address space for function pointers, with each function that
     // is used as a function pointer being allocated a slot.
     TableSymbol *indirectFunctionTable;
+
+    // __wasm_component_model_builtin_context_set_1
+    // Function used to set TLS base in component model modules.
+    UndefinedFunction *contextSet1;
+
+    // __wasm_component_model_builtin_context_get_1
+    // Function used to get TLS base in component model modules.
+    UndefinedFunction *contextGet1;
   };
   WasmSym sym;
 
diff --git a/lld/wasm/Driver.cpp b/lld/wasm/Driver.cpp
index b1e36f2ecff74..6eaacd7288f22 100644
--- a/lld/wasm/Driver.cpp
+++ b/lld/wasm/Driver.cpp
@@ -656,15 +656,16 @@ static void readConfigs(opt::InputArgList &args) {
   ctx.arg.exportDynamic =
       args.hasFlag(OPT_export_dynamic, OPT_no_export_dynamic, ctx.arg.shared);
 
-  // Parse wasm32/64.
+  // Parse wasm32/64 and maybe -wasip3.
   if (auto *arg = args.getLastArg(OPT_m)) {
     StringRef s = arg->getValue();
-    if (s == "wasm32")
+    if (s.starts_with("wasm32"))
       ctx.arg.is64 = false;
-    else if (s == "wasm64")
+    else if (s.starts_with("wasm64"))
       ctx.arg.is64 = true;
     else
       error("invalid target architecture: " + s);
+    ctx.arg.isWasip3 = s.ends_with("-wasip3");
   }
 
   // --threads= takes a positive integer and provides the default value for
@@ -827,6 +828,10 @@ static void checkOptions(opt::InputArgList &args) {
     if (ctx.arg.tableBase)
       error("--table-base may not be used with -shared/-pie");
   }
+
+  if (ctx.arg.sharedMemory && ctx.arg.isWasip3) {
+    error("--shared-memory is incompatible with the wasip3 target");
+  }
 }
 
 static const char *getReproduceOption(opt::InputArgList &args) {
@@ -885,7 +890,7 @@ static void writeWhyExtract() {
 // Equivalent of demote demoteSharedAndLazySymbols() in the ELF linker
 static void demoteLazySymbols() {
   for (Symbol *sym : symtab->symbols()) {
-    if (auto* s = dyn_cast<LazySymbol>(sym)) {
+    if (auto *s = dyn_cast<LazySymbol>(sym)) {
       if (s->signature) {
         LLVM_DEBUG(llvm::dbgs()
                    << "demoting lazy func: " << s->getName() << "\n");
@@ -906,6 +911,18 @@ createUndefinedGlobal(StringRef name, llvm::wasm::WasmGlobalType *type) {
   return sym;
 }
 
+static UndefinedFunction *
+createUndefinedFunction(StringRef name, std::optional<StringRef> importName,
+                        std::optional<StringRef> importModule,
+                        WasmSignature *signature) {
+  auto *sym = cast<UndefinedFunction>(symtab->addUndefinedFunction(
+      name, importName, importModule, WASM_SYMBOL_UNDEFINED, nullptr, signature,
+      true));
+  ctx.arg.allowUndefinedSymbols.insert(sym->getName());
+  sym->isUsedInRegularObj = true;
+  return sym;
+}
+
 static InputGlobal *createGlobal(StringRef name, bool isMutable) {
   llvm::wasm::WasmGlobal wasmGlobal;
   bool is64 = ctx.arg.is64.value_or(false);
@@ -946,11 +963,13 @@ static void createSyntheticSymbols() {
 
   bool is64 = ctx.arg.is64.value_or(false);
 
+  auto stack_pointer_name =
+      ctx.arg.isWasip3 ? "__init_stack_pointer" : "__stack_pointer";
   if (ctx.isPic) {
     ctx.sym.stackPointer =
-        createUndefinedGlobal("__stack_pointer", ctx.arg.is64.value_or(false)
-                                                     ? &mutableGlobalTypeI64
-                                                     : &mutableGlobalTypeI32);
+        createUndefinedGlobal(stack_pointer_name, ctx.arg.is64.value_or(false)
+                                                      ? &mutableGlobalTypeI64
+                                                      : &mutableGlobalTypeI32);
     // For PIC code, we import two global variables (__memory_base and
     // __table_base) from the environment and use these as the offset at
     // which to load our static data and function table.
@@ -963,14 +982,15 @@ static void createSyntheticSymbols() {
     ctx.sym.tableBase->markLive();
   } else {
     // For non-PIC code
-    ctx.sym.stackPointer = createGlobalVariable("__stack_pointer", true);
+    ctx.sym.stackPointer = createGlobalVariable(stack_pointer_name, true);
     ctx.sym.stackPointer->markLive();
   }
 
-  if (ctx.arg.sharedMemory) {
+  if (ctx.arg.isMultithreaded()) {
     // TLS symbols are all hidden/dso-local
-    ctx.sym.tlsBase =
-        createGlobalVariable("__tls_base", true, WASM_SYMBOL_VISIBILITY_HIDDEN);
+    auto tls_base_name = ctx.arg.isWasip3 ? "__init_tls_base" : "__tls_base";
+    ctx.sym.tlsBase = createGlobalVariable(tls_base_name, true,
+                                           WASM_SYMBOL_VISIBILITY_HIDDEN);
     ctx.sym.tlsSize = createGlobalVariable("__tls_size", false,
                                            WASM_SYMBOL_VISIBILITY_HIDDEN);
     ctx.sym.tlsAlign = createGlobalVariable("__tls_align", false,
@@ -979,6 +999,21 @@ static void createSyntheticSymbols() {
         "__wasm_init_tls", WASM_SYMBOL_VISIBILITY_HIDDEN,
         make<SyntheticFunction>(is64 ? i64ArgSignature : i32ArgSignature,
                                 "__wasm_init_tls"));
+    if (ctx.arg.isWasip3) {
+      ctx.sym.tlsBase->markLive();
+      ctx.sym.tlsSize->markLive();
+      ctx.sym.tlsAlign->markLive();
+      static WasmSignature contextSet1Signature{{}, {ValType::I32}};
+      ctx.sym.contextSet1 = createUndefinedFunction(
+          "__wasm_component_model_builtin_context_set_1", "[context-set-1]",
+          "$root", &contextSet1Signature);
+      ctx.sym.contextSet1->markLive();
+      static WasmSignature contextGet1Signature{{ValType::I32}, {}};
+      ctx.sym.contextGet1 = createUndefinedFunction(
+          "__wasm_component_model_builtin_context_get_1", "[context-get-1]",
+          "$root", &contextGet1Signature);
+      ctx.sym.contextGet1->markLive();
+    }
   }
 }
 
@@ -1017,7 +1052,7 @@ static void createOptionalSymbols() {
   //
   // __tls_size and __tls_align are not needed in this case since they are only
   // needed for __wasm_init_tls (which we do not create in this case).
-  if (!ctx.arg.sharedMemory)
+  if (!ctx.arg.sharedMemory && !ctx.arg.isWasip3)
     ctx.sym.tlsBase = createOptionalGlobal("__tls_base", false);
 }
 
@@ -1026,15 +1061,15 @@ static void processStubLibrariesPreLTO() {
   for (auto &stub_file : ctx.stubFiles) {
     LLVM_DEBUG(llvm::dbgs()
                << "processing stub file: " << stub_file->getName() << "\n");
-    for (auto [name, deps]: stub_file->symbolDependencies) {
-      auto* sym = symtab->find(name);
+    for (auto [name, deps] : stub_file->symbolDependencies) {
+      auto *sym = symtab->find(name);
       // If the symbol is not present at all (yet), or if it is present but
       // undefined, then mark the dependent symbols as used by a regular
       // object so they will be preserved and exported by the LTO process.
       if (!sym || sym->isUndefined()) {
         for (const auto dep : deps) {
-          auto* needed = symtab->find(dep);
-          if (needed ) {
+          auto *needed = symtab->find(dep);
+          if (needed) {
             needed->isUsedInRegularObj = true;
             // Like with handleLibcall we have to extract any LTO archive
             // members that might need to be exported due to stub library
diff --git a/lld/wasm/Relocations.cpp b/lld/wasm/Relocations.cpp
index a3f87ea3d69c0..cb597fdeffcf3 100644
--- a/lld/wasm/Relocations.cpp
+++ b/lld/wasm/Relocations.cpp
@@ -33,7 +33,7 @@ static bool requiresGOTAccess(const Symbol *sym) {
   return true;
 }
 
-static bool allowUndefined(const Symbol* sym) {
+static bool allowUndefined(const Symbol *sym) {
   // Symbols that are explicitly imported are always allowed to be undefined at
   // link time.
   if (sym->isImported())
@@ -125,7 +125,7 @@ void scanRelocations(InputChunk *chunk) {
       // In single-threaded builds TLS is lowered away and TLS data can be
       // merged with normal data and allowing TLS relocation in non-TLS
       // segments.
-      if (ctx.arg.sharedMemory) {
+      if (ctx.arg.isMultithreaded()) {
         if (!sym->isTLS()) {
           error(toString(file) + ": relocation " +
                 relocTypeToString(reloc.Type) +
diff --git a/lld/wasm/Symbols.cpp b/lld/wasm/Symbols.cpp
index f2040441e6257..97a9871a06308 100644
--- a/lld/wasm/Symbols.cpp
+++ b/lld/wasm/Symbols.cpp
@@ -95,7 +95,7 @@ WasmSymbolType Symbol::getWasmType() const {
 }
 
 const WasmSignature *Symbol::getSignature() const {
-  if (auto* f = dyn_cast<FunctionSymbol>(this))
+  if (auto *f = dyn_cast<FunctionSymbol>(this))
     return f->signature;
   if (auto *t = dyn_cast<TagSymbol>(this))
     return t->signature;
@@ -223,9 +223,7 @@ bool Symbol::isExportedExplicit() const {
   return forceExport || flags & WASM_SYMBOL_EXPORTED;
 }
 
-bool Symbol::isNoStrip() const {
-  return flags & WASM_SYMBOL_NO_STRIP;
-}
+bool Symbol::isNoStrip() const { return flags & WASM_SYMBOL_NO_STRIP; }
 
 uint32_t FunctionSymbol::getFunctionIndex() const {
   if (const auto *u = dyn_cast<UndefinedFunction>(this))
@@ -413,7 +411,7 @@ void LazySymbol::setWeak() {
   flags |= (flags & ~WASM_SYMBOL_BINDING_MASK) | WASM_SYMBOL_BINDING_WEAK;
 }
 
-void printTraceSymbolUndefined(StringRef name, const InputFile* file) {
+void printTraceSymbolUndefined(StringRef name, const InputFile *file) {
   message(toString(file) + ": reference to " + name);
 }
 
diff --git a/lld/wasm/SyntheticSections.cpp b/lld/wasm/SyntheticSections.cpp
index ede6ac4da77b3..023c690c14354 100644
--- a/lld/wasm/SyntheticSections.cpp
+++ b/lld/wasm/SyntheticSections.cpp
@@ -466,8 +466,7 @@ void GlobalSection::addInternalGOTEntry(Symbol *sym) {
 void GlobalSection::generateRelocationCode(raw_ostream &os, bool TLS) const {
   assert(!ctx.arg.extendedConst);
   bool is64 = ctx.arg.is64.value_or(false);
-  unsigned opcode_ptr_add = is64 ? WASM_OPCODE_I64_ADD
-                                 : WASM_OPCODE_I32_ADD;
+  unsigned opcode_ptr_add = is64 ? WASM_OPCODE_I64_ADD : WASM_OPCODE_I32_ADD;
 
   for (const Symbol *sym : internalGotSymbols) {
     if (TLS != sym->isTLS())
@@ -477,7 +476,7 @@ void GlobalSection::generateRelocationCode(raw_ostream &os, bool TLS) const {
       // Get __memory_base
       writeU8(os, WASM_OPCODE_GLOBAL_GET, "GLOBAL_GET");
       if (sym->isTLS())
-        writeUleb128(os, ctx.sym.tlsBase->getGlobalIndex(), "__tls_base");
+        writeGetTLSBase(ctx, os);
       else
         writeUleb128(os, ctx.sym.memoryBase->getGlobalIndex(), "__memory_base");
 
@@ -520,9 +519,9 @@ void GlobalSection::writeBody() {
       // the correct runtime value during `__wasm_apply_global_relocs`.
       if (!ctx.arg.extendedConst && ctx.isPic && !sym->isTLS())
         mutable_ = true;
-      // With multi-theadeding any TLS globals must be mutable since they get
+      // With multi-threading any TLS globals must be mutable since they get
       // set during `__wasm_apply_global_tls_relocs`
-      if (ctx.arg.sharedMemory && sym->isTLS())
+      if (ctx.arg.isMultithreaded() && sym->isTLS())
         mutable_ = true;
     }
     WasmGlobalType type{itype, mutable_};
@@ -559,10 +558,11 @@ void GlobalSection::writeBody() {
     } else {
       WasmInitExpr initExpr;
       if (auto *d = dyn_cast<DefinedData>(sym))
-        // In the sharedMemory case TLS globals are set during
-        // `__wasm_apply_global_tls_relocs`, but in the non-shared case
+        // In the multi-threaded case, TLS globals are set during
+        // `__wasm_apply_global_tls_relocs`, but in the non-multi-threaded case
         // we know the absolute value at link time.
-        initExpr = intConst(d->getVA(/*absolute=*/!ctx.arg.sharedMemory), is64);
+        initExpr =
+            intConst(d->getVA(/*absolute=*/!ctx.arg.isMultithreaded()), is64);
       else if (auto *f = dyn_cast<FunctionSymbol>(sym))
         initExpr = intConst(f->isStub ? 0 : f->getTableIndex(), is64);
       else {
@@ -646,7 +646,7 @@ void ElemSection::writeBody() {
   uint32_t tableIndex = ctx.arg.tableBase;
   for (const FunctionSymbol *sym : indirectFunctions) {
     assert(sym->getTableIndex() == tableIndex);
-    (void) tableIndex;
+    (void)tableIndex;
     writeUleb128(os, sym->getFunctionIndex(), "function index");
     ++tableIndex;
   }
@@ -663,7 +663,7 @@ void DataCountSection::writeBody() {
 }
 
 bool DataCountSection::isNeeded() const {
-  return numSegments && ctx.arg.sharedMemory;
+  return numSegments && ctx.arg.isMultithreaded();
 }
 
 void LinkingSection::writeBody() {
@@ -992,4 +992,4 @@ void BuildIdSection::writeBuildId(llvm::ArrayRef<uint8_t> buf) {
   memcpy(hashPlaceholderPtr, buf.data(), hashSize);
 }
 
-} // namespace wasm::lld
+} // namespace lld::wasm
diff --git a/lld/wasm/Writer.cpp b/lld/wasm/Writer.cpp
index dfd856f2faee6..50d6449ca79a9 100644
--- a/lld/wasm/Writer.cpp
+++ b/lld/wasm/Writer.cpp
@@ -311,7 +311,8 @@ void Writer::writeBuildId() {
 }
 
 static void setGlobalPtr(DefinedGlobal *g, uint64_t memoryPtr) {
-  LLVM_DEBUG(dbgs() << "setGlobalPtr " << g->getName() << " -> " << memoryPtr << "\n");
+  LLVM_DEBUG(dbgs() << "setGlobalPtr " << g->getName() << " -> " << memoryPtr
+                    << "\n");
   g->global->setPointerValue(memoryPtr);
 }
 
@@ -358,7 +359,8 @@ void Writer::layoutMemory() {
     placeStack();
     if (ctx.arg.globalBase) {
       if (ctx.arg.globalBase < memoryPtr) {
-        error("--global-base cannot be less than stack size when --stack-first is used");
+        error("--global-base cannot be less than stack size when --stack-first "
+              "is used");
         return;
       }
       memoryPtr = ctx.arg.globalBase;
@@ -382,6 +384,7 @@ void Writer::layoutMemory() {
   for (OutputSegment *seg : segments) {
     out.dylinkSec->memAlign = std::max(out.dylinkSec->memAlign, seg->alignment);
     memoryPtr = alignTo(memoryPtr, 1ULL << seg->alignment);
+
     seg->startVA = memoryPtr;
     log(formatv("mem: {0,-15} offset={1,-8} size={2,-8} align={3}", seg->name,
                 memoryPtr, seg->size, seg->alignment));
@@ -1029,7 +1032,7 @@ static StringRef getOutputDataSegmentName(const InputChunk &seg) {
 OutputSegment *Writer::createOutputSegment(StringRef name) {
   LLVM_DEBUG(dbgs() << "new segment: " << name << "\n");
   OutputSegment *s = make<OutputSegment>(name);
-  if (ctx.arg.sharedMemory)
+  if (ctx.arg.isMultithreaded())
     s->initFlags = WASM_DATA_SEGMENT_IS_PASSIVE;
   if (!ctx.arg.relocatable && name.starts_with(".bss"))
     s->isBss = true;
@@ -1163,14 +1166,14 @@ void Writer::createSyntheticInitFunctions() {
         "__wasm_init_memory", WASM_SYMBOL_VISIBILITY_HIDDEN,
         make<SyntheticFunction>(nullSignature, "__wasm_init_memory"));
     ctx.sym.initMemory->mark...
[truncated]

@TartanLlama
Copy link
Copy Markdown
Contributor Author

@alexcrichton I've factored out the WASIP3 changes into feature flags and linker options like --component-model-thread-context. I'll check what happens on the second point and try and get a decent error story in place.

@TartanLlama
Copy link
Copy Markdown
Contributor Author

@alexcrichton I've made it so that if you try to link an object file compiled with/without the component-model-thread-context feature and specify/omit the --component-model-thread-context linker flag incorrectly, you get one of the following messages:

// if you link a WASIP3 ABI object file without --component-model-thread-context
component-model-thread-context feature used by <file> but --component-model-thread-context not specified.

// if you link a pre-WASIP3 ABI object file with --component-model-thread-context
--component-model-thread-context is disallowed by <file> because it was not compiled with the 'component-model-thread-context' feature.

@TartanLlama
Copy link
Copy Markdown
Contributor Author

I'll progressively add tests to this, but the core functionality is ready for review

@TartanLlama
Copy link
Copy Markdown
Contributor Author

@sbc100 any more thoughts on this? Maybe we can set some time to chat about this synchronously?

@sbc100
Copy link
Copy Markdown
Collaborator

sbc100 commented Mar 23, 2026

@sbc100 any more thoughts on this? Maybe we can set some time to chat about this synchronously?

Yes I think it would be great if you could join the Wasm Tools char we have with @sunfishcode on Mondays. @dschuff can invite you I think?

Copy link
Copy Markdown
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love that the wasm-ld patch is now smaller.

I'm still not totally sure about the new --component-model-thread-context command line flag or the internal componentModelThreadContext naming, especially now that we don't have compontent model in the names of the new functions we are using.

}

// Create synthetic symbols that rely on information that is only available
// after LTO, e.g. __stack_pointer, __tls_base.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this need split needed now but not before?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the thread context ABI must have been determined prior to creating these symbols so that we know which ones to create, and we could have a case where every object file is an LTO bitcode file, in which case we'd need to wait until after LTO to determine the ABI. However, the __wasm_create_ctors function is needed earlier in the process, so I split that out

}

static bool WantsSharedMemory(const llvm::Triple &Triple, const ArgList &Args) {
return WantsPthread(Triple, Args) &&
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this means the the component model green threads is passing -pthread? I would hope that in this mode we could make WantsPthread return false?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WASIP3 green threads for wasi-libc is implemented as pthreads

"\"[context-set-1]\"\n"
"local.get 0\n"
"call __wasm_component_model_builtin_context_set_1");
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this whole file is just asm perhaps it would be more readable if you just make it into a .S file? (i.e. assembly with pre-processor enabled).

Copy link
Copy Markdown
Contributor Author

@TartanLlama TartanLlama Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I did that originally but hit the issue that the build system for compiler-rt doesn't set --target for .S files, which means the #ifdef doesn't fire. There's a comment above here that tries to explain.

It may be that the changes required for the build system would be minimal, but there's a lot of CMake that deals with setting the CFLAGS correctly, so I figured I'd take the simpler path for now to minimise changes (especially since this could affect other targets). We could address it in a follow-up PR.

continue;
}
// Treat this as using the globals ABI
usesComponentModelThreadContext = false;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just assume that all objects files without the component-model-thread-context feature don't use this API?

i.e. why do we need to check for the __stack_pointer import here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can assume that all object files without that feature don't use this ABI, but we can't assume that they are compatible.

The categories I see are:

  1. Files that do not use a guest stack at all, and will not disallow component-model-thread-context. These will work with either ABI.
  2. Files that use a guest stack and disallow component-model-thread-context. These are incompatible with the new ABI and will be diagnosed based on the feature mismatch.
  3. Files that use a guest stack but do not disallow component-model-thread-context (e.g. -mcpu mvp, files produced for wasm32-wasip3 with an older toolchain). These are incompatible with the new ABI, but will not be diagnosed based on features, but we really want a linker error with a message that can be searched if, for example, a user tries to link object files produced by a version of rustc using an older LLVM with ones produced by a newer wasi-sdk. This is the case for which we do this check: the presence of __stack_pointer is used as a proxy for "this object file is using the global ABI".

ThreadContextABI threadContextABI = ThreadContextABI::Undetermined;

for (ObjFile *obj : files) {
auto targetFeatures = obj->getWasmObj()->getTargetFeatures();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we process target features of the input object in Writer::populateTargetFeatures. We also generate errors about incompatible object files here.

is there some way to integrate this code in to that existing location? (or maybe have it called from there?) Or is it too late the game?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's where I originally put this, but it's too late unfortunately, as we need this code to decide which symbols to create.

@TartanLlama
Copy link
Copy Markdown
Contributor Author

TartanLlama commented Mar 24, 2026

I'm still not totally sure about the new --component-model-thread-context command line flag or the internal componentModelThreadContext naming, especially now that we don't have compontent model in the names of the new functions we are using.

To clarify, do you mean this just for wasm-ld or also for the Clang/LLVM patches?

For the linker changes, I think there are two separate types of modification:

  1. Changes for using compiler-rt functions instead of __stack_pointer and __tls_base globals. These form the bulk of the changes.
  2. Changes related to the threaded nature of WASIP3, such as here where there is some customisation of the passive segments or here where I allow allow TLS without atomics.

Since the changes in 2 are unrelated to those in 1, I think if we're going to generalise them to remove the component-model-ness, they should be under separate configuration names.

How does this proposal sound:

  • Remove the Ctx::componentModelThreadContext member.
  • Introduce a Ctx::externThreadBuiltins member (we can bikeshed the name) and refactor all changes in 1 to use that.
  • Introduce a Ctx::threadModel member, with the type enum class ThreadModel { Single, SharedMemory, Cooperative }; (again, we can bikeshed these names) and refactor all changes in 2 to use that.
  • Rename ComponentModelThreadContext across the Clang/LLVM changes to ComponentModelThreading to better reflect that it's not just the context storage mechanism that is affected.
  • Compute the values of Ctx::externThreadBuiltins and Ctx::threadModel based on the presence of the component-model-threading feature and -shared-memory flag.

This way, everything becomes more generic, the isMultithreaded check becomes just Ctx::threadModel != ThreadModel::single, and the component-model-specificity disappears after the feature is checked.

It would also be possible to remove all component-model-specificity from the linker by changing the WASM feature name to extern-thread-builtins and adding a -thread-model <model> option to wasm-ld that could be set to cooperative. With this change, Ctx::externThreadBuiltins would be set entirely based on the extern-thread-builtins WASM feature, and Ctx::threadModel entirely based on the -thread-model flag. If desired, we could also make this change for Clang/LLVM, so that two flags (-mextern-thread-builtins and... I'm not sure, -mcooperative-threading?) would control everything, would be set by default on WASIp3, and for now would be required to either both be set or neither be set.

I think some of this is complicated by the -shared-memory flag doing double-duty as "memories should be marked as shared" and "I'm doing shared memory multi-threading". Ideally these would be teased out, but I think that could be the subject of a future PR.

@TartanLlama
Copy link
Copy Markdown
Contributor Author

@sbc100 I've stripped out all of the changes related to cooperative threading from this PR and left only the parts related to making library calls for thread context, which I've put under a new -mlibcall-thread-context clang flag and --libcall-thread-context linker flag. I also moved the implementation of the library functions to wasi-libc. As such, the patch is now significantly smaller and more general.

Having a linker flag actually simplifies the implementation, because it no longer has to look thread the object files to determine which synthetic symbols to create. This lets me do the feature checks in the usual place in Writer.cpp rather than in Driver.cpp and avoids splitting the synthetic symbol creation into pre- and post-LTO stages. This also minimises changes needed to existing tests, which were otherwise seeing a bunch of definitions shuffled around.

With these refactors, there is no longer anything component-model specific in this PR. I haven't quite got my pthreads test suite passing with this yet, and I'll need a small follow up PR for some cooperative-threading-specific changes, but hopefully this version of the PR feels more maintainable to you.

Copy link
Copy Markdown
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This PR seems far less invasive now, especially the wasm-ld parts.

InputFunction *function) {
LLVM_DEBUG(dbgs() << "addSyntheticFunction: " << name << "\n");
assert(!find(name));
assert(!find(name) || find(name)->isUndefined());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is needed because we now have calls to addSynthetixXXX later in the linker process?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it was, but since I've now removed the need for that, I can also revert this change

writeU8(os, WASM_OPCODE_GLOBAL_GET, "GLOBAL_SET");
writeUleb128(os, ctx.sym.tlsBase->getGlobalIndex(), "__tls_base");
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the empty line here?

sym->kind() == Symbol::UndefinedGlobalKind &&
sym->importModule && sym->importModule == "env";
})) {
disallowed.insert({"libcall-thread-context", std::string(fileName)});
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like this. This way the error message about the incompatibility will still like to the offending file.

} else {
writeU8(os, WASM_OPCODE_GLOBAL_SET, "global.set");
writeUleb128(os, ctx.sym.tlsBase->getGlobalIndex(), "global index");
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this just be writeSetTLSBase?

}
writeU8(os, WASM_OPCODE_GLOBAL_SET, "GLOBAL_SET");
writeUleb128(os, ctx.sym.tlsBase->getGlobalIndex(), "__tls_base");
writeSetTLSBase(ctx, os);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused about that is the point of the __init_tls_base global if we are calling out to the libcall here?

HelpText<"Initial size of the linear memory">;

def libcall_thread_context: FF<"libcall-thread-context">,
HelpText<"Use library calls for thread context access instead of globals.">;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we still need this flag if the object files feature section contains this flag?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:WebAssembly clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" lld:wasm lld llvm:mc Machine (object) code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants