Skip to content

Add ACL ownership augmentation and AND-slice semantics to C++ runtime#704

Open
simonhollis wants to merge 2 commits intofacebookincubator:mainfrom
simonhollis:export-D97111684
Open

Add ACL ownership augmentation and AND-slice semantics to C++ runtime#704
simonhollis wants to merge 2 commits intofacebookincubator:mainfrom
simonhollis:export-D97111684

Conversation

@simonhollis
Copy link
Copy Markdown
Contributor

Summary:

What has changed

Core C++ runtime changes to support ACL enforcement via the ownership infrastructure:

  1. New acl.h/acl.cpp with augmentOwnershipWithACL() — walks computed ownership
    facts and ANDs each fact's existing owner Uset with a CNF Uset built from the
    unit's ACL group assignments (OR within levels, AND across levels).
  2. slice.h visibility semantics changed from first-match-returns to AND-all:
    when multiple slices cover the same UsetId, ALL must agree the fact is visible.
  3. setu32.h moves append(const_iterator, const_iterator) from private to public,
    adds from(std::set<uint32_t>) factory, and changes foreach iterator to const.
  4. serialize.h adds StringList, StringMap variant types to support the new thrift
    acl_config map<string, list> serialization.

Why the change?

The ownership system already tracks which units own which facts via Usets. ACL
enforcement piggybacks on this: each unit gets an ACL CNF constraint (AND of OR-groups),
which is ANDed with the unit's existing ownership. At query time, the user's groups
form a slice, and the AND-slice semantics ensure a fact is only visible if BOTH
the ownership slice AND the ACL slice agree. This avoids a separate ACL-filtering
pass and reuses the existing ownership visibility infrastructure.

Most important changes

  • acl.cpp:56-187 - augmentOwnershipWithACL() implementation: builds UnitId→CNF map,
    resolves promoted Usets to leaf UnitIds, walks facts with memoization cache
  • acl.cpp:24-47 - makeOrSet/makeAndSet helpers for CNF construction
  • acl.h:23-51 - New ACLConfigEntry, UnitACLAssignment structs
  • slice.h:83-97 - CRITICAL: Changed from first-match return to AND-all semantics.
    Old code returned immediately on first slice match; new code requires ALL covering
    slices to return visible=true. Non-overlapping slices behave identically to before.
  • setu32.h:474 - append(const_iterator, const_iterator) moved to public interface
  • serialize.h:213-214 - New StringList/StringMap types added to Field variant
  • serialize.h:307-315 - String and StringMap serialization branches added

Differential Revision: D97111684

Summary:
# What has changed
Thrift interfaces gain new fields for ACL support across three areas:
1. Server config gets a new ACLPolicy struct and optional acl_policy field
2. Write path: Batch and SendJsonBatch structs gain acl_config maps (path → group name lists)
3. Query path: UserQueryClientInfo gains acl_group_names field for string-based group names with server-side resolution
4. New ACLConfigConflict exception for detecting conflicting ACL mappings during writes

# Why the change?
ACL enforcement requires passing ACL configuration data between indexers, the write
server, and the query server via thrift. Indexers must attach per-path ACL group
assignments to batches during writes, and clients must pass their ACL group
memberships at query time so the server can filter results. Group names are
resolved server-side to numeric IDs using the DB's name-to-ID mapping.

# Most important changes
- `server_config.thrift` - New ACLPolicy struct with enabled_repos list and acl_policy field on Config
- `glean.thrift` - acl_config field on Batch struct (map<string, list<string>>)
- `glean.thrift` - New ACLConfigConflict exception with conflicting_key/value diagnostics
- `glean.thrift` - acl_group_names field on UserQueryClientInfo for server-side name resolution
- `glean.thrift` - acl_config field on SendJsonBatch struct

Differential Revision: D97111677
Summary:
# What has changed
Core C++ runtime changes to support ACL enforcement via the ownership infrastructure:
1. New `acl.h/acl.cpp` with `augmentOwnershipWithACL()` — walks computed ownership
   facts and ANDs each fact's existing owner Uset with a CNF Uset built from the
   unit's ACL group assignments (OR within levels, AND across levels).
2. `slice.h` visibility semantics changed from first-match-returns to AND-all:
   when multiple slices cover the same UsetId, ALL must agree the fact is visible.
3. `setu32.h` moves `append(const_iterator, const_iterator)` from private to public,
   adds `from(std::set<uint32_t>)` factory, and changes `foreach` iterator to const.
4. `serialize.h` adds StringList, StringMap variant types to support the new thrift
   acl_config map<string, list<string>> serialization.

# Why the change?
The ownership system already tracks which units own which facts via Usets. ACL
enforcement piggybacks on this: each unit gets an ACL CNF constraint (AND of OR-groups),
which is ANDed with the unit's existing ownership. At query time, the user's groups
form a slice, and the AND-slice semantics ensure a fact is only visible if BOTH
the ownership slice AND the ACL slice agree. This avoids a separate ACL-filtering
pass and reuses the existing ownership visibility infrastructure.

# Most important changes
- `acl.cpp:56-187` - `augmentOwnershipWithACL()` implementation: builds UnitId→CNF map,
  resolves promoted Usets to leaf UnitIds, walks facts with memoization cache
- `acl.cpp:24-47` - `makeOrSet`/`makeAndSet` helpers for CNF construction
- `acl.h:23-51` - New ACLConfigEntry, UnitACLAssignment structs
- `slice.h:83-97` - **CRITICAL**: Changed from first-match return to AND-all semantics.
  Old code returned immediately on first slice match; new code requires ALL covering
  slices to return visible=true. Non-overlapping slices behave identically to before.
- `setu32.h:474` - `append(const_iterator, const_iterator)` moved to public interface
- `serialize.h:213-214` - New StringList/StringMap types added to Field variant
- `serialize.h:307-315` - String and StringMap serialization branches added

Differential Revision: D97111684
@netlify
Copy link
Copy Markdown

netlify bot commented Mar 18, 2026

Deploy Preview for fb-oss-glean canceled.

Name Link
🔨 Latest commit 5dfb22a
🔍 Latest deploy log https://app.netlify.com/projects/fb-oss-glean/deploys/69bae09ea2060a000971d37e

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 18, 2026
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync bot commented Mar 18, 2026

@simonhollis has exported this pull request. If you are a Meta employee, you can view the originating Diff in D97111684.

@simonhollis
Copy link
Copy Markdown
Contributor Author

Putting this PR up to solicit feedback on a new ACL functionality for glean that records directories files are in when indexed and guards visibility of facts on query based on ACL group identities.

The purpose of this PR is not to ask to merge code, but to expose current ideas in slice.h for how to combine ownership and ACL information at query time. Also, feedback on the approach in new acl.cpp and acl.h to construct new ACL-related expressions for combination with the existing ownership logic appreciated.

@simonhollis simonhollis requested a review from simonmar March 18, 2026 17:32
Copy link
Copy Markdown
Collaborator

@simonmar simonmar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the slice changes

OK - but see comment

On the overall approach of augmentOwnershipWithACL

What you have looks basically sound (modulo a few comments I made inline).

If it were me, I would try to move as much of this out of the server and into the indexer though - ideally Glean should know nothing about ACLs, and it should provide generic functionality that could be used to support multiple different ACL policies. Here we're baking in a policy that is file/directory prefix-based, which seems too specific. However, this is not a hill I will die on, just a matter of trying to design something as generic as possible.

41: optional string db_create_storage;

// ACL filtering configuration. When enabled, query results are filtered
// based on user's group membership. Requires both JustKnobs
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that JustKnobs is not open-source

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out. The plan here is to use #ifdef to have different paths for OSS (no JustKnobs check) and internal (with JustKnobs check)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general point, we try to avoid #ifdef in the codebase as far as possible because it's a maintenance headache. There are a few alternatives, e.g. make an API that has a stub implementation in the OSS build.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will use the facebook/ vs github/ directory pattern with facebook/Glean/Query/ACL.hs having the internal implementation and github/Glean/Query/ACL.hs being a stub

1: DatabaseStatus status;
}

// Conflict when merging ACL configurations
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty suspicious of this, as we discussed: the ACL config should be fixed for a given revision of the repo. Also I don't think the server has any business enforcing this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We can remove this conflict resolution logic.

/// The outer vector represents levels that are ANDed together.
struct UnitACLAssignment {
UnitId unitId;
std::vector<std::vector<UsetId>> levels; // outer=AND, inner=OR
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above, why not construct the UsetId for the expression and put it here instead of the vector-of-vectors?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I think your suggestion makes sense. There's no inheritance needed per-file.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe doing something else is more complex. Here's an attempt: #705

/// All groups at the same prefix level are ORed together.
struct ACLConfigEntry {
std::string prefix;
std::vector<UsetId> groupUsetIds;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not construct the UsetId for A || B || ... and put it here instead of the vector? That would save repeatedly constructing it, no?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows caching of the recursive construction. example:

dir: foo/ -> Groups [a,b]
dir: foo/bar/ -> Groups [c, d, e]

foo/bar/ also needs to inherit [a, b]
Both directories need assigned UsetIds. So, the Group->UsetID map is:

[a,b] -> UsetID(1)
[c, d, e] -> UsetID(2)

and dir to usetIDs:
foo/ -> (1)
foo/bar/ -> (1, 2), which we construct as UsetID('foo/') Union UsetID("'foo/bar").

Not too bad in this example, but as directories get deeper, the efficiency of creation is more significant.

// and promoted. The old code checked (ownerUsetId < firstSetId) which
// was NEVER true, so no facts were ever augmented. We must resolve
// promoted sets to their leaf UnitIds to find ACL matches.
folly::F14FastMap<UsetId, Uset*> idToUset;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you only need this because ComputedOwnership has an Id to UsetId mapping and you need to turn the UsetId into a Uset*, right?

Isn't there a problem with stacked DBs here? The UsetId might be in the base DB, so we don't have a Uset* for it. But in that case we should have already added ACLs to it, so we should be OK? Need to handle this case anyway.

Building this mapping seems pretty ugly and potentially expensive. You could imagine either storing the mapping in the ComputedOwnership, or keeping it in the Usets itself (depending on how expensive that is).

Copy link
Copy Markdown
Contributor Author

@simonhollis simonhollis Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the incremental case, we ensure that stacked UsetIds are > any used ID in the base, so they don't conflict. Then, we reverse-walk the stack to lookup USets, so we eventually cover all the mappings from the stack.

Specifically here, a UsetId could refer to a Uset from the base DB that isn't in this ComputedOwnership container. However, since the stacked DB only has stacked DB facts in it, we should be OK.

I'm thinking over the performance question.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have a fact with an owner UsetId that refers to a set in the base DB, so you won't find it in the mapping, I think this ends up working out OK with the code as it is but I'm not sure - at the least I would suggest testing it and adding a comment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Adding testcases and some logging to expose when a computed ownership UsetId belongs to a base DB's Usets container not the stacked one.

UsetId originalOwner = ownerUsetId;

// Collect unique ACL CNF IDs for leaf UnitIds in this owner's set.
std::set<uint32_t> aclCnfIds;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this unit32_t? Shouldn't it be UsetId?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're type aliases, but agree UsetID is more descriptive; will change.

std::set<uint32_t> aclCnfIds;

if (ownerUsetId < firstSetId) {
// Direct UnitId (rare after computeOwnership but handle it)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this to be quite common because most facts are owned by a single unit, so I'm surprised. Why do you say it's rare?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it should be an error case. Since we should by now have created an OR set, there shouldn't be any bare UnitIds here. However, you're right that in many cases, the size of the OR set will be length=1

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if ownerUsetId < firstSetId and this is a stacked DB, then the owner could be a set in the base DB.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a test case, logging and assertion for this too.

// Name of program making the query.
4: optional list<string> acl_group_names;
// ACL group names for query-time filtering.
// Names are resolved to IDs server-side using the DB's name-to-ID mapping.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line is redundant, IDs are not a concept the user of this API needs to care about.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed since remote Glean clients need a way of passing this information to queries.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for not being clear, I was referring to just the part of the comment beginning "Names are resolved..", it's redundant detail from the user's point of view

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

if (usetid == INVALID_USET) {
return false;
}
// AND semantics: when multiple slices cover the same UsetId (e.g.,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - but you could optimise this. If the slices are sorted, then we can stop as soon as we find an out of range slice. Might be worth it since this is in the inner loop for queries.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I'll have a look at that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some analysis here, and I think this is cheap enough already, since the loop is already small and tight. Maintaining the sort may outweigh the lookup improvement in the average case. Performance may degrade once we get to higher stacks, and could make this worthwhile, but we don't currently construct those. Putting this in the won't fix bucket.

@simonhollis
Copy link
Copy Markdown
Contributor Author

If it were me, I would try to move as much of this out of the server and into the indexer though - ideally Glean should know nothing about ACLs, and it should provide generic functionality that could be used to support multiple different ACL policies. Here we're baking in a policy that is file/directory prefix-based, which seems too specific. However, this is not a hill I will die on, just a matter of trying to design something as generic as possible.

I understand the idea here to abstract out of the core and treat ACLs as just more generic ownership units supplied by indexers. It could be done, but would still need server changes, as well as some implicit contracts between indexers and write server; and query server and query agent. I dislike implicit contracts from a safety point of view.

More fundamentally, server changes wouldn't be eliminated entirely due to the fundamental mismatch of how we currently use all-OR semantics for ownership propagation and merging, compared with the OR-AND semantics needed for ACLs with hierarchy. The expression computation is just as important when we think about derived facts, and relying on the indexer to perform derivation propagation of ACL names just adds a place where things can go wrong.

There's an additional wrinkle when dealing with both ACL'd and stacked DBs where a unit has changed ACL permissions across the base + stacked DB. I think this could be worked around, but to me just highlights the number of corner cases that are much more easily addressed by riding on the existing server ownership logic, rather than putting the code indexer-side.

So, I think the right thing to do here is to keep the code in the server, but as modular as possible. If we really want a version where the core is kept clean, it could probably be conditionally compiled in, but I'd start by including it.

In terms of policy question, the code being added here is in new files, which makes it modular enough that a new policy could add a new calculation module or override the default functionality.

This is already a long response, but I may as well list some other minor advantages of having ACL-specific code in the server:

  • Query time, we have to distinguish them from conventional ownership units.
    • Explicitness gives semantics to the concept and makes it more understandable and robust.
    • An added benefit is knowing at query-time why filtering of a fact has occurred - what did it not match on, and what did that represent w.r.t. the original code base.
  • Having the server allocate IDs gives a single point of consistency that avoids any potential issues with multi-batch writes, derived fact assignments
    • Assigning at finalise or complete steps also prevents any re-computations during write.
  • It minimises indexer-side changes, which scale with the number of indexers.
  • If we accept that we have something that's explicitly ACLs in Glean, we can be explicit about Glean DBs and whether or not they a) have ACL information; b) Whether we have an enforcement policy. These can be carried in the DB for consistency and safety.

@simonmar
Copy link
Copy Markdown
Collaborator

Hi @simonhollis - I do think it's possible to achieve a design here that embeds none of the ACL policy knowledge in the server and has no implicit contracts. It's easy to see that the current design is not quite right, because it breaks abstraction boundaries: (1) it adds meaning to units (they should be arbitrary strings from the server's point of view) and (2) if the client wants to use a different ACL policy they have to change Glean itself. Using a different ACL policy should be a matter of changing the indexer and query-time ACLs, both things that the client controls.

But anyway, we do need to resolve some technical things:

More fundamentally, server changes wouldn't be eliminated entirely due to the fundamental mismatch of how we currently use all-OR semantics for ownership propagation and merging, compared with the OR-AND semantics needed for ACLs with hierarchy. The expression computation is just as important when we think about derived facts, and relying on the indexer to perform derivation propagation of ACL names just adds a place where things can go wrong.

Could you give an example here? I thought we had established that since the ownership system supports arbitrary expressions involving AND/OR that it could express whatever ACL logic you wanted.

The clue is that ownership propagation and derivation are policy-free: they do only what is required to ensure the integrity of the DB. The ownership of a fact A that is referenced by B and C must be A|B|C, it can't be anything else or the DB would be invalid under some choice of ownership units. Similarly the ownership of a fact A that is derived from B and C must be B&C, again it can't be anything else or the DB would have incorrect or incomplete derivations under some choice of ownership units. So if we're making changes here for ACLs I'm deeply suspicious.

(as I mentioned earlier, whether to put the ACL logic in the server is not a hill I'm going to die on. But to that I will add that DB integrity is a hill I will die on!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants