dns: Add is_hostname() for RFC 1123 §2.1 Internet hostname validation#2346
Open
vtushar06 wants to merge 2 commits intosourcemeta:mainfrom
Open
dns: Add is_hostname() for RFC 1123 §2.1 Internet hostname validation#2346vtushar06 wants to merge 2 commits intosourcemeta:mainfrom
vtushar06 wants to merge 2 commits intosourcemeta:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new DNS core module that provides RFC 1123 §2.1 (JSON Schema hostname) Internet hostname validation, plus a dedicated unit test suite and CMake wiring consistent with existing core/ip and core/time modules.
Changes:
- Introduce
sourcemeta::core::is_hostname(std::string_view) -> boolimplemented as a no-allocation ASCII state machine. - Add comprehensive
hostnameformat unit tests (valid/invalid cases including length limits and ASCII-only enforcement). - Wire the new
core/dnslibrary and its tests into the top-level CMake options/build.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
src/core/dns/include/sourcemeta/core/dns.h |
Public API for is_hostname() with Doxygen docs. |
src/core/dns/hostname.cc |
Implements RFC 1123 §2.1 hostname validation via a state machine. |
src/core/dns/CMakeLists.txt |
Defines the new sourcemeta::core::dns library target. |
test/dns/hostname_test.cc |
Adds unit tests covering valid/invalid hostname inputs and edge cases. |
test/dns/CMakeLists.txt |
Adds a dns unit test target linked to sourcemeta::core::dns. |
CMakeLists.txt |
Adds SOURCEMETA_CORE_DNS option and includes module/tests subdirectories. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
2f8cab5 to
f5747ca
Compare
Adds sourcemeta::core::is_hostname() in a new src/core/dns module following the same pattern as src/core/ip and src/core/time. - Pure string_view state machine, no heap allocations - Validates hostname per RFC 1123 §2.1 + RFC 952 grammar - First char: letter or digit (RFC 1123 §2.1 relaxation of RFC 952) - Label length: 1-63 chars (RFC 1123 §2.1 MUST) - Total length: 1-255 chars (RFC 1123 §2.1 SHOULD) - Rejects trailing dot, leading dot, consecutive dots - Rejects labels starting or ending with hyphen - Rejects underscore and non-ASCII bytes - Accepts XN--aa---o47jg78q (RFC 1123 has no positions-3-4 rule; test suite cites RFC 5891 which is IDNA2008, not RFC 1123) - 44 unit tests (21 valid, 23 invalid) draft4 and draft6: expected 27/27 pass (no A-label group) draft7+: expected 23/61 pass (Group 2 is IDNA2008, out of scope) Relates to format-assertion support for Draft 4 and Draft 6. Signed-off-by: Tushar Verma <tusharmyself06@gmail.com>
- Apply clang-format (LLVM style) to four code blocks that violated
line-length / alignment rules under --dry-run -Werror:
valid_label_exactly_63: collapse two-line EXPECT_TRUE to one line
invalid_label_64: collapse two-line EXPECT_FALSE to one line
invalid_fullwidth_dot: split adjacent string literal across two lines
invalid_high_bit_byte / invalid_nul_byte: align string_view
initialiser list per LLVM column rules
- Update comment on invalid_empty: replace "<name> requires at least
one <let>" with "<hname> requires at least one <name> / label"
(Copilot review: original phrasing was inaccurate given the RFC 1123
§2.1 relaxation that allows digit-first labels)
All 44 tests still pass.
Signed-off-by: Tushar Verma <tusharmyself06@gmail.com>
f5747ca to
d217b3b
Compare
Author
|
hey @jviotti PR is ready for review, let me know once you are done with review. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description:
Adds
sourcemeta::core::is_hostname()in a newsrc/core/dnsmodule following the same pattern assrc/core/ipandsrc/core/time.This is a building block for format-assertion support in Blaze for Draft 4 and Draft 6.
Function signature:
auto is_hostname(std::string_view value) -> bool;Pure string_view state machine followed by No heap allocations. No regex. No external deps.
RFC 1123 §2.1 compliance:
One deliberate test suite divergence:
Test #20 in Group 1 (draft7+) marks
XN--aa---o47jg78qas invalid citingRFC 5891 §4.2.3.1 (IDNA2008). RFC 1123 has no such rule. draft 4's test suite marksxn--4gbwdl.xn--wgbh1c(same structural pattern) as valid. Ourimplementation accepts XN--aa---o47jg78q (spec-faithful per
RFC 1123 §2.1).One bug fixed vs ajv-formats and python-jsonschema:
Both accept example. (trailing dot) via regex
\\.?$matching. Our state machine rejects it by construction.Expected test suite results:
draft4: 27/27 pass (no A-label group)
draft6: 27/27 pass (no A-label group)
draft7/2019-09/2020-12/v1: 23/61 pass (Group 2 is IDNA2008 - out of scope for hostname format per JSON Schema spec)
Tests: 44 cases (21 valid, 23 invalid)
Out of scope for this PR: