-
Notifications
You must be signed in to change notification settings - Fork 143
IGNITE-25502 fail node on index greater than current majority #7694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 3 commits
64f8d31
3ba9fdb
cdf9139
a8e7b71
c5d9461
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -96,4 +96,14 @@ default void onConfigurationCommitted( | |
| * Invoked once after a raft node has been shut down. | ||
| */ | ||
| void onShutdown(); | ||
|
|
||
| /** | ||
| * Returns the last applied index persisted by the state machine. | ||
| * Called during {@code NodeImpl.init()} to prevent truncation of already-applied log entries. | ||
ibessonov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| * | ||
| * @return persisted applied index, or 0 if unknown. | ||
| */ | ||
| default long getPersistedAppliedIndex() { | ||
| return 0; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't we implement it for all listeners, especially for MS/CMG?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should, the same situation is possible there, maybe worth a separate ticket though
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe that the real answer is "no, we should not". MG and CMG only use one data storage per FSM, which always guarantees that
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, thanks for clarification @ibessonov |
||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1066,6 +1066,19 @@ public boolean init(final NodeOptions opts) { | |
| return false; | ||
| } | ||
|
|
||
| // Restore appliedId so that unsafeTruncateSuffix() can reject truncation of applied entries. | ||
|
||
| long persistedApplied = this.options.getFsm().getPersistedAppliedIndex(); | ||
| if (persistedApplied > 0) { | ||
| long term = this.logManager.getTerm(persistedApplied); | ||
| if (term > 0) { | ||
| // Term is 0 when the index is outside the log (covered by a snapshot) — skip in that case. | ||
ibessonov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| this.logManager.setAppliedId(new LogId(persistedApplied, term)); | ||
| } else { | ||
| LOG.warn("Node {} persisted applied index {} is not in the raft log, expecting snapshot to cover it.", | ||
ibessonov marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| getNodeId(), persistedApplied); | ||
| } | ||
| } | ||
|
|
||
| final Status st = this.logManager.checkConsistency(); | ||
| if (!st.isOk()) { | ||
| LOG.error("Node {} is initialized with inconsistent log, status={}.", getNodeId(), st); | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1049,11 +1049,25 @@ private boolean reset(final long nextLogIndex) { | |
| } | ||
| } | ||
|
|
||
| private void unsafeTruncateSuffix(final long lastIndexKept, final Lock lock) { | ||
| /** | ||
| * Truncates log entries after {@code lastIndexKept}. | ||
| * | ||
| * @return {@code true} on success, {@code false} if truncation would discard applied entries (node moves to error state). | ||
| */ | ||
| private boolean unsafeTruncateSuffix(final long lastIndexKept, final Lock lock) { | ||
| if (lastIndexKept < this.appliedId.getIndex()) { | ||
| LOG.error("FATAL ERROR: Can't truncate logs before appliedId={}, lastIndexKept={}", this.appliedId, | ||
| lastIndexKept); | ||
| return; | ||
| LOG.error("Raft log suffix conflict: cannot truncate entries that have been applied to the state machine. " | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's keep log format as everywhere else. nodeId={}, appliedId={}, lastIndexKept={} should be at the end surrounded by [].
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done!
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that there should be no dot after |
||
| + "The partition will be moved to error state [nodeId={}, appliedId={}, lastIndexKept={}].", | ||
| this.nodeId, this.appliedId, lastIndexKept); | ||
| lock.unlock(); | ||
| try { | ||
| reportError(RaftError.EINVAL.getNumber(), | ||
| "Raft log suffix conflict: attempted to truncate applied entries, appliedId=%s, lastIndexKept=%d", | ||
| this.appliedId, lastIndexKept); | ||
| } finally { | ||
| lock.lock(); | ||
| } | ||
| return false; | ||
| } | ||
|
|
||
| this.logsInMemory.removeFromLastWhen(entry -> entry.getId().getIndex() > lastIndexKept); | ||
|
|
@@ -1068,6 +1082,7 @@ private void unsafeTruncateSuffix(final long lastIndexKept, final Lock lock) { | |
| final TruncateSuffixClosure c = new TruncateSuffixClosure(lastIndexKept, lastTermKept); | ||
| offerEvent(c, EventType.TRUNCATE_SUFFIX); | ||
| lock.lock(); | ||
| return true; | ||
| } | ||
|
|
||
| @SuppressWarnings("NonAtomicOperationOnVolatileField") | ||
|
|
@@ -1121,7 +1136,11 @@ private boolean checkAndResolveConflict(final List<LogEntry> entries, final Stab | |
| if (entries.get(conflictingIndex).getId().getIndex() <= this.lastLogIndex) { | ||
| // Truncate all the conflicting entries to make local logs | ||
| // consensus with the leader. | ||
| unsafeTruncateSuffix(entries.get(conflictingIndex).getId().getIndex() - 1, lock); | ||
| if (!unsafeTruncateSuffix(entries.get(conflictingIndex).getId().getIndex() - 1, lock)) { | ||
| Utils.runClosureInThread(nodeOptions.getCommonExecutor(), done, | ||
| new Status(RaftError.EINVAL, "Raft log suffix conflict with applied entries")); | ||
| return false; | ||
| } | ||
| } | ||
| this.lastLogIndex = lastLogEntry.getId().getIndex(); | ||
| } // else this is a duplicated AppendEntriesRequest, we have | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I had a great discussion with @rpuch and it seems that we have already fixed the possibility of having gap between persisted and applied index (see https://issues.apache.org/jira/browse/IGNITE-28216), so there is a chance that all this PR could be unnecessary
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IGNITE-28216 fixes one of the known root causes: specific missed lastApplied() bumps in partition processing. This PR adds a detection mechanism: if truncation below applied index ever happens (from this bug or any future bug), we fail the node instead of silently corrupting data.
As a result these are complementary. IGNITE-28216 prevents the gap; our PR catches it if it ever occurs anyway, due to some other unknown bug or nuance -- a similar future bug would again cause silent corruption with no indication of something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot resolve the comment, but I'm satisfied with your explanation and seems that this PR is really needeed