Introduce optimized CRC-32 implementation#4917
Conversation
| namespace Svc { | ||
|
|
||
| CmdSequencerComponentImpl::FPrimeSequence::CRC ::CRC() : m_computed(INITIAL_COMPUTED_VALUE), m_stored(0) {} | ||
| CmdSequencerComponentImpl::FPrimeSequence::CRC ::CRC() : m_computed(), m_stored(0) {} |
Check notice
Code scanning / CodeQL
More than one statement per line Note
| 0xCCB0A91F, 0x740CCE7A, 0x66B96194, 0xDE0506F1, | ||
| }; | ||
|
|
||
| U32 crc32_ieee802_3_update(const U8* data, FwSizeType length, U32 crc) { |
Check notice
Code scanning / CodeQL
Long function without assertion Note
|
|
||
| static int crc_tab16_init = CRC_FALSE; | ||
| // F PRIME CHANGE (see header) | ||
| #if 0 |
Check notice
Code scanning / CodeQL
Conditional compilation Note
|
|
||
| static unsigned short crc_tab16[256]; | ||
| // F PRIME CHANGE (see header) | ||
| #if 0 |
Check notice
Code scanning / CodeQL
Conditional compilation Note
|
|
||
| static void init_crc16_tab( void ); | ||
| // F PRIME CHANGE (see header) | ||
| #if 0 |
Check notice
Code scanning / CodeQL
Conditional compilation Note
| \*******************************************************************/ | ||
|
|
||
| // F PRIME CHANGE (see header) | ||
| #if 0 |
Check notice
Code scanning / CodeQL
Conditional compilation Note
| \*******************************************************************/ | ||
|
|
||
| // F PRIME CHANGE (see header) | ||
| #if 0 |
Check notice
Code scanning / CodeQL
Conditional compilation Note
|
|
||
| void CmdSequencerComponentImpl::FPrimeSequence::CRC ::finalize() { | ||
| this->m_computed = ~this->m_computed; | ||
| this->m_computed.update(buffer, bufferSize); |
Check warning
Code scanning / CodeQL
Unchecked function argument Warning
| }; | ||
|
|
||
| U32 crc32_ieee802_3_update(const U8* data, FwSizeType length, U32 crc) { | ||
| FW_ASSERT(data != nullptr, FwAssertArgType(length)); |
Check warning
Code scanning / CodeQL
Unchecked function argument Warning
| PlatformPointerCastType data_address = reinterpret_cast<PlatformPointerCastType>(data); | ||
| U32 correction = (step_size - (data_address & (step_size - 1))) & (step_size - 1); | ||
| for (; i < correction; i++) { | ||
| crc = (crc >> 8) ^ crc32_ieee802_3_lookup0[(crc & 0xFF) ^ data[i]]; |
Check warning
Code scanning / CodeQL
Unchecked function argument Warning
| // we'd need the complexity of considering endianness.) | ||
| FwSizeType stop_at = length - 3; | ||
| for (; i < stop_at; i += 4) { | ||
| crc ^= U32(data[i + 0] | (data[i + 1] << 8) | (data[i + 2] << 16) | (data[i + 3] << 24)); |
Check warning
Code scanning / CodeQL
Unchecked function argument Warning
|
|
||
| // Handle the last few bytes one-by-one. | ||
| for (; i < length; i++) { | ||
| crc = (crc >> 8) ^ crc32_ieee802_3_lookup0[(crc & 0xFF) ^ data[i]]; |
Check warning
Code scanning / CodeQL
Unchecked function argument Warning
| for (; i < length; i++) { | ||
| crc = (crc >> 8) ^ crc32_ieee802_3_lookup0[(crc & 0xFF) ^ data[i]]; | ||
| } | ||
| return crc; |
Check warning
Code scanning / CodeQL
Unchecked function argument Warning
| c = static_cast<const char*>(data)[index]; | ||
| this->hash_handle = static_cast<HASH_HANDLE_TYPE>(update_crc_32(this->hash_handle, c)); | ||
| } | ||
| this->hash_handle = crc32_ieee802_3_update(static_cast<const U8*>(data), len, this->hash_handle); |
Check warning
Code scanning / CodeQL
Unchecked function argument Warning
LeStarch
left a comment
There was a problem hiding this comment.
Overall, I am happy with this PR. The one issue that might be fixed is the FIXME is file.
| Utils::Hash crc; | ||
| FwSizeType writeSize = static_cast<FwSizeType>(HASH_DIGEST_LENGTH); | ||
| U32 crcInitial = U32(~0); | ||
| stat = paramFile.write(reinterpret_cast<const U8*>(&crcInitial), writeSize, Os::File::WaitType::WAIT); |
There was a problem hiding this comment.
I know this bug pre-exists this PR, but if I am reading this right, the CRC stored here is endianness-dependent.....
There was a problem hiding this comment.
Yes. And, I think the CRC-32 is not correctly calculated because it is supposed to have a one's complement at the end. But the way it is used, there is no one's complement performed.
| this->m_crc = static_cast<U32>(update_crc_32(this->m_crc, static_cast<CHAR>(this->m_crc_buffer[i]))); | ||
| } | ||
| FW_ASSERT(size <= FW_FILE_CHUNK_SIZE, FwAssertArgType(size)); | ||
| // FIXME: Utils::Hash should be integrated more carefully into File |
There was a problem hiding this comment.
There is a FIXME here. Seems that hash should be stored rather than m_crc.
There was a problem hiding this comment.
I added the FIXME because I was of the opinion that this was a larger change that felt outside the scope of my PR. Although, I can take another look if that's going to be a blocker.
There was a problem hiding this comment.
We can take a look, and wrap this up.
|
Note: I spun the endianness problem into a new ticket: #4931 |
Change Description
Introduce a more optimized (but still platform-agnostic) CRC-32 implementation. Introduce implementation-agnostic unit tests for the CRC-32 implementation. Update code that broke the abstractions and used libcrc directly to use Utils::Hash instead.
Rationale
The existing CRC-32 implementation is not performant. On the VA41630 microcontroller, the new algorithm performs at around ~11.7 MB/s, which is much better than the existing algorithm's 3.8 MB/s. The use of a set of hard-coded CRC-32 lookup tables also reduces the amount of data memory required.
Testing/Review Recommendations
Existing code did not always properly use the Utils::Hash interface. Where possible, I've changed other code to use that interface. However, CCSDS code relies on a different CRC algorithm that has not been reimplemented as part of this PR, so I had to keep around the majority of libcrc for the time being. Additionally, Os::File uses CRCs in a way that is too complex for me to easily refactor, and I recommend that someone with more familiarity with this code goes in and addresses the FIXMEs I've left.
Future Work
I would like there to be a way for individual platforms to provide even more optimized versions of this algorithm. The performance on the VA41630 can be raised to at least 13.1 MB/s through the use of inline assembly to pipeline LDR instructions more efficiently.
AI Usage (see policy)
Claude Code was consulted to provide advice on optimizing the resulting assembly for the CRC-32 implementation. The advice was not useful and was disregarded.