Recently, I had to discover the parameters for a cycling redundancy check (CRC) calculation. Because it was the second time within six months that there was a need for determining a CRC polynomial, I implemented a brute-forcing tool. This tool reveals the (truncated) polynomial, the initial value, input and output reflection and a finally applied XOR-mask. Obviously, brute-forcing fails for large polynomials like for a CRC-32, because you have to try different polynomials and initial values. But for smaller CRCs this approach works well. Usually, you do not have truncated polynomials larger than 16, especially not in simple telegrams with some bytes.
What is a truncated polynomial? In general, a polynomial has a grade. E.g., CRC-16-CCITT has a polynomial of grade 16. This is x^16 + x^12+x^5+1. To describe the polynomial by its coefficients the leading digit is often skipped, e.g., 1.0001.0000.0010.0001 becomes 1.0000.0010.0001. Thus, the term truncated polynomial is used to indicate this truncation.
Let’s see how this tool works. In a first step, we generate some test data. Therefore, the tool generate-test-data might be used. It generates some random bit strings and calculates a CRC, which is appended to the string.
$ ./generate-test-data --final-xor 0 --messages 10 | tee testmessage_1 # width : 10 bits # CRC's offset : 49 # calc CRC for bit offsets : 0 .. 49 (not included) # final XOR : 0 # reflect in : true # reflect out : false # # truncated polynom : 189 (MSB not shown) # initial value : 149 0110110010000011101000011000110101111000000000100 1111111010 0001000000001100101100100110011011111100000110100 0101000101 1101011100111000110110110010111011110110101001001 0011100111 0100000011100011110000100111010001101100010010011 0110100011 0110110000010111010110011001010011100101110000101 1100000001 0110101000000111010111101101001000110010100111110 1011001001 1000101011110110010100001110001111101110000011000 1000010010 0110111010111000011100100111110101110101111011010 1001111001 1001101001111000010110001000011101110000111001001 0101110000 1101110110101001101001110010011101010001110000110 0000110111
Here, the bit string starts at offset 0 and ends at offset 49. Please note, that these are not the common eight bits per byte. I avoided this byte-centric view, because in low-level reverse engineering messages might be odd-formed, especially when analyzing radio transmission.
In a second step, these test messages are reformatted to fit the brute-forcing tool’s input format. Therefore, I wrote a script in Perl. It reads bit strings (“101001…”), hex strings (“0xdeadbeef…”) and comma-separated lists of hex values (“0x23, 0x42, 0x5”) and reformats them to either a hex or a bit string.
$ perl ../rewrite-as.pl bits testmessage_1 | tee testmessage_1.bits 01101100100000111010000110001101011110000000001001111111010 00010000000011001011001001100110111111000001101000101000101 11010111001110001101101100101110111101101010010010011100111 01000000111000111100001001110100011011000100100110110100011 01101100000101110101100110010100111001011100001011100000001 01101010000001110101111011010010001100101001111101011001001 10001010111101100101000011100011111011100000110001000010010 01101110101110000111001001111101011101011110110101001111001 10011010011110000101100010000111011100001110010010101110000 11011101101010011010011100100111010100011100001100000110111
Finally, we launch the CRC brute-forcing tool, which requires specifying several offsets: The tool needs to know, where the message’s CRC relevant portion starts and stops and where the checksum is placed. When reverse engineering a telegram format, one might determine the checksum part without problems. This is caused by the CRC algorithm’s property that it is a Galois type shift register, which produces pseudo-random numbers. But it loops not on its own. Instead the algorithm is fed with message bits. Thus, the checksum looks more random than the rest of the message.
$ ./bruteforce-crc --width 10 --start 0 --end 49 --offs-crc 49 \ --file testmessage_1.bits --probe-reflections true number of threads : 4 width : 10 bits CRC's offset : 49 calc CRC for bit offsets : 0 .. 49 (not included) final XOR : 0 reflect in : false reflect out : false truncated polynom : from 0 to 1023 (MSB not shown) initial value : from 0 to 1023 probe reflections : true probe final xor : false extracted crc 03fa extracted crc 0145 extracted crc 00e7 extracted crc 01a3 extracted crc 0301 extracted crc 02c9 extracted crc 0212 extracted crc 0279 extracted crc 0170 extracted crc 0037 ----------------------[ MATCH ]-------------------------------- Found a model for the CRC calculation: Truncated polynom : 0xbd (189) Initial value : 0x95 (149) Final XOR : 0x0 (0) Reflected input : false Reflected output : false Message offset : from bit 0 .. 49 (end not included)
The brute-forcing tool extracts the checksum field at the specified offset and compares the calculated CRCs with the extracted CRCs. If a CRC matches the extracted one, the tool continues with testing the other messages. And finally, if a CRC parameter set satisfies all messages a CRC model is found.
If you have to reverse engineer an undocumented bus-protocol or some radio messages and don’t want to reconstruct the CRC parameters by putting on one's thinking cap, you might need such a tool, too. Therefore, I published the code on github.com.