The “incorrect hash in some cases” fixed in js-sha512 v0.9.0


The (i.e. by virtue of having that name on npm) js-sha512 library released version 0.9.0 in January 2024, the first update in over five years.

The changelog indicates there was a bug fix:

And the commit for it contains this diff:

         blocks[0] = this.block;
-        blocks[1] = blocks[2] = blocks[3] = blocks[4] =
+        this.block = blocks[1] = blocks[2] = blocks[3] = blocks[4] =
         blocks[29] = blocks[30] = blocks[31] = blocks[32] = 0;

There’s added code to zero out this.block.

For context, to compute SHA-512, you process the input in 128-byte chunks. For each chunk, you calculate an update to a 512-bit state. The specification describes this calculation as starting with loading the chunk into the beginning of an array of 80 64-bit integers, i.e. indices 0 through 15. Arrays of 64-bit integers, namely BigUint64Array, are a relatively recent addition to JavaScript, so the library instead uses an array of 160 32-bit-clean JavaScript Numbers, so it would load the chunk into indices 0 through 31.

So here’s this code that runs at the beginning of a chunk, which mostly zeros out an area of the array to be populated with bits from the input (zeroing it out because it uses bitwise OR to put a byte at a time into a 32-bit number). But it’s not exactly that. There’s this this.block that initializes index 0, and it also touches index 32. This is because the library can take a little more than 128 bytes at a time, which if it does, the extra spills into index 32. The rest of the per-chunk calculation will clobber index 32, so the code first saves any bits that spilled into it by copying index 32 into the this.block field before performing that calculation. Then, at the beginning of the next chunk, we get to this part of the code again, and we initialize index 0 with the spilled bits.

And about that spillage: it’s because the library can take string inputs and does its own encoding into UTF-8. It reads a code point at a time, so for higher code points, it will process multiple bytes, potentially spilling into the next chunk.

So far, all of that is fine. As we go through chunks this way in the middle of the hash, we may save some bits past the end of a chunk into this.block and restore them at the beginning of the next chunk.

The problem comes at the end of the input.

When finalizing the hash, there’s a certain way you have to append some padding and the input length to the end of the last chunk. If there’s no room at the end of the last chunk for that, you have to add an extra chunk.

Without the new code to zero out the already-restored this.block value, the code path for making that extra chunk could incorrectly re-use the saved this.block, putting it into the beginning of this extra chunk when it had already appeared at the beginning of the chunk before.

To trigger this issue, we would need to provide an input that

  1. is the right length to trigger the need-extra-chunk codepath, 112 <= byteLength(input) mod 128 < 128, and
  2. has a chunk right before that extra chunk that has a multi-byte code point spill into it

(Detail: We need byteLength(input) mod 128 < 128 because the library treats ending the input exactly at the chunk boundary more like having an empty chunk after the end rather than having a full chunk at the end. The author may be a glass-half-empty kind of person. Which, come to think of it, is perhaps the more optimistic view if free space is meant to be a good thing.)

Here’s a minimal test case meeting those conditions:

const {sha512} = require('js-sha512');
const test =
  // fill up most of first chunk
  '\u0021'.repeat(127) +
  // two-byte C2 A1 as last byte of first chunk
  // and as first byte of second chunk
  '\u00a1' +
  // fill second chunk up to 112 bytes
Library Output (hex)
js-sha512@0.8.0 a04469756440fd376932dde7799b286bb06864b59055360155cfac09e4f3e9a8283c4bb201e63f29df38f1267c9531ee71db03feb66f6c79cfe054fb1d0aec24
js-sha512@0.9.0 d6a53a8194ef6a2e3ebe7882be8b1a5485cf27b970d682e2c42366f6634336cbdd35dd5f9c153c3d90cda7b4e3ba612cced269a36ca981ec551d298430e1e617
sha512sum (GNU coreutils 9.1) d6a53a8194ef6a2e3ebe7882be8b1a5485cf27b970d682e2c42366f6634336cbdd35dd5f9c153c3d90cda7b4e3ba612cced269a36ca981ec551d298430e1e617

If you use the library with only binary inputs, I believe this won’t affect you. This has worked fine, for example:

const testU8 = new TextEncoder().encode(test);

The library always takes input from a byte array one byte at a time, so nothing ever spills into index 32 or this.block.

The author also added their own test for this behavior, which looks like an LLM prompt for upselling a prenuptial agreement, in Finnish.

The changelog for this release also mentions a change to unsigned right shift. My reading of the code is that the values shifted in the changed lines are all small, no larger than 0xFFFF at the most. So this shouldn’t change any behavior.

There repo also contains a minified copy of the library, and I didn’t look at that.

Oh, and there are 3367 lines changed in package-lock.json. You’re on your own for those.

My last post was about either Having Someone in your Replit editor or List of washing machine cycles that would work normally in zero gravity. Find out which.