cat articles/numpy-cast

NumPy cast overflow behavior can vary by environment and array size

created 2021-03-31

I got caught by exactly what the title says: overflow behavior when casting in NumPy differed depending on the environment and the size of the data. It seems to take a different code path depending on the array length, and it took me a while to identify the cause. The correct answer is probably "do not pass overflowing data into a cast", but if behavior changes like this, I would at least appreciate a warning that an overflow happened.

This feels like the kind of bug that can look fine during development on a Mac, while already being broken, and then behave differently in production. I did not dig far enough to know whether this is specific to the Mac environment or whether it depends on the BLAS implementation, such as Intel MKL.

import platform
print(platform.system())
# オーバーフローして1になる
print(np.array([257.0], dtype="float32").astype('uint8'))
# オーバーフローして1になる
print(np.array([257.0, 0, 0, 0, 0, 0, 0], dtype="float32").astype('uint8'))
# オーバーフローして1になるが正しい、と思いきや、環境によっては丸め込まれて255になる
print(np.array([257.0, 0, 0, 0, 0, 0, 0, 0], dtype="float32").astype('uint8'))

Linux
[1]
[1 0 0 0 0 0 0]
[1 0 0 0 0 0 0 0]

Windows
[1]
[1 0 0 0 0 0 0]
[1 0 0 0 0 0 0 0]

On my Intel Mac, the value is clamped instead.

Darwin
[1]
[1 0 0 0 0 0 0]
[255   0   0   0   0   0   0   0]