@regehr mucky side effects. If there had been a __builitin_bitpermute(...) kind of intrinsic lying around, we'd probably see a lot of clear-cut bit permutes, but we might instead have a lot of code out there of the form
for (u32 i = 0; i < 64; ++i) {
if (val & perm[1UL << i]) {
// do weird shit
}
}
... that might have had a bit permute buried in there somewhere, but maybe hard to find with automated tools.