Which open source license forbids the training of AI on?
Like is there a "BSD 5-clause", "GPLv4", or "CC BY-SA-NAI"? Well I guess "CC NC" or "CC ND" but what if you want commercial non-AI use and/or non-AI derivatives?
Ramblings of a programmer and cryptography enthusiast. I do stuff… sometimes. Also creating hsmVault.com… eventually.
Which open source license forbids the training of AI on?
Like is there a "BSD 5-clause", "GPLv4", or "CC BY-SA-NAI"? Well I guess "CC NC" or "CC ND" but what if you want commercial non-AI use and/or non-AI derivatives?
Oh shit the high steaks launch video came out: https://www.youtube.com/watch?v=9UX7NJLYyb4
@soatok Dhole mention in WAN show and they umm don't know how to pronounce it, but they figure it out https://youtu.be/TWb2P-GGBcU?t=2h02m40s
I was wondering what the new US administration's stance would be on the transition to post-quantum cryptography.
It seems that they are taking the threat of quantum computing very seriously, as the Secretary of Commerce highlighted in an interview:
“the only thing I **need** to do that has to do with regulatory, is post-quantum cryptography”
But then goes on saying "your password is different than mine" and the “central hub has our key” to describe asymmetric cryptography 🤦♂️
@Bugcrowd Why do you not support U2F? Do you not like convenience and security? https://www.bugcrowd.com/blog/bugcrowd-security-update-password-reset-and-mfa-requirement/
@coreyspowell Hey umm stupid question, but why does that look like a photo of the moon blocking the sun?
Earth is 3.667x bigger than the moon (radius wise). If that's Earth then it should be much bigger vs the sun in that image and shouldn't show the sun's corona all around the Earth. Unless the ring loop is a lens flare that just so happens to look like the corona.
Hmm it was a micromoon (thus "microearth") and we're a little closer to the sun, but that's not enough. Earth should still be >3x the size of the sun (radius wise) from the moon at that time.
----
Oh is that the Earth's atmosphere reflecting the sun? Which means the appearance of the corona is actually clouds.
@firstyear It's the format of the hash. Those will start with $krb5pa$18$ and the parameters will be in a specific format vs generic PBKDF2 which is hash:iterations:base64 encoded salt:base64 encoded output. Also I just looked at the code for $krb5pa$18$ and it uses AES. It appears to be something like: hash = AES-CBC-encrypt(key:PBKDF2-SHA1(...), data).
@firstyear Settings such that an attacker's speed is less than 10,000 guesses/second/"GPU". Historically a "GPU" was the best GPU for password cracking, but the RTX 4090 happened which had a better performance/cost ratio. So I considered it "1.5 GPUs". The RTX 5080 and RTX 5090 are 1 and 2 GPUs respectively. Basically a GPU is one that has an MSRP of $700 in 2015 USD, but things like power and size are considered.
Argon2 and bcrypt are better for defenders. PBKDF2 should be fazed out due to being pro attacker and anti defender. Defenders have SIMD and multiple cores but PBKDF2 can only use 1 core and non-SIMD which is about 1/32 the computational ability of a common current CPU (depends on cores and AVX2 (8x) or AVX-512 (16x*)). Note GPUs have about 10x more compute than an equivalent CPU.
*Current AMD CPUs with AVX-512 (512 bit SIMD) can process about 3, 256 bit SIMD chunks/clock. It does increase speed but not 2x. This is because AVX-512 has 2x the registers which causes less register pressure (moving between registers and memory).
😳 Sorry wrong link... should of been RTX 5090 benchmarks instead of RTX 4090.
Update: Based on this benchmark (https://gist.github.com/Chick3nman/09bac0775e6393468c2925c1e1363d5c) new minimums are:
PBKDF2-HMAC-SHA1: 1,400,000 iterations
PBKDF2-HMAC-SHA256: 600,000 iterations*
PBKDF2-HMAC-SHA512: 220,000 iterations
bcrypt: cost 9
* For some reason PBKDF2-HMAC-SHA256 on an RTX 5090 (as 2 GPUs) is slower than an RTX 4090 (as 1.5 GPUs) thus using RTX 4090 (as 1.5 GPUs) for the minimum.
(Edit: Wrong link (it was to RTX 4090 benchmarks instead of RTX 5090 benchmarks))
What is an ankle biter?... Oh I wish there was an option to randomize the order per account that sees this or that's the default/only way. I put "other" first. Since most just pick the first one. I flipped a coin for placement of the real answers. Hoping this makes it unbiased... I should probably remove everything besides the question to make it more unbiased ¯\_(ツ)_/¯
I found a better method and then shrunk the search space a bunch... yeah no it ain't happening. Like god has a higher chance of existing and we made her up.
P.S. That's a joke. This has a way higher chance of having a solution. I'm still going to run something eventually but only expecting it to be 30 watts of extra heating. I spent about 2 kWh on this ignoring testing.
Current status: Heating home with math. There's a prize for finding a valid solution but it's very unlikely there's a solution. Also it's like <100 W of extra heat so "heating". Oh my CPU has a TDP of 65 W. I thought it was more. I'd turn a light off to offset power usage but that's the base state.
@karlbode.com One reason for not deduplicating SSNs is some people are very stupid.
> Over time, the number that appeared (078-05-1120) has been claimed by a total of over 40,000 people as their own.
https://en.wikipedia.org/wiki/Social_Security_number#SSNs_used_in_advertising
I'm sure some of those are fraud but a vast majority are dumb dumbs: "I bought a new wallet and it assigned me a SSN. Now that's a time saver."
Anyone know where to submit a bug report/issue for VS2022? Is there a Github page for it?
_mm256_fnmadd_pd is generating inefficient code because it's only generating VFNMADD213PD instead of VFNMADD132PD or VFNMADD231PD.
Anyone have VPS suggestions? I'm looking at probably 1and1 (oh right, 1and1 got bought and their name changed to something stupid "ion something" ah "ionos") because I just found out they have cheap VPSes and I already use them for my domains.
Stupid answers: DigitalOcean (forced 0FA as "2FA" and no U2F support. Thus stupid AF--fuck that) and now obviously not Vultr (they also don't have U2F support but they haven't lock me out because of BS 0FA).
Oh right https://tobtu.com is down because the Vultr VPS it was hosted on broke and I lost more than I thought. Anyway funny story, about a week ago I was like I haven't made a backup in awhile... I should do that. Nice last one was 2024-02-24 and it's not everything. I guess I should of paid that $1/mo for backups or $0.05/GB/mo for a snapshot.
Update: RTX 50 series is CUDA compute capability 12.0 and "Maximum amount of shared memory per thread block" is unchanged from the RTX 40 series (https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#features-and-technical-specifications-technical-specifications-per-compute-capability).
** Note I'll need benchmarks to confirm but this will likely be the only change. Also unofficial preview **
The RTX 5080 should be ~2.3% which I didn't think would be enough to change anything, but it's just enough to bump PBKDF2 minimums by the minimal amount:
PBKDF2-HMAC-SHA512: 220,000 iterations
PBKDF2-HMAC-SHA256: 610,000 iterations
PBKDF2-HMAC-SHA1: 1,400,000 iterations
If you're wondering, I round up to the next 2 significate digit number of iterations needed to get a GPU down to 10 kH/s.
@soatok That livestream will be several hours and you'll find next to nothing of substance. Last time I looked it took like an hour to find the code I was looking for but once found it took less than a minute to verify a bug I knew existed. Another time I gave up after like 20 minutes and looked at Nadim's Signal implementation to see if a bug existed. It did and it took a minute to find and verify in his code.
This is all due to the coding philosophy of like small functions or whatever. To follow the code you jump around just to find out at the end it's just returning a constant. Maybe Github is better at this now. Before it was search for function name then the next function name... It's also across multiple repos and I'm always in the wrong one. It's spaghetti and the meatballs are hiding in a couch in another house. **That said I use and trust Signal.**
** THE CHALLENGE **
The two bugs were with the safety number and DH ratchet. If you can find and verify both bugs in the code in under an hour you are a god. No seriously start at https://github.com/signalapp and time yourself (or don't it's a waste of time. I just tried gave up after a few minutes. It's still bad. After a bit I looked up a special string I remembered was nearby the code I was looking for. Well that code move but the string was there):
1) The safety number is a hash that includes your phone number and public key. Thus if you posted your public key before (like you were encouraged to do) and safety digits (which people were doing when it came out), then one can make guesses at your phone number (I got a shout out in https://signal.org/blog/safety-number-updates/).
2) DH ratchet doesn't have backwards security because it doesn't include identity keys.
----
If you (anyone) do this, I'd like to know how long it took to find or give up.
@forty Oh right the consensus was "I'll have to think about this." I asked about my use of HKDF. I was trying to make the code nice but it makes the security proof hard. Basically it just has PRF security vs KDF security. So I think I'll copy the use of HKDF from TLS 1.3, MLS, and OPAQUE.
I used the salt as domain separator, but they all use the info for that "tls13 [...]", "MLS 1.0 [...]", and "OPAQUE-[...]". I wanted to do it at the beginning and not worry about it. There's code that I reuse that needs domain separation. I now need multiple info strings for that. Oh right I saw this thing from https://www.latacora.com/blog/2024/07/29/crypto-right-answers-pq/ (see image) and was told awhile ago that it was an understood use. I no longer think that is true because I believe they said TLS or MLS did this but looking at that now it's slightly different. They expand to a new key and use that as the salt for the next extract and not the direct extract output as the salt for the next extract. Anyway it will need a rewrite.
Spoiler: It will have 3 PAKEs (balanced [CPace*], augmented [BS-SPEKE], and doubly augmented [Double BS-SPEKE]). I might add an identity PAKE too. CHIP but not CRISP from https://eprint.iacr.org/2020/529. The paper incorrectly claims CRISP to be an siPAKE (strong identity PAKE) but the salt is not protected because there is no OPRF. It's "protected" by a slow group operation it's like 10x or 100x slower than ECC (I forget exactly what this operation is). That operation needs to happen once per password guess, but also needs to happen when using CRISP. So you can't make it too slow. Also iPAKEs main use case is for IoT devices and being slow would be very bad.
* CPace doesn't have a spec yet. So I might just call it S-SPEKE and give it a spec. S-SPEKE is "salted SPEKE" like BS-SPEKE is "blind salted SPEKE" (also there's B-SPEKE which is an augmented version of SPEKE but is suboptimal because it doesn't use Noise-KN). CPace has an option for only one salt and doesn't have a mode for figure out who is the initiator. Think like NAT hole punching with UDP packets then doing a bPAKE. Also CPace is literally just SPEKE but one or both parties sends a salt to hash with the password. Oh and SPEKE was only defined on DH because SWU and Elligator weren't invented yet, but it's an obvious conversion from DH to ECC with those.