Lmst

After more than 5 years of trustworthy use, I discarded my old hasher 'pwgn' and replaced it with a brand new 'pwg2'.

'Pwg2' is based on the widely implemented and appreciated Argon2 hash function. The script became incredibly more shorter and simpler. Also because of Argon2 the execution time is dramatically reduced. All in all another productive day!

Please test the script if you like and report back any vulnerabilities. Thank you!

#linux #bash #script #argon2 #password #hash #function
https://codeberg.org/oxo/tool/src/branch/main/pwg2

#4 👥 Leverage built-in authentication with #Breeze, #Fortify or #Jetstream
🗝️ Store passwords securely using #Bcrypt or #Argon2 hashing algorithms
🔑 Secure environment variables and force #HTTPS in production environments

„The #bcrypt password hashing function should only be used for password storage in legacy systems where #Argon2 and scrypt are not available.“
https://cheatsheetseries.owasp.org/cheatsheets/Password_Storage_Cheat_Sheet.html#bcrypt #security #owasp

Key Transparency and the Right to be Forgotten

This post is the first in a new series covering some of the reasoning behind decisions made in my project to build end-to-end encryption for direct messages on the Fediverse.

(Collectively, Fedi-E2EE.)

Although the reasons for specific design decisions should be immediately obvious from reading the relevant specification (and if not, I consider that a bug in the specification), I believe writing about it less formally will improve the clarity behind the specific design decisions taken.

In the inaugural post for this series, I’d like to focus on how the Fedi-E2EE Public Key Directory specification aims to provide Key Transparency and an Authority-free PKI for the Fediverse without making GDPR compliance logically impossible.

CMYKat‘s art, edited by me.

Background

Key Transparency

For a clearer background, I recommend reading my blog post announcing the focused effort on a Public Key Directory, and then my update from August 2024.

If you’re in a hurry, I’ll be brief:

The goal of Key Transparency is to ensure everyone in a network sees the same view of who has which public key.

How it accomplishes this is a little complicated: It involves Merkle trees, digital signatures, and a higher-level protocol of distinct actions that affect the state machine.

If you’re thinking “blockchain”, you’re in the right ballpark, but we aren’t propping up a cryptocurrency. Instead, we’re using a centralized publisher model (per Public Key Directory instance) with decentralized verification.

Add a bit of cross-signing and replication, and you can stitch together a robust network of Public Key Directories that can be queried to obtain the currently-trusted list of public keys (or other auxiliary data) for a given Fediverse user. This can then be used to build application-layer protocols (i.e., end-to-end encryption with an identity key more robust than “trust on first use” due to the built-in audit trail to Merkle trees).

I’m handwaving a lot of details here. The Architecture and Specification documents are both worth a read if you’re curious to learn more.

Harubaki

Right To Be Forgotten

I am not a lawyer, nor do I play one on TV. This is not legal advice. Other standard disclaimers go here.

Okay, now that we’ve got that out of the way, Article 17 of the GDPR establishes a “Right to erasure” for Personal Data.

What this actually means in practice has not been consistently decided by the courts yet. However, a publicly readable, immutable ledger that maps public keys (which may be considered Personal Data) with Actor IDs (which includes usernames, which are definitely Personal Data) goes against the grain when it comes to GDPR.

It remains an open question of there is public interest in this data persisting in a read-only ledger ad infinitum, which could override the right to be forgotten. If there is, that’s for the courts to decide, not furry tech bloggers.

I know it can be tempting, especially as an American with no presence in the European Union, to shrug and say, “That seems like a them problem.” However, if other folks want to be able to use my designs within the EU, I would be remiss to at least consider this potential pitfall and try to mitigate it in my designs.

So that’s exactly what I did.

Almost Contradictory

At first glance, the privacy goals of both Key Transparency and the GDPR’s Right To Erasure are at odds.

One creates an immutable, append-only history.
The other establishes a right for EU citizens’ history to be selectively censored, which means history has to be mutable.

However, they’re not totally impossible to reconcile.

An untested legal theory circulating around large American tech companies is that “crypto shredding” is legally equivalent to erasure.

Crypto shredding is the act of storing encrypted data, and then when given a legal takedown request from an EU citizen, deleting the key instead of the data.

This works from a purely technical perspective: If the data is encrypted, and you don’t know the key, to you it’s indistinguishable from someone who encrypted the same number of NUL bytes.

In fact, many security proofs for encryption schemes are satisfied by reaching this conclusion, so this isn’t a crazy notion.

Is Crypto Shredding Plausible?

In 2019, the European Parliamentary Research Service published a lengthy report titled Blockchain and the General Data Protection Regulation which states the following:

Before any examination of whether blockchain technology is capable of complying with Article 17 GDPR; it must be underscored that the precise meaning of the term ‘erasure’ remains unclear.
Article 17 GDPR does not define erasure, and the Regulation’s recitals are equally mum on how this term should be understood. It might be assumed that a common-sense understanding of this terminology ought to be embraced. According to the Oxford English Dictionary, erasure means ‘the removal or writing, recorded material, or data’ or ‘the removal of all traces of something: obliteration’.494
From this perspective, erasure could be taken to equal destruction. It has, however, already been stressed that the destruction of data on blockchains, particularly these of a public and permissionless nature, is far from straightforward.
There are, however, indications that the obligation inherent to Article 17 GDPR does not have to be interpreted as requiring the outright destruction of data. In Google Spain, the delisting of information from research results was considered to amount to erasure. It is important to note, however, that in this case, this is all that was requested of Google by the claimant, who did not have control over the original data source (an online newspaper publication). Had the claimant wished to obtain the outright destruction of the relevant data it would have had to address the newspaper, not Google. This may be taken as an indication that what the GDPR requires is that the obligation resting on data controllers is to do all they can to secure a result as close as possible to the destruction of their data within the limits of [their] own factual possibilities.
Dr Michèle Finck, Blockchain and the General Data Protection Regulation, pp. 75-76

From this, we can kind of intuit that the courts aren’t pedantic: The cited Google Spain case was satisfied by merely delisting the content, not the erasure of the newspaper’s archives.

The report goes on to say:

As awareness regarding the tricky reconciliation between Article 17 GDPR and distributed ledgers grows, a number of technical alternatives to the outright destruction of data have been considered by various actors. An often-mentioned solution is that of the destruction of the private key, which would have the effect of making data encrypted with a public key inaccessible. This is indeed the solution that has been put forward by the French data protection authority CNIL in its guidance on blockchains and the GDPR. The CNIL has suggested that erasure could be obtained where the keyed hash function’s secret key is deleted together with information from other systems where it was stored for processing.
Dr Michèle Finck, Blockchain and the General Data Protection Regulation, pp. 76-77

That said, I cannot locate a specific court decision that affirms that crypto erasure is legally sufficient for complying with data erasure requests (nor any that affirm that it’s necessary).

I don’t have a crystal ball that can read the future on what government compliance will decide, nor am I an expert in legal matters.

Given the absence of a clear legal framework, I do think it’s totally reasonable to consider crypto-shredding equivalent to data erasure. Most experts would probably agree with this. But it’s also possible that the courts could rule totally stupidly on this one day.

Therefore, I must caution anyone that follows a similar path: Do not claim GDPR compliance just because you implement crypto-shredding in a distributed ledger. All you can realistically promise is that you’re not going out of your way to make compliance logically impossible. All we have to go by are untested legal hypotheses, and very little clarity (even if the technologists are near-unanimous on the topic!).

Towards A Solution

With all that in mind, let’s start with “crypto shredding” as the answer to the GDPR + transparency log conundrum.

This is only the start of our complications.

CMYKat

Protocol Risks Introduced by Crypto Shredding

Before the introduction of crypto shredding, the job of the Public Key Directory was simple:

Receive a protocol message.
Validate the protocol message.
Commit the protocol message to a transparency log (in this case, Sigsum).
Retrieve the protocol message whenever someone requests it to independently verify its inclusion.
Miscellaneous other protocol things (cross-directory checkpoint commitment, replication, etc.).

Point being: there was very little that the directory could do to be dishonest. If they lied about the contents of a record, it would invalidate the inclusion proofs of every successive record in the ledger.

In order to make a given record crypto-shreddable without breaking the inclusion proofs for every record that follows, we need to commit to the ciphertext, not the plaintext. (And then, when a takedown request comes in, wipe the key.)

Now, things are quite more interesting.

Do you…

…Distribute the encryption key alongside the ciphertext and let independent third parties decrypt it on demand?
…OR…
Decrypt the ciphertext and serve plaintext through the public API, keeping the encryption key private so that it may be shredded later?

The first option seems simple, but runs into governance issues: How do you claim the data was crypto-shredded if countless individuals have a copy of the encryption key, and can therefore recover the plaintext from the ciphertext?

I don’t think that would stand up in court.

CMYKat

Clearly, your best option is the second one.

Okay, so how does an end user know that the ciphertext that was committed to the transparency ledger decrypts to the specific plaintext value served by the Public Key Directory? How do users know it’s not lying?

Quick aside: This question is also relevant if you went with the first option and used a non-committing AEAD mode for the actual encryption scheme.
In that scenario, a hostile nation state adversary could pressure a Public Key Directory to selectively give one decryption key to targeted users, and another to the rest of the Internet, in order to perform a targeted attack against citizens they’d rather didn’t have civil rights.
My entire goal with introducing key transparency to my end-to-end encryption proposal is to prevent these sorts of attacks, not enable them.

There are a lot of avenues we could explore here, but it’s always worth outlining the specific assumptions and security goals of any design before you start perusing the literature.

Assumptions

This is just a list of things we assume are true, and do not need to prove for the sake of our discussion here today. The first two are legal assumptions; the remainder are cryptographic.

Ask your lawyer if you want advice about the first two assumptions. Ask your cryptographer if you suspect any of the remaining assumptions are false.

Crypto-shredding is a legally valid way to provide data erasure (as discussed above).
EU courts will consider public keys to be Personal Data.
The SHA-2 family of hash functions is secure (ignoring length-extension attacks, which won’t matter for how we’re using them).
HMAC is a secure way to build a MAC algorithm out of a secure hash function.
HKDF is a secure KDF if used correctly.
AES is a secure 128-bit block cipher.
Counter Mode (CTR) is a secure way to turn a block cipher into a stream cipher.
AES-CTR + HMAC-SHA2 can be turned into a secure AEAD mode, if done carefully.
Ed25519 is a digital signature algorithm that provides strong security against existent forgery under a chosen-message attack (SUF-CMA).
Argon2id is a secure, memory-hard password KDF, when used with reasonable parameters. (You’ll see why in a moment.)
Sigsum is a secure mechanism for building a transparency log.

This list isn’t exhaustive or formal, but should be sufficient for our purposes.

Security Goals

The protocol messages stored in the Public Key Directory are accompanied by a Merkle tree proof of inclusion. This makes it append-only with an immutable history.
The Public Key Directory cannot behave dishonestly about the decrypted plaintext for a given ciphertext without clients detecting the deception.
Whatever strategy we use to solve this should be resistant to economic precomputation and brute-force attacks.

Can We Use Zero-Knowledge Proofs?

At first, this seems like an ideal situation for a succinct, non-interactive zero-knowledge proof.

After all, you’ve got some secret data that you hold, and you want to prove that a calculation is correct without revealing the data to the end user. This seems like the ideal setup for Schnorr’s identification protocol.

CMYKat

Unfortunately, the second assumption (public keys being considered Personal Data by courts, even though they’re derived from random secret keys) makes implementing a Zero-Knowledge Proof here very challenging.

First, if you look at Ed25519 carefully, you’ll realize that it’s just a digital signature algorithm built atop a Schnorr proof, which requires some sort of public key (even an ephemeral one) to be managed.

Worse, if you try to derive this value solely from public inputs (rather than creating a key management catch-22), the secret scalar your system derives at will have been calculated from the user’s Personal Data–which only strengthens a court’s argument that the public key is therefore personally identifiable.

CMKat

There may be a more exotic zero-knowledge proof scheme that might be appropriate for our needs, but I’m generally wary of fancy new cryptography.

Here are two rules I live by in this context:

If I can’t get the algorithms out of the crypto module for whatever programming language I find myself working with, it may as well not even exist.
- Corollary: If libsodium bindings are available, that counts as “the crypto module” too.
If a developer needs to reach for a generic Big Integer library (e.g., GMP) for any reason in the course of implementing a protocol, I do not trust their implementation.

Unfortunately, a lot of zero-knowledge proof designs fail one or both of these rules in practice.

(Sorry not sorry, homomorphic encryption enthusiasts! The real world hasn’t caught up to your ideas yet.)

What About Verifiable Random Functions (VRFs)?

It may be tempting to use VRFs (i.e., RFC 9381), but this runs into the same problem as zero-knowledge proofs: we’re assuming that an EU court would deem public keys Personal Data.

But even if that assumption turns out false, the lifecycle of a protocol message looks like this:

User wants to perform an action (e.g., AddKey).
Their client software creates a plaintext protocol message.
Their client software generates a random 256-bit key for each potentially-sensitive attribute, so it can be shredded later.
Their client software encrypts each attribute of the protocol message.
The ciphertext and keys are sent to the Public Key Directory.
For each attribute, the Public Key Directory decrypts the ciphertext with the key, verifies the contents, and then stores both. The ciphertext is used to generate a commitment on Sigsum (signed by the Public Key Directory’s keypair).
The Public Key Directory serves plaintext to requestors, but does not disclose the key.
In the future, the end user can demand a legal takedown, which just wipes the key.

Let’s assume I wanted to build a VRF out of Ed25519 (similar to what Signal does with VXEdDSA). Now I have a key management problem, which is pretty much what this project was meant to address in the first place.

VRFs are really cool, and more projects should use them, but I don’t think they will help me.

CMYKat

Soatok’s Proposed Solution

If you want to fully understand the nitty-gritty implementation details, I encourage you to read the current draft specification, plus the section describing the encryption algorithm, and finally the plaintext commitment algorithm.

Now that we’ve established all that, I can begin to describe my approach to solving this problem.

First, we will encrypt each attribute of a protocol message, as follows:

For subkey derivation, we use HKDF-HMAC-SHA512.
For encrypting the actual plaintext, we use AES-256-CTR.
For message authentication, we use HMAC-SHA512.
Additional associated data (AAD) is accepted and handled securely; i.e., we don’t use YOLO as a hash construction.

This prevents an Invisible Salamander attack from being possible.

This encryption is performed client-side, by each user, and the symmetric key for each attribute is shared with the Public Key Directory when publishing protocol messages.
If they later issue a legal request for erasure, they can be sure that the key used to encrypt the data they previously published isn’t secretly the same key used by every other user’s records.
They always know this because they selected the key, not the server. Furthermore, everyone can verify that the hash published to the Merkle tree matches a locally generated hash of the ciphertext they just emitted.
This provides a mechanism to keep everyone honest. If anything goes wrong, it will be detected.

Next, to prevent the server from being dishonest, we include a plaintext commitment hash, which is included as part of the AAD (alongside the attribute name).

(Implementing crypto-shredding is straightforward: simply wipe the encryption keys for the attributes of the records in scope for the request.)

If you’ve read this far, you’re probably wondering, “What exactly do you mean by plaintext commitment?”

Art by Scruff.

Plaintext Commitments

The security of a plaintext commitment is attained by the Argon2id password hashing function.

By using the Argon2id KDF, you can make an effective trapdoor that is easy to calculate if you know the plaintext, but economically infeasible to brute-force attack if you do not.

However, you need to do a little more work to make it safe.

Harubaki

The details here matter a lot, so this section is unavoidably going to be a little dense.

Pass the Salt?

Argon2id expects both a password and a salt.

If you eschew the salt (i.e., zero it out), you open the door to precomputation attacks (see also: rainbow tables) that would greatly weaken the security of this plaintext commitment scheme.

You need a salt.

If you generate the salt randomly, this commitment property isn’t guaranteed by the algorithm. It would be difficult, but probably not impossible, to find two salts (, ) such that .

Deriving the salt from public inputs eliminates this flexibility.

By itself, this reintroduces the risk of making salts totally deterministic, which reintroduces the risk of precomputation attacks (which motivated the salt in the first place).

If you include the plaintext in this calculation, it could also create a crib that gives attackers a shortcut for bypassing the cost of password hashing.

Furthermore, any two encryptions operations that act over the same plaintext would, without any additional design considerations, produce an identical value for the plaintext commitment.

CMYKat

Public Inputs for Salt Derivation

The initial proposal included the plaintext value for Argon2 salt derivation, and published the salt and Argon2 output next to each other.

Hacker News comex pointed out a flaw with this technique, so I’ve since revised how salts are selected to make them independent of the plaintext.

The public inputs for the Argon2 salt are now:

The version identifier prefix for the ciphertext blob.
The 256-bit random value used as a KDF salt (also stored in the ciphertext blob).
A recent Merkle tree root.
The attribute name (prefixed by its length).

These values are all hashed together with SHA-512, and then truncated to 128 bits (the length required by libsodium for Argon2 salts).

This salt is not stored, but can deterministically be calculated from public information.

Crisis Averted?

This sure sounds like we’ve arrived at a solution, but let’s also consider another situation before we declare our job done.

High-traffic Public Key Directories may have multiple users push a protocol message with the same recent Merkle root.

This may happen if two or more users query the directory to obtain the latest Merkle root before either of them publish their updates.

Later, if both of these users issue a legal takedown, someone might observe that the recent-merkle-root is the same for two messages, but their commitments differ.

Is this enough leakage to distinguish plaintext records?

In my earlier design, we needed to truncate the salt and rely on understanding the birthday bound to reason about its security. This is no longer the case, since each salt is randomized by the same random value used in key derivation.

Choosing Other Parameters

As mentioned a second ago, we set the output length of the Argon2id KDF to 32 bytes (256 bits). We expect the security of this KDF to exceed , which to most users might as well be infinity.

With apologies to Filippo.

The other Argon2id parameters are a bit hand-wavey. Although the general recommendation for Argon2id is to use as much memory as possible, this code will inevitably run in some low-memory environments, so asking for several gigabytes isn’t reasonable.

For the first draft, I settled on 16 MiB of memory, 3 iterations, and a parallelism degree of 1 (for widespread platform support).

Plaintext Commitment Algorithm

With all that figured out, our plaintext commitment algorithm looks something like this:

Calculate the SHA512 hash of:
- A domain separation constant
- The header prefix (stored in the ciphertext)
- The randomness used for key-splitting in encryption (stored in the ciphertext)
- Recent Merkle Root
- Attribute Name Length (64-bit unsigned integer)
- Attribute Name
Truncate this hash to the rightmost 16 bytes (128 bits). This is the salt.
Calculate Argon2id over the following inputs concatenated in this order, with an output length of 32 bytes (256 bits), using the salt from step 2:
- Recent Merle Root Length (64-bit unsigned integer)
- Recent Merkle Root
- Attribute Name Length (64-bit unsigned integer)
- Attribute Name
- Plaintext Length (64-bit unsigned integer)
- Plaintext

The output (step 3) is included as the AAD in the attribute encryption step, so the authentication tag is calculated over both the randomness and the commitment.

To verify a commitment (which is extractable from the ciphertext), simply recalculate the commitment you expect (using the recent Merkle root specified by the record), and compare the two in constant-time.

If they match, then you know the plaintext you’re seeing is the correct value for the ciphertext value that was committed to the Merkle tree.

If the encryption key is shredded in the future, an attacker without knowledge of the plaintext will have an enormous uphill battle recovering it from the KDF output (and the salt will prove to be somewhat useless as a crib).

Caveats and Limitations

Although this design does satisfy the specific criteria we’ve established, an attacker that already knows the correct plaintext can confirm that a specific record matches it via the plaintext commitment.

This cannot be avoided: If we are to publish a commitment of the plaintext, someone with the plaintext can always confirm the commitment after the fact.

CMYKat

Whether this matters at all to the courts is a question for which I cannot offer any insight.

Remember, we don’t even know if any of this is actually necessary, or if “moderation and platform safety” is a sufficient reason to sidestep the right to erasure.
If the courts ever clarify this adequately, we can simply publish the mapping of Actor IDs to public keys and auxiliary data without any crypto-shredding at all.

Trying to attack it from the other direction (download a crypto-shredded record and try to recover the plaintext without knowing it ahead of time) is attack angle we’re interested in.

Herd Immunity for the Forgotten

Another interesting implication that might not be obvious: The more Fediverse servers and users publish to a single Public Key Directory, the greater the anonymity pool available to each of them.

Consider the case where a user has erased their previous Fediverse account and used the GDPR to also crypto-shred the Public Key Directory entries containing their old Actor ID.

To guess the correct plaintext, you must not only brute-force guessing possible usernames, but also permute your guesses across all of the instances in scope.

The more instances there are, the higher the cost of the attack.

CMYKat

Recap

I tasked myself with designing a Key Transparency solution that doesn’t make complying with Article 17 of the GDPR nigh-impossible. To that end, crypto-shredding seemed like the only viable way forward.

A serialized record containing ciphertext for each sensitive attribute would be committed to the Merkle tree. The directory would store the key locally and serve plaintext until a legal takedown was requested by the user who owns the data. Afterwards, the stored ciphertext committed to the Merkle tree is indistinguishable from random for any party that doesn’t already know the plaintext value.

I didn’t want to allow Public Key Directories to lie about the plaintext for a given ciphertext, given that they know the key and the requestor doesn’t.

After considering zero-knowledge proofs and finding them to not be a perfect fit, I settled on designing a plaintext commitment scheme based on the Argon2id password KDF. The KDF salts can be calculated from public inputs.

Altogether, this meets the requirements of enabling crypto-shredding while keeping the Public Key Directory honest. All known attacks for this design are prohibitively expensive for any terrestrial threat actors.

As an added bonus, I didn’t introduce anything fancy. You can build all of this with the cryptography available to your favorite programming language today.

CMYKat

Closing Thoughts

If you’ve made it this far without being horribly confused, you’ve successfully followed my thought process for developing message attribute shreddability in my Public Key Directory specification.

This is just one component of the overall design proposal, but one that I thought my readers would enjoy exploring in greater detail than the specification needed to capture.

Header art: Harubaki, CMYKat.

(This post was updated on 2024-11-22 to replace the incorrect term “PII” with “personal data”. Apologies for the confusion!)

#Argon2 #crypto #cryptography #E2EE #encryption #FederatedPKI #fediverse #passwordHashing #symmetricCryptography

Key Transparency and the Right to be Forgotten

"Crypto means cryptography," the dhole insists.

Про #GRUB можно позабыть уже несколько лет как — явно находится под чьим-то влиянием и до сих пор отказывается реализовать поддержку #LUKS 2-й версии, в части использования #Argon2 / #Argon2id.

Важно это потому, что в мире полно ферм для майнинга криптовалют, так или иначе арестованных органами общественного правопорядка. Это изначально специализированные #ASIC для перебора значений hash-функций. В результате, стало возможным взламывать грубой силой почти все варианты дискового шифрования, если для хранения пароля используются обычные #PBKDF2 / #PBKDF, не адаптированные под противодействие крипто-майнинговым фермам. Примером, нормальной современной #PBKDF является тот же #Argon2 и его вариации.

Альтернатива в том, что позволяет тот же #systemd-boot. Например, для полнодискового шифрование через LUKS берётся SSD/NVMe разбитый через #GPT с выделением раздела EFI System Partition, на котором размещаются образы #initrd / #initramfs и бинарники загрузчика systemd-boot являющиеся EFI-приложением.

Всё содержимое EFI System Partition может проверяться Secure Boot'ом — быть заверены своим собственным сертификатом в дереве. Не только бинарники, но и текстовые *.conf файлы в /boot/loader/entries/ описывающие каждый вариант загрузки. Поскольку они содержат такие вещи как:

title    ... — как зовётся в меню загрузочном
linux    /vmlinuz-6.6-x86_64 — какое ядро ОС использовать
initrd    /intel-ucode.img — какой микрокод процессора грузить
initrd    /initramfs-6.6-x86_64.img — сам загрузочный образ
...
options quiet — могут быть и в одну строчку все сразу
options splash
options rd.udev.log_level=3
options systemd.show_status=auto
options sysrq_always_enabled=1
options intel_iommu=on
options iommu=pt
...

Т.е. файлы *.conf могут содержать всякие опции/параметры ядра, отвечающие за безопасность работы системы. Например, раздачей таких рекомендаций недавно развлекался #ФСТЭК (вот оригинал официальной публикации).

Про различия в вариантах #Argon2, почему собственно RFC 9106 рекомендует использовать #Argon2id

#Argon2d maximizes resistance to GPU cracking attacks. It accesses the memory array in a password dependent order, which reduces the possibility of time–memory trade-off (TMTO) attacks, but introduces possible side-channel attacks.
#Argon2i is optimized to resist side-channel attacks. It accesses the memory array in a password independent order.
#Argon2id is a hybrid version. It follows the #Argon2i approach for the first half pass over memory and the #Argon2d approach for subsequent passes.

#linux #crypto #lang_ru

Its been, I've lost count, 6 years? And still GRUB2 does not support #argon2 while #pbkdf2 absolutely gets stomped .

Effectively rendering #fde with encrypted boot insecure on Linux (X86). While with u-boot on ARM you can use argon2

I do not understand why this isn't like a high priority issues for GNU

Thinking about FIPS and Argon2 leads to some weird places.

https://scottarc.blog/2024/06/17/the-quest-for-the-gargon/

#FIPS #cryptography #argon2

Musing about Password-Based Cryptography for the Government

What would a modern NIST standard for password-based cryptography look like?

Obviously, we have PBKDF2–which, if used with a FIPS-approved hash function, gives you a way to derive encryption keys and/or password validators from human-memorable secrets.

However, PBKDF2 isn’t memory-hard.

In 2012, several cryptographers initiated the Password Hashing Competition (PHC) to study the state-of-the-art for password-based cryptography at the time. Part of this motivation was that memory-hard hashing (first developed by Colin Percival in scrypt a few years prior) provided greater defense against the increasing parallelism of modern password cracking techniques.

After a few years of cryptanalysis, the PHC selected an algorithm called Argon2, and gave special recognition to four other finalists.

And, quote the NIST SP 800-63B:

A memory-hard function SHOULD be used because it increases the cost of an attack.
If you were expecting, “Nevermore,” you’re currently reading the wrong literary genre.

“So, we’re done, right? Just use Argon2 and call it a day.”

We did it! Yayyyyyyyy~

…

Of course, it’s not that simple.

(Artist source unknown, meme generated from imgflip)

What is Argon2?

Argon2 is defined in IETF RFC 9106. There are several variants of Argon2 that have subtly different security properties (Argon2d, Argon2i, Argon2id, Argon2ds — the latter one providing a property called cache-hardness. which Steve Thomas’s slide deck from BSidesLV 2022 explores in depth).

Argon2id is the variant most of us settled on in 2024.

Regardless of the variant used, the same underpinnings are used. From RFC 9106, section 3.2:

Argon2 uses an internal compression function G with two 1024-byte inputs, a 1024-byte output, and an internal hash function H^x(), with x being its output length in bytes. Here, H^x() applied to string A is the BLAKE2b ([BLAKE2], Section 3.3) function, which takes (d,ll,kk=0,nn=x) as parameters, where d is A padded to a multiple of 128 bytes and ll is the length of d in bytes. The compression function G is based on its internal permutation. A variable-length hash function H’ built upon H is also used. G is described in Section 3.5, and H’ is described in Section 3.3.
Bold text for emphasis.

If you weren’t adept at playing Crypto Algorithm Bingo, it might be easy to miss the fact that BLAKE2b is NOT a cryptographic algorithm approved for use in FIPS validated modules.

So, full stop, unless NIST and the US Department of Commerce turn over a new leaf and add BLAKE2 to the approved algorithms list for FIPS, this is a non-starter.

Well, why not use yescrypt? Or scrypt for that matter?

Yescrypt (and scrypt before it) are based on Salsa20/8. In fact, most of the time computing a KDF output with either algorithm is spent on Salsa20-encryption regions of memory.

After all the computing resources are spent on Salsa20/8 and memory management, PBKDF2-SHA256 is used to compress the output to a fixed length. This is arguably complying with NIST’s requirements to use PBKDF2–albeit with an iteration count of 1 (so it’s just artificially sweetened HMAC, if we’re being honest with ourselves).

How are systems complying today?

I’ve heard a few conflicting stories over the years from folks that care a lot about FIPS (presumably because the US government is a significant chunk of their annual recurring revenue). It’s possible I’m misremembering what they said, so please take these secondhand anecdotes with an appropriate amount of salt.

One person claimed that Scrypt is fine since “the last step is PBKDF2”, and if an auditor blinks, you allegedly just need to document all the Salsa20 stuff as “obfuscation” and PBKDF2 is what you’re really doing to comply.

Another approach I heard was to run a memory-hard KDF in parallel with PBKDF2, then use HKDF to combine the two outputs.

Between the two, I’m more likely to believe that an auditor would approve the latter HKDF-based design, but I’ve never worked at a NIST CMVP lab, so who knows?

Unfortunately, NIST SP 800-63B has little to say about the specifics:

Examples of suitable key derivation functions include Password-based Key Derivation Function 2 (PBKDF2) [SP 800-132] and Balloon [BALLOON]. A memory-hard function SHOULD be used because it increases the cost of an attack.

I already said that PBKDF2 isn’t memory hard, so that’s useless here.

The other example they gave, Balloon Hashing, is frankly a weird recommendation to make, given the lack of a stable reference implementation and how poorly specified it is.

This is starting to look like a catch-22. Maybe we would be better off not supporting passwords anymore.

But what if you can’t make that decision?

What would a modern NIST standard for password-based cryptography even look like?

Towards Gargon: Government-flavored Argon2

Is that last question even answerable?

I argue, “Probably yes.” From the introduction to RFC 9106:

Argon2 is also a mode of operation over a fixed-input-length compression function G and a variable-input-length hash function H. Even though Argon2 can be potentially used with an arbitrary function H, as long as it provides outputs up to 64 bytes, the BLAKE2b function [BLAKE2] is used in this document.

Clearly, the Argon2 RFC authors intended to allow the hash function be swapped out for another one.

So can we just str_replace() BLAKE2b with SHA512 (or SHA3-512) and call our job done?

No, that would be too easy.

The internal compression function, G

Argon2’s design involves computing the internal compression function, G, over regions of memory. The linked section of that version of RFC 9106 provides a good overview of the construction.

G is defined in terms of the permutation, P.
P is based on the round function of BLAKE2b.
The BLAKE2b round function is based on ChaCha, which is similar to Salsa20 (and designed by the same author), which we already established isn’t approved for FIPS.

So if we’re going to invent a Government-tolerable variant of Argon2, we’ll need to be a bit more creative about our choice for G as well.

More precisely, even if we keep the overall structure of G intact, we’ll need to define a FIPS-able permutation, P.

The permutation, P, for building the internal compression function, G

A reasonable person would assume we would need to pick a component from the hash function we’re building atop which has an increased circuit depth. After all, that’s what the Argon2 designers did:

The modular additions in GB are combined with 64-bit multiplications. Multiplications are the only difference from the original BLAKE2b design. This choice is done to increase the circuit depth and thus the running time of ASIC implementations, while having roughly the same running time on CPUs thanks to parallelism and pipelining.
RFC 9106

And this is where reasonableness hits a wall. There are several directions that one could go to invent Government-tolerable Argon2.

The SHA-2 family compression function (i.e., , , , and ).
The basic block permutation function from SHA3 (i.e., , , , , and ).
Look elsewhere in the FIPS algorithm suite, such as AES (e.g., in Counter Mode, to exploit the hardware acceleration of AES in modern CPUs).

Each of these ideas is terrible in their own way.

The cryptanalysis results showing that the best attack against a full hash function costs 2 to some power queries don’t imply the security of each constituent component. So you’re really rolling the dice if you pursue this.

AES might be okay, depending on how it’s constructed and used. But the devil’s always in the details.

It’s starting to seem like Gargon’s possibility is fleeting, after all.

Wouldn’t life be simpler if NIST just approved BLAKE2b and/or Argon2 for use in FIPS validated modules?

Yes, life would be much simpler. NIST should do that.

Unfortunately, until that day comes, there are yet more windmills that need tilting.

https://scottarc.blog/2024/06/17/the-quest-for-the-gargon/

#Argon2 #crypto #Cryptography #CryptographyStandards #cybersecurity #encryption #FIPS #NIST #passwordBasedCryptography #passwords #PBKDF2 #security

Is someone else missing the "ad" parameter (optional secret data) at Argon2 command line tool?
🤔 AFAIK also missing on all javascript bindings. Not the case for the active Rust binding that uses this secret parameter though

#argon2 #hash #keyed_hashing #hashing #password_hashing #linux

🎊 We are starting #2024 off right! 🎉

With the latest update all Tuta accounts are now utilizing #Argon2 and #AES256 encryption by default.🔒💪

This security improvement is the next step towards full #postquantum encryption!
👉 https://tuta.com/blog/aes-256-encryption

All Tuta accounts are now using AES256 and Argon2 by default!

This is a #PBKDF2 hate account ! Vive la #Argon2 🎉 !

#CryptoAsInCryptography

"Since the time PBKDF2 was designed, we’ve seen the rise of powerful GPUs become common place. To defend against this rising onslaught of GPU hashing power is a relatively new algorithm, argon2."

https://blog.dataparty.xyz/blog/wtf-is-a-kdf/#how-does-argon2-work

#cryptography #argon2 #pbkdf2

How Does Argon2 Work?
The cryptographic power of argon2 is sublte but brilliant. Instead of focusing on CPU time by requiring large numbers of hash iterations, argon2 wages war on your GPUs memory capacity. When hashing a password with argon2 an application developer can dial up the amount of RAM that is required to complete the computation. In so doing it starves the globs of highly parallel computation cores in a GPU reducing the total processing power the GPU can bring to bear.

🔒 Passwortsicherheit ist entscheidend für Entwicklerinnen & Systemadministratorinnen. OpenLDAP bietet zwar MD5 & SHA1, aber diese sind unsicher. Erfahre im neuesten Blogpost, wie du OpenLDAP mit dem sicheren argon2-Algorithmus kompilierst und als Standard festlegst.

https://www.puzzle.ch/de/blog/articles/2023/08/08/enhancing-openldap-security-with-argon2

#Passwortsicherheit #OpenLDAP #argon2 #Cybersicherheit

mini-project announcement!! (but finally something new)

https://codeberg.org/valpackett/argon2ian is #argon2 built as #wasm #webassembly for evergreen browsers and #deno, but like, size-optimized for real. Only 8.5 KB for the whole async (web workers powered) JS wrapper, and that's with everything inline, no external file loads at all – completely bundle-able like a normal JS module. No text encoding for the hashes though, just the raw stuff.

p.s. if anyone is interested in cronching some other library like this, you could maybe hire me for such a project :)

High @sc00bz and @epixoip, I recently came across your recommendations not to (blindly) use #Argon2 as a #PHF (but it's a good #KDF) due to this requiring runtimes that make it (usually) inapplicable for password hashing. Or, phrased differently, would require lowering security parameters in order to stay performant, that the security of the hashing would be compromised.

The #Bcrypt article on Wikipedia put forth a similar claim but without any citations and phrased a bit misleading (IMO). I've adjusted the article and added two citations. If you have time, I'd be glad if you could give some feedback on this, as there are only few citable sources on this and I'm by far no expert on the matter:

https://en.wikipedia.org/w/index.php?title=Bcrypt&diff=prev&oldid=1157855165

Thank you!

@niconiconi @Moon @PeterCxy @a1ba Doesn'it still lack support for a bunch of LUKS settings & modes that have been around for a while?

Or is this warning (https://wiki.archlinux.org/title/GRUB#Encrypted_/boot) finally obsolete and wrong somewhere?

At least going by (info "(grub) cryptomount") (https://www.gnu.org/software/grub/manual/grub/html_node/cryptomount.html) on Debian argon2id still isn't supported despite PBKDF2 being recommended against nowadays.

#GRUB #LUKS #PBKDF2 #PBKDF #Argon2 #EncryptedBoot

*Edit*: I am being ridiculous here: I forgot to run with --release flag. 🤦‍♂️ So while the performance differences are really there, it’s more like factor four from fastest (just-argon2) to slowest (argon2) implementation.

Interesting conclusion on the state of the #rustlang ecosystem: the only #Argon2 implementation still under active development (argon2) is also by far the slowest one. Unless I mixed up some numbers, it is six times slower than rust-argon2 and four times slower than argon2rs.

The really fast implementations are the ones wrapping the argon2 C library, these haven’t been updated in years however and often provide a really awkward API. While argonautica has non-trivial system dependencies, just-argon2 works without.

Well, guess just-argon2 it is…

@mjg59

Thank you for sounding the alert!

I identified a minor issue with your otherwise nice explanation: According to my sources (man cryptsetup, #rfc9106), all #argon2 varieties are memory-hard. RFC 9106 is even titled “Argon2 Memory-Hard Function for Password Hashing and Proof-of-Work Applications”.

However, given that there are known attacks against #argon2i, it seems wise to use #argon2id instead. It is also what is recommended in the RFC.

As a #QubesOS user, I just checked the state of affairs there:

The cryptsetup that comes with QubesOS 3.x used #luks1, and those who did an in-place upgrade to 4.x still have that unless they converted to #luks2 manually (as detailed in the migration guide).

The cryptsetup in QubesOS 4.x uses #luks2, but it still defaults to #argon2i unfortunately.

Means of storing credential hashes/salts in public, viability check

https://security.stackexchange.com/questions/269316/means-of-storing-credential-hashes-salts-in-public-viability-check

#passwordcracking #credentials #dictionary #argon2 #hash

OpenSSL merged the Argon2 KDF

#cryptography #OpenSSL #Argon2

https://github.com/openssl/openssl/pull/12256

#Argon2

Client Info