The microblog

Microblog entries before 2022.11.15 appeared on Twitter (@hashbreaker; see history of the name), and were archived here on 2022.11.18. Newer microblog entries are appearing on Mastodon ( and on Twitter and here (with timestamps from Mastodon).

2024.02.15 18:17:54: 2009: "Not covered in this talk: other types of DoS attacks. e.g. DNSSEC advertising says zero server-CPU-time cost. How much server CPU time can we actually consume?" Also posed the question in some later talks. Most recent answer:

2024.02.01 00:18:36: Columbia Accident Investigation Board, final report, 2003, volume 1 (, page 191: "The Board views the endemic use of PowerPoint briefing slides instead of technical papers as an illustration of the problematic methods of technical communication at NASA."

2024.01.19 21:03:54: Recent claims of exponents for supposedly well-studied lattice attacks considering memory-access costs: 2023.11, 0.396! Oops, wait, 0.349! 2023.12, 0.349, or 0.329 in 3D! 2024.01, 0.311, or 0.292 in 3D!

2024.01.18 23:46:16: Is there a name for the following failure pattern? (1) "Don't worry about flaws in defense X: we have Y as another layer of defense." (2) "Don't worry about flaws in Y: we also have X." (3) "This real-world attack exploited flaws in X _and_ in Y? Nobody could have expected that!"

2024.01.18 17:59:00: Puzzled by AMD manuals saying that VPMASKMOV might fault on unused addresses. Does it actually do that on any AMD chips? Intel manuals guarantee it won't, making it useful for 256-bit processing of array lengths that aren't multiples of 256 bits, but could such code crash on AMD?

2024.01.17 00:55:48: Updated sortbench (int32 arrays, AVX2) to add Intel's x86-simd-sort, add the "fast-and-robust" library, upgrade to latest version of Google's vqsort, support current vxsort, and include baseline std::sort: Let me know if I've missed a competitive library.

2024.01.03 07:30:58: A recent preprint "The Planck Constant and Quantum Fourier Transformation" ( suggests that Shor is unimplementable since it involves tiny rotations. But Coppersmith pointed out in 1994 ( that Shor works _without_ the tiny rotations.

2024.01.02 17:13:05: New blog post "Double encryption: Analyzing the NSA/GCHQ arguments against hybrids." #nsa #quantification #risks #complexity #costs

2023.12.19 09:04:24: New KyberSlash resource page about Kyber libraries using divisions on secret inputs and thus leaking secret information into timings in various environments: Tracks various Kyber libraries to see which ones have this division and which ones have patched.

2023.12.17 20:29:35: GCHQ claims cryptosystem X has "significantly more complexity" than cryptosystem Y. Do you find yourself asking for quantification? New paper "Analyzing the complexity of reference post-quantum software": Spinoff: found timing leaks in most Kyber software.

2023.12.08 20:00:24: Cost exponents for lattice attacks continue their relentless march downwards. New paper "Asymptotics of hybrid primal lattice attacks": Was tempted to add a picture of the mascot for lattice-based cryptography: a slow-boiled frog.

2023.12.06 14:14:34: Have you run into people hyping the costs of post-quantum encryption? Do you find yourself wondering what the actual costs are for various cryptographic applications? I have a new paper "Predicting performance for post-quantum encrypted-file systems":

2023.11.26 21:51:34: In case it's useful for more people, posted a small script built on top of pypdf ( to magnify giant-margin conference/book PDFs into reasonable-margin PDFs, while preserving hyperlinks (unlike pdfjam) and anchors. More information:

2023.11.25 20:56:29: New blog post "Another way to botch the security analysis of Kyber-512": #nist #uncertainty #errorbars #quantification

2023.11.23 21:56:17: New paper "Quantifying risks in cryptographic selection processes": of the 69 round-1 post-quantum submissions, 48% are broken by now (smallest parameters known easier to break than AES-128); of those unbroken in round 1, 25%; of those picked by NIST, 36%.

2023.10.23 18:33:51: New blog post: "Reducing 'gate' counts for Kyber-512: Two algorithm analyses, from first principles, contradicting NIST's calculation." #xor #popcount #gates #memory #clumping Also via Cloudflare given the frequent DoS attacks:

2023.10.13 14:38:24: Regarding the ad-hominem attacks that have just been posted by @matthew_d_green on Mastodon and on Twitter, I've replied directly to each of his Mastodon postings with easily verifiable facts to the contrary, and will be happy to engage in followup discussion there.

2023.10.06 11:37:32: For obsolete attacks from 2020, estimates "2^151 gates" in "RAM model" break Kyber-512. NIST says memory adds "40 bits": 2^191. NIST waves at, but Table 2 there estimates that if you scale up to sntrup653 then memory costs 2^169.

2023.10.06 11:02:01: For people who haven't had time to read the full analysis, here's something much shorter: I sent a message to NIST's mailing list spelling out how NIST's "40 bits of security more than would be suggested by the RAM model" claim fails a simple sanity check.

2023.10.05 20:16:54: Saw a Micciancio slide in an Intel event today advertising Kyber's sizes specifically for Kyber-512. Glanced at a NIST draft standard that includes Kyber-512. Then heard online that overestimating Kyber's security level is ok since everyone has moved to Kyber-768. Wait, what?

2023.10.03 19:33:12: New blog post "The inability to count correctly: Debunking NIST's calculation of the Kyber-512 security level." On a related note, announces a followup FOIA lawsuit filed today. #nist #addition #multiplication #ntru #kyber #fiasco

2023.09.17 16:23:08: Plugging AES-256 into "beyond-birthday-bound security" has a lower security level and is easier to screw up than "bigger-birthday-bound security". This makes "beyond" attractive for academic cryptographers writing papers, and, as illustrates, for NSA etc.

2023.09.17 10:39:23: Fact check: Lattice KEMs and signatures do _not_ have the security proofs claimed by @sweis ( in ("if you could solve this average case [crypto problem] then you could solve this, worst case [SVP] which is known to be NP-hard").

2023.09.10 10:29:16: For comparison: Dustin Moody, NIST, public talk slides ( "Engagement with community and stakeholders. This includes feedback we received from many, including the NSA. We keep everyone out of our internal standardization meetings and the decision process."

2023.09.10 10:12:29: NSA's secret members of team: Nick Gajcowski; David Hubbard; Daniel Kirkwood; Brad Lackey; Laurie Law; John McVey; Scott Simon; Jerry Solinas; David Tuller; later Rich Davis. Jacob Farinholt was Naval Surface Warfare Center, US Navy. Not sure about Evan Bullock.

2023.09.10 09:39:16: says author is "Post Quantum Cryptography Team, National Institute of Standards and Technology (NIST),". FOIA results have revealed secret team members in early Sep 2016, after draft NISTPQC call: more NSA people than NIST people.

2023.09.06 17:58:26: My new report "Papers with computer-checked proofs" gives "case studies supporting the hypothesis that it is often affordable for a paper presenting theorems to also include proofs that have been checked with today's proof-checking software":

2023.08.18 16:29:04: Just uploaded new version of my paper "Understanding binary-Goppa decoding": Includes detailed comparisons of all ten theorems to formalizations of the theorems in HOL Light and in Lean. Also explains how to run HOL Light and Lean to verify the proofs.

2023.08.11 20:36:50: Now posted HOL Light formalizations of the same theorems about decoding Goppa codes: ( via Cloudflare). More verbose than the Lean versions, but took me less time to write. Which will be a better foundation for software verification?

2023.07.26 13:46:28: Formally verified theorems about decoding Goppa codes: This is using the Lean theorem prover; I'll try formalizing the same theorems in HOL Light for comparison. This is a step towards full verification of fast software for the McEliece cryptosystem.

2023.06.30 10:43:22: lib25519-20230630 with new code from Kaushik Nath for batch nP (e.g., Tiger Lake: 22 kcycles!) and multi-scalar mult: Also CLI, infrastructure improvements, Alpine/musl support. Still needs more auditing and formal verification.

2023.06.27 14:14:28: "We operate transparently." FOIA lawsuit docs so far: first evasion (copies of public docs), then much more interesting secret slides, and now a month of secret email, including scheduling 12 Jan 2016 meeting with NSA, 26 Jan 2016 meeting with NSA, 2 Feb 2016 meeting with GCHQ.

2023.06.26 02:30:49: Here's a cryptosystem as an attack challenge: implementable PQ ECDH NIKE! Use n = 2^32-5; Koblitz curve over F_{2^n}; type-2 ONB; T = Frob; secret exponent (1+T^r1)...(1+T^r64)T^r0+T^r65+...+T^r96. Basic Shor is too slow even for 2^r1+...+2^r64, group F_{2^n}^*. Note 2n+1 factor.

2023.06.19 10:57:42: Wow, finally an honest version of FrodoKEM! New paper from Joel Gärtner proves that 2^128 QROM IND-CCA2 security for dimension 79510 with 37-bit modulus follows from a reasonably conjectured quantitative hardness assumption for worst-case approximate SIVP.

2023.06.16 00:35:10: now redistributes the web pages via Cloudflare. For example, latest NSA/NIST FOIA results: libmceliece: "Turbo Boost: How to perpetuate security problems":

2023.06.16 00:28:22: Given yesterday's 18-hour DoS attack against the network, I'm reposting recent links. Latest NSA/NIST FOIA results: libmceliece: "Turbo Boost: How to perpetuate security problems":

2023.06.14 12:26:39: Pleased to announce CryptAttackTester, a framework to systematically catch bugs in cryptanalysis. CAT tests analyses of completely specified algorithms in a simple, fully defined cost metric. Includes many ISD attacks + AES-128. Joint work with @TungChou1.

2023.06.12 09:48:22: libmceliece now available from for post-quantum encryption with Classic McEliece, wrapping fast code from @tungchou1 + me. Simple stateless wire-format API and CLI. Automatic run-time selection of implementations: unified binary works with or without AVX2.

2023.06.09 12:10:47: New blog post "Turbo Boost: How to perpetuate security problems." with special guest appearances from Shark, Fluffy, and Turbo Boost Max Ultra Hyper Performance Extreme. #overclocking #performancehype #power #timing #hertzbleed #riskmanagement #environment

2023.05.29 16:57:36: Exercise in systems engineering: What's the best fix for Change the Kyber and FrodoKEM software? Change the RNG to a simpler randombytes() API that guarantees callers won't see this failure case? Crypto students aren't taught how to think this through.

2023.05.19 03:36:11: Called AT&T to check on promised tech visit to restore home Internet. Turns out they silently cancelled visit. More calls. Finally they turn Internet back on. AT&T social-media manager wants me to change name "AT&T Victim #31415926". Sorry, no name-change techs available today.

2023.05.16 00:47:15: Fifth day of no Internet at home. Starting to work on new slogans for AT&T: "Immobilizing your world." "Reach out, reach out, and, sorry, still no Internet." "Not technically a monopoly since 1984." "Rethinking theft." "Being offline is good for you." "Internet is the I in AT&T."

2023.05.12 15:09:48: Update for everybody asking: @ATThelp has been useless. I've been paying AT&T $100/month for Internet service; two days ago AT&T deliberately turned it off with no prior notice, and AT&T continues refusing to turn it back on. @TMobile has been a lifesaver but wireless has limits.

2023.05.10 15:40:22: AT&T turns off my Internet service without warning and claims "Per your request, AT&T Internet Service has been placed on Voluntary Suspend". Over phone, @atthelp admits this is "involuntary fiber migration" and refuses to turn service back on until a tech visit (>1 week wait).

2023.04.16 21:11:27: New formally verified proof of #safegcd iteration bound: (script for full run+extras: Advantages over previous formally verified proofs: (1) covers all input sizes; (2) finishes verifying in 10 minutes; (3) smaller TCB (HOL Light).

2023.03.17 15:40:37: Major update of "Multi-ciphertext security degradation for lattices" paper: Main optimization is integrated into the central theorem statement, backed by a proof ( verified by the HOL Light theorem prover (

2023.03.06 18:06:08: After 40 messages, past the "Springer asking me for money in violation of the open-access contract" stage. Now facing the big boss: The Springer Paper Mangler. 48 hours to put Humpty Dumpty back together OR ELSE. Original: Mangled:

2023.02.26 14:39:54: Mini-McEliece challenges 1223, 1284 from were solved in a Eurocrypt 2022 paper. Have now solved the 1347 challenge... by simply running our PQCrypto 2008 software! Perfect example of how minor the algorithmic speedups have been.

2023.01.27 20:28:30: It's fascinating to see how the historical data in the bottom-left corner of the graph in (from @sejaques, aka leads readers to guess the number of years to the top right without realizing that the top right is a moving target.

2023.01.27 19:38:17: Given today's 5-hour DoS attack against my servers, here's an Internet Archive link for the updated "NSA, NIST, and post-quantum cryptography" FOIA results that I had just announced: The (formerly) secret govt documents seem to be properly archived+linked.

2023.01.26 16:00:30: NIST rules say free usability is "critical". NIST could have said in 2021: NTRU is patent-free, use it! Instead secret NIST slides said, basically, "What if we ignore patents?": NIST delayed, then took option where patent license won't activate until 2024.

2023.01.26 15:39:01: New librandombytes: This is designed to shield applications from having to worry about random() not being very random, RAND_bytes() maybe failing, older machines not having getrandom(), /dev/urandom maybe not being initialized, /dev/random being slow, etc.

2023.01.26 00:00:14: Documents delivered in my FOIA lawsuit don't include any NSA material yet (hmmm) but are very interesting. Particularly horrifying is NIST's secret 2021 Kyber-Saber-NTRU comparison chart: e.g., credits Kyber for being used in "Cloudflare/CEPQ2" experiment.

2023.01.10 05:50:11: Updated libcpucycles-20230110 now available: Documentation improvements: documented cpucycles_version(), current default frequency, how PERF_FORMAT_TOTAL_TIME_RUNNING is handled. Improved uname handling. Added s390-stckf cycle counter, s390 cpuinfo parser.

2023.01.05 11:03:15: Releasing #libcpucycles library to count CPU cycles: Supports counters for amd64 (both PMC and TSC), arm32, arm64 (both PMC and VCT), mips64, ppc32, ppc64, riscv32, riscv64, sparc64, and x86, plus automatic fallbacks to various OS-level timing mechanisms.

2022.12.28 18:39:09: No CCC this year but various decentralized events are happening: The purely online event is FireShonks. @hyperelliptic and I are giving a talk there 24 hours from now: "Post-quantum cryptography: Detours, delays, and disasters"

2022.12.23 11:00:51: Still needs auditing and formal verification, but happy to announce availability of lib25519-20221222. Includes extensive new speed work from Kaushik Nath: e.g., on Skylake, 30 kcycles for DH keygen, sig keygen, signing; 90 for DH shared secret; 110 verif.

2022.11.14 16:12:34: New paper "Multi-ciphertext security degradation for lattices" identifies several gaps in provable-security claims for lattice systems, and drives attacks through those gaps. The easy part is disproving FrodoKEM's still-not-withdrawn "large margin" claim.

2022.11.14 16:24:28: The hard part is showing that, under the (shaky!) heuristics used today to claim lattice security levels, the error distributions in New Hope and Kyber allow an asymptotically faster attack breaking one out of many ciphertexts, contrary to a (flawed!) proof claim at ACM CCS 2021.

2022.11.14 16:36:48: Quantifying the impact on Kyber-512 would be even harder than quantifying the cost of single-target attacks against Kyber-512, which in turn is an unstable, challenging research topic. NIST is grossly misleading users when it labels Kyber's 2020 security analysis as "thorough".

2022.11.04 05:18:32, replying to "René Mayrhofer 🇺🇦 🇹🇼 (@rene_mobile)": There's an edge of the software ecosystem that has to parse UTF-8 for display etc in any case, but there's also a split between networking contexts that use UTF-8 and networking contexts that use Punycode, forcing every piece of software at the boundary to convert back and forth.

2022.11.03 12:46:15: 20 years ago, when the IETF was building Punycode instead of mandating UTF-8, I thought they were being remarkably stupid, and said so publicly. Later I started understanding the basic incentives. Simple, boring, working systems mean less money for standardization organizations.

2022.10.31 21:49:20: FrodoKEM documentation claims that "the FrodoKEM parameter sets comfortably match their target security levels with a large margin". Warning: That's not true. Send 2^40 ciphertexts to a frodokem640 public key; one of them will be decrypted by a large-scale attack feasible today.

2022.10.31 21:54:38: This attack does _not_ rely on a subsequent protocol exposing AES-128 ciphertexts for a common plaintext, a typical way that AES-128 keys are exposed to multi-target attacks. The attack is directly against the FrodoKEM ciphertexts. Randomizing AES modes doesn't help at all here.

2022.10.31 22:09:58: NIST discarded FrodoKEM for performance reasons, but praised its security at length. Various other organizations are continuing to consider FrodoKEM because of its reputation as the most conservative lattice system. So it's worrisome to see FrodoKEM making false security claims.

2022.10.19 19:00:29: More document releases forced by the "NSA, NIST, and post-quantum cryptography" lawsuit: These include internal NIST slides marked "not for public distribution". Meanwhile NIST repeatedly claimed in public that this was an "open and transparent" project.

2022.10.06 13:57:08: Index of records received so far in response to "NSA, NIST, and post-quantum cryptography" FOIA filing: Zero records were delivered between the FOIA filing (March 2022) and the lawsuit filing (August 2022). Obviously many more records are still to come.

2022.10.06 14:25:49: It's striking that various slide sets (BIKE summary, BIKE vs. HQC summary, summary of the BDGL algorithm, etc.) list just _one_ author. Does this mean that NIST assigned each input to just _one_ employee to read and summarize, with no protection against errors and possible abuse?

2022.10.06 14:31:43: From an error-correction perspective, having multiple readers of each input would have still fallen short of having summaries promptly posted for public review (what one would expect for an "open and transparent" project), but would have been a big step up from just one reader.

2022.09.03 13:22:00: Last year I had a paper rejected on a non-isogeny-based proposal (not announced yet; have been prioritizing other things) for non-interactive post-quantum key exchange. Here are some review quotes illustrating how incompetent the cryptographic community is at risk management.

2022.09.03 13:22:02: "Jao and Urbanik in Mathcrypt 2019 proposed a post-quantum NIKE based on SIDH" allegedly much faster than this new proposal X. "And when it comes to NIKE, it seems vanishingly unlikely that ... attacks against isogenies will improve to the point where they become slower" than X.

2022.09.03 13:22:04: "Vanishingly unlikely"? A year later, almost the entire mountain, or maybe I should say volcano, of SIDH/SIKE proposals has exploded into ashes. CRS/CSIDH is qualitatively different and still doing fine, but would it really be _that_ surprising if there's a devastating attack?

2022.09.03 13:22:06: "Cryptographers and practitioners care about performance, and not just a little, we care a whole lot": Indeed, to the extent of advocating focusing _all_ efforts on the most efficient proposals, which experience shows is _not_ the same as minimizing risk within the user's budget.

2022.09.03 13:22:07: Here's the really disturbing part to contemplate. Is this actually incompetence? Or has the cryptographic community spent decades optimizing its practices to create frequent failures, which it then points to in its requests for funding? See Section 3.8 of

2022.08.30 03:56:14, replying to "covorigin (@covorigin)": There's a parameter to tune of how many pages the process is allowed to read without having the pages checked first. For those pages, I agree that it might be good for errors caught in subsequent scrubbing to terminate the process, but I'm not sure people will be ok with this.

2022.08.30 04:00:22: What's attractive about zero access, with a page fault (in the OS sense) checking the page for faults (in the hardware sense) and only then allowing a read, is that there's a pure reduction in the error rate; nobody saying "you terminated my movie player because a pixel flipped".

2022.08.30 03:44:24, replying to "BenBE (@BenBE1987)": Would be interesting to look at speed of known techniques for RS decoding in this context. I think ~1 cycle/byte for extended Hamming (with current portable software) is fast enough to slip in unnoticed for tons of data, but something slower is probably broadly affordable too.

2022.08.30 03:35:55, replying to "Dominic White ❌ (@singe)" = "Dominic White 🎄🎅 (@singe)": Some combination of hammer detection and ECC _might_ work, but this is awfully difficult to evaluate, and papers keep showing attacks. It's much more convincing (and seems implementable: see ZebRAM etc.) to keep a physical moat, at least 1 row, between different security domains.

2022.08.30 03:15:28, replying to "Thái "thaidn" Dương (@XorNinja)": By "released", you mean "suppressed until they saw that the public had the quantum core of the attack (Eisentraeger--Hallgren--Kitaev--Song) and the applicability to lattice-based cryptography, so the only piece missing in public was the note that cyclotomic units are short"?

2022.08.30 03:20:38: That was a critical note, and the public _could_ easily have missed it for many years. But the timeline, according to GCHQ, was _not_ that GCHQ was issuing a prompt public warning. There was never an explanation of what triggered them to publish the attack at the moment they did.

2022.08.29 13:52:00, replying to "BenBE (@BenBE1987)": 4096+14 is what libsecded (from does for 4096 bytes. Can de-interleave into original page + checksums. It's SECDED on the bottom bit of each byte, SECDED on the next bit of each byte, etc. Mixing bits can give better error correction for the same space.

2022.08.29 13:25:45: Given the current reality of desktops/laptops/smartphones almost never having ECC RAM, I'd love to see more operating-system support for periodically sweeping through pages to detect and correct errors, storing (say) 14 bytes of error-correction data for each 4096-byte page.

2022.08.29 13:44:13: It's hard for the OS to do anything useful to correct errors in pages being actively written, but that's not most pages at most times. The OS can try marking a page as read-only (or, more robust, zero access); compute the error-correction bits; periodically check for errors.

2022.08.29 10:29:51: New paper "A one-time single-bit fault leaks all previous NTRU-HRSS session keys to a chosen-ciphertext attack": Attack was enabled by a change to NTRU-HRSS in 2019. Attack software (using a simulated DRAM fault): "attackntrw" from

2022.08.29 03:49:02, replying to "Dominic White ❌ (@singe)" = "Dominic White 🎄🎅 (@singe)": The portable code in libsecded is roughly 1 cycle/byte on current Intel CPUs (depending on array size), which is the sort of cost most applications don't notice even if it's applied to all data. Certainly interesting to try larger-distance codes. But need isolation vs Rowhammer.

2022.08.29 03:21:52, replying to "John Carlos Baez (@johncarlosbaez)": Define T = cost(brute-force search for all AES-128 keys). There are finitely many algorithms A with cost(A) <= T; cost includes len(A). Compute exact success probability of each A by running A on all sequences of coin flips of length fitting into cost T. Reflect into a proof. QED

2022.08.29 03:09:53, replying to "John Carlos Baez (@johncarlosbaez)": Your parenthetical sentence here is false. For example, there exists a proof of the minimal cost of an AES-128 attack, with standard formalizations of "cost"+"attack"+"proof". We don't know if there's a proof of length <2^L for any useful value of L; that's a different question.

2022.08.28 08:33:42: Bits in DRAM sometimes flip. Typical servers have SECDED ECC DRAM to protect against this, but typical desktops/laptops/smartphones don't. Have released a "libsecded" micro-library with secded_encode() to protect an array and secded_decode() to recover it:

2022.08.28 08:39:02: Of course, have to worry about the possibility of bugs in libsecded doing more damage than bit flips. The software passes many tests (and is also checked against a simpler Python implementation), but those aren't comprehensive. Planned formal verification is still future work.

2022.08.22 19:17:26, replying to "nikita borisov (@nikitab)": No, "specific data values may delay instruction retirement by, at most, one cycle" in is a pipeline effect. Also says Skylake "may" do this for "at least" one insn in a list of (basically) vector mul. CacheBleed showed exploitability of 1-cycle variations.

2022.08.22 19:22:24: This is reminiscent of the FPU on the IBM PowerPC RS64 IV taking an extra cycle to multiply by 0; see warning at the bottom of page 10 of Figuring out values that trigger a Skylake slowdown could enable attacks along the lines of

2022.08.22 19:31:09: It's easy to see how cutting corners in hardware for floating-point normalization would explain the slowdown on that PowerPC. Intel seems to say that its vector fp mul _is_ constant-time; but maybe the way that the vector int mul reuses the vector fp mul is creating a slowdown.

2022.08.22 11:28:03: The documentation actually suggests, but doesn't quite say, that, already on Skylake, vector multiplications (used in many crypto implementations) _aren't_ constant-time. Since then I've been doing various scans to try to find inputs triggering variations; nothing to report yet.

2022.08.18 21:01:22: Hey, math profs, in case you missed it, there are exciting opportunities available to take some time working with NSA on secret problems: "We established a sabbatical program to allow mathematicians to visit us while retaining their academic affiliation"

2022.08.18 21:28:28: According to the information about how your work will be used is "cleansed" so you'll be free to imagine that it's legal, ethical, etc. Sure, the same government does some horrifying things, but surely _your_ work will only be used for the good stuff.

2022.08.18 21:33:03: Also, whatever you've heard about a lifetime post-employment obligation, upheld by the courts, to show NSA anything you're thinking about publishing so they can censor the parts they want to keep secret: Stop worrying. You're smart enough to find non-crypto problems to work on.

2022.08.18 16:49:27, replying to "Damien Robert (@GondoPloum)": Can you lift this to a computation on ideal classes, so as to be able to quickly handle (e.g.) any given supersingular curve over F_p after curve-independent precomputation?

2022.08.18 04:14:33, replying to "Anton Tutoveanu (@AntonTutoveanu)": The three NIST reports have 14 authors. Not all worked on this full-time for six years, but presumably total work is in the ballpark of 10000 days, i.e., 60 days per report page. There are many NIST references to further information the public hasn't seen, such as NSA "feedback".

2022.08.18 03:33:46, replying to "Probabilita ( (@dakoraa)": Sure, some public comments sound like that. But many others are directly on topic, expressing concern about what NSA is doing, based on what NSA is known to have done before. "covertly influence and/or overtly leverage" designs to make them "exploitable".

2022.08.07 20:51:19, replying to "Nadim Kobeissi ( (@kaepora)" = "Nadim Kobeissi (@kaepora)": An attack agency that hires all the cryptanalysts in advance doesn't need to worry about trying to suppress public attack knowledge in other ways. In reality, it isn't _all_ the cryptanalysts, but hiring Coppersmith 20 years ago was a big win for IDA, and there are many others.

2022.08.07 20:58:39: NSA advertises itself as the largest employer of mathematicians in the country. They also offer summer jobs for US university mathematicians excited by the idea of working on secret problems. Don't underestimate the resources of a multi-billion-dollar-a-year government agency.

2022.08.07 20:15:37, replying to "Nadim Kobeissi ( (@kaepora)" = "Nadim Kobeissi (@kaepora)": Certain people are falsely attributing to the blog post an inflammatory bribery claim. I never made that claim, in the blog post or anywhere else. The claim is totally out of whack with what the blog post explicitly says. Read for yourself; don't get suckered by disinformation.

2022.08.07 20:25:18: People starting from wanting to believe NISTPQC can't have been sabotaged were already making the these-are-top-experts-who-can't-have-been-bribed argument. The blog post notes this argument and then states verifiable facts trumping it, such as IDA hiring Coppersmith years ago.

2022.08.07 19:49:04, replying to "Nadim Kobeissi ( (@kaepora)" = "Nadim Kobeissi (@kaepora)": "At the risk of belaboring the obvious: An attacker won't have to say 'Oops, researcher X is working in public and has just found an attack; can we suppress this somehow?' if the attacker had the common sense to hire X years earlier, meaning that X isn't working in public." 1/2

2022.08.07 19:51:00: Quote continued: "People arguing that there can't be sabotage because submission teams can't be bribed are completely missing the point. ... It's not hard to imagine that [NSA] has been pushing NISTPQC to select algorithms that NSA secretly knows how to break." 2/2

2022.08.07 06:06:11, replying to "Peter Todd (@peterktodd)": Keeping the ECC layer is critical for trustworthy protection today. But the objective of rolling out post-quantum crypto is to _also_ protect user data against future quantum computers. The ECC layer will then be broken by Shor's algorithm, and we need to get the pq layer right.

2022.08.07 05:59:14, replying to "Ben (@bytesofben)": Typical choices of 256-bit ciphers are fine; no threats on the horizon. (If you put your key on a quantum laptop and encrypt quantum data then it's likely broken, so don't do that.) 256 is overkill (looks like each qubit op will cost roughly 2^40 bit ops) but also very low cost.

2022.08.07 05:29:31: It's great to see the progress on rolling out post-quantum crypto, assuming big quantum computers are coming. The _risks_ of Kyber problems (patents, attacks) aren't a reason to incur the _definite failure_ of doing nothing. But the bleeding-edge Kyber-512 option is a bad idea.

2022.08.07 05:34:37: If there's a Kyber-512 attack that scales as well as the recent SIKE attack, then, sure, Kyber-1024 is dead too. But if there's an attack that scales like core RSA attacks (NFS for integer factorization), then moving from Kyber-512 and Kyber-768 to Kyber-1024 could save the day.

2022.08.07 05:46:57: Some people say "We'll move to larger key sizes if an attack is published"; does this mean we don't care about tons of user data we're feeding into attacker databases _before_ the attack is published? Once we've sent a ciphertext, we can't retroactively add stronger protections.

2022.08.06 11:45:39, replying to "Luca De Feo (@luca_defeo)": I already mentioned the possibility of having the secret generated by (say) ANSSI. The spec could easily have required new secrets; probably this would have evolved to MPC. Would a "you're hiding an attack!" accusation deter you from adding a potentially useful extra defense?

2022.08.06 08:56:20, replying to "Anton Tutoveanu (@AntonTutoveanu)": Sure, the traditional view is that the evaluation of (proposed) cryptographic standards should assume perfect implementations, blaming the implementor for any deviations. Unfortunately, this allows a saboteur to select designs that predictably produce implementation errors.

2022.08.06 08:58:28: Rivest's 1992 critique of DSA in is worth reading. In particular, regarding DSA nonces, he wrote "The poor user is given enough rope with which to hang himself---something a standard should not do"; this is a useful counterpoint to the traditional view.

2022.08.06 08:39:46, replying to "Jared Flatow 🪩 (@jmflatow)": Use hybrids. Fight against NSA's "turn off ECC" pressure. Use the highest supported security levels. Measure + publicly document the percentage of your application's costs spent on cryptography. Don't condone speed being taken out of context to argue for bleeding-edge key sizes.

2022.08.05 20:43:34: New blog post "NSA, NIST, and post-quantum cryptography: Announcing my second lawsuit against the U.S. government." Case filed in federal court today by @LoevyAndLoevy. #nsa #nist #des #dsa #dualec #sigintenablingproject #nistpqc #foia

2022.08.02 15:08:23: You know the classic game of finding a very short Google search term that, at least for a brief moment, produces exactly one hit, for example for an amusing misspelling from people who obviously didn't bother to check their work? Try searching for "Orschoot and Weiner" in quotes.

2022.08.02 12:03:14, replying to "Luca De Feo (@luca_defeo)": No, it's not theoretical. As I said in the first message, you had the option of following a different path (ahem), generating the standard A at random by applying and then throwing away a secret isogeny. What's interesting about this is that then SIKE wouldn't (yet?) be broken.

2022.08.02 12:13:44: In other words, if current attacks are the end of the story, then pushing for elimination of back doors created a SIKE weakness that could have been avoided otherwise. Now think about this situation from the perspective of attackers who secretly knew the weakness from the outset.

2022.08.02 08:00:20, replying to "Luca De Feo (@luca_defeo)": Of course A=0 doesn't sound like a secret number. But think about the SIKE design from the perspective of an attacker whose secret knowledge was this 2022 attack. That attacker knows how to exploit A=0, and doesn't (yet?) know how to exploit an A chosen randomly by (say) ANSSI.

2022.08.01 01:28:34: Here's a funny aspect of the new SIDH/SIKE attack to think about: It seems that SIDH/SIKE wouldn't have been broken (yet?) if the proposers had applied a secret isogeny to build a standard starting curve. The attack would instead have been showing that the secret is a back door.

2022.08.01 01:33:01: See Section 5 of for previous approaches to constructing SIDH/SIKE back doors. The new attack gives a back door for many more parameters, including parameters that look just like current SIDH/SIKE plus a defensible "we added this extra protection" tweak.

2022.08.01 01:39:22: Compare to NIST's submission criteria: "To help rule out the existence of possible back-doors in an algorithm, the submitter shall explain the provenance of any constants or tables used in the algorithm." Is it true that explaining the SIDH/SIKE constants rules out back doors?

2022.07.31 18:21:35: New paper "Fast norm computation in smooth-degree Abelian number fields": Task amply studied for general number fields is much faster for cyclotomics. Critical subroutine in traditional class-group/unit-group computation, and in filtered-S-unit attacks.

2022.07.31 13:06:04: It's easy for people to issue security warnings _after_ systems are broken. What takes much more work is _proactively_ analyzing risks deeply enough to issue meaningful warnings _before_ systems are broken. Here's a 2021 example, disputing the "case for SIKE" security analysis.

2022.07.31 13:12:05: How is it possible that in 2021 there were "recent advances in torsion-point attacks" vs SIKE, while in 2022 one can find claims that there was "no attack progress" against SIDH/SIKE for a decade? There's an important lesson here about metrics for attack advances. Let me explain.

2022.07.31 13:19:45: There are some famous long-standing algorithmic problems for which there have been many attack advances and extensive evidence regarding how the advances developed. Let's take (non-quantum!) factorization as an example highlighted and studied by Gauss and many other researchers.

2022.07.31 13:30:37: How do we measure whether a factorization algorithm is better than previous algorithms? This is an important question. We don't want useless ideas to produce excitement for the attacker or fear for the defender; at the same time, we need to recognize and encourage useful ideas.

2022.07.31 13:35:30: So, okay, let's pull out the computer and try factoring random n-bit RSA moduli pq with many different factorization algorithms for various sizes of n. This immediately gives interesting information: one can easily see Pollard rho beating trial division, MPQS beating QS, etc.

2022.07.31 13:44:13: If algorithm X factors random RSA moduli faster than all previous algorithms then certainly X is an advance. This is a useful metric. But let's look at what goes wrong if we take this as the _only_ metric, dismissing any factoring algorithms that don't reduce RSA security levels.

2022.07.31 13:48:32: Pollard's p-1 algorithm established speed records for factoring _occasional_ integers. Basically, pq is vulnerable when p-1 or q-1 is a product of very small primes. It's easy to show that n-bit RSA moduli pq are extremely unlikely to be vulnerable to this unless n is very small.

2022.07.31 13:59:01: In other words, Pollard's p-1 algorithm doesn't apply to worst-case factorization. But it inspired followups replacing p-1 (from the multiplicative group) with other functions of p (from other groups). In particular, it inspired Lenstra's ECM (using elliptic-curve groups).

2022.07.31 14:03:56: ECM _does_ apply (under conjectures backed by extensive evidence) to the worst case. It's a tremendously powerful attack against RSA moduli. For (say) RSA-1024, NFS is (conjecturally) even faster, but modern versions of NFS save time using ECM as a "cofactorization" subroutine.

2022.07.31 14:13:27: Should Pollard's p-1 algorithm have been dismissed because it was useful only for occasional numbers? Of course not. The algorithm was setting speed records for some factorization problems. Algorithms for general problems are often outgrowths of algorithms for special cases.

2022.07.31 14:19:26: Is it interesting that the selected SIKE parameters maintained their security level (aside from small VoW inner-loop improvements) for a decade? Definitely. But using that as the only metric is a mistake. Torsion-point attacks were ripping more and more SIKE parameters to shreds.

2022.07.31 14:44:57: At first glance the new SIKE attack looks like a spectacular advance. But the previous work was important too! People leaping from "SIKE is maintaining its security level" to "there is no progress in SIKE attacks" were making a mistake, misextrapolating from a limited metric.

2022.07.31 15:10:52: There's more to say about how to measure special cases and use this for cryptographic risk assessment. But the starting point is to recognize that _ignoring_ special-case attacks, such as Pollard's p-1 method or previous torsion-point attacks, is a dangerous oversimplification.

2022.07.29 11:22:23, replying to "Mathias (@matthegap)":

2022.07.21 23:44:34: uses a square-root ECDL-in-intervals algorithm (baby-step-giant-step) for reconstructing truncated signatures. Better tradeoff between cost and truncation: (with @hyperelliptic) presents a cube-root ECDL-in-intervals algorithm.

2022.07.20 08:53:07, replying to "Jethro Beekman (@JethroGB)": People often create streaming APIs, but we've seen again and again how dangerous those APIs are: applications act on streams straight from the attacker. It's much safer to have a signature on each packet. Rough analogy: put handwritten signatures on each page of a legal document.

2022.07.20 08:47:15, replying to "Ruben Kelevra (@RubenKelevra)": Yes, that's how a signed-message API works, protecting against the very common failure mode of simply skipping (or ignoring the results of) a "check a signature" call. The more advanced question is how to make it harder for people to look at sm, see where m is, and remove the s.

2022.07.20 08:22:39: If signed messages look like message+signature (as opposed to "message recovery") then it's too easy for people to grab the message and skip checking the signature. To fight against this, transform sm to obscure m: xor 1,2,3,...; better, apply any of the AONTs from Rivest et al.

2022.07.16 19:30:35: NIST's latest report (1) says NIST is confident in the security of Kyber; (2) says Kyber-512 >= AES-128; (3) says Kyber-768 >= AES-192. But attack advances keep reducing lattice security levels! It will be completely unsurprising if the next round of attacks falsifies #2 and #3.

2022.07.16 19:41:58: Do large-scale attackers (think: years of secret work by Coppersmith et al.) have _feasible_ attacks against Kyber-512? Maybe, maybe not. This is safer than the 100% security failure (assuming big quantum computers are built) of not rolling out _anything_. But "confident"? Yikes.

2022.07.16 19:45:59: _Public_ lattice attacks are super-complicated and keep getting more complicated. The 17 bullet items on pages 3-4 of are surveying attack advances between 2018 and 2021, and we've seen more in 2022. This is completely different from the stability of ECDL.

2022.07.16 19:50:55: Here's the really weird part: quotes NIST's Dustin Moody as now saying "Because this is a new research field, we don’t want to put all our eggs in one basket and only have lattice algorithms, and then an attack comes along and we don’t have anything else."

2022.07.16 19:57:26: It seems that NIST _does_ see at least some of the risks in these bleeding-edge structured-lattice systems. But NIST says that "NIST is confident in the security that each provides." Confident? NIST keeps using that word. I do not think that word means what NIST thinks it means.

2022.07.07 19:36:58: Would be interested in hearing perspectives from more crypto engineers on whether the current Kyber patent status looks clear enough to proceed with deployment. Is it clear that the Ding and CNRS patents are dealt with? Is it ok that NIST hasn't commented on, e.g., CN107566121A?

2022.07.06 01:01:31: Looks like NIST didn't actually nail down the patent buyouts before announcing Kyber's selection, so now the patent holders have even more power. But, wait, NIST's expert negotiators say that they "may consider" switching to NTRU if agreements aren't signed "by the end of 2022".

2022.07.06 01:13:53: Can someone point me to where NIST's new report explains why they didn't simply select NTRU back in 2021? Is it the part where NIST says it finds the MLWE problem "marginally more convincing" than the NTRU problem? "Marginally" justifies leaping straight into a patent minefield?

2022.07.06 01:18:36: Aha, clearly this is the explanation: "A significant factor in the decision to choose KYBER over NTRU was NTRU’s performance". But wait: the same report says "KYBER, NTRU, and Saber ... Most applications would be able to use any of them without significant performance penalties."

2022.07.06 01:36:35: "Issues relating to patents were a factor in NIST’s decision during the third round as NIST became aware of various third-party patents." Actually, the CNRS patent and the Ding patent and several other patent threats were on NIST's web site in 2018, long before the third round.

2022.07.06 01:42:15: "NIST negotiated with several third parties to enter into various agreements to overcome potential adoption challenges posed by third-party patents." Where does the report evaluate the delay involved in (maybe) getting this done, and the security damage caused by this delay?

2022.07.06 01:45:11: "An evaluation factor is whether a patent might hinder adoption of the cryptographic standard." Compare to the original call (emphasis added): "it is CRITICAL that this process leads to cryptographic standards that can be freely implemented in security technologies and products."

2022.07.06 01:53:45: While all of this is going on: SLURRRRRRRRRRRRRRRRRRRRRRRRRP [that's the actual sound, amazingly the same everywhere around the world, of month after month of user data being systematically intercepted and recorded by the espionage agencies for various out-of-control governments]

2022.07.06 00:08:19: For people wondering why the Kyber team has suddenly expanded to include Jintai Ding: There's a fascinating back story. For details see my January blog post "Plagiarism as a patent amplifier: Understanding the delayed rollout of post-quantum cryptography":

2022.07.06 00:29:49: For some reason NIST left this interesting change in team composition out of its so-called "history" file, and has also now broken many links. We were watching and saved the change shortly after it happened:

2022.07.02 06:00:05, replying to "Greg Slepak ( (@taoeffect)": Getting crypto to widespread deployment involves many stages (decades for ECC!). Google already tried rolling out post-quantum crypto 6 years ago and then retreated, for interesting reasons; see OpenSSH has pq now but much more is waiting for NIST to act.

2022.07.02 02:49:44: NIST now says it plans to announce its selections of post-quantum algorithms on "Tuesday, July 5th" (I presume 2022, not 2033). Given the extent to which waiting for NIST has stalled pq deployment, this announcement is an important step forward no matter what the details are.

2022.07.02 02:52:04: Regarding details, I _hope_ that whatever NIST picked turns out to be safe, and I _hope_ that their handling of patents turns out to be adequate. If so, great: this announcement will set many more wheels in motion towards deployment of high-security post-quantum cryptography.

2022.07.02 02:54:43: But say NIST selects X, and later X turns out to be a disaster. (I question the competence of anyone who ignores this risk.) Are people then going to go back to waiting for NIST? Surely not. The announcement is getting rid of NIST's primary impact here as a deployment bottleneck.

2022.07.01 09:00:04, replying to "Erik Tews (@e_tews)": This is the multicore era. The baseline is a system with, say, 8 cores. Instead invest in 12 cores, and then recoup the investment through the power savings of running at lower speed. This generally produces better speeds for lower cost. Also, the hardware tends to last longer.

2022.06.30 18:23:03, replying to "loganaden velvindron (@loganaden_42)": There's some deployment already, yes. That's an exception to NIST's power to give away user data to attackers via delaying standardization. But at the moment most wheels in the broader ecosystem are sitting idle, waiting to spin up until NIST takes action.

2022.06.30 03:47:31: We're now up to a solid half year of delay in post-quantum standardization, apparently because NIST picked a new design in the middle of a patent minefield and was somehow confident it could instantly buy its way out of the minefield. Half a year of data given away to attackers.

2022.06.30 03:55:31: No, this isn't extra time taken for security review of submissions. NIST has repeatedly said it made its decisions long ago. Public cryptanalysts, not wanting to waste time, have a strong incentive to work on other topics until NIST reveals which submissions have been selected.

2022.06.30 04:08:59: At this point it's totally unclear what methodology, if any, NIST used to assess security risks in making its decisions. Could the differences in risks outweigh the now-guaranteed security failure of giving away half a year of user data? Did NIST's analysis include patent risks?

2022.06.30 04:17:35: NIST discouraged public patent analysis; was forced by rules to post IP statements but promptly undermined this by saying round 1 should analyze only "technical merits". And post-round 1: "we hope everyone will focus on the technical issues, rather than on the patents right now".

2022.06.30 04:26:03: Outright lie from NIST in October 2021: "For example, as Chris noted, we have not been discouraging public discussion on patent issues that may be relevant to the PQC standardization process." This was after pressure built enough that NIST had to pretend it was on top of patents.

2022.06.30 04:37:47: I sounded the alarm about post-quantum patents in 2018. NIST should have _encouraged_ public analysis of patents from the outset as an important component of decisions, instead of trying to quietly deal with patents as an afterthought to a holier-than-thou "technical" process.

2022.06.30 04:57:03: I hope we'll hear soon what the selections are, and that the buyouts have succeeded, and that the buyouts cover all the patents that matter. But this won't retroactively fix the past half year of delay, and the corresponding half year of user data that we've failed to protect.

2022.06.21 23:15:17, replying to "Jacob Christian Munch-Andersen (@NoHatCoder)": already has an entry discussing masking and linking to some examples of attacks. This is a nightmare to audit, and isn't a substitute for plugging the underlying leak. Similarly, yes, some types of secrets can be erased quickly, but faster than the attack?

2022.06.21 23:19:42: It's easy to fall into the trap of thinking "This demo took 89 hours, so if this secret can be changed every day then it's safe." But we've seen again and again that initial demos are publicly superseded by much faster attacks. Large-scale attackers are probably many years ahead.

2022.06.21 07:49:22: New resource page available on timing attacks, including recommendations for action to take regarding overclocking attacks such as #HertzBleed: Don't wait for the next public overclocking attack; take proactive steps to defend your data against compromise.

2022.06.16 03:31:57, replying to "Ruben Kelevra (@RubenKelevra)": I agree that the user doesn't want to wait for the computer. I don't agree that spinning up threads on all cores needs a user-perceptible slowdown. I'm not surprised at the limited attention to super-fast browser startup: don't users normally have a browser running continuously?

2022.06.15 23:42:36, replying to "Tom (@TomInfosec)": This particular attack demo succeeded with toy models and toy signal processing, so I'd expect state-of-the-art models and state-of-the-art signal processing to extract secrets from many more programs, _except_ when users protect themselves by setting constant CPU frequencies.

2022.06.15 23:28:18, replying to "Ruben Kelevra (@RubenKelevra)": Are you waiting for your computer during these unspecified workloads? If so, shouldn't you be asking the software providers for multithreading and vectorization to make the code an order of magnitude faster, even if this makes Turbo Boost drop to 1%, as in your x265 example?

2022.06.15 23:33:54: Given how many major applications that users care about are already multithreaded and vectorized, it's wrong to cherry-pick unoptimized single-threaded applications as pictures of total system performance. This error will increase as more and more applications add optimizations.

2022.06.15 23:06:44, replying to "Ruben Kelevra (@RubenKelevra)": No, I've run various types of heavily optimized code for long periods on all cores on various Intel and AMD boxes _at base frequency_ without coming anywhere close to the thermal limits. Running at higher frequency would mean much more frequent hassle of replacing dead hardware.

2022.06.15 23:20:25: CPU manufacturers set the thermal limits on consumer CPUs to avoid obvious short-term failures. They set safer limits and frequencies on the CPUs marketed as server CPUs. It's not a coincidence that server operators frequently publish reports on observed long-term failure rates.

2022.06.15 22:49:08, replying to "Ruben Kelevra (@RubenKelevra)": Three good reasons for users to turn off overclocking: 1. Overclocking reduces the hardware lifetime; dead hardware is a hassle. 2. Overclocking is a security risk. 3. The advertised benefit of overclocking is increasingly out of whack with the real-world benefit of overclocking.

2022.06.15 10:05:22, replying to "Ruben Kelevra (@RubenKelevra)": If the user is waiting then the time is long enough that the unoptimized code with (say) 2x Turbo Boost should be replaced by optimized code with (say) 4x vectorization, 4x multithreading, even though this limits Turbo Boost. The most important bottlenecks have done this already.

2022.06.15 09:57:32, replying to "Christian Wissel (@Gnarfoz)": The claim I'm disputing isn't that turning off Turbo Boost makes something run slower. The claim I'm disputing is that turning off Turbo Boost "has an extreme system-wide performance impact".

2022.06.15 09:50:47, replying to "Gok (@Gok)": Most Intel and AMD chips at base frequency are way below thermal limits with good fans + server-room temperatures. ARM development boards tend to be harder to cool and often set their nominal frequencies too high; running the boards at lower frequencies helps them last longer.

2022.06.15 09:21:26, replying to "Ruben Kelevra (@RubenKelevra)": Your numbers seem to indicate that, on your 4-core machine, Firefox startup wasn't even managing to keep 2 cores active on average. Could this limited attention to Firefox startup optimization perhaps be a hint that most users aren't spending all day waiting for Firefox to start?

2022.06.15 09:27:50: The code that users spend the most time waiting for has much bigger incentives to be vectorized and multithreaded, even though this limits the Turbo Boost speedup. I'd expect a claim of an "extreme system-wide performance impact" to be backed by numbers for common bottlenecks.

2022.06.15 09:13:44, replying to "Gok (@Gok)": Seems to me that CPUs are generally acquiring more and more cores (even on laptops), and more and more performance-critical code is switching to using those cores, so it's becoming increasingly obsolete to declare that system performance is judged by CPUs running just one thread.

2022.06.15 09:06:35, replying to "Robert Merget (@ic0nz1)": The claim at issue of an "an extreme system-wide performance impact" is from outside researchers, not from Intel and AMD. Of course CPU manufacturers collect detailed performance evaluations regarding hundreds of ideas to see which ones will save percentages here and there.

2022.06.15 08:59:46, replying to "Ruben Kelevra (@RubenKelevra)": The claim at hand isn't that there's a measurable performance difference. The claim at hand is that there is an "an extreme system-wide performance impact".

2022.06.15 08:57:44, replying to "Gok (@Gok)": 1. Idle cores draw much less power even at full speed. 2. Almost all of my power consumption is for servers that I'm buying to get computations done (as opposed to Raspberry Pi etc for benchmarking). 3. The claim I'm disputing is about Turbo Boost, not about slow-down-when-idle.

2022.06.15 08:44:22, replying to "Ruben Kelevra (@RubenKelevra)": "Obviously"? Have you measured the difference? Would you call it "extreme"? If it turns out that you're waiting primarily for the CPU to finish web-page computations (and not the network), wouldn't the best way to reduce latency be to split those computations across _all_ cores?

2022.06.15 08:10:26: As someone who happily runs servers and laptops at constant clock frequencies (see for Linux advice) rather than heat-the-hardware random frequencies, I dispute the claim in that this has an "extreme system-wide performance impact".

2022.06.15 08:19:36: Using all server cores _while keeping the hardware alive for a long time_ is what gets the most computation done per dollar. My experience running >100 servers of many different types is that the best clock frequencies for this are at or below base frequency, no Turbo Boost.

2022.06.15 08:26:46: Meanwhile I'm rarely waiting for my laptop, even with it running at very low speed. I'm happy with the laptop staying cool and quiet. Yes, I know there are some people using monster "laptops" where I'd use a server, but are they really getting "extreme" benefits from Turbo Boost?

2022.06.15 08:32:25: It's easy to find Intel laptops where the nominal top Turbo Boost frequency is more than twice the base frequency. These laptops can't run at anywhere near that top frequency for optimized computations running on all cores. Where's the "extreme system-wide performance impact"?

2022.06.15 08:38:21: What I find particularly concerning about these unquantified claims of an "extreme" impact is that, in context, these claims are trying to stop people from considering a straightforward solution to a security problem. If the costs are supposedly unacceptable, let's hear numbers.

2022.06.08 01:05:01: Posted an AVX2-vectorized-sorting benchmarking script covering djbsort, vxsort, vqsort. (vxsort and vqsort also support AVX-512.) The middle part of this Skylake graph is the part that matters for crypto, and also the base case for larger quicksort etc. [media]

2022.06.08 01:20:21: The graph says a lot about performance. For example, the blue increases on the right are from many L1 cache misses. Maybe there has to be a tradeoff between constant-time and fast; but should try optimizing the sorting-network array-access pattern, as noted in the documentation.

2022.06.08 01:25:11: Assuming I haven't misunderstood how to use vqsort: The graph shows vqsort often losing badly to vxsort-cpp, and never beating it by much. The green mountain in the middle of the graph is a striking performance regression. Fixing it would also improve vqsort for larger sizes.

2022.06.08 01:54:22: In current vqsort, the base case is (for AVX2) a size-128 sorting network, where vqsort looks slightly better than previous work... because it fully unrolls the code for that size. Can't fit many such sizes in insn caches. djbsort and vxsort put more effort into code compression.

2022.06.08 02:05:05: Since vqsort manages to catch up to vxsort for large sizes despite being worse in the base case, the vqsort splits must be faster than the vxsort splits, so vxsort could gain speed by adopting those. For djbsort, this is ruled out by the requirement of being constant-time.

2022.06.08 02:32:08: Regarding benchmarks, it's important to realize that measuring only size 128 misses opportunities to speed up other quicksort base-case sizes, and measuring only size 1M adds unnecessary fog by adding many levels of splits to the base cases. Many sizes appear; measure them all!

2022.06.07 20:12:44, replying to "Richard Startin (@richardstartin)": When I started collecting vectorized-sorting references years ago, I quickly noticed a pattern of previous work not citing other clearly relevant previous work ... so to compensate I did _more_ searches, finding many interesting implementations and papers:

2022.06.07 20:20:11: Smaller searches today (on duckduckgo to avoid filter bubbles) find vxsort without trouble. It's puzzling to see Google claiming state-of-the-art speeds without a direct comparison. Also, days after being pointed to djbsort and vxsort, Google still can't manage to run benchmarks?

2022.06.07 20:01:14, replying to "Cat (@eigma)": The 16x8 clarification is useful, but the lack of timings is surprising, as is the "some use cases may be interested in smaller arrays" comment. Handling smaller arrays faster would make vqsort faster for _all_ sizes. The vqsort paper+code spend serious effort on the base case.

2022.06.05 22:07:25, replying to "Jacob Christian Munch-Andersen (@NoHatCoder)": Um, no, that's not how Intel CPUs work. Intel prioritizes speed, and then tries to reduce power without noticeable slowdowns. Agner Fog's example is AVX2 ramping up to full power in 56000 cycles and staying there unless there's _no_ 256-bit instruction for _millions_ of cycles.

2022.06.05 21:27:27, replying to "Jacob Christian Munch-Andersen (@NoHatCoder)": Um, page 155 of reports full-power AVX2 after 56000 cycles on almost the same CPU. I measure first vqsort int32[256] call above 50000 cycles, then ~10000 for the next three runs, then rapidly settling down to around 8000. (djbsort: 4615, 2026, 1361, etc.)

2022.06.05 21:38:27: First-call performance in this type of benchmark isn't interesting for applications that keep their main-loop code size under control; that's why I reported the stable ~8000-cycle figure. For people familiar with the Skylake performance characteristics, >30 runs are ample data.

2022.06.05 21:49:48: I understand that many people aren't immersed in CPU microarchitecture, so I've now run a 3-second sequence of 1048576 calls to rdtsc+vqsort int32[256] on the same Skylake. An average call takes 8292 cycles, 6x slower than djbsort. (rdtsc and other loop overheads use <30 cycles.)

2022.06.05 22:25:29: Tweaking the bench_sort from vqsort to use M = 256 reports 364 MB/s, i.e., 8.24 cycles/byte at 3GHz, which is around 8400 cycles. M = 1024 gives 645 MB/s, i.e., 4.68 cycles/byte, above 19000 cycles. Looks like a bit more timing overhead than my vqsort test, but basically matches.

2022.06.05 18:15:39, replying to "Jacob Christian Munch-Andersen (@NoHatCoder)": Ran a loop of 33 rdtsc+vqsort, each >8000 cycles for the smaller size that I mentioned. One always expects initial calls to be outliers (not just for AVX2 ramp-up; the big starting issue is code caching); djbsort's int32-speed ( says medians and quartiles.

2022.06.05 18:19:57: AVX2 usage has also become so pervasive in typical code that it's not surprising for the CPU to always have the AVX2 unit warmed up; cooldown is triggered after millions of non-AVX2 cycles. But the more important point is to always check for variations across many measurements.

2022.06.05 07:57:39, replying to "Danilo (@oak_doak)": Ryzen 5 3600 is a Zen 2 chip, so (unlike Zen 3 and most Intel CPUs) it has very slow pext. Looks like vqsort's 64-bit AVX2 code blindly uses pext.

2022.06.05 07:03:08, replying to "0b0000000000000 (@0b0000000000000)": Sorting in L1 cache is the most important use case in post-quantum crypto and many other applications. It's also the base case inside vqsort, and something the vqsort paper and code put considerable effort into. The vqsort claim was "fastest", not just "fastest for large sizes".

2022.06.05 07:08:03: So far I haven't been able to verify these vqsort speed claims. On the contrary, it seems that, for 32-bit data types on AVX2, vqsort would be faster if its base-case code were replaced by a call to the 2018 djbsort code. Similarly, vqsort should reuse vxsort-cpp for AVX-512.

2022.06.05 06:51:30, replying to "Danilo (@oak_doak)": The growing corner of CPUs with AVX-512 can definitely do better with that than using similar AVX2 code, but the paper says "fastest sort for individual (non-tuple) keys on AVX2 and AVX-512", which I understand to mean fastest on CPUs with AVX-512 _and_ on CPUs with just AVX2.

2022.06.05 03:39:55, replying to "Ruben Kelevra (@RubenKelevra)": Intel Xeon E3-1220 v5, pinned at 3GHz. Turbo Boost (which would be 3.5GHz) disabled. No evidence of any AVX2 throttling. Reasonable cooling, no evidence of thermal throttling, plus these were very short single-core runs. Both of the pieces of code being benchmarked were AVX2.

2022.06.04 23:39:13: Tried Google's new vectorized quicksort code vqsort on Skylake, and timed Sorter() as ~8000 cycles for int32[256] (big chunk of code for a size-specific sorting network), ~19000 cycles for int32[1024] (non-constant-time). djbsort is 1230, 6286 (ct). Did I misuse vqsort somehow?

2022.06.04 23:44:01: I included algo-inl.h and vqsort.h, did auto s = Sorter() and SortAscending order, and then did s(x,N,order) on an array x of N int32_t values. Size seems ok; I checked that the output is sorted correctly and (in a separate run to not slow things down) that asan doesn't complain.

2022.06.04 23:50:39: I also checked in gdb that the dispatch is calling the AVX2 code (presumably the fastest vqsort option for Skylake, as opposed to machines with AVX-512 such as Skylake-X). I didn't notice an API with lower per-call overhead, and per-call can't explain the jump from 8000 to 19000.

2022.06.05 00:04:21: I also looked briefly at the vqsort paper, which says "outperforms state of the art architecture-specific algorithms". Section 3 describes vqsort's size-256 sorting network, but is missing microbenchmarks and comparisons to, e.g., sid1607, djbsort, vxsort.

2022.06.05 00:08:37: Oops, sorry, the "outperforms" quote is from the accompanying blog post The paper says that vqsort is "the fastest sorting implementation known to us for commercially available shared-memory machines". Should I maybe have added some extra cmake options?

2022.06.05 00:12:23: Another theory is that the speeds depend on a newer compiler version than what I tried, but this theory doesn't seem compatible with the "performance portability" and "production-readiness" parts of the advertising. I also tried a Broadwell with gcc 10.2.1, which isn't so old.

2022.06.02 17:02:50: Eurocrypt talk today on presented cryptosystems using lattices secretly isometric to a public easy-to-decode lattice, and portrayed this as analogous to McEliece using codes secretly isometric to a public easy-to-decode code. That's not what McEliece does!

2022.06.02 17:13:49: Beyond isometry, there are many ways to hide codes; see for a survey. McEliece takes a secret scaling (from the secret polynomial g), plus a subfield subcode (the scaling isn't an isometry on the resulting code), plus a permutation (the isometry part).

2022.06.02 17:23:39: This is important because if McEliece relied _just_ on the secret isometry (the permutation) then it would be broken by Sendrier's 2000 support-splitting algorithm. Now a new proposal relies purely on secret isometries, misrepresents McEliece, and downplays Sendrier? Alarm bells!

2022.06.02 17:35:17: Meanwhile the trend in code-based cryptography is to add _more_ defenses against potential attacks. For example, Classic McEliece describes secret puncturing, taking the code length n below a power of 2, as an "extra defense", and uses this for most proposed parameter sets.

2022.06.02 17:58:49: Secret puncturing can't hurt security. The 2012 challenge to break a secretly punctured secretly permuted symmetric public code (BCH) is designed to shed light on whether secret puncturing helps security. Secret puncturing is then layered _on top_ of McEliece's original defenses.

2022.06.01 19:16:00: Yet another paper appears claiming to chop a further percentage out of lattice security levels against quantum attacks: But we keep hearing that we're not supposed to worry about continual lattice security degradation. Let's look at the logic behind this.

2022.06.01 19:22:57: First of all, we're told to ignore the (im)maturity of the security analysis, as reflected by the (in)stability of quantitative security levels. Quantification is dumbed down into a yes/no question: is this attack as expensive as a brute-force attack against a single AES-128 key?

2022.06.01 19:28:13: Max cost of an AES-128 key search is 2^128 AES evaluations, about 2^143 bit ops. Quantum attacks sound much cheaper, about 2^64 quantum AES evaluations, but we're not supposed to worry: a qubit op probably costs roughly 2^40 bit ops; also, P-way parallel attacks gain only 2^64/P.

2022.06.01 19:32:37: Whenever there's an improved quantum attack against lattices, we're told to ignore it because (1) the speedup is still smaller than the quantum AES speedup; (2) the lattice parameters are chosen to be as hard to break as AES-128 pre-quantum; (3) ergo they're also ok post-quantum.

2022.06.01 19:45:36: But is it true that parameters are chosen this way? We already have ECC for pre-quantum security. Consider Lyubashevsky in saying that the lattice quantum speedup was "just a dozen or so bits" and, on this basis, recommending smaller lattice parameters.

2022.06.01 19:52:27: Furthermore, it's not just quantum attacks getting better. During NISTPQC, Kyber's pre-quantum AES comparison has degraded from (1) "conservative" to (2) bleeding edge in bit ops to (3) apparently broken in bit ops and bleeding edge in AT, despite tweaks to try to add security.

2022.06.01 20:23:17: The latest everything-is-fine narrative emphasizes that attacks aren't feasible yet. Cryptanalysts aren't even acknowledged for successfully breaking the original as-strong-as-AES claim. This is a big regression from the traditional emphasis on quantitative algorithm speedups.

2022.06.01 20:30:29: Systematically encouraging publication of algorithm speedups is by far the community's best way of finding out whether proposed cryptosystems are breakable. This means measuring and acknowledging the speedups, not making one excuse after another to downplay or deny the speedups.

2022.05.31 00:28:51: Very small software release showing how easy it is to beat NTL (which is faster than PARI, Sage, etc.) by a factor >100 for typical input sizes in an important subroutine, algebraic norm computation, _if_ the number field is a power-of-2 cyclotomic field:

2022.05.31 00:36:08: The basic algorithm here was already known, but previous software wasn't showing what the algorithm can accomplish. More complicated algorithms handle more cyclotomics and their subfields (number theorists say "Abelian fields") as long as the degree is smooth. Paper coming soon.

2022.05.26 22:54:16: Now posted a collection of evidence that the publication of DH was driven by the usual academic publication incentives, not by patents: As part of digging into the history, freed up various related litigation documents via RECAP:

2022.05.26 23:09:08: Most patents that I've studied are on "inventions" that were published by other people independently, giving simpler examples of the damage done by the patent system. DH is unusual in that it wasn't published independently; that's why one has to look more closely at the history.

2022.05.26 23:18:48: Meanwhile pro-patent articles such as (1) say X was patented, (2) say the disclosure+deployment of X are of societal value, (3) leap to the conclusion that the patent on X was of societal value, and (4) never ask whether X would have been published anyway.

2022.05.26 20:26:24: Updated software to parse the China NHC case announcements (now more fields in output data): Shanghai is down to 1% of peak from 6 weeks ago, again raising question of what the most tolerable measures would be if the goal were instead set at ensuring R<1. [media]

2022.05.25 00:51:51: "I'm not happy with the field of candidates for post-quantum public-key signature systems." 2003, in the posting that introduced the "post-quantum" phrase: Then surveyed these sig systems; almost all now broken, except hash-based.

2022.05.25 01:12:29: Will today's post-quantum proposals turn out to have a better track record when we look back at them 18 years from now? Some people sound awfully confident in supposed dividing lines between new structured-lattice systems and old broken structured-lattice/structured-code systems.

2022.05.25 01:25:43: When there's a long history of cryptographic failures, is this because there are a dozen pitfalls surrounding a safe core idea? Or is the core idea fundamentally flawed? It's deeply disturbing to see cryptographic decisions being made by people who think these are easy questions.

2022.05.21 19:17:21, replying to "Shawn Willden 🇺🇸 🇺🇦 (@shawnwillden)": Would you say that "any organization that runs high-value edge- and cloud-computing applications that require large volumes of data to flow quickly between local nodes and decentralized sources of computing power" is facing the performance challenges of crypto on small devices?

2022.05.21 18:58:20: Management consultant Dogbert says that post-quantum cryptography is "impractical" for "high-value edge- and cloud-computing applications that require large volumes of data to flow quickly between local nodes and decentralized sources of computing power":

2022.05.21 19:04:54: After this performance BS and generic emerging-market FUD (unspecified "cost" and the risk of "having to switch to higher-performance PQC solutions that come to market in the future"), Dogbert concludes "most organizations should take a wait-and-see approach to PQC solutions".

2022.05.09 18:33:26: 2018, e.g. in Many cryptographers are blaming subtle overflow bugs on ECC carry chains. 2022, Subtle overflow bug in some Kyber code. Zero impact today, but we're about to see an explosion of lattice deployment and many more bugs.

2022.05.09 14:31:27, replying to "Rich Felker (@RichFelker)": I should emphasize that my question is about the population-level numbers. Examples of boosted people dying don't say that vaccines are useless; similarly, "here are examples of reinfections" isn't the same as "here are millions of new #LongCOVID cases after you said it's over".

2022.05.09 13:53:30, replying to "M:\arc -de B (@marcdeb)": The question about whether there will be millions of new #LongCOVID cases each year is a question about US numbers. The analogous worldwide question would say tens of millions but has less supporting data: studies of #LongCOVID prevalence typically focus on specific countries.

2022.05.09 14:09:42: COVID deaths should be relatively easy to track, but recent reports claim that some big countries have been undercounting deaths by an order of magnitude, versus ~1.2x in the US. Less severe infections are seriously undercounted in the US but wastewater data is readily available.

2022.05.09 03:52:19, replying to "Gordon Mohr | gojomo.eth ꧁ꙮꙮ꧂ (@gojomo)": The vaccination numbers and past-infection numbers are of course relevant, and were already taken into account in the question at the top of the thread.

2022.05.09 03:48:50, replying to "Gordon Mohr | gojomo.eth ꧁ꙮꙮ꧂ (@gojomo)": Another random example claiming "The Omicron wave will leave most people with potent and durable protection against Covid": This once-and-done belief is driving COVID (in)action today in the US. But what happens if it fails for millions of people per year?

2022.05.09 03:34:23, replying to "Gordon Mohr | gojomo.eth ꧁ꙮꙮ꧂ (@gojomo)": Example of a popular Twitter account stating a policy rationale where (1) the resulting actions are indistinguishable from today's mainstream COVID actions and (2) the rationale focuses exclusively on the percentage of people who have caught COVID so far:

2022.05.09 01:48:04: Is there any scientific basis for today's "Catch COVID once and that's the end of it" narrative? Why shouldn't COVID mutations keep escaping immunity, creating big new waves every year that, in the US, kill hundreds of thousands of people and inflict #LongCOVID on millions more?

2022.05.09 02:31:29: There's ample evidence of vaccination reducing typical severity + reducing #LongCOVID rates, but reducing isn't eliminating. Mutation is now much faster than vaccine updates. Improved treatments have been reducing short-term death rates but so far doing little against #LongCOVID.

2022.05.09 02:45:39: If another big wave brings millions of new #LongCOVID cases, will the it's-over narrative disintegrate? Or will we see more downplaying and denial (there aren't many of these people, they're imagining their symptoms, etc.), and vociferous insistence that nothing else can be done?

2022.05.07 14:42:53: "Math doesn’t seem to care very much about the energy budget of an earth sized planet or how many atoms are in its crust, so if your claim to have a large security margin relies crucially on exceeding some such physical limit, you m