Cryptography innovations in hardware processors
Article by: Wajdi Feghali, Intel
Industry will need to develop innovative hardware and optimized software solutions to accelerate the creation of tomorrow’s cryptography.
It is highly likely that in the future everything will be encrypted, from your shopping list to your medical file. It’s an exciting notion, but the field of cryptography is particularly volatile, and a lot of work is going on right now to ensure that data can be secure in the future.
Multiple cryptographic operations can be applied to each byte of data because the data is cryptographically protected across multiple layers of software, network, and storage stacks. These processes support very critical business functions that require increased security, but at the hardware level they are some of the most compute-intensive operations in existence. And the demand for cryptographic calculations continues to grow, with the amount of data generated each year increasing exponentially and companies using larger keys, as well as multiple concurrent cryptographic algorithms, to bolster security. All the while, these IT demands continue to swell.
To combat the problem of cryptographic computation costs, the hardware industry has strived to produce new guidelines, microarchitectural improvements, and innovative software optimization methods. Strong examples of these advancements over the years include the introduction of next-generation fixed-function processor instructions that have reduced the computational requirements of Advanced Encryption Standard (AES) symmetric encryption and more recently FIPS algorithms. As a result, organizations have increasingly committed to implementing strong cryptographic ciphers to better secure data and communications over the past 10 years.
[Download Report] Imagine the future of electronics
But as advances in quantum computing continue to accelerate, the security effectiveness of symmetric and asymmetric ciphers may be threatened. Increasing the size of the keys (from 128 to 256 bits) can help make symmetric algorithms (such as AES) more resistant to quantum attacks, but again, this solution results in higher computational costs. Asymmetric ciphers (such as RSA and ECDSA) will also most likely be insufficient. Many have said that the raw power of quantum computers will be the death of encryption, but we don’t think it will.
The in-place encryption schemes mentioned above will likely be supplanted by new post-quantum cryptographic approaches. The industry is actively working to transition to new cryptography standards tailored to address these looming post-quantum security challenges. In fact, many proposals have already been submitted to the NIST Post-Quantum Cryptography (PQC) competition, whose requirements vary in terms of key size, storage, and compute specifications.
As the age of quantum computing approaches, the industry will need to rally together to move towards new methods and standards.
What will this change look like? The transition will be long, and the existing crypto will remain in place until the industry is able to fully embrace emerging quantum-resistant algorithms. We expect this will result in high computational load, and organizations will not embrace stronger encryption until the underlying post-quantum algorithms are economically viable from a compute performance perspective.
To accelerate the creation of tomorrow’s cryptography, industry will need to develop inventive hardware improvements and optimized software solutions that work together to reduce compute requirements. The good news is that we are not starting from scratch.
Here are six key examples of crypto performance improvements and innovations underway today:
1. TLS (Transport Layer Security) cryptographic algorithms – TLS protocols operate in two phases. The first is the initiation stage of the session. When a session is initiated, the client must communicate private messages to the server using a public key (often RSA) encryption method before the protocol generates a shared secret key. RSA is based on modular exponentiation, a high cost computational mechanism that produces most TLS session initiation processor cycles. Combining RSA with an algorithm such as Elliptic Curve Cryptography (ECC), using techniques such as perfect transmission privacy, can provide even greater security.
In the second phase, the bulk data is transferred. The protocols encrypt data packets to ensure confidentiality and leverage the cryptographic hash-based Message Authentication Code (MAC) of data to protect against attempts to modify data in transit. Encryption and authentication algorithms protect TLS bulk data transfers, and in many cases putting the two together can increase overall performance. Some cipher suites such as AES-GCM even define combined “encryption + authentication” modes.
2. Public key cryptography – To support the improved performance of “large number” multiplication processes often found in public key ciphers, some vendors are creating new instruction sets. For example, Intel Ice Lake processors introduced support for the AVX512 Integer Fused Multiply Add (AVX512_IFMA) instruction set architecture (ISA). The instructions multiply eight 52-bit unsigned integers found in 512-bit wide registers (ZMMs), produce the high and low halves of the result, and add it to the 64-bit accumulator. Combined with software optimization techniques (such as multi-buffer processing), these instructions can provide significant performance improvements not only for RSA, but also for ECC.
3. Symmetric encryption – Two instruction enhancements increase the performance of AES symmetric encryption: vectorized AES (VAES) and vectorized multiplication without porting. VAES instructions have been extended to support vector processing of up to four AES (128-bit) blocks at a time using 512-bit wide registers (ZMM), and when properly used, they provide a performance advantage to all AES modes of operation. Some vendors have also extended vector processing support for up to four portless multiplication operations at a time using 512-bit wide registers (ZMMs) to provide additional performance to Galois hash and AES-GCM encryption. largely used.
4. Hash – Computational performance can be improved by creating new extensions for the Secure Hash Algorithm (SHA), which digests arbitrarily sized data into a fixed size of 256 bits. These extensions include instructions that dramatically improve the performance of SHA-256, allowing more cryptographic hash to be used.
5. Functional sewing – Assembling of functions was launched in 2010 and is a technique for optimizing two algorithms that usually run in combination, but sequentially, such as AES-CBC and SHA256, and form them into a single optimized algorithm focused on maximizing CPU resources and throughput. The result is a fine interleaving of the instructions of each algorithm so that the two algorithms run simultaneously. This allows processor threads that would otherwise be inactive while running a single algorithm, due to either data dependencies or instruction latencies, to execute instructions from the other algorithm, and vice versa. This is very relevant because algorithms still have strict dependencies that the modern microprocessor cannot fully parallelize.
6. Multi-buffer – Multibuffer is an innovative and efficient technique for processing multiple independent data buffers in parallel for cryptographic algorithms. Vendors have already implemented this technique for algorithms such as hashing and symmetric encryption. Processing multiple buffers simultaneously can result in significant performance improvements, both in cases where the code can take advantage of single-instruction multi-data instructions (AVX / AVX2 / AVX512) and even in cases where it can’t. This is important because more data requires cryptographic processing and the availability of wider processor data paths will allow the industry to keep pace.
True quantum computing will be coming before we know it, and the industry mindset has already started to shift from “does this data need to be encrypted?” “To” why is this data not encrypted? As a community, we need to focus on implementing advanced cryptography at the hardware level, along with associated algorithmic and software innovations to meet the challenges presented by a post-quantum world. This will lead to more performance and security breakthroughs across a host of important encryption algorithms and help accelerate the transition to the next generation crypto schemes that the industry will need to navigate the coming decade. .
This article was originally published on EE time.
Wajdi Feghali is an Intel member.