CHERI-RISC-V and CHERI-x86-64, CHERI-extended ISAs for capability

Palli · January 16, 2024, 11:12am

CHERI is over decade old research project, but it seems like hardware is already available for RISC-V. I find this intriguing, and Julia may want to support eventually. Note the ISA extension(s) has changed recently, not just from the now dropped CHERI-MIPS. They have lots of papers, from PhDs relating to this. Capability operating systems are decade old actually, but went out of favor, I think because of (software) overhead, but that seems changing, so security no longer costly.

A variety of programming-language and code-generation models can be used with a CHERI-extended ISA. As integer virtual addresses continue to be supported, C or C++ compilers might choose to always implement pointers via integers, selectively implement certain pointers as capabilities based on annotations or type information (i.e., a hybrid C interpretation), or alternatively always implement pointers as capabilities except where explicitly annotated (i.e., a pure-capability interpretation). Programming languages may also employ capabilities internal to their implementation: for example, to protect return addresses, vtable pointers, and other virtual addresses for which capability protection can provide enhanced vulnerability mitigation.
[…]
Capability instructions allow executing code to create, constrain (e.g., by reducing bounds or permissions), manage, and inspect capability register values. Both unsealed (memory) and sealed (object) capabilities can be loaded and stored via memory capability registers (i.e., dereferencing)
[…]
Tagged memory associates a 1-bit tag with each capability-aligned and capability-sized word in physical memory, which allows capabilities to be safely loaded and stored in memory without loss of integrity.
[…]
In keeping with the RISC philosophy, CHERI instructions are intended for use primarily by the operating system and compiler rather than directly by the programmer, and consist of relatively simple instructions that avoid (for example) combining memory access and register value manipulation in a single instruction.
[…]
CHERI-RISC-V applies CHERI protections in different ways compared to our initial CHERI-MIPS architecture. Some of these differences arise from differences in the base ISA; we anticipate that adaptations of CHERI to ISAs will adopt conventions such as instruction-encoding in keeping with their specific flavor and design.

Other design decisions reflect maturity of the CHERI model and lessons learned from CHERI-MIPS. In our initial work on CHERI, we utilized an uncompressed capability format in which each capability was 256 bits in size.

1.6 Experimental Features
[…]

A system for mixing 64-bit and 128-bit capabilities

Chapter 7 provides a detailed description of each CHERI-RISC-V instruction.
Chapter 8 provides a detailed description of each CHERI-x86-64 instruction.

Accurate garbage collection Traditional implementations of C are not amenable to accurate garbage collection because unions and types such as intptr_t allow a register or memory location to contain either an integer value or a pointer. CHERI-C does not have this limitation: The tag bit makes it possible to accurately identify all memory locations that contain data that can be interpreted as a pointer. Garbage collection is the logical dual of revocation: garbage collection extends the lifetime of objects as long as they have valid references, whereas revocation curtails the lifetime of references once the objects to which they refer are no longer valid. A simple stop-the-world mark-and-sweep collector for C can perform both tasks, scanning all reachable memory, invalidating all references to revoked objects, and recycling unreachable memory.

More complex garbage collectors typically rely on read or write barriers (i.e., mechanisms for notifying the collector that a reference has been read or written). These are typically inserted by the compiler; however, in the context of revocation the compiler-generated code must be treated as untrusted. It may be possible to use the permission bits – either in capabilities themselves or in page-table entries – to introduce traps that can be used as barriers.

Palli · March 3, 2024, 1:25pm

FYI: CHERI is in the White House report (and Rust, but not Ada or Julia, though Julia would also be better than C and C++):

BACK TO THE BUILDING BLOCKS: A PATH TOWARD SECURE AND MEASURABLE SOFTWARE

taken major action, starting with Executive Order 14028 on Improving the Nation’s Cybersecurity, to drive the ecosystem to patch known classes of vulnerabilities through secure software development practices across the supply chain. Continuing to encourage both the government and the private sector to do this can have an outsized impact on improving the Nation’s cybersecurity. […]

Programmers writing lines of code do not do so without consequence; […] There are no “silver bullets” in cybersecurity […]

According to experts, both memory safe and memory unsafe programming languages meet these requirements. At this time, the most widely used languages that meet all three properties are C and C++, which are not memory safe programming languages. Rust, one example of a memory safe programming language, has the three requisite properties above, but has not yet been proven in space systems. […]

Therefore, to reduce memory safety vulnerabilities in space or other embedded systems that face similar constraints, a complementary approach to implement memory safety through hardware can be explored.

The chip, in particular, is an important hardware building block to consider. There are several promising efforts currently underway to support memory protections through hardware. For example, a group of manufacturers have developed a new memory-tagging extension (MTE) to cross-check the validity of pointers to memory locations before using them. If they are invalid, the CPU produces an error.xvii This technique is an effective method to detect memory safety bugs, but this approach should not be considered a comprehensive solution to prevent all memory safety exploits.xviii Another example of a hardware method is the Capability Hardware Enhanced RISC Instructions (CHERI).xix This architecture changes how software accesses memory, with the aim of removing vulnerabilities present in historically memory unsafe languages.xx

This is not just about space (and the GC can be avoided in Julia… and that’s also not needed for most other (non-embedded) software):

In the case of Apollo 13 the near disaster was inadvertently caused by the laws of physics, but today there are adversaries actively trying to sabotage space systems.xv Now, as cyberspace continues to be introduced to outer space, the spacecraft must also be secure by design. A catastrophe should not be the catalyst for action.

The space ecosystem is not immune to memory safety vulnerabilities, however there are several constraints in space systems with regards to language use. First, the language must allow the code to be close to the kernel so that it can tightly interact with both software and hardware; second, the language must support determinism so the timing of the outputs are consistent; and third, the language must not have – or be able to override – the “garbage collector,” a function that automatically reclaims memory allocated by the computer program that is no longer in use.xvi These requirements help ensure the reliable and predictable outcomes necessary for space systems.

Topic		Replies	Views
Software Memory Safety Internals & Design question	1	1484	November 11, 2022
Is Julia safe? New to Julia	12	931	October 9, 2024
Hazard pointers - in C++26 vs Julia - concurrency/atomics e.g. for UInt128 Offtopic atomic , concurrency	4	930	January 24, 2024
Introducing Rust alongside C in Julia's source tree? Internals & Design	18	1370	January 10, 2025
Vale's generational references memory management and Higher RAII vs Julia and other languages Offtopic memory	0	311	May 19, 2024

CHERI-RISC-V and CHERI-x86-64, CHERI-extended ISAs for capability

Related topics