Inside Fil-C: A Simplified Model for Memory-Safe C/C++
Decoding Fil-C: Memory Safety for Legacy Code
In the ongoing quest for secure systems programming, a new contender named Fil-C has emerged, pitching itself as a memory-safe implementation of C and C++. While the production system involves complex LLVM IR rewrites, a recently published simplified model demystifies its core concepts. This model reveals an automated source code transformation that fundamentally changes how pointers operate, introducing a garbage collector to a famously manual memory management environment.
The Core Transformation: An AllocationRecord for Every Pointer
At its heart, Fil-C's simplified model adds a shadow layer of metadata to every pointer. Within each function, every local pointer variable gains a companion AllocationRecord* variable. This record, a data structure containing bounds and metadata, becomes the guardian for memory accesses.
Simple pointer operations are rewritten to propagate this metadata. An assignment like p1 = p2; becomes p1 = p2, p1ar = p2ar;. Crucially, casting a non-pointer value to a pointer type results in a NULL AllocationRecord, flagging it as unsafe. This foundational rewrite sets the stage for comprehensive safety checks.
Memory Allocation Gets a Safety Overhaul
The model intercepts standard library calls. A call to malloc(size) is rewritten to call filc_malloc(size). The simplified filc_malloc performs a surprising action: it makes three distinct allocations instead of one.
- One for the requested user memory (
visible_bytes). - One for a corresponding
invisible_bytesarray, which will store theAllocationRecord*metadata for any pointers stored within the user memory. - One for the
AllocationRecordstruct itself, which links to the other two allocations.
This tripartite structure is the engine of Fil-C's safety. Every heap allocation now carries its own provenance and bounds information.
Bounds Checking and Pointer-Load Safety
Dereferencing a pointer is no longer a simple memory fetch. The transform inserts bounds checks using the accompanying AllocationRecord*. For example, reading an integer becomes a check that the offset is within the visible_bytes region.
The system gets more sophisticated when the value being loaded or stored is itself a pointer. Since the compiler cannot directly see into heap memory, it uses the invisible_bytes array. If a pointer resides at visible_bytes + i, its accompanying AllocationRecord* is stored at invisible_bytes + i. This ensures pointer metadata flows correctly through data structures.
The Unavoidable Garbage Collector
A pivotal revelation of the model is that Fil-C introduces a garbage collector to C/C++. The simplified filc_free function only frees the visible_bytes and invisible_bytes allocations, leaving the AllocationRecord itself. A GC, which could be a simple stop-the-world collector, later traces through these records, freeing unreachable ones and calling filc_free on them.
This has profound implications: forgetting to call free no longer causes a memory leak. However, explicit free calls remain valuable for timely resource release. The GC also enables safe address-of operations on local variables by promoting them to the heap when their address might escape.
Handling the Unknowable: Memmove and Provenance
Functions like memmove pose a unique challenge, as they operate on arbitrary byte ranges with no type information. Fil-C's solution is a heuristic: it only moves pointer metadata (invisible_bytes) for ranges that are correctly aligned for pointers. This leads to the curious result that moving eight aligned bytes behaves differently from eight separate single-byte moves.
This strict tracking of pointer metadata makes Fil-C a concrete system with explicit pointer provenance. It invalidates compiler optimizations that assume p1 and p2 are interchangeable if they hold the same address, as they may carry different AllocationRecord* metadata.
Production Complexities and Real-World Use Cases
The simplified model elides significant complexities required for a production system. Concurrency demands a more sophisticated GC and careful handling of atomic operations. Function pointers require extra metadata and potentially a uniform calling ABI to prevent type confusion attacks.
Performance and memory overhead are obvious concerns, prompting potential optimizations like lazy allocation of invisible_bytes or colocating metadata. So when would one use such a system? The author suggests several scenarios:
- Securing large, existing C/C++ codebases with a GC and performance trade-off for memory safety, perhaps as a stepping stone to a rewrite.
- As a bug-finding tool, similar to running code under AddressSanitizer (ASan).
- Enabling safe compile-time evaluation in languages like Zig, where the compile-time and runtime language are the same.
- As a practical model for studying pointer provenance and compiler semantics.
A Strategic Trade-off for a Persistent Problem
Fil-C represents a radical, engineered approach to a decades-old problem. It doesn't change the C/C++ languages but transforms their execution model, trading manual control for automated safety through metadata and garbage collection. While the performance cost is substantial, the model provides a clear, mechanical blueprint for retrofitting memory safety onto unsafe code—a compelling option in a world where legacy C and C++ codebases remain critical and vulnerable.
Related News

AI Agent Costs Rise, Challenging Productivity Gains (2026 Analysis)

Cloudflare Launches Unified AI Platform for the Agentic Internet

Cloudflare Email Service Enters Public Beta, Powering AI Agents

Google Faces Scrutiny Over Gmail Spam, Security Lapses Amid Broader Threats

OpenAI Launches ChatGPT for Excel Beta, Targets Business Users

