File Copy using Memory Mapping

Van C. Ngo · May 6, 2026

Concepts

  • Using mmap offers a powerful way to access files: the kernel maps the files directly into the process’s virtual memory space
  • Instead of copying data by calling read() and write(), you just call memcpy()

Implementation Example

  • To copy a file in 1MB chunks using mmap, you use the offset parameter. This is useful when the file is too large to fit into your RAM or when you want more granular control over the transfer.

  • The offset passed to mmap must be a multiple of the system’s page size, e.g., 4KB. Since 1MB is a multiple of 4KB, it works perfectly.

#include <iostream>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <cstring>
#include <algorithm>


bool mmap_copy(const char *src_path, const char *dest_path) {
    bool ret = true;

    const size_t CHUNK_SIZE = 1024 * 1024; // 1MB

    int src_fd = open(src_path, O_RDONLY);
    int dest_fd = open(dest_path, O_RDWR | O_CREAT | O_TRUNC, 0666);

    struct stat st;
    fstat(src_fd, &st);
    size_t total_size = st.st_size;

    // Pre-allocate the destination file size
    ftruncate(dest_fd, total_size);

    size_t offset = 0;
    while (offset < total_size) {
        // Calculate remaining bytes for the last chunk
        size_t current_chunk = std::min(CHUNK_SIZE, total_size - offset);

        // Map 1MB of the source (Read-only)
        void* src_ptr = mmap(NULL, current_chunk, PROT_READ, MAP_PRIVATE, src_fd, offset);
        
        // Map 1MB of the destination (Shared so changes go to disk)
        void* dest_ptr = mmap(NULL, current_chunk, PROT_WRITE, MAP_SHARED, dest_fd, offset);

        if (src_ptr == MAP_FAILED || dest_ptr == MAP_FAILED) {
            perror("mmap failed");
            ret = false;
            break;
        }

        // Perform the copy in memory
        std::memcpy(dest_ptr, src_ptr, current_chunk);

        // Optional: Force flush this chunk to flash immediately
        msync(dest_ptr, current_chunk, MS_SYNC);

        // Unmap immediately to free up virtual address space
        munmap(src_ptr, current_chunk);
        munmap(dest_ptr, current_chunk);

        offset += current_chunk;
        std::cout << "Copied: " << offset << "/" << total_size << " bytes\r" << std::flush;
    }

    close(src_fd);
    close(dest_fd);
    std::cout << "\nTransfer complete." << std::endl;
    
    return ret;
}

Benefits

  • Low Memory Footprint: Your application only consumes 2MB of virtual memory (1MB for source, 1MB for dest) at any given time, regardless of whether the file is 10GB or 100GB.

  • Flash Wear/Control: By calling msync inside the loop, you ensure that chunks are committed to the flash hardware progressively rather than all at once at the end.