Concepts
- Using
mmapoffers a powerful way to access files: the kernel maps the files directly into the process’s virtual memory space - Instead of copying data by calling
read()andwrite(), you just callmemcpy()
Implementation Example
-
To copy a file in 1MB chunks using
mmap, you use the offset parameter. This is useful when the file is too large to fit into your RAM or when you want more granular control over the transfer. -
The offset passed to
mmapmust be a multiple of the system’s page size, e.g., 4KB. Since 1MB is a multiple of 4KB, it works perfectly.
#include <iostream>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <cstring>
#include <algorithm>
bool mmap_copy(const char *src_path, const char *dest_path) {
bool ret = true;
const size_t CHUNK_SIZE = 1024 * 1024; // 1MB
int src_fd = open(src_path, O_RDONLY);
int dest_fd = open(dest_path, O_RDWR | O_CREAT | O_TRUNC, 0666);
struct stat st;
fstat(src_fd, &st);
size_t total_size = st.st_size;
// Pre-allocate the destination file size
ftruncate(dest_fd, total_size);
size_t offset = 0;
while (offset < total_size) {
// Calculate remaining bytes for the last chunk
size_t current_chunk = std::min(CHUNK_SIZE, total_size - offset);
// Map 1MB of the source (Read-only)
void* src_ptr = mmap(NULL, current_chunk, PROT_READ, MAP_PRIVATE, src_fd, offset);
// Map 1MB of the destination (Shared so changes go to disk)
void* dest_ptr = mmap(NULL, current_chunk, PROT_WRITE, MAP_SHARED, dest_fd, offset);
if (src_ptr == MAP_FAILED || dest_ptr == MAP_FAILED) {
perror("mmap failed");
ret = false;
break;
}
// Perform the copy in memory
std::memcpy(dest_ptr, src_ptr, current_chunk);
// Optional: Force flush this chunk to flash immediately
msync(dest_ptr, current_chunk, MS_SYNC);
// Unmap immediately to free up virtual address space
munmap(src_ptr, current_chunk);
munmap(dest_ptr, current_chunk);
offset += current_chunk;
std::cout << "Copied: " << offset << "/" << total_size << " bytes\r" << std::flush;
}
close(src_fd);
close(dest_fd);
std::cout << "\nTransfer complete." << std::endl;
return ret;
}
Benefits
-
Low Memory Footprint: Your application only consumes 2MB of virtual memory (1MB for source, 1MB for dest) at any given time, regardless of whether the file is 10GB or 100GB.
-
Flash Wear/Control: By calling
msyncinside the loop, you ensure that chunks are committed to the flash hardware progressively rather than all at once at the end.
