From 3d496ff477fe956b1c988793b647d8fa563b4264 Mon Sep 17 00:00:00 2001 From: Jayden Date: Fri, 27 Mar 2026 12:10:30 +0800 Subject: [PATCH] first commit --- specification_rfc_style_V6.txt | 768 +++++++++++++++++++++++++++++++++ 1 file changed, 768 insertions(+) create mode 100644 specification_rfc_style_V6.txt diff --git a/specification_rfc_style_V6.txt b/specification_rfc_style_V6.txt new file mode 100644 index 0000000..9df5f82 --- /dev/null +++ b/specification_rfc_style_V6.txt @@ -0,0 +1,768 @@ +Internet-Draft MFPK-ENC-V6 Format +Intended status: Informational Expires: TBD + + Encrypted Multi-File Container Format (Streaming Binary V6) + +Abstract + + This document specifies the MFPK-ENC-V6 streaming encrypted multi- + file container format. V6 is a complete redesign based on Rogaway + et al.'s STREAM construction for nonce-based online authenticated + encryption (nOAE), with per-entry key isolation via the SE3 + construction (Hoang & Shen, CCS 2020). The format provides + segment-by-segment authenticated encryption, resistance to chunk + reordering and truncation, random-access decryption, and resistance + to randomness failure. The password-to-key step uses Argon2id + (RFC 9106) for memory-hard brute-force resistance. Per-entry subkey + and nonce mask derivation use BLAKE3 keyed hashing as a fast PRF + over the already-strong master key. V6 is NOT compatible with any + previous version. + +Status of This Memo + + This document is not an IETF standard. It is published for + informational purposes. + +Copyright Notice + + Copyright (c) 2025. All rights reserved. + +Table of Contents + + 1. Introduction + 2. Notational Conventions + 3. Cryptographic Primitives + 4. Container Overview + 5. Global Header + 6. Nonce Structure and Generation + 7. Key Derivation + 8. Entry Record Structure + 9. Segment-Based Encryption (STREAM / SE3) + 10. File Content Segmentation + 11. Paths and Metadata + 12. Root Directory Entry + 13. Indexing and Random-Access Decryption + 14. Password Verification + 15. Security Considerations + 16. IANA Considerations + 17. Versioning + 18. Constants Summary + 19. Migration from V5 + 20. Implementation Guidance + References + +1. Introduction + + MFPK-ENC-V6 is a streaming authenticated encryption container + format for storing files and directories. Compared to V5, V6: + + - Implements Rogaway et al.'s STREAM construction for nOAE security, + preventing chunk reordering, deletion, and truncation attacks. + - Uses the SE3 construction (Hoang & Shen) for per-entry subkey and + nonce mask derivation, providing misuse-resistance against CSPRNG + failure. + - Provides per-entry key isolation: each entry is encrypted under an + independent subkey derived from the master key and a fresh random R. + - Authenticates each segment independently with 128-bit Poly1305 tags. + - Supports random-access decryption of any individual segment. + - Uses ChaCha20-Poly1305 (RFC 8439) as the AEAD primitive. + - Uses Argon2id (RFC 9106) for password-to-master-key derivation, + providing strong resistance against GPU/ASIC brute-force attacks. + - Uses BLAKE3 (keyed-hash mode) as a fast PRF for per-entry subkey + and nonce mask derivation from the already-strong master key. + - Eliminates BASE_PATH (redundant with FULL_PATH) present in V5. + - Eliminates need for full-file buffering during decryption. + + V6 is a breaking change and is NOT compatible with V5 or earlier. + + The two-layer KDF design follows the security layering validated + by Hoang & Shen: Argon2id is the memory-hard password-hardening + step executed once at container open; BLAKE3 is the fast PRF + executed per entry over the secret master key, not over the password. + An attacker cannot reach BLAKE3 without first inverting Argon2id. + +2. Notational Conventions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + RFC 2119. + + All multi-byte integers are little-endian unless otherwise specified. + + Notation: + - || denotes concatenation + - ⊕ denotes XOR + - [n] denotes an n-byte sequence + - AEAD.Encrypt(K, N, A, P) denotes authenticated encryption + - AEAD.Decrypt(K, N, A, C) denotes authenticated decryption + - ⌈x⌉ denotes the ceiling function + +3. Cryptographic Primitives + + 3.1. AEAD Cipher: ChaCha20-Poly1305 (RFC 8439) + * Nonce length: 12 bytes (96 bits) + * Key length: 32 bytes (256 bits) + * Tag length: 16 bytes (128 bits) + * AAD: per-segment metadata (see Section 9.2) + + 3.2. Password KDF: Argon2id (RFC 9106) + * Salt length: 32 bytes (256 bits) + * Output length: 32 bytes (256 bits) + * Iterations (t): stored in header as uint8 + * Memory (m): stored in header as uint32 (KiB) + * Parallelism (p): stored in header as uint8 + * Password encoding: UTF-8 + + KDF parameters are written into the global header at container + creation and read back verbatim on open. A reader MUST use + the parameters from the header, not any compiled-in defaults. + Recommended defaults for new containers: t=3, m=65536, p=4. + + 3.3. Per-Entry KDF: BLAKE3 (keyed-hash mode) + * Key: 32-byte master_key (output of Argon2id) + * Input: "MFPK-V6-SUBKEY-MASK" || R (domain-separated) + * Output: 39 bytes (32-byte subkey || 7-byte mask_X) + + BLAKE3 is used here as a fast PRF over a secret key, not over a + password. Its speed is appropriate because brute-force resistance + is already provided by Argon2id at the master key layer. + + 3.4. Randomness + All salts, master nonces R, and nonce prefixes P MUST be + generated using a cryptographically secure random number + generator (CSPRNG). + +4. Container Overview + + A V6 container consists of: + + 1. Global Header (82 bytes, fixed) + - Magic and version identifier + - 32-byte Argon2id salt + - Argon2id parameters (t, m, p) + - Password verification structure + + 2. Entry Records (variable, sequential) + - Each entry begins with SYNC_WORD for resynchronization + - Entry header with per-entry nonce material and length fields + - Segmented encrypted content (files only) + + The format supports: + - Streaming encryption with O(1) memory (one segment buffer) + - Random-access decryption of any segment + - Segment-by-segment authentication + - Append-only writes + - Removal via rewrite-to-new-file + +5. Global Header (fixed size: 82 bytes) + + Offset Size Description + ------ ---- --------------------------------------------------------- + 0 4 MAGIC_VERSION = 0x89 'M' 'F' 0x06 (bytes: 89 4D 46 06) + 4 32 MASTER_SALT (32 bytes, CSPRNG, input to Argon2id) + 36 1 ARGON2_T (uint8, Argon2id time cost / iterations) + 37 4 ARGON2_M (uint32 LE, Argon2id memory cost in KiB) + 41 1 ARGON2_P (uint8, Argon2id parallelism) + 42 12 PWV_NONCE (ChaCha20-Poly1305 12-byte nonce) + 54 28 PWV_CIPHERTEXT (12-byte plaintext + 16-byte tag) + + Password Verification: + - PWV_MARKER = "MFPKV6-MARK!" (12 bytes ASCII) + - Encrypted with the derived master_key and empty AAD. + - Successful decryption verifies password correctness. + - The marker is NEVER stored in plaintext. + + Total header size: 4 + 32 + 1 + 4 + 1 + 12 + 28 = 82 bytes. + + Note on MASTER_SALT size: 32 bytes (256 bits) satisfies RFC 9106's + recommended minimum salt length for Argon2id. + + Note on KDF parameters: writers MUST record the exact parameters + used at creation time. Readers MUST use the stored values and MUST + NOT substitute compiled-in defaults. This allows parameters to be + upgraded in future containers without a format version bump. + +6. Nonce Structure and Generation (SE3 Construction) + + V6 uses the SE3 two-level nonce structure from Hoang & Shen to + provide misuse-resistance against CSPRNG failure. + + 6.1. Master Nonce (R): 16 bytes + Generated once per entry using CSPRNG. Input to per-entry KDF. + R uniquely identifies each entry's cryptographic context. + + 6.2. Nonce Prefix (P): 7 bytes + Generated once per entry using CSPRNG. + Combined with derived mask X to produce effective prefix P*. + + 6.3. Effective Nonce Prefix (P*): 7 bytes + P* = P ⊕ X + where X is the 7-byte mask from BLAKE3 keyed-hash derivation. + Even if P is constant (total CSPRNG failure), P* remains unique + per entry because X is derived from the unique R value. + + 6.4. Segment Nonce: 12 bytes + nonce_i = P*[0:4] || LE64(segment_index) + - Bytes 0–3: first 4 bytes of P* (entry-unique prefix) + - Bytes 4–11: 8-byte little-endian uint64 segment_index + + The last_flag is NOT encoded in the nonce; it is authenticated + via the AAD (see Section 9.2). This is consistent with SE3 + as analyzed in Theorem 4 of Hoang & Shen — the flag need only + be authenticated, not encoded in the nonce itself. + + 6.5. Nonce Uniqueness Guarantee + - Each entry has a unique (R, P) pair from CSPRNG. + - Each segment within an entry has a unique segment_index. + - P* uniqueness is guaranteed even under complete P failure. + - Per-entry subkeys ensure cross-entry isolation. + +7. Key Derivation + + 7.1. Master Key Derivation (Argon2id — slow, once per open) + + Input: + - password: UTF-8 encoded user password + - MASTER_SALT: 32-byte salt from global header + - ARGON2_T: time cost from global header + - ARGON2_M: memory cost (KiB) from global header + - ARGON2_P: parallelism from global header + + Process: + master_key = Argon2id( + password = password_utf8, + salt = MASTER_SALT, + t = ARGON2_T, + m = ARGON2_M, + p = ARGON2_P, + length = 32 + ) + + Output: 32-byte master_key + + This step is intentionally slow to resist offline dictionary + attacks. An attacker must execute this step for every guess, + at a cost determined by the stored parameters. + + 7.2. Per-Entry Subkey and Mask Derivation (BLAKE3 — fast, per entry) + + Input: + - master_key: 32-byte key from step 7.1 + - R: 16-byte master nonce from entry header + + Process: + derived = BLAKE3.keyed_hash( + key = master_key, + data = "MFPK-V6-SUBKEY-MASK" || R + ) + subkey = derived[0:32] // 32 bytes + mask_X = derived[32:39] // 7 bytes + + Output: + - subkey: 32-byte ChaCha20-Poly1305 key for this entry + - mask_X: 7-byte nonce mask for this entry + + BLAKE3 operates as a PRF keyed by master_key. Speed is + acceptable here because master_key has full 256-bit entropy + (an attacker cannot brute-force BLAKE3 without master_key, + and master_key requires Argon2id to obtain). + + 7.3. Effective Nonce Prefix + + P* = P ⊕ mask_X + + where P is the 7-byte CSPRNG-generated nonce prefix from the + entry header, and mask_X is from step 7.2. + + 7.4. Security Layer Summary + + [password] --Argon2id--> [master_key] (brute-force wall) + [master_key, R] --BLAKE3--> [subkey, mask_X] (fast PRF) + [subkey, P*, counter, last_flag] --ChaCha20-Poly1305--> [ciphertext] + +8. Entry Record Structure + + Each entry begins with SYNC_WORD for resilience and recovery. + + Constants: + - SYNC_WORD = 0xA6 'S' 'T' 'R' (bytes: A6 53 54 52) + - ENTRY_TYPE_FILE = 0x00 + - ENTRY_TYPE_DIR = 0x01 + - SEGMENT_SIZE = 65536 bytes (64 KiB) + + 8.1. Entry Header Layout (44 bytes fixed, then variable fields) + + Offset Size Field + ------ ---- ------------------------------------------------------- + 0 4 SYNC_WORD (A6 53 54 52) + 4 1 ENTRY_TYPE (0x00=file, 0x01=directory) + 5 16 MASTER_NONCE_R (random, input to per-entry KDF) + 21 7 NONCE_PREFIX_P (random, XORed with mask_X to form P*) + 28 8 FILE_SIZE (uint64 LE, logical plaintext bytes) + Directories MUST have FILE_SIZE = 0. + 36 4 NUM_SEGMENTS (uint32 LE, number of content segments) + Directories MUST have NUM_SEGMENTS = 0. + NUM_SEGMENTS = ⌈FILE_SIZE / SEGMENT_SIZE⌉ for files. + Empty files (FILE_SIZE = 0) have NUM_SEGMENTS = 0. + 40 2 FULLPATH_LEN (uint16 LE, length of ENCRYPTED_FULLPATH) + 42 2 TIMESTAMP_LEN (uint16 LE, length of ENCRYPTED_TIMESTAMP) + Directories MUST have TIMESTAMP_LEN = 0. + 44 N1 ENCRYPTED_FULLPATH + 44+N1 N2 ENCRYPTED_TIMESTAMP (absent for directories, N2=0) + 44+N1+N2 ... SEGMENTS (files only) + + Notes: + - FILE_SIZE is stored in plaintext. This is a known metadata leak + (see Section 15.5). It is required for streaming skip support. + - NUM_SEGMENTS is derivable from FILE_SIZE and SEGMENT_SIZE. + It is stored explicitly for reader convenience and for skip + calculations that do not require recomputing the ceiling. + + 8.2. Metadata Encryption (FULLPATH and TIMESTAMP) + + FULLPATH and TIMESTAMP are encrypted as special segments with + segment_index = 0, distinguished by different last_flag values: + + - FULLPATH: segment_index = 0, last_flag = 0x02 + - TIMESTAMP: segment_index = 0, last_flag = 0x03 + + Both use FILE_SIZE = 0 in their AAD (see Section 9.2). + + Format: nonce (12 bytes) || ciphertext || tag (16 bytes) + The nonce is deterministic: P*[0:4] || LE64(0) + Both FULLPATH and TIMESTAMP share segment_index=0 but differ + in last_flag, which is authenticated in AAD. Combined with + per-entry subkeys, cross-entry transplantation is prevented. + +9. Segment-Based Encryption (STREAM / SE3) + + V6 implements the STREAM construction where each segment is + independently encrypted and authenticated, preventing reordering, + deletion, and truncation. + + 9.1. Segment Nonce Construction + + For segment i (1-indexed for content, 0 for metadata): + + nonce_i = P*[0:4] || LE64(i) + + The nonce encodes the segment counter directly, ensuring that + any reordering is detected by nonce verification. + + 9.2. Per-Segment AAD + + AAD_i = entry_type (1 byte) + || LE64(segment_index) (8 bytes) + || last_flag (1 byte) + || LE64(file_size) (8 bytes) + + Total AAD: 18 bytes per segment. + + Fields: + - entry_type: 0x00 for file, 0x01 for directory metadata + - segment_index: uint64 LE (matches counter in nonce) + - last_flag: 0x00 non-final, 0x01 final content segment, + 0x02 FULLPATH metadata, 0x03 TIMESTAMP metadata + - file_size: logical plaintext file size (0 for metadata) + + The AAD binds each segment to: + - Its position (segment_index prevents reordering) + - Whether it is the last segment (last_flag prevents truncation) + - The total file size (prevents cross-file segment transplant) + - The entry type (prevents type confusion) + + Cross-entry transplantation is further prevented by per-entry + subkeys derived from unique R values. + + 9.3. Encryption Process + + For segment i with plaintext P_i: + + nonce_i = P*[0:4] || LE64(i) + AAD_i = entry_type || LE64(i) || last_flag || LE64(file_size) + C_i = ChaCha20Poly1305.Encrypt(subkey, nonce_i, AAD_i, P_i) + + Stored on disk as: nonce_i (12 bytes) || C_i (plaintext_len + 16) + + 9.4. Decryption and Verification + + For each segment during decryption: + 1. Read nonce (12 bytes) from disk. + 2. Reconstruct expected nonce from P* and segment_index. + 3. Verify stored nonce == expected nonce; abort on mismatch. + 4. Reconstruct AAD from known fields. + 5. Decrypt; abort immediately on authentication failure. + 6. Release plaintext only after tag verification. + + 9.5. Segment Layout on Disk + + SEGMENT_RECORD = NONCE (12) || CIPHERTEXT_WITH_TAG (≤ 65536 + 16) + Per-segment overhead: 28 bytes (12 nonce + 16 tag) + +10. File Content Segmentation + + 10.1. Segment Size + + SEGMENT_SIZE = 65536 bytes (64 KiB) + + 10.2. Segmentation + + NUM_SEGMENTS = ⌈FILE_SIZE / SEGMENT_SIZE⌉ (0 for empty files) + + For i = 1 to NUM_SEGMENTS: + if i < NUM_SEGMENTS: segment_plaintext_size = SEGMENT_SIZE + else: segment_plaintext_size = FILE_SIZE mod SEGMENT_SIZE + (or SEGMENT_SIZE if FILE_SIZE is a multiple) + + Segment i is FINAL (last_flag = 0x01) iff i == NUM_SEGMENTS. + All other content segments use last_flag = 0x00. + + 10.3. Encrypted Segment Size + + encrypted_size(segment) = 12 + plaintext_size + 16 + + 10.4. Total Encrypted Content Size + + For file of size S with N = NUM_SEGMENTS: + total_encrypted = N * 28 + S + + 10.5. Skip Calculation (no decryption required) + + To skip past entry content without decryption: + content_end = content_start + NUM_SEGMENTS * 28 + FILE_SIZE + + Where content_start is the byte offset immediately after + ENCRYPTED_TIMESTAMP (or ENCRYPTED_FULLPATH for directories), + i.e. entry_offset + 44 + FULLPATH_LEN + TIMESTAMP_LEN. + +11. Paths and Metadata + + 11.1. FULLPATH Format + + - POSIX-style absolute path, e.g., "/dir/subdir/file.txt" + - UTF-8 encoded + - Maximum 4096 bytes (enforced: FULLPATH_LEN is uint16, and + implementations MUST reject paths exceeding 4096 bytes) + - BASE_PATH is NOT stored; it is always derivable as + dirname(FULLPATH). Implementations SHOULD verify consistency. + + 11.2. Timestamp Format (files only) + + Plaintext: 8-byte little-endian IEEE-754 float64, Unix mtime. + + Encrypted using: + - subkey derived from (master_key, R) + - segment_index = 0, last_flag = 0x03 + - AAD = 0x00 || LE64(0) || 0x03 || LE64(0) + + 11.3. Directory Entries + + - ENTRY_TYPE = 0x01 + - FILE_SIZE = 0 + - NUM_SEGMENTS = 0 + - TIMESTAMP_LEN = 0 + - No segment data after ENCRYPTED_FULLPATH + +12. Root Directory Entry + + A well-formed container MUST include an explicit root directory + entry as the first entry after the global header: + + - ENTRY_TYPE = 0x01 (directory) + - FULLPATH = "/" + - FILE_SIZE = 0 + - NUM_SEGMENTS = 0 + - TIMESTAMP_LEN = 0 + +13. Indexing and Random-Access Decryption + + 13.1. Sequential Indexing + + To build a complete index: + 1. Read global header (82 bytes), verify MAGIC_VERSION. + 2. Read ARGON2_T, ARGON2_M, ARGON2_P from header. + 3. Derive master_key via Argon2id with stored parameters (Section 7.1). + 4. Verify password via PWV_NONCE and PWV_CIPHERTEXT (Section 14). + 5. Position at offset 82. + 6. For each entry: + a. Read and verify SYNC_WORD. + b. Read fixed 40 bytes of entry header fields. + c. Derive subkey and mask_X from master_key and R (Section 7.2). + d. Compute P* = P ⊕ mask_X. + e. Decrypt and verify ENCRYPTED_FULLPATH. + f. Decrypt and verify ENCRYPTED_TIMESTAMP (files only). + g. Record entry offset and metadata. + h. Advance position: + offset += 44 + FULLPATH_LEN + TIMESTAMP_LEN + if FILE: offset += NUM_SEGMENTS * 28 + FILE_SIZE + + 13.2. Random-Access Decryption + + To decrypt segment i of a known file entry: + 1. Retrieve entry metadata from index. + 2. Derive subkey and mask_X from master_key and R. + 3. Compute P* = P ⊕ mask_X. + 4. Compute segment file offset: + content_start = entry_offset + 44 + FULLPATH_LEN + TIMESTAMP_LEN + offset = content_start + (i - 1) * (28 + SEGMENT_SIZE) + Note: the final segment may be smaller than SEGMENT_SIZE. + 5. Read nonce (12), ciphertext + tag from offset. + 6. Verify stored nonce == P*[0:4] || LE64(i). + 7. Construct AAD_i. + 8. Decrypt: plaintext = ChaCha20Poly1305.Decrypt(subkey, nonce, AAD_i, C_i) + + 13.3. Resynchronization + + If entry parsing fails (bad ENTRY_TYPE, decryption failure, + length overflow): + - Scan forward for next SYNC_WORD occurrence. + - Resume parsing from that position. + - Mark skipped region as corrupted in index. + +14. Password Verification + + 14.1. Verification Process + + 1. Read MASTER_SALT from header (offset 4, 32 bytes). + 2. Read ARGON2_T (offset 36, 1 byte), ARGON2_M (offset 37, + 4 bytes LE), ARGON2_P (offset 41, 1 byte). + 3. Derive master_key using Argon2id with stored parameters + (Section 7.1). + 4. Read PWV_NONCE (offset 42, 12 bytes). + 5. Read PWV_CIPHERTEXT (offset 54, 28 bytes). + 6. Attempt: + marker = ChaCha20Poly1305.Decrypt(master_key, PWV_NONCE, b"", PWV_CIPHERTEXT) + 7. If decryption succeeds and marker == "MFPKV6-MARK!": correct password. + If decryption fails or marker mismatch: incorrect password. + + 14.2. Security Note + + The 76-byte global header is sufficient for offline dictionary + attacks against the password. Argon2id (64 MiB, t=3) raises + the cost of each attempt to ~100ms on typical hardware, + providing strong practical resistance. The salt ensures + precomputation (rainbow table) attacks are infeasible. + +15. Security Considerations + + 15.1. Nonce Uniqueness and SE3 Construction + + V6 employs the SE3 construction proven secure by Hoang & Shen: + - Master nonce R provides per-entry uniqueness and key isolation. + - Nonce prefix P with derived mask X prevents P collisions. + - Even if P generation fails (constant P), security is maintained + because X is derived from unique R via BLAKE3. + - Each entry uses an independent subkey derived from unique R, + preventing cross-entry ciphertext transplantation. + + 15.2. Segment Authentication (STREAM) + + - Each segment is independently authenticated with a 128-bit tag. + - Segment reordering is detected via segment_index in both + the nonce and the AAD. + - Segment deletion is detected because the final segment is + distinguished by last_flag = 0x01 in AAD. + - Truncation attacks are prevented because the final segment + must authenticate with last_flag = 0x01. + - Early authentication failure aborts decryption immediately, + releasing no plaintext from unauthenticated segments. + + 15.3. Key Derivation Security + + - Argon2id provides memory-hard brute-force resistance at the + password layer (64 MiB memory cost per attempt). + - BLAKE3 provides 256-bit PRF security at the per-entry layer, + operating over the secret master_key (not the password). + - Per-entry subkeys provide domain separation between entries. + - A 32-byte salt prevents precomputation attacks against Argon2id. + + 15.4. Metadata Confidentiality + + - All paths are encrypted; no plaintext path appears in the + container. Only ENTRY_TYPE (1 byte) is plaintext per entry. + - Timestamps are encrypted for files. + - FILE_SIZE and NUM_SEGMENTS are stored in plaintext. This is + a known and accepted metadata leak required for streaming + skip support (see Section 15.5). + + 15.5. Known Metadata Leaks + + The following fields are stored in plaintext and visible to + an attacker with read access to the container file: + - ENTRY_TYPE: reveals whether an entry is a file or directory. + - FILE_SIZE: reveals the exact plaintext size of each file. + - NUM_SEGMENTS: conveys the same information as FILE_SIZE. + + These leaks are inherent in the streaming/skip design and are + considered acceptable for this format. Applications requiring + full size confidentiality should pad plaintexts before + encryption. + + 15.6. Implementation Requirements + + Implementations MUST: + - Use a CSPRNG for all nonce and salt generation. + - Zeroize sensitive key material (master_key, subkey) after use. + - Verify authentication tags before releasing any plaintext. + - Verify segment nonces match expected values during decryption. + - Validate all length fields before allocation or reads. + - Enforce the 4096-byte maximum path length. + - Verify sequential segment indices during decryption. + +16. IANA Considerations + + - File extension: ".mfpk" (unchanged) + - MIME type: application/x-mfpk-v6 + - Magic bytes: 89 4D 46 06 + +17. Versioning + + - MAGIC_VERSION: 89 4D 46 06 (last byte = 0x06 for V6) + - V6 is NOT backward compatible with V5 or earlier. + - Implementations MUST reject containers with unexpected magic bytes. + +18. Constants Summary + + MAGIC_VERSION: 89 4D 46 06 + SYNC_WORD: A6 53 54 52 + HEADER_SIZE: 82 bytes + + MASTER_SALT_SIZE: 32 bytes (Argon2id salt) + ARGON2_T_SIZE: 1 byte (uint8) + ARGON2_M_SIZE: 4 bytes (uint32 LE, KiB) + ARGON2_P_SIZE: 1 byte (uint8) + MASTER_NONCE_SIZE: 16 bytes (R, per-entry) + NONCE_PREFIX_SIZE: 7 bytes (P, per-entry) + NONCE_MASK_SIZE: 7 bytes (X, derived) + SEGMENT_NONCE_SIZE: 12 bytes (ChaCha20-Poly1305) + TAG_SIZE: 16 bytes + KEY_SIZE: 32 bytes + + PWV_MARKER: "MFPKV6-MARK!" (12 bytes ASCII) + SEGMENT_SIZE: 65536 bytes (64 KiB) + SEGMENT_OVERHEAD: 28 bytes (12 nonce + 16 tag) + + ENTRY_HEADER_FIXED: 44 bytes + ENTRY_TYPE_FILE: 0x00 + ENTRY_TYPE_DIRECTORY: 0x01 + + LAST_FLAG_NON_FINAL: 0x00 (non-final content segment) + LAST_FLAG_FINAL: 0x01 (final content segment) + LAST_FLAG_FULLPATH: 0x02 (FULLPATH metadata segment) + LAST_FLAG_TIMESTAMP: 0x03 (TIMESTAMP metadata segment) + + Recommended Argon2id defaults for new containers: + t (iterations): 3 + m (memory KiB): 65536 (64 MiB) + p (parallelism): 4 + output length: 32 bytes (not stored; always 32) + +19. Migration from V5 + + V5 and V6 are completely incompatible. Migration requires: + 1. Decrypt entire V5 container using V5 Argon2id key. + 2. Extract all files and metadata to secure temporary storage. + 3. Create new V6 container with same or new password. + 4. Re-encrypt all content using V6 format. + 5. Verify integrity of migrated container. + 6. Securely delete V5 container and temporary files. + + Key changes from V5: + - AES-256-GCM replaced by ChaCha20-Poly1305. + - Argon2id retained but salt size increased from 32 to 32 bytes + (unchanged) and header now uses Argon2id correctly at master key + layer rather than a fast hash. + - Per-file encryption replaced by per-segment STREAM construction. + - BASE_PATH field eliminated (derivable from FULLPATH). + - No AAD in V5 replaced by full positional AAD in V6. + - Header size changed from 72 bytes (V5) to 76 bytes (V6). + +20. Implementation Guidance + + 20.1. Memory Requirements + + Minimum for encryption/decryption (after key derivation): + - 64 KiB plaintext segment buffer + - 64 KiB + 28 bytes ciphertext segment buffer + - ~256 bytes cryptographic state + - Total: ~130 KiB per concurrent operation + + Argon2id key derivation additionally requires 64 MiB RAM + transiently during container open. + + 20.2. Streaming Encryption Algorithm + + 1. Choose Argon2id parameters (t, m, p); write to header. + 2. Derive master_key via Argon2id(password, MASTER_SALT, t, m, p). + 2. For each entry: + a. Generate R (16 bytes) and P (7 bytes) from CSPRNG. + b. Derive subkey and mask_X: BLAKE3.keyed_hash(master_key, "MFPK-V6-SUBKEY-MASK" || R). + c. Compute P* = P ⊕ mask_X. + d. Encrypt FULLPATH: segment_index=0, last_flag=0x02. + e. Encrypt TIMESTAMP (files): segment_index=0, last_flag=0x03. + f. Write entry header with R, P, FILE_SIZE, NUM_SEGMENTS, + FULLPATH_LEN, TIMESTAMP_LEN. + g. Write ENCRYPTED_FULLPATH and ENCRYPTED_TIMESTAMP. + h. For i = 1 to NUM_SEGMENTS: + - Read up to SEGMENT_SIZE plaintext bytes. + - last_flag = 0x01 if i == NUM_SEGMENTS else 0x00. + - nonce_i = P*[0:4] || LE64(i). + - AAD_i = 0x00 || LE64(i) || last_flag || LE64(FILE_SIZE). + - C_i = ChaCha20Poly1305.Encrypt(subkey, nonce_i, AAD_i, plaintext). + - Write nonce_i || C_i. + + 20.3. Streaming Decryption Algorithm + + 1. Read Argon2id parameters from header. + 2. Derive master_key via Argon2id(password, MASTER_SALT, t, m, p). + 3. Verify PWV. + 4. Read entry header, extract R, P, FILE_SIZE, NUM_SEGMENTS. + 2. Derive subkey and mask_X. + 3. Compute P* = P ⊕ mask_X. + 4. Decrypt and verify FULLPATH and TIMESTAMP. + 5. For i = 1 to NUM_SEGMENTS: + a. Read nonce (12 bytes), then ciphertext + tag. + b. Verify nonce == P*[0:4] || LE64(i); abort on mismatch. + c. Reconstruct AAD_i. + d. Decrypt; abort and discard output on auth failure. + e. Write plaintext to output. + + 20.4. Atomic Operations + + - Write entries to a temporary file, then rename atomically. + - fsync/fdatasync before rename for durability. + - Truncate container on cancelled add operations. + + 20.5. Error Handling + + - Validate all length fields before allocation. + - Check for integer overflow in size calculations. + - Fail securely on authentication errors (no partial plaintext). + - Log errors without leaking key material or plaintext. + +References + + [STREAM] Hoang, V.T., Reyhanitabar, R., Rogaway, P., Vizár, D., + "Online Authenticated-Encryption and its Nonce-Reuse + Misuse-Resistance", CRYPTO 2015. + + [SE3] Hoang, V.T., Shen, Y., "Security of Streaming Encryption + in Google's Tink Library", CCS 2020. + https://eprint.iacr.org/2020/1019 + + [ChaCha20] Nir, Y., Langley, A., "ChaCha20 and Poly1305 for IETF + Protocols", RFC 8439, 2018. + + [Argon2id] Biryukov, A., Dinu, D., Khovratovich, D., Josefsson, S., + "Argon2 Memory-Hard Function for Password Hashing and + Proof-of-Work Applications", RFC 9106, 2021. + + [BLAKE3] O'Connor, J., Aumasson, J-P., Neves, S., + Wilcox-O'Hearn, Z., "BLAKE3: one function, fast + everywhere", 2020. + +Author's Address + + MFPK Maintainer + Email: mfpk-spec@example.com