Character Set Support

All character data is encoded using UTF-8, ensuring consistent representation across containers, primary storage, and secondary storage systems.

UTF-8 is a variable-length encoding standard defined by RFC 3629, capable of representing all 1,112,064 valid Unicode code points using one to four 8-bit bytes:

Code points in the ASCII range (U+0000 to U+007F) are encoded using a single byte, maintaining direct binary compatibility with ASCII.
Code points in higher ranges are encoded using two to four bytes, with a structure designed to minimize size for commonly used characters and maximize compatibility.

This encoding strategy ensures full Unicode support, efficient storage, and reliable interoperability with modern file systems and network protocols.

UTF-8 Restrictions

Certain UTF-8 characters are restricted when used in file or folder names:

The characters / and | are not permitted in file or folder names, as they are reserved by the filesystem and shell.
The characters % and : cannot be used as the sole character in a file or folder name.

If a file or folder is encoded using a non-UTF-8 character set, it will be automatically re-encoded using the pattern: %MNGA182 followed by the hexadecimal representation of the original name. This ensures compatibility and traceability of files with unsupported or legacy encodings.

Last updated 3 months ago

Was this helpful?