Difference between revisions of "NKit/NKitFormat"
(→GameCube) |
|||
(38 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
+ | This is the technical detail of the NKit Format. It's concise and intended to be used along side the code. | ||
− | The NKit Format is a non-lossy format for shrinking and restoring Wii and GameCube images. It supports clean / Redump images as well as scrubbed and hacked images. Some corrupt and bad images are supported too, although these | + | The NKit Format is a non-lossy format for shrinking and restoring Wii and GameCube images. It supports clean / Redump images as well as scrubbed and hacked images. Some corrupt and bad images are supported too, although these can error due to invalid fst.bin modifications. |
− | There are 2 NKit output formats, raw (iso) and compressed (gcz). Both were designed with 1:1 preservation, size and playability in mind. Hardware support for Wii was sacrificed for smaller | + | There are 2 NKit output formats, raw (iso) and compressed (gcz). Both were designed with 1:1 preservation, size and playability in mind. Hardware support for Wii was sacrificed for smaller image sizes. If hardware support is required then WBFS is currently the best available and recoverable to Redump where not hacked or corrupt. |
{| class="wikitable" | {| class="wikitable" | ||
Line 12: | Line 13: | ||
| GameCube || nkit.gcz || No || Yes || Yes || GCZ is Dolphin's own block seekable compression format | | GameCube || nkit.gcz || No || Yes || Yes || GCZ is Dolphin's own block seekable compression format | ||
|- | |- | ||
− | | Wii || nkit.iso || No | + | | Wii || nkit.iso || No || Yes || Yes || RVT-H format only playable in Dolphin |
|- | |- | ||
| Wii || nkit.gcz || No || Yes || Yes || RVT-H in GCZ playable in Dolphin only | | Wii || nkit.gcz || No || Yes || Yes || RVT-H in GCZ playable in Dolphin only | ||
|} | |} | ||
− | NKit Format images contain the bare minimum of data. All junk and scrubbing is removed. Non-uniform data is preserved in 256 byte blocks with a 4 byte header. Wii encryption and hashes are fully recreatable and removed. Meaning any remaining data is as compressible possible. | + | NKit Format images contain the bare minimum of data. All junk and scrubbing is removed. Non-uniform (NonJunk) data is preserved in 256 byte blocks with a 4 byte header. Wii encryption and hashes are fully recreatable and removed. Meaning any remaining data is as compressible as possible. |
== Header == | == Header == | ||
Line 42: | Line 43: | ||
|} | |} | ||
− | NKit also modifies the Wii header to set the bytes at 0x60 and 0x61 to | + | NKit also modifies the Wii header to set the bytes at 0x60 and 0x61 to 0x1. This indicates to a Dev Kit and Dolphin that the encryption and hashes aren't present. |
− | + | == GameCube == | |
− | The Boot.bin | + | The disc header (Boot.bin) length is 0x440. It's modified as detailed above. The [[NKit/Discs#GameCube_Disc_.2F_Wii_Raw_Partition_Data|other items]] remain untouched. |
− | The | + | The [[NKit/NKitFormat#FileSystem|filesystem]] and fst.bin are modified when converting to NKit. |
− | + | If the fst.bin is invalid then the whole filesystem (post fst.bin) is encoded as a gap. This is known as a Bad Conversion. Typically for corrupt images. Some customs may have this issue. Bad Conversions often error too as not all possibilities are convertable | |
+ | |||
+ | == Wii == | ||
+ | |||
+ | The disc header length is 0x50000. The partition tables are modified to point to new partition offsets. | ||
+ | |||
+ | ==== Removed Update Partition ==== | ||
+ | The update partition can be removed in order to save space. In this case offset 0x218 in the disc header will hold CRC32 of the removed partition. The removed partition will have been backed up in the 'Redump' recovery folder or in the 'NkitExtracted' recovery folder if it's not a known partition. The file name will end with this CRC | ||
+ | |||
+ | The original partition table is backed up at 0x50000. The next partition (Data or Channel) is located at 0x58000 | ||
+ | |||
+ | === Partitions === | ||
+ | |||
+ | The partition header length is always 0x20000. The NKit Format only modifies 1 4 byte value - the partition length (offset 0x2bc). The length is the new size of the compacted partition data and is used to restore the partition. | ||
+ | |||
+ | The partition data is unencrypted and hashes have been removed. | ||
+ | |||
+ | The new partition data length is calculated without the hashes, whereas the original includes them. The original value is preserved in the partition data at 0x210. The real RVT-H format stores 0 for the length in the header. It's unconfirmed if a Dev Kit uses this value | ||
+ | |||
+ | ==== Partition Processing ==== | ||
+ | |||
+ | NKit partition data information: | ||
+ | * Decrypted groups | ||
+ | * Hashes are verified and removed - if any hash sector fails in a group then all the group's hashes are cached and stored at the end of the partition | ||
+ | * System items are copied as is up to the fst.bin. The fst.bin will be modified to reflect new file offsets | ||
+ | * Immediately after the fst.bin follows a bit mask of flags indicating preserved hashes that didn't verify correctly. The length of this mask is 1 byte for every 8 groups in the partition. Add 1 byte if the number of groups isn't divisible by 8 | ||
+ | * The filesystem is compacted with the NKit Format, by encoding the gaps between files | ||
+ | * If the fst.bin is invalid then the whole filesystem (post fst) is encoded as a gap. This is known as a Bad Conversion. Typically for corrupt images. Some customs may have this issue. Bad Conversions often error too as not all possibilities are convertable | ||
+ | * Hashes that failed to verify are stored at the end of the NKit Format partition data. The length can be calculated from the hash mask. Care must be taken to calculate the last group. Only read as many hashes are there are blocks | ||
+ | * If the partition data ID is 4 null bytes then 0x440 bytes for the header is written followed by 4 bytes indicating the original partition length. The remainder of the partition is encoded as a gap. The header is not modified. This is to cater for 2 system discs that have missing update partition data that's filled with sequential values. Decryption and hashes are valid and removed | ||
+ | * Junk following the partition is encoded and stored in the NKit Format partition | ||
+ | * Partitions are then padded to 32KiB with 0x00 where the next partition follows (if any) | ||
+ | |||
+ | A lot of the complexity in this format is to cater for non-Redump images (scrubbed, hacks etc). Redump images don't require hash preservation, scrubbing preservation or invalid fst.bin checks. This format caters for them anyway. | ||
== FileSystem == | == FileSystem == | ||
− | Both the GameCube and Wii use the same code to encode the filesystem. The only real difference is that Wii | + | Both the GameCube and Wii use the same code to encode the filesystem. The only real difference is that Wii fst.bin offsets and lengths must be multiplied by 4 when mapping to files. The NKit Format uses a custom Run Length Encoding (RLE) to remove the gaps between files and shrink gaps to 4 bytes in most cases. |
+ | |||
+ | Gaps are processed on the 4 byte boundary. Any data from a non aligned file remains intact (it should be 1-3 bytes of 0x00). The gaps also contain trailing nulls that follow the previous file. See the [[NKit/DiscFileSystem#FileSystem_Rules|filesystem rules]] for more information. | ||
+ | |||
+ | Data in a gap is categorised as: | ||
+ | * Junk: a gap with the correct nulls and junk | ||
+ | * Block: Scrubbed or filled with the same byte. Wii scrubbed data when decrypted is a repeating 16 byte sequence | ||
+ | * NonJunk: Data that is not junk or scrubbed data | ||
+ | |||
+ | The Gap / 0x00 Scrubbed / Preserved Junk File RLE format is: | ||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! Offset !! Mask !! Value | ||
+ | |- | ||
+ | | 0x0 || 0xFFFFFFFC || Size of gap without the last 2 bits set. Gaps are always to 4 byte boundaries so the last 2 bits will be 0 anyway | ||
+ | |- | ||
+ | | 0x0 || 0x00000003 || 0x0:AllJunk, 0x1:All 0x00 Scrubbed, 0x2:Mixed, 0x3:JunkFile | ||
+ | |- | ||
+ | | 0x4 || 0xFFFFFFFF || Optional: Caters for Gaps larger than 32 bit (Only seen in one image). Present if the value for the above mask is 0xFFFFFFFC (all bits on). Add this value to the 0xFFFFFFFC for the full gap size. | ||
+ | |} | ||
+ | |||
+ | If the gap is set to mixed there will be multiple encoded blocks with an 4 byte header. The header will be made from masks detailed in the items below. 1 & 2 or 1 & 3. | ||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! Item !! Mask !! Value | ||
+ | |- | ||
+ | | 1 || 0xC0000000 || 0x00000000:Junk, 0x40000000:NonJunk, 0x80000000:ByteFill, 0xC0000000:Repeat | ||
+ | |- | ||
+ | | 2 || 0x3FFFFFFF: Only for Junk, NonJunk, Repeat || Count of 256 byte blocks | ||
+ | |- | ||
+ | | 3 || 0x3FFFFF00: Only for ByteFill (Count) & 0xFF (Byte) || Count of 256 byte blocks & Byte - NKit GameCube supports all bytes, Wii only supports 0x00 and 0xFF | ||
+ | |} | ||
+ | |||
+ | The above items store 256 byte block counts rather than length. This is to allow them to use as few bytes as possible. When restoring the data the gap length from the first table must be used to determine if the last block is complete or partial. The stored gap length must not be exceeded. | ||
+ | |||
+ | Repeat is used if the size or block count does not fit in the value mask specified. It's never occurred in tests as yet. | ||
+ | |||
+ | For NonJunk sections the preserved junk blocks are stored immediately after the 4 byte header detailed above. The NonJunk blocks are written to disc in 50MiB blocks where the block being processed takes the cache size over 50MiB. There is only one disc in Redump that triggers this (a GameCube service disc). | ||
+ | |||
+ | ===== Preserving Wii Scrubbing ===== | ||
+ | Scrubbing replaces the encryption with a repeating character - normally 0x00. Therefore all underlying junk and hashes are removed. NKit still decrypts it and process the data. When analysing decrypted scrubbed partition data, the result is a repeating 16 byte pattern. NKit recognises this and preserves it as ByteFill (0x00 and 0xFF scrubbing is supported). | ||
+ | |||
+ | NKit preserves gaps from the offset of the end of the previous file (4 byte boundary) to the next file. Gaps are not aligned on the 256 byte boundary that NKit is using to detect the blocks its preserving. So when an image go from junk to scrubbed there's often a preserved NonJunk block that's half junk half scrubbed or vice versa. When restoring an image, this just works out back 1:1 to the original. If a full group is scrubbed then the scrubbing must be copied in to the hash sector. When encrypted the repeating bytes magically go back to 0x00 or 0xFF scrubbing. |
Latest revision as of 10:53, 20 September 2019
This is the technical detail of the NKit Format. It's concise and intended to be used along side the code.
The NKit Format is a non-lossy format for shrinking and restoring Wii and GameCube images. It supports clean / Redump images as well as scrubbed and hacked images. Some corrupt and bad images are supported too, although these can error due to invalid fst.bin modifications.
There are 2 NKit output formats, raw (iso) and compressed (gcz). Both were designed with 1:1 preservation, size and playability in mind. Hardware support for Wii was sacrificed for smaller image sizes. If hardware support is required then WBFS is currently the best available and recoverable to Redump where not hacked or corrupt.
System | Format | Hardware Supported | Dolphin Supported | Restorable 1:1 | Notes |
---|---|---|---|---|---|
GameCube | nkit.iso | Yes | Yes | Yes | Same as compacted GameCube iso |
GameCube | nkit.gcz | No | Yes | Yes | GCZ is Dolphin's own block seekable compression format |
Wii | nkit.iso | No | Yes | Yes | RVT-H format only playable in Dolphin |
Wii | nkit.gcz | No | Yes | Yes | RVT-H in GCZ playable in Dolphin only |
NKit Format images contain the bare minimum of data. All junk and scrubbing is removed. Non-uniform (NonJunk) data is preserved in 256 byte blocks with a 4 byte header. Wii encryption and hashes are fully recreatable and removed. Meaning any remaining data is as compressible as possible.
Contents
Header
NKit places its header at 0x200 in the Wii and GameCube disc headers.
Offset | Length | Name |
---|---|---|
0x200 | 0x4 | NKit Header 'NKIT' |
0x204 | 0x4 | NKit Version ' v01' |
0x208 | 0x4 | Source image original CRC32 |
0x20C | 0x4 | NKit CRC - makes the NKit file CRC32 equal the source CRC at 0x208 (at 0x4 in GCZ) |
0x210 | 0x4 | Source image length |
0x214 | 0x4 | Forced Junk ID (When Disc ID differs - rare - GameCube only) |
0x218 | 0x4 | Wii Update partition CRC32 if removed when converting |
NKit also modifies the Wii header to set the bytes at 0x60 and 0x61 to 0x1. This indicates to a Dev Kit and Dolphin that the encryption and hashes aren't present.
GameCube
The disc header (Boot.bin) length is 0x440. It's modified as detailed above. The other items remain untouched.
The filesystem and fst.bin are modified when converting to NKit.
If the fst.bin is invalid then the whole filesystem (post fst.bin) is encoded as a gap. This is known as a Bad Conversion. Typically for corrupt images. Some customs may have this issue. Bad Conversions often error too as not all possibilities are convertable
Wii
The disc header length is 0x50000. The partition tables are modified to point to new partition offsets.
Removed Update Partition
The update partition can be removed in order to save space. In this case offset 0x218 in the disc header will hold CRC32 of the removed partition. The removed partition will have been backed up in the 'Redump' recovery folder or in the 'NkitExtracted' recovery folder if it's not a known partition. The file name will end with this CRC
The original partition table is backed up at 0x50000. The next partition (Data or Channel) is located at 0x58000
Partitions
The partition header length is always 0x20000. The NKit Format only modifies 1 4 byte value - the partition length (offset 0x2bc). The length is the new size of the compacted partition data and is used to restore the partition.
The partition data is unencrypted and hashes have been removed.
The new partition data length is calculated without the hashes, whereas the original includes them. The original value is preserved in the partition data at 0x210. The real RVT-H format stores 0 for the length in the header. It's unconfirmed if a Dev Kit uses this value
Partition Processing
NKit partition data information:
- Decrypted groups
- Hashes are verified and removed - if any hash sector fails in a group then all the group's hashes are cached and stored at the end of the partition
- System items are copied as is up to the fst.bin. The fst.bin will be modified to reflect new file offsets
- Immediately after the fst.bin follows a bit mask of flags indicating preserved hashes that didn't verify correctly. The length of this mask is 1 byte for every 8 groups in the partition. Add 1 byte if the number of groups isn't divisible by 8
- The filesystem is compacted with the NKit Format, by encoding the gaps between files
- If the fst.bin is invalid then the whole filesystem (post fst) is encoded as a gap. This is known as a Bad Conversion. Typically for corrupt images. Some customs may have this issue. Bad Conversions often error too as not all possibilities are convertable
- Hashes that failed to verify are stored at the end of the NKit Format partition data. The length can be calculated from the hash mask. Care must be taken to calculate the last group. Only read as many hashes are there are blocks
- If the partition data ID is 4 null bytes then 0x440 bytes for the header is written followed by 4 bytes indicating the original partition length. The remainder of the partition is encoded as a gap. The header is not modified. This is to cater for 2 system discs that have missing update partition data that's filled with sequential values. Decryption and hashes are valid and removed
- Junk following the partition is encoded and stored in the NKit Format partition
- Partitions are then padded to 32KiB with 0x00 where the next partition follows (if any)
A lot of the complexity in this format is to cater for non-Redump images (scrubbed, hacks etc). Redump images don't require hash preservation, scrubbing preservation or invalid fst.bin checks. This format caters for them anyway.
FileSystem
Both the GameCube and Wii use the same code to encode the filesystem. The only real difference is that Wii fst.bin offsets and lengths must be multiplied by 4 when mapping to files. The NKit Format uses a custom Run Length Encoding (RLE) to remove the gaps between files and shrink gaps to 4 bytes in most cases.
Gaps are processed on the 4 byte boundary. Any data from a non aligned file remains intact (it should be 1-3 bytes of 0x00). The gaps also contain trailing nulls that follow the previous file. See the filesystem rules for more information.
Data in a gap is categorised as:
- Junk: a gap with the correct nulls and junk
- Block: Scrubbed or filled with the same byte. Wii scrubbed data when decrypted is a repeating 16 byte sequence
- NonJunk: Data that is not junk or scrubbed data
The Gap / 0x00 Scrubbed / Preserved Junk File RLE format is:
Offset | Mask | Value |
---|---|---|
0x0 | 0xFFFFFFFC | Size of gap without the last 2 bits set. Gaps are always to 4 byte boundaries so the last 2 bits will be 0 anyway |
0x0 | 0x00000003 | 0x0:AllJunk, 0x1:All 0x00 Scrubbed, 0x2:Mixed, 0x3:JunkFile |
0x4 | 0xFFFFFFFF | Optional: Caters for Gaps larger than 32 bit (Only seen in one image). Present if the value for the above mask is 0xFFFFFFFC (all bits on). Add this value to the 0xFFFFFFFC for the full gap size. |
If the gap is set to mixed there will be multiple encoded blocks with an 4 byte header. The header will be made from masks detailed in the items below. 1 & 2 or 1 & 3.
Item | Mask | Value |
---|---|---|
1 | 0xC0000000 | 0x00000000:Junk, 0x40000000:NonJunk, 0x80000000:ByteFill, 0xC0000000:Repeat |
2 | 0x3FFFFFFF: Only for Junk, NonJunk, Repeat | Count of 256 byte blocks |
3 | 0x3FFFFF00: Only for ByteFill (Count) & 0xFF (Byte) | Count of 256 byte blocks & Byte - NKit GameCube supports all bytes, Wii only supports 0x00 and 0xFF |
The above items store 256 byte block counts rather than length. This is to allow them to use as few bytes as possible. When restoring the data the gap length from the first table must be used to determine if the last block is complete or partial. The stored gap length must not be exceeded.
Repeat is used if the size or block count does not fit in the value mask specified. It's never occurred in tests as yet.
For NonJunk sections the preserved junk blocks are stored immediately after the 4 byte header detailed above. The NonJunk blocks are written to disc in 50MiB blocks where the block being processed takes the cache size over 50MiB. There is only one disc in Redump that triggers this (a GameCube service disc).
Preserving Wii Scrubbing
Scrubbing replaces the encryption with a repeating character - normally 0x00. Therefore all underlying junk and hashes are removed. NKit still decrypts it and process the data. When analysing decrypted scrubbed partition data, the result is a repeating 16 byte pattern. NKit recognises this and preserves it as ByteFill (0x00 and 0xFF scrubbing is supported).
NKit preserves gaps from the offset of the end of the previous file (4 byte boundary) to the next file. Gaps are not aligned on the 256 byte boundary that NKit is using to detect the blocks its preserving. So when an image go from junk to scrubbed there's often a preserved NonJunk block that's half junk half scrubbed or vice versa. When restoring an image, this just works out back 1:1 to the original. If a full group is scrubbed then the scrubbing must be copied in to the hash sector. When encrypted the repeating bytes magically go back to 0x00 or 0xFF scrubbing.