@@ -214,6 +214,73 @@ TODO: documentation contents, what should and should no be included
214214
215215TODO
216216
217+ [[encoding-name]]
218+ == Encoding name (`encoding`)
219+
220+ The `encoding` key (used in `meta/encoding` for the default encoding and
221+ in individual attributes to override it) specifies the character
222+ encoding for string values. The KS compiler recognizes a set of
223+ canonical encoding names and common aliases, and will emit warnings if a
224+ non-canonical form is used.
225+
226+ One MUST use the canonical (exact) spelling of the encoding name. The
227+ match is *case-sensitive*: for example, `UTF-8` is correct, but `utf-8`
228+ or `Utf-8` will trigger a warning.
229+
230+ The following canonical names are recognized by the compiler:
231+
232+ [cols="1,3", options="header"]
233+ |====
234+ | Canonical name | Common aliases (accepted but produce a warning)
235+
236+ | `ASCII`
237+ | US-ASCII, US_ASCII, IBM367, cp367
238+
239+ | `UTF-8`
240+ | UTF8, UTF_8, cp65001
241+
242+ | `UTF-16BE`
243+ | UTF16BE, UTF16-BE, UTF-16-BE, UTF_16BE
244+
245+ | `UTF-16LE`
246+ | UTF16LE, UTF16-LE, UTF-16-LE, UTF_16LE
247+
248+ | `UTF-32BE`
249+ | UTF32BE, UTF32-BE, UTF-32-BE, UTF_32BE
250+
251+ | `UTF-32LE`
252+ | UTF32LE, UTF32-LE, UTF-32-LE, UTF_32LE
253+
254+ | `ISO-8859-1`
255+ | ISO8859-1, ISO_8859_1, latin1, cp819, windows-28591
256+
257+ | `ISO-8859-2` ... `ISO-8859-16`
258+ | Same pattern of aliases (e.g. latin2, latin3, ..., latin10)
259+
260+ | `windows-1250` ... `windows-1258`
261+ | cp1250 ... cp1258
262+
263+ | `IBM437`
264+ | cp437, 437
265+
266+ | `IBM866`
267+ | cp866, 866
268+
269+ | `Shift_JIS`
270+ | Shift-JIS, ShiftJIS, S-JIS, SJIS, PCK
271+
272+ | `Big5`
273+ | csBig5
274+
275+ | `EUC-KR`
276+ | EUCKR, EUC_KR, korean
277+ |====
278+
279+ The list above is included just for demonstration purposes. The master
280+ list is maintained in the compiler source code (see
281+ https://github.com/kaitai-io/kaitai_struct_compiler/blob/master/shared/src/main/scala/io/kaitai/struct/EncodingList.scala[EncodingList.scala])
282+ — if in doubt, follow the list in the source code.
283+
217284[[seq-attr]]
218285== Sequence attributes
219286
0 commit comments