If you're interested in helping improve The List, then this is the document for you. It will help you understand how the project is structured and maintained. Along with any rules or guidelines that you will need to know.
There are two main directories in the root of the project called utilities and source.
- utilities - contains the source text files for the programs we use to manage The List.
- source - contains the many bits and pieces of The List broken down for editing using git.
Under this path, there are a couple configuration files and some group directories. The _Release.txt file simply contains the version of the release we are working towards. The _Mapping.txt file contains settings and other information needed for makelist to compile The List into its release format.
With the exception of the Miscellaneous directory, the other sub-directories are groups which contain the files that become a LST file as defined in the _Mapping.txt file when The List is compiled. The files in the Miscellaneous directory are not processed and are simply copied into a release.
Every group sub-directory contains at least one Comment Section file. The filenames of these files always start with an underscore character. Generally, these files are included as-is into the appropriate LST file based on the order as defined in the _Mapping.txt file.
Most groups (like Interrupt List and Ports List) can contain numerous sub-directories and files used for the items in a LST file. The directory and file names for those items are not important to the program that compiles The List. Their names are only for our convenience when finding and editing an entry for The List. Those items can be broken down into any level of sub-paths which we find suitable to our needs in maintaining the project. (When compiling, the unique IDs or sort-as IDs (described below) are used to determine the order of entries.)
Note: All files are required to have a .txt filename extension and must be encoded in standard ASCII using code page 437. UTF-8 and other text encoding schemes are not supported at this time.
The individual list item entry files each contain a header that has important information that is needed when compiling the list files. The header starts and ends with two horizontal bars comprised of at least 75 dashes (minus characters). While there are several possible fields, only the Unique ID field is absolutely mandatory.
-
Unique ID - Mandatory field. It contains the ID which will be used for the section break in the compiled LST file. For example, Interrupt 21h function 4Ch has a Unique ID of
214C. An entry has to have exactly one Unique ID.However, sometimes there is insufficient uniqueness for an ID to distinguish it from another entry. For such items, the ID can be suffixed with a -sort-as- ID to better identify the entry and guarantee the order in a LST file.
For example, there are multiple entries for Interrupt 21h function 40h. These entries have Unique IDs like
2140-sort-as-2140++and2140-sort-as-2140PC/TCPto always maintain the same order in The List. Additional text after the Unique ID is prohibited. -
Sort As - Optional field. This is an alternative method to the -sort-as- suffix for providing additional uniqueness. Using one of the previously mentioned entries as an example, you could enter
Unique ID: 2140andSort As: 2140PC/TCP. Additional text after the Sort As ID is prohibited.Either method of providing a Sort As ID is equally acceptable. However, do not use both the
-sort-as-suffix and theSort Asfield in the same entry. -
Category - Recommended field. This is a single character field as listed in the _Categories.txt file or
-if no category is used. Note that the category identifier is case-sensitive. While not required, additional text is recommended after the category identifier in order to ease identification of the category. Entries are restricted to only a single category per file. The category identifier is entered into the section break line of the compiled LST file, after the first 8 dashes. After the category identifier another dash follows, then the list-as ID (from the Unique ID). -
Flag - Recommended field. Like the Category field, this is a single character field as listed in the _Flags.txt file or
n/aif no flag is applicable. Like the category identifier, the flag identifier is case-sensitive. While not required, additional text is recommended after the flag identifier in order to ease identification of the flag. You may include several Flag fields in the header if needed.
After the second all-dash separator line that ends the header, the contents of the list item follow. All contents are copied exactly into the LST file during compilation, except that leading and trailing empty lines are skipped and linebreaks are normalised to CR LF. Content lines should always fit in fewer than 80 columns. Tab stops are expected at every 8th column.
The first non-empty line is known as the summary base.
For an entry like INT 21/AH=4Ch,
the summary starts with INT 21.
After this there may be a blank-separated field listing
all flag letters, if any.
(If multiple flags are used, there are no blanks
between the individual flag letters.)
Note that all used flags must be listed here again
because the header Flag fields do not affect the compiled LST file.
After this there's a blank and a dash and another blank. The remainder of the line is the Name of the entry. It may start with one or multiple component or system identifiers separated from the remaining Name by another dash each. Words other than proper nouns or identifiers are typically in all-caps.
The summary base does not include register values for the entry.
After the summary base line, input conditions may follow, mostly listing register values to set up to call this entry's function. All input conditions are indented with a tab at the very beginning of the line. ecm's IntList expects input registers to be in one of the following formats to pick them up for its dynamic summary and as register states for hyperlink destinations:
<TAB><all-caps register name> = <hexadecimal value>for a single value<TAB><reg name> = <hex> subfn <hex>for register value and subfunction value<TAB><reg name> = <hex> ('<letters>')for magic values, where the letters match the ASCII interpretation of the hexadecimal value in NASM or MASM order<TAB><reg name> = <hex> / <hex>for multiple values<TAB><reg name> = <hex>..| to |-<hex>for a value range<TAB><reg name> = subfunction|type of load|what to return in BH|origin of movefor a type register- Line starting with
---<NONDASH>: Skipped - Line starting with
<NONTAB>: End of dynamic summary buildup
Type registers enter a sub-mode where matches occur as follows:
<TAB><4 BLANKs><hex>...(trail ignored) gives one value for the type register- Line starting with
<TAB>followed by different whitespace than exactly<4 BLANKs>: Skipped - Line starting with
<TAB><NONWHITESPACE>: End of type register sub-mode - Line starting with
<NONTAB>: End of dynamic summary buildup
After the input conditions, indented sections may follow.
These start with section names that are not indented with contents
indented by one tab or more.
Typically, sentences or paragraphs start with a 1-tab indentation
and continued paragraph parts use <TAB><2 BLANKs> indentation.
The section names are:
- Return:
- Desc:
- Notes:
- BUGS:
- Program:
- InstallCheck:
- Range:
There is a special section name SeeAlso:.
Every line belonging to this section type starts with SeeAlso: followed
by one blank.
This section lists only references to other entries or tables.
After the named sections, tables may be listed.
Tables are always started by an empty line then an unindented line.
Near the beginning of a table definition,
the text (Table Annnn) always is used,
in which A is alphanumeric (all-caps) and n are decimal digits.
IntList uses the following patterns to match references as hyperlinks within entries after the summary base line:
- Name matches are matched as
"<N NONDOUBLEQUOTEs>", name matches are allowed after all references except for tables. Note that there must not be any whitespace before the opening doublequote. - Register values are matched as
<all-caps register name>=<hex>without any whitespace - Multiple register values are chained with slashes
/
The following patterns are turned into references:
- Interrupts are matched as
INT<BLANK><hex> - Interrupts with registers are matched as
INT<BLANK><hex>/<regvalues> - Register-only references (only valid in INTERRUP.LST) are matched as
<regvalues>where the first register must be one of AH, AL, or AX - In SeeAlso: lines, register-only references may occur for other first registers too
- Tables are matched as
#<1 all-caps ALPHANUM><4 DECIMAL DIGITs> - Memory is matched as
MEM<BLANK><hex>:<hex>orMEM<BLANK><hex> - Far calls are matched as
@<hex>:<hex> - Port ranges are matched as
PORT<BLANK><hex>-<hex> - Single ports are matched as
PORT<BLANK><hex>
After an interrupt with registers or register-only reference match,
an immediately following ,<hex>
(with the comma not preceeded or followed by any whitespace)
is matched as a reference to the same call as the prior match
except the last register's value is replaced by the specified number.