Skip to content

Compress parser codepaths for repetitive code & Update Algo docs #65

Open
@Ed94

Description

The parser needs a review both for updating the algo's documentation and to reduce repetitive code.

As of 868b93c much of the parser is littered with manual look-ahead loops to resolve contextual ambiguities:

// Check three tokens ahead to make sure that were not dealing with a constructor initialization...
// ( 350.0f , <--- Could be the scenario
// Example : <Capture_Start> <Value> <Comma>
// idx +1 +2
bool detected_comma = _ctx->parser.Tokens.Arr[ _ctx->parser.Tokens.Idx + 2 ].Type == Tok_Comma;
b32 detected_non_varadic_unpaired_param = detected_comma && nexttok.Type != Tok_Varadic_Argument;
if (! detected_non_varadic_unpaired_param && nexttok.Type == Tok_Preprocess_Macro_Expr) for( s32 break_scope = 0; break_scope == 0; ++ break_scope)
{
Macro* macro = lookup_macro( nexttok.Text );
if (macro == nullptr || ! macro_is_functional(* macro))
break;
// ( <Macro_Expr> (
// Idx +1 +2
s32 idx = _ctx->parser.Tokens.Idx + 1;
s32 level = 0;
// Find end of the token expression
for ( ; idx < array_num(_ctx->parser.Tokens.Arr); idx++ )
{
Token tok = _ctx->parser.Tokens.Arr[ idx ];
if ( tok.Type == Tok_Capture_Start )
level++;
else if ( tok.Type == Tok_Capture_End && level > 0 )
level--;
if (level == 0 && tok.Type == Tok_Capture_End)
break;
}
++ idx; // Will incremnt to possible comma position
if ( _ctx->parser.Tokens.Arr[ idx ].Type != Tok_Comma )
break;
detected_non_varadic_unpaired_param = true;
}

for example, the above uses raw iteration through the lexed tokens to resolve if after thee macro argument is a comma.

We need to setup utilizing a slice for a set of tokens to look ahead that will behave as a sub-slice of the full lexed slice:

struct LexSlice
{
	Token* Ptr;
	s32      Len;
	s32      Idx;
};

Like with the regular tokens array, it needs a simple interface to navigate with it (could probably just recycle the current one with TokArray and just change it to take a slice instead.

This sort of iteration is used throughout the parser for aggregating tokens the parser cannot parse:

eat( Tok_Capture_Start );

s32 level = 0;
while ( left && ( currtok.Type != Tok_Capture_End || level > 0 ) )
{
	if ( currtok.Type == Tok_Capture_Start )
		level++;
	else if ( currtok.Type == Tok_Capture_End && level > 0 )
		level--;

	eat( currtok.Type );
}
eat( Tok_Capture_End );

It can be generalized for both consumption and look-ahead.

Metadata

Assignees

No one assigned

    Labels

    Projects

    • Status

      No status

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions