WIP: Improvements to parser, updated docs

Trying to get support for typename keyword soon
This commit is contained in:
Edward R. Gonzalez 2023-11-21 21:27:33 -05:00
parent 772db608be
commit f67f9547df
7 changed files with 500 additions and 271 deletions

View File

@ -1,11 +1,3 @@
# Forward
Was never satisfied with how I did the wrap of the management of the AST.
For C++, the current design may be as good as it gets for the limitations of the langauge.
I'll at least try in this issue to brainstorm something simpiler without losing ergonomics.
This will also be a good place to document the current design.
## Current Design
`AST` is the actual managed node object for the library.
@ -22,10 +14,8 @@ The simpliest being just a type alias.
using Code = AST*;
```
This is what the genc library would have to use due to its constraints of a langauge.
Anything else and it would either be an unergonomic mess of struct wrapping with a mess of macros & procedures to interface with it.
Further, to provide intuitive filters on the AST, there are AST types (covered in [AST_Types.md](AST_Types.md)).
This is what the genc library would have to use due to its constraints of a langauge.
The actual content per type of AST is covered within [AST_Types.md](AST_Types.md).
These are pure PODS that just have the lay members relevant to the type of AST node they represent.
Each of them has a Code type alias specific to it.
@ -35,3 +25,20 @@ Again, the simpliest case for these would be a type alias.
```cpp
using struct AST_Typedef CodeTypedef;
```
As of November 21st, 2023, the AST has had a strict layout for how its content is laid out.
This will be abandoned during its redesign that will occur starting with support for statments & expressions for either execution and type declarations.
Having a strict layout is too resctrictive vs allowing each AST type to have maximum control over the layout.
The redesign will occur after the following todos are addressed:
* [Improvements Lexer & Token struct#27](https://github.com/Ed94/gencpp/issues/27)
* [Generalize AST Flags to a single 4-byte flag#42](https://github.com/Ed94/gencpp/issues/42)
* [AST-Code Object Redesign.#38](https://github.com/Ed94/gencpp/issues/38)
* [Code-AST Documentation#40](https://github.com/Ed94/gencpp/issues/40)
* [AST::debug_str() improvements#33](https://github.com/Ed94/gencpp/issues/33)
* [AST::is_equal implemented and works with singleheader-test#31](https://github.com/Ed94/gencpp/issues/31)
* [Parser : Add ability to have a parse failure and continue with errors recorded.#35](https://github.com/Ed94/gencpp/issues/35)
* [Scanner : Add CodeFile#29](https://github.com/Ed94/gencpp/issues/29)
* [Auxiliary : AST visual debugger#36](https://github.com/Ed94/gencpp/issues/36)

View File

@ -21,6 +21,7 @@ Fields:
```cpp
Code Front;
Code Back;
Token* Token;
Code Parent;
StringCached Name;
CodeT Type;
@ -582,20 +583,21 @@ Serialization:
## Typedef
Behave as usual except function or macro typedefs.
Those don't use the underlying type field as everything was serialized under the Name field.
Those (macros) don't use the underlying type field as everything was serialized under the Name field.
Fields:
```cpp
CodeComment InlineCmt;
Code UnderlyingType;
Code Prev;
Code Next;
Code Parent;
StringCached Name;
CodeT Type;
ModuleFlag ModuleFlags;
b32 IsFunction;
CodeComment InlineCmt;
Code UnderlyingType;
Code Prev;
Code Next;
parse::Token* Tok
Code Parent;
StringCached Name;
CodeT Type;
ModuleFlag ModuleFlags;
b32 IsFunction;
```
Serialization:
@ -617,6 +619,7 @@ CodeAttributes Attributes;
CodeBody Body;
Code Prev;
Code Next;
parser::Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
@ -642,6 +645,7 @@ CodeAttributes Attributes;
CodeType UnderlyingType;
Code Prev;
Code Next;
parser::Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
@ -660,6 +664,8 @@ Serialization:
## Variable
[Algo](./Parser_Algo.md:)
Fields:
```cpp
@ -669,8 +675,10 @@ CodeSpecifiers Specs;
CodeType ValueType;
Code BitfieldSize;
Code Value;
CodeVar NextVar;
Code Prev;
Code Next;
parser::Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
@ -681,8 +689,8 @@ Serialization:
```cpp
// Regular
<ModuleFlags> <Attributes> <Specs> <ValueType> <Name> = <Value>; <InlineCmt>
<ModuleFlags> <Attributes> <Specs> <ValueType> <Name> = <Value>, NextVar ...; <InlineCmt>
// Bitfield
<ModuleFlags> <Attributes> <Specs> <ValueType> <Name> : <BitfieldSize> = <Value>; <InlineCmt>
<ModuleFlags> <Attributes> <Specs> <ValueType> <Name> : <BitfieldSize> = <Value>, NextVar ...; <InlineCmt>
```

View File

@ -1,15 +1,16 @@
# Parser's Algorithim
gencpp uses a hand-written recursive descent parser. Both the lexer and parser handle a full C/C++ file in a single pass.
gencpp uses a hand-written recursive descent parser. Both the lexer and parser currently handle a full C/C++ file in a single pass.
## Notable implementation background
### Lexer
The lex procedure does the lexical pass of content provided as a `StrC` type.
The tokens are stored (for now) in `gen::Parser::Tokens`.
The tokens are stored (for now) in `gen::parser::Tokens`.
Fields:
```cpp
Array<Token> Arr;
s32 Idx;
@ -18,23 +19,34 @@ s32 Idx;
What token types are supported can be found in [ETokType.csv](../project/enums/ETokType.csv) you can also find the token types in [ETokType.h](../project/components/gen/etoktype.cpp) , which is the generated enum from the csv file.
Tokens are defined with the struct `gen::Parser::Token`:
Tokens are defined with the struct `gen::parser::Token`:
Fields:
```cpp
char const* Text;
sptr Length;
TokType Type;
s32 Line;
s32 Column;
bool IsAssign;
u32 Flags;
```
`IsAssign` is a flag that is set when the token is an assignment operator. Which is used for various purposes:
Flags is a bitfield made up of TokFlags (Token Flags):
* Using statment assignment
* Parameter argument default value assignment
* Variable declaration initialization assignment
* `TF_Operator` : Any operator token used in expressions
* `TF_Assign`
* Using statment assignment
* Parameter argument default value assignment
* Variable declaration initialization assignment
* `TF_Preprocess` : Related to a preprocessing directive
* `TF_Preprocess_Cond` : A preprocess conditional
* `TF_Attribute` : An attribute token
* `TF_AccessSpecifier` : An accesor operation token
* `TF_Specifier` : One of the specifier tokens
* `TF_EndDefinition` : Can be interpreted as an end definition for a scope.
* `TF_Formatting` : Considered a part of the formatting
* `TF_Literal` : Anything considered a literal by C++.
I plan to replace IsAssign with a general flags field and properly keep track of all operator types instead of abstracting it away to `ETokType::Operator`.
@ -58,7 +70,7 @@ The parser has a limited user interface, only specific types of definitions or s
Each public user interface procedure has the following format:
```cpp
CodeStruct parse_<definition type>( StrC def )
<code type> parse_<definition type>( StrC def )
{
check_parse_args( def );
using namespace Parser;
@ -76,14 +88,168 @@ The most top-level parsing procedure used for C/C++ file parsing is `parse_globa
It uses a helper procedure called `parse_global_nspace`.
Each internal procedure will be
Each internal procedure will have the following format:
## parse_global_nspace
```cpp
internal
<code type> parse_<definition_type>( <empty or contextual params> )
{
push_scope();
...
<code type> result = (<code type>) make_code();
...
Context.pop();
return result;
}
```
Below is an outline of the general alogirithim used for these internal procedures. The intention is provide a basic briefing to aid the user in traversing the actual code definitions. These appear in the same order as they are in the `parser.cpp` file
## `parse_array_decl`
1. Check if its an array declaration with no expression.
1. Consume and return empty array declaration
2. Opening square bracket
3. Consume expression
4. Closing square bracket
5. If adjacent opening bracket
1. Repeat array declaration parse until no brackets remain
## `parse_attributes`
1. Check for standard attribute
2. Check for GNU attribute
3. Check for MSVC attribute
4. Check for a token registered as an attribute
## `parse_class_struct`
## `parse_class_struct_body`
## `parse_comment`
## `parse_compilcated_definition`
## `parse_define`
## `parse_forward_or_definition`
## `parse_function_after_name`
## `parse_function_body`
## `parse_global_nspace`
1. Make sure the type provided to the helper function is a `Namespace_Body`, `Global_Body`, `Export_Body`, `Extern_Linkage_body`.
2. If its not a `Global_Body` eat the opening brace for the scope.
3.
3. `
## `parse_identifier`
## parse_type
## `parse_include`
## `parse_operator_after_ret_type`
## `parse_operator_function_or_variable`
## `parse_pragma`
## `parse_params`
## `parse_preprocess_cond`
## `parse_simple_preprocess`
## `parse_static_assert`
## `parse_template_args`
## `parse_variable_after_name`
## `parse_variable_declaration_list`
## `parse_class`
## `parse_constructor`
## `parse_destructor`
## `parse_enum`
## `parse_export_body`
## `parse_extern_link_body`
## `parse_extern_link`
## `parse_friend`
## `parse_function`
## `parse_namespace`
## `parse_operator`
## `parse_operator_cast`
## `parse_struct`
## `parse_template`
## `parse_type`
## `parse_typedef`
1. Check for export module specifier
2. typedef keyword
3. If its a preprocess macro: Get the macro name
4.
## `parse_union`
1. Check for export module specifier
2. union keyword
3. `parse_attributes`
4. Check for identifier
5. Parse the body (Possible options):
1. Newline
2. Comment
3. Decl_Class
4. Decl_Enum
5. Decl_Struct
6. Decl_Union
7. Preprocess_Define
8. Preprocess_Conditional
9. Preprocess_Macro
10. Preprocess_Pragma
11. Unsupported preprocess directive
12. Variable
6. If its not an inplace definiton: End Statement
## `parse_using`
1. Check for export module specifier
2. using keyword
3. Check to see if its a using namespace
4. Get the identifier
5. If its a regular using declaration:
1. `parse_attributes`
2. `parse_type`
3. `parse_array_decl`
6. End statement
7. Check for inline comment
## `parse_variable`
1. Check for export module specifier
2. `parse_attributes`
3. `parse specifiers`
4. `parse_type`
5. `parse_identifier`
6. `parse_variable_after_name`

View File

@ -65,7 +65,7 @@ As mentioned in root readme, the user is provided Code objects by calling the co
The AST is managed by the library and provided to the user via its interface.
However, the user may specifiy memory configuration.
Data layout of AST struct:
Data layout of AST struct (Subject to heavily change with upcoming redesign):
```cpp
union {

View File

@ -34,6 +34,7 @@ namespace ESpecifier
NoExceptions,
Override,
Pure,
Volatile,
NumSpecifiers
};
@ -64,13 +65,13 @@ namespace ESpecifier
{ sizeof( "&&" ), "&&" },
{ sizeof( "static" ), "static" },
{ sizeof( "thread_local" ), "thread_local" },
{ sizeof( "volatile" ), "volatile" },
{ sizeof( "virtual" ), "virtual" },
{ sizeof( "const" ), "const" },
{ sizeof( "final" ), "final" },
{ sizeof( "noexcept" ), "noexcept" },
{ sizeof( "override" ), "override" },
{ sizeof( "= 0" ), "= 0" },
{ sizeof( "volatile" ), "volatile" },
};
return lookup[ type ];
}

View File

@ -92,6 +92,7 @@ namespace parser
Statement_End,
StaticAssert,
String,
Type_Typename,
Type_Unsigned,
Type_Signed,
Type_Short,
@ -194,6 +195,7 @@ namespace parser
{ sizeof( ";" ), ";" },
{ sizeof( "static_assert" ), "static_assert" },
{ sizeof( "__string__" ), "__string__" },
{ sizeof( "typename" ), "typename" },
{ sizeof( "unsigned" ), "unsigned" },
{ sizeof( "signed" ), "signed" },
{ sizeof( "short" ), "short" },

View File

@ -133,6 +133,9 @@ void init()
Tokens = Array<Token>::init_reserve( LexArena
, ( LexAllocator_Size - sizeof( Array<Token>::Header ) ) / sizeof(Token)
);
defines_map_arena = Arena_64KB::init();
defines = HashTable<StrC>::init( defines_map_arena );
}
internal
@ -178,7 +181,7 @@ if ( def.Ptr == nullptr ) \
internal Code parse_array_decl ();
internal CodeAttributes parse_attributes ();
internal CodeComment parse_comment ();
internal Code parse_compilcated_definition ();
internal Code parse_complicated_definition ( TokType which );
internal CodeBody parse_class_struct_body ( TokType which, Token name = NullToken );
internal Code parse_class_struct ( TokType which, bool inplace_def );
internal CodeDefine parse_define ();
@ -476,6 +479,7 @@ Code parse_array_decl()
{
Code array_expr = untyped_str( currtok );
eat( TokType::Operator );
// []
Context.pop();
return array_expr;
@ -484,6 +488,7 @@ Code parse_array_decl()
if ( check( TokType::BraceSquare_Open ) )
{
eat( TokType::BraceSquare_Open );
// [
if ( left == 0 )
{
@ -509,6 +514,7 @@ Code parse_array_decl()
untyped_tok.Length = ( (sptr)prevtok.Text + prevtok.Length ) - (sptr)untyped_tok.Text;
Code array_expr = untyped_str( untyped_tok );
// [ <Content>
if ( left == 0 )
{
@ -525,11 +531,13 @@ Code parse_array_decl()
}
eat( TokType::BraceSquare_Close );
// [ <Content> ]
// Its a multi-dimensional array
if ( check( TokType::BraceSquare_Open ))
{
Code adjacent_arr_expr = parse_array_decl();
// [ <Content> ][ <Content> ]...
array_expr->Next = adjacent_arr_expr.ast;
}
@ -553,31 +561,38 @@ CodeAttributes parse_attributes()
if ( check(TokType::Attribute_Open) )
{
eat( TokType::Attribute_Open);
// [[
start = currtok;
while ( left && currtok.Type != TokType::Attribute_Close )
{
eat( currtok.Type );
}
// [[ <Content>
eat( TokType::Attribute_Close );
// [[ <Content> ]]
s32 len = ( (sptr)prevtok.Text + prevtok.Length ) - (sptr)start.Text;
}
else if ( check(TokType::Decl_GNU_Attribute) )
{
eat(TokType::Decl_GNU_Attribute);
eat(TokType::Capture_Start);
eat(TokType::Capture_Start);
// __attribute__((
start = currtok;
while ( left && currtok.Type != TokType::Capture_End )
{
eat(currtok.Type);
}
// __attribute__(( <Content>
eat(TokType::Capture_End);
eat(TokType::Capture_End);
// __attribute__(( <Content> ))
s32 len = ( (sptr)prevtok.Text + prevtok.Length ) - (sptr)start.Text;
}
@ -586,14 +601,17 @@ CodeAttributes parse_attributes()
{
eat( TokType::Decl_MSVC_Attribute );
eat( TokType::Capture_Start);
// __declspec(
start = currtok;
while ( left && currtok.Type != TokType::Capture_End )
{
eat(currtok.Type);
}
// __declspec( <Content>
eat(TokType::Capture_End);
// __declspec( <Content> )
s32 len = ( (sptr)prevtok.Text + prevtok.Length ) - (sptr)start.Text;
}
@ -602,6 +620,7 @@ CodeAttributes parse_attributes()
{
eat(currtok.Type);
s32 len = start.Length;
// <Attribute>
}
if ( len > 0 )
@ -626,117 +645,97 @@ CodeAttributes parse_attributes()
}
internal
CodeComment parse_comment()
Code parse_class_struct( TokType which, bool inplace_def = false )
{
StackNode scope { nullptr, currtok_noskip, NullToken, txt( __func__ ) };
Context.push( & scope );
CodeComment
result = (CodeComment) make_code();
result->Type = ECode::Comment;
result->Content = get_cached_string( currtok_noskip );
result->Name = result->Content;
// result->Token = currtok_noskip;
eat( TokType::Comment );
Context.pop();
return result;
}
internal
Code parse_complicated_definition( TokType which )
{
push_scope();
bool is_inplace = false;
TokArray tokens = Context.Tokens;
s32 idx = tokens.Idx;
s32 level = 0;
for ( ; idx < tokens.Arr.num(); idx ++ )
if ( which != TokType::Decl_Class && which != TokType::Decl_Struct )
{
if ( tokens[idx].Type == TokType::BraceCurly_Open )
level++;
if ( tokens[idx].Type == TokType::BraceCurly_Close )
level--;
if ( level == 0 && tokens[idx].Type == TokType::Statement_End )
break;
}
if ( (idx - 2 ) == tokens.Idx )
{
// Its a forward declaration only
Code result = parse_forward_or_definition( which, is_inplace );
Context.pop();
return result;
}
Token tok = tokens[ idx - 1 ];
if ( tok.Type == TokType::Identifier )
{
tok = tokens[ idx - 2 ];
bool is_indirection = tok.Type == TokType::Ampersand
|| tok.Type == TokType::Star;
bool ok_to_parse = false;
if ( tok.Type == TokType::BraceCurly_Close )
{
// Its an inplace definition
// <which> <type_identifier> { ... } <identifier>;
ok_to_parse = true;
is_inplace = true;
}
else if ( tok.Type == TokType::Identifier && tokens[ idx - 3 ].Type == TokType::Decl_Struct )
{
// Its a variable with type ID using struct namespace.
// <which> <type_identifier> <identifier>;
ok_to_parse = true;
}
else if ( is_indirection )
{
// Its a indirection type with type ID using struct namespace.
// <which> <type_identifier>* <identifier>;
ok_to_parse = true;
}
if ( ! ok_to_parse )
{
log_failure( "Unsupported or bad member definition after struct declaration\n%s", Context.to_string() );
Context.pop();
return CodeInvalid;
}
Code result = parse_operator_function_or_variable( false, { nullptr }, { nullptr } );
Context.pop();
return result;
}
else if ( tok.Type == TokType::BraceCurly_Close )
{
// Its a definition
// <which> { ... };
Code result = parse_forward_or_definition( which, is_inplace );
Context.pop();
return result;
}
else if ( tok.Type == TokType::BraceSquare_Close)
{
// Its an array definition
// <which> <type_identifier> <identifier> [ ... ];
Code result = parse_operator_function_or_variable( false, { nullptr }, { nullptr } );
Context.pop();
return result;
}
else
{
log_failure( "Unsupported or bad member definition after struct declaration\n%s", Context.to_string() );
Context.pop();
log_failure( "Error, expected class or struct, not %s\n%s", ETokType::to_str( which ), Context.to_string() );
return CodeInvalid;
}
Token name { nullptr, 0, TokType::Invalid };
AccessSpec access = AccessSpec::Default;
CodeType parent = { nullptr };
CodeBody body = { nullptr };
CodeAttributes attributes = { nullptr };
ModuleFlag mflags = ModuleFlag::None;
CodeClass result = CodeInvalid;
if ( check(TokType::Module_Export) )
{
mflags = ModuleFlag::Export;
eat( TokType::Module_Export );
}
eat( which );
attributes = parse_attributes();
if ( check( TokType::Identifier ) )
{
name = parse_identifier();
Context.Scope->Name = name;
}
local_persist
char interface_arr_mem[ kilobytes(4) ] {0};
Array<CodeType> interfaces = Array<CodeType>::init_reserve( Arena::init_from_memory(interface_arr_mem, kilobytes(4) ), 4 );
if ( check( TokType::Assign_Classifer ) )
{
eat( TokType::Assign_Classifer );
if ( currtok.is_access_specifier() )
{
access = currtok.to_access_specifier();
}
Token parent_tok = parse_identifier();
parent = def_type( parent_tok );
while ( check(TokType::Comma) )
{
eat(TokType::Access_Public);
if ( currtok.is_access_specifier() )
{
eat(currtok.Type);
}
Token interface_tok = parse_identifier();
interfaces.append( def_type( interface_tok ) );
}
}
if ( check( TokType::BraceCurly_Open ) )
{
body = parse_class_struct_body( which, name );
}
CodeComment inline_cmt = NoCode;
if ( ! inplace_def )
{
Token stmt_end = currtok;
eat( TokType::Statement_End );
if ( currtok_noskip.Type == TokType::Comment && currtok_noskip.Line == stmt_end.Line )
inline_cmt = parse_comment();
}
if ( which == TokType::Decl_Class )
result = def_class( name, body, parent, access, attributes, mflags );
else
result = def_struct( name, body, (CodeType)parent, access, attributes, mflags );
if ( inline_cmt )
result->InlineCmt = inline_cmt;
interfaces.free();
return result;
}
internal neverinline
@ -963,6 +962,7 @@ CodeBody parse_class_struct_body( TokType which, Token name )
case TokType::Type_Signed:
case TokType::Type_Short:
case TokType::Type_Long:
case TokType::Type_bool:
case TokType::Type_char:
case TokType::Type_int:
case TokType::Type_double:
@ -1009,97 +1009,117 @@ CodeBody parse_class_struct_body( TokType which, Token name )
}
internal
Code parse_class_struct( TokType which, bool inplace_def = false )
CodeComment parse_comment()
{
if ( which != TokType::Decl_Class && which != TokType::Decl_Struct )
StackNode scope { nullptr, currtok_noskip, NullToken, txt( __func__ ) };
Context.push( & scope );
CodeComment
result = (CodeComment) make_code();
result->Type = ECode::Comment;
result->Content = get_cached_string( currtok_noskip );
result->Name = result->Content;
// result->Token = currtok_noskip;
eat( TokType::Comment );
Context.pop();
return result;
}
internal
Code parse_complicated_definition( TokType which )
{
push_scope();
bool is_inplace = false;
TokArray tokens = Context.Tokens;
s32 idx = tokens.Idx;
s32 level = 0;
for ( ; idx < tokens.Arr.num(); idx ++ )
{
log_failure( "Error, expected class or struct, not %s\n%s", ETokType::to_str( which ), Context.to_string() );
if ( tokens[idx].Type == TokType::BraceCurly_Open )
level++;
if ( tokens[idx].Type == TokType::BraceCurly_Close )
level--;
if ( level == 0 && tokens[idx].Type == TokType::Statement_End )
break;
}
if ( (idx - 2 ) == tokens.Idx )
{
// Its a forward declaration only
Code result = parse_forward_or_definition( which, is_inplace );
Context.pop();
return result;
}
Token tok = tokens[ idx - 1 ];
if ( tok.Type == TokType::Identifier )
{
tok = tokens[ idx - 2 ];
bool is_indirection = tok.Type == TokType::Ampersand
|| tok.Type == TokType::Star;
bool ok_to_parse = false;
if ( tok.Type == TokType::BraceCurly_Close )
{
// Its an inplace definition
// <which> <type_identifier> { ... } <identifier>;
ok_to_parse = true;
is_inplace = true;
}
else if ( tok.Type == TokType::Identifier && tokens[ idx - 3 ].Type == TokType::Decl_Struct )
{
// Its a variable with type ID using struct namespace.
// <which> <type_identifier> <identifier>;
ok_to_parse = true;
}
else if ( is_indirection )
{
// Its a indirection type with type ID using struct namespace.
// <which> <type_identifier>* <identifier>;
ok_to_parse = true;
}
if ( ! ok_to_parse )
{
log_failure( "Unsupported or bad member definition after struct declaration\n%s", Context.to_string() );
Context.pop();
return CodeInvalid;
}
Code result = parse_operator_function_or_variable( false, { nullptr }, { nullptr } );
Context.pop();
return result;
}
else if ( tok.Type == TokType::BraceCurly_Close )
{
// Its a definition
// <which> { ... };
Code result = parse_forward_or_definition( which, is_inplace );
Context.pop();
return result;
}
else if ( tok.Type == TokType::BraceSquare_Close)
{
// Its an array definition
// <which> <type_identifier> <identifier> [ ... ];
Code result = parse_operator_function_or_variable( false, { nullptr }, { nullptr } );
Context.pop();
return result;
}
else
{
log_failure( "Unsupported or bad member definition after struct declaration\n%s", Context.to_string() );
Context.pop();
return CodeInvalid;
}
Token name { nullptr, 0, TokType::Invalid };
AccessSpec access = AccessSpec::Default;
CodeType parent = { nullptr };
CodeBody body = { nullptr };
CodeAttributes attributes = { nullptr };
ModuleFlag mflags = ModuleFlag::None;
CodeClass result = CodeInvalid;
if ( check(TokType::Module_Export) )
{
mflags = ModuleFlag::Export;
eat( TokType::Module_Export );
}
eat( which );
attributes = parse_attributes();
if ( check( TokType::Identifier ) )
{
name = parse_identifier();
Context.Scope->Name = name;
}
local_persist
char interface_arr_mem[ kilobytes(4) ] {0};
Array<CodeType> interfaces = Array<CodeType>::init_reserve( Arena::init_from_memory(interface_arr_mem, kilobytes(4) ), 4 );
if ( check( TokType::Assign_Classifer ) )
{
eat( TokType::Assign_Classifer );
if ( currtok.is_access_specifier() )
{
access = currtok.to_access_specifier();
}
Token parent_tok = parse_identifier();
parent = def_type( parent_tok );
while ( check(TokType::Comma) )
{
eat(TokType::Access_Public);
if ( currtok.is_access_specifier() )
{
eat(currtok.Type);
}
Token interface_tok = parse_identifier();
interfaces.append( def_type( interface_tok ) );
}
}
if ( check( TokType::BraceCurly_Open ) )
{
body = parse_class_struct_body( which, name );
}
CodeComment inline_cmt = NoCode;
if ( ! inplace_def )
{
Token stmt_end = currtok;
eat( TokType::Statement_End );
if ( currtok_noskip.Type == TokType::Comment && currtok_noskip.Line == stmt_end.Line )
inline_cmt = parse_comment();
}
if ( which == TokType::Decl_Class )
result = def_class( name, body, parent, access, attributes, mflags );
else
result = def_struct( name, body, (CodeType)parent, access, attributes, mflags );
if ( inline_cmt )
result->InlineCmt = inline_cmt;
interfaces.free();
return result;
}
internal inline
@ -1530,6 +1550,7 @@ CodeBody parse_global_nspace( CodeT which )
case TokType::Type_Short:
case TokType::Type_Signed:
case TokType::Type_Unsigned:
case TokType::Type_bool:
case TokType::Type_char:
case TokType::Type_double:
case TokType::Type_int:
@ -1583,6 +1604,10 @@ CodeBody parse_global_nspace( CodeT which )
return result;
}
// TODO(Ed): I want to eventually change the identifier to its own AST type.
// This would allow distinction of the qualifier for a symbol <qualifier>::<nested symboL>
// This would also allow
internal
Token parse_identifier( bool* possible_member_function )
{
@ -2620,7 +2645,6 @@ internal CodeVar parse_variable_declaration_list()
CodeVar var = parse_variable_after_name( ModuleFlag::None, NoCode, specifiers, NoCode, name );
// TODO(Ed) : CodeVar is going to need a procedure to append comma-defined vars to itself.
if ( ! result )
{
result.ast = var.ast;
@ -2641,7 +2665,6 @@ internal CodeVar parse_variable_declaration_list()
internal
CodeClass parse_class( bool inplace_def )
{
using namespace Parser;
push_scope();
CodeClass result = (CodeClass) parse_class_struct( TokType::Decl_Class, inplace_def );
Context.pop();
@ -2651,7 +2674,6 @@ CodeClass parse_class( bool inplace_def )
internal
CodeConstructor parse_constructor()
{
using namespace Parser;
push_scope();
Token identifier = parse_identifier();
@ -2721,7 +2743,6 @@ CodeConstructor parse_constructor()
internal
CodeDestructor parse_destructor( CodeSpecifiers specifiers )
{
using namespace Parser;
push_scope();
if ( check( TokType::Spec_Virtual ) )
@ -2800,7 +2821,6 @@ CodeDestructor parse_destructor( CodeSpecifiers specifiers )
internal
CodeEnum parse_enum( bool inplace_def )
{
using namespace Parser;
using namespace ECode;
push_scope();
@ -2991,30 +3011,15 @@ CodeEnum parse_enum( bool inplace_def )
internal inline
CodeBody parse_export_body()
{
using namespace Parser;
push_scope();
CodeBody result = parse_global_nspace( ECode::Export_Body );
Context.pop();
return result;
}
CodeBody parse_export_body( StrC def )
{
check_parse_args( def );
using namespace Parser;
TokArray toks = lex( def );
if ( toks.Arr == nullptr )
return CodeInvalid;
Context.Tokens = toks;
return parse_export_body();
}
internal inline
CodeBody parse_extern_link_body()
{
using namespace Parser;
push_scope();
CodeBody result = parse_global_nspace( ECode::Extern_Linkage_Body );
Context.pop();
@ -3024,7 +3029,6 @@ CodeBody parse_extern_link_body()
internal
CodeExtern parse_extern_link()
{
using namespace Parser;
push_scope();
eat( TokType::Decl_Extern_Linkage );
@ -3057,7 +3061,6 @@ CodeExtern parse_extern_link()
internal
CodeFriend parse_friend()
{
using namespace Parser;
using namespace ECode;
push_scope();
@ -3119,7 +3122,6 @@ CodeFriend parse_friend()
internal
CodeFn parse_function()
{
using namespace Parser;
push_scope();
SpecifierT specs_found[16] { ESpecifier::NumSpecifiers };
@ -3196,7 +3198,6 @@ CodeFn parse_function()
internal
CodeNS parse_namespace()
{
using namespace Parser;
push_scope();
eat( TokType::Decl_Namespace );
@ -3287,7 +3288,6 @@ CodeOperator parse_operator()
internal
CodeOpCast parse_operator_cast( CodeSpecifiers specifiers )
{
using namespace parser;
push_scope();
// TODO : Specifiers attributed to the cast
@ -3540,9 +3540,11 @@ CodeType parse_type( bool* typedef_is_function )
Token name = { nullptr, 0, TokType::Invalid };
// Attributes are assumed to be before the type signature
// <attributes> ...
CodeAttributes attributes = parse_attributes();
// Prefix specifiers
// <attributes <specifiers> ...
while ( left && currtok.is_specifier() )
{
SpecifierT spec = ESpecifier::to_type( currtok );
@ -3566,7 +3568,9 @@ CodeType parse_type( bool* typedef_is_function )
return CodeInvalid;
}
// All kinds of nonsense can makeup a type signature, first we check for a in-place definition of a class, enum, or struct
/* All kinds of nonsense can makeup a type signature, first we check for a in-place definition of a class, enum, or struct
<attributes> <specifiers> <class, enum, struct, union> ...
*/
if ( currtok.Type == TokType::Decl_Class
|| currtok.Type == TokType::Decl_Enum
|| currtok.Type == TokType::Decl_Struct
@ -3580,6 +3584,7 @@ CodeType parse_type( bool* typedef_is_function )
Context.Scope->Name = name;
}
// Decltype draft implementaiton
#if 0
else if ( currtok.Type == TokType::DeclType )
{
@ -3606,6 +3611,7 @@ CodeType parse_type( bool* typedef_is_function )
#endif
// Check if native type keywords are used, eat them for the signature.
// <attributes> <specifiers> <native types ...> ...
else if ( currtok.Type >= TokType::Type_Unsigned && currtok.Type <= TokType::Type_MS_W64 )
{
// TODO(Ed) : Review this... Its necessary for parsing however the algo's path to this is lost...
@ -3621,6 +3627,7 @@ CodeType parse_type( bool* typedef_is_function )
}
// The usual Identifier type signature that may have namespace qualifiers
// <attriutes> <specifiers> <identifier> ...
else
{
name = parse_identifier();
@ -3937,8 +3944,10 @@ CodeTypedef parse_typedef()
mflags = ModuleFlag::Export;
eat( TokType::Module_Export );
}
// <ModuleFlags>
eat( TokType::Decl_Typedef );
// <ModuleFlags> typedef
constexpr bool from_typedef = true;
@ -3952,6 +3961,7 @@ CodeTypedef parse_typedef()
name = currtok;
Context.Scope->Name = name;
eat( TokType::Preprocess_Macro );
// <ModuleFalgs> typedef <Preprocessed_Macro>
}
else
{
@ -3961,7 +3971,7 @@ CodeTypedef parse_typedef()
|| currtok.Type == TokType::Decl_Struct
|| currtok.Type == TokType::Decl_Union;
// This code is highly correlated with parse_compilcated_definition
// This code is highly correlated with parse_complicated_definition
if ( is_complicated )
{
TokArray tokens = Context.Tokens;
@ -3984,6 +3994,7 @@ CodeTypedef parse_typedef()
{
// Its a forward declaration only
type = parse_forward_or_definition( currtok.Type, from_typedef );
// <ModuleFalgs> typedef <UnderlyingType: Forward Decl>
}
Token tok = tokens[ idx - 1 ];
@ -4025,18 +4036,21 @@ CodeTypedef parse_typedef()
// TODO(Ed) : I'm not sure if I have to use parse_type here, I'd rather not as that would complicate parse_type.
// type = parse_type();
type = parse_forward_or_definition( currtok.Type, from_typedef );
// <ModuleFalgs> typedef <UnderlyingType>
}
else if ( tok.Type == TokType::BraceCurly_Close )
{
// Its a definition
// <which> { ... };
type = parse_forward_or_definition( currtok.Type, from_typedef );
// <ModuleFalgs> typedef <UnderlyingType>
}
else if ( tok.Type == TokType::BraceSquare_Close)
{
// Its an array definition
// <which> <type_identifier> <identifier> [ ... ];
type = parse_type();
// <ModuleFalgs> typedef <UnderlyingType>
}
else
{
@ -4047,11 +4061,13 @@ CodeTypedef parse_typedef()
}
else
type = parse_type( & is_function );
// <ModuleFalgs> typedef <UnderlyingType>
if ( check( TokType::Identifier ) )
{
name = currtok;
eat( TokType::Identifier );
// <ModuleFalgs> typedef <UnderlyingType> <Name>
}
else if ( ! is_function )
{
@ -4059,16 +4075,19 @@ CodeTypedef parse_typedef()
Context.pop();
return CodeInvalid;
}
}
array_expr = parse_array_decl();
array_expr = parse_array_decl();
// <UnderlyingType> + <ArrayExpr>
}
Token stmt_end = currtok;
eat( TokType::Statement_End );
// <ModuleFalgs> typedef <UnderlyingType> <Name>;
CodeComment inline_cmt = NoCode;
if ( currtok_noskip.Type == TokType::Comment && currtok_noskip.Line == stmt_end.Line )
inline_cmt = parse_comment();
// <ModuleFalgs> typedef <UnderlyingType> <Name> <ArrayExpr>; <InlineCmt>
using namespace ECode;
@ -4117,23 +4136,27 @@ CodeUnion parse_union( bool inplace_def )
mflags = ModuleFlag::Export;
eat( TokType::Module_Export );
}
// <ModuleFlags>
eat( TokType::Decl_Union );
// <ModuleFlags> union
CodeAttributes attributes = parse_attributes();
// <ModuleFlags> union <Attributes>
StrC name = { 0, nullptr };
if ( check( TokType::Identifier ) )
{
name = currtok;
Context.Scope->Name = currtok;
eat( TokType::Identifier );
}
// <ModuleFlags> union <Attributes> <Name>
CodeBody body = { nullptr };
eat( TokType::BraceCurly_Open );
// <ModuleFlags> union <Attributes> <Name> {
body = make_code();
body->Type = ECode::Union_Body;
@ -4147,7 +4170,6 @@ CodeUnion parse_union( bool inplace_def )
switch ( currtok_noskip.Type )
{
case TokType::NewLine:
// Empty lines are auto skipped by Tokens.current()
member = fmt_newline;
eat( TokType::NewLine );
break;
@ -4213,11 +4235,14 @@ CodeUnion parse_union( bool inplace_def )
if ( member )
body.append( member );
}
// <ModuleFlags> union <Attributes> <Name> { <Body>
eat( TokType::BraceCurly_Close );
// <ModuleFlags> union <Attributes> <Name> { <Body> }
if ( ! inplace_def )
eat( TokType::Statement_End );
// <ModuleFlags> union <Attributes> <Name> { <Body> };
CodeUnion
result = (CodeUnion) make_code();
@ -4259,38 +4284,51 @@ CodeUsing parse_using()
mflags = ModuleFlag::Export;
eat( TokType::Module_Export );
}
// <ModuleFlags>
eat( TokType::Decl_Using );
// <ModuleFlags> using
if ( currtok.Type == TokType::Decl_Namespace )
{
is_namespace = true;
eat( TokType::Decl_Namespace );
// <ModuleFlags> using namespace
}
name = currtok;
Context.Scope->Name = name;
eat( TokType::Identifier );
// <ModuleFlags> using <namespace> <Name>
if ( bitfield_is_equal( u32, currtok.Flags, TF_Assign ) )
if ( ! is_namespace )
{
attributes = parse_attributes();
if ( bitfield_is_equal( u32, currtok.Flags, TF_Assign ) )
{
attributes = parse_attributes();
// <ModuleFlags> using <Name> <Attributes>
eat( TokType::Operator );
eat( TokType::Operator );
// <ModuleFlags> using <Name> <Attributes> =
type = parse_type();
type = parse_type();
// <ModuleFlags> using <Name> <Attributes> = <UnderlyingType>
array_expr = parse_array_decl();
// <UnderlyingType> + <ArrExpr>
}
}
array_expr = parse_array_decl();
Token stmt_end = currtok;
eat( TokType::Statement_End );
// <ModuleFlags> using <namespace> <Attributes> <Name> = <UnderlyingType>;
CodeComment inline_cmt = NoCode;
if ( currtok_noskip.Type == TokType::Comment && currtok_noskip.Line == stmt_end.Line )
{
inline_cmt = parse_comment();
}
// <ModuleFlags> using <namespace> <Attributes> <Name> = <UnderlyingType>; <InlineCmt>
using namespace ECode;
@ -4341,8 +4379,10 @@ CodeVar parse_variable()
mflags = ModuleFlag::Export;
eat( TokType::Module_Export );
}
// <ModuleFlags>
attributes = parse_attributes();
// <ModuleFlags> <Attributes>
while ( left && currtok.is_specifier() )
{
@ -4382,15 +4422,20 @@ CodeVar parse_variable()
{
specifiers = def_specifiers( NumSpecifiers, specs_found );
}
// <ModuleFlags> <Attributes> <Specifiers>
CodeType type = parse_type();
// <ModuleFlags> <Attributes> <Specifiers> <ValueType>
if ( type == Code::Invalid )
return CodeInvalid;
Context.Scope->Name = parse_identifier();
// <ModuleFlags> <Attributes> <Specifiers> <ValueType> <Name>
CodeVar result = parse_variable_after_name( mflags, attributes, specifiers, type, Context.Scope->Name );
// Regular : <ModuleFlags> <Attributes> <Specifiers> <ValueType> <Name> = <Value>; <InlineCmt>
// Bitfield : <ModuleFlags> <Attributes> <Specifiers> <ValueType> <Name> : <BitfieldSize> = <Value>; <InlineCmt>
Context.pop();
return result;