Made the parse definitions for variable, typedef, using more complete.

Fixed issues seen with expression token parsing
Moved array expression parsing outside of type parsing. Its only done with variable, typdef, and using declarations.
Added parsing of attributes, I'm going to make them separate from the regular specifiers as they are complicated.
This commit is contained in:
Edward R. Gonzalez 2023-04-23 14:33:37 -04:00
parent 09491be375
commit e50e9e094e
6 changed files with 321 additions and 284 deletions

View File

@ -19,7 +19,6 @@ Intended for small-to midsized projects.
* [Code generation and modification](#code-generation-and-modification)
* [On multithreading](#on-multi-threading)
* [Extending the library](#extending-the-library)
* [Why](#why)
* [TODO](#todo)
## Notes
@ -276,6 +275,7 @@ u8 _Align_Pad[6];
*`CodeT` is a typedef for `ECode::Type` which has an underlying type of `u32`*
*`OperatorT` is a typedef for `EOperator::Type` which has an underlying type of `u32`*
*`StringCahced` is a typedef for `char const*` to denote it is an interned string*
AST widths are setup to be AST_POD_Size.
The width dictates how much the static array can hold before it must give way to using an allocated array:
@ -305,7 +305,6 @@ Data Notes:
* ASTs are wrapped for the user in a Code struct which essentially a warpper for a AST* type.
* Both AST and Code have member symbols but their data layout is enforced to be POD types.
* This library treats memory failures as fatal.
* Entires start as a static array, however if it goes over capacity a dynamic array is allocated for the entires.
* Strings are stored in their own set of arenas. AST constructors use cached strings for names, and content.
## There are four sets of interfaces for Code AST generation the library provides
@ -581,34 +580,6 @@ Code parse_defines() can emit a custom code AST with Macro_Constant type.
Another would be getting preprocessor or template metaprogramming Codes from Unreal Engine definitions, etc.
## Why
Macros in c/c++ are usually painful to debug, and templates can be unless your on a monsterous IDE (and even then fail often).
Templates also have a heavy cost to compile-times due to their recursive nature of expansion if complex code is getting generated, or if heavy type checking system is used (assertsion require expansion, etc).
Unfortunately most programming langauges opt the approach of internally processing the generated code immediately within the AST or not expose it to the user in a nice way to even introspect as a text file.
Stage metaprogramming doesn't have this problem, since its entire purpose is to create those generated files that the final program will reference instead.
This is technically what the macro preprocessor does in a basic form, however a proper metaprogram for generation is easier to deal with for more complex generation.
The drawback naturally is generation functions, at face value, are harder to grasp than something following a template pattern (for simple generation). This drawback becomes less valid the more complex the code generation becomes.
Thus a rule of thumb is if its a simple definition you can get away with just the preprocessor `#define`, or if the templates being used don't break the debugger or your compile times, this is most likely not needed.
However, if:
* Compile time complexity becomes large.
* You enjoy actually *seeing* the generated code instead of just the error symbols or the pdb symbols.
* You value your debugging expereince, and would like to debug your metaprogram, without having to step through the debug version of the compiler (if you even can)
* Want to roll your own reflection system
* Want to maintain a series of libraries for internal use, but don't want to deal with manual merging as often when they update.
* Want to create tailored headers for your code or for your libraries since you usually don't need the majority of the code within them.
* You just dislike metaprogramming with template expansion
Then this might help you boostrap a toolset todo so.
# TODO
* May be in need of a better name, I found a few repos with this same one...

View File

@ -747,14 +747,18 @@ namespace gen
break;
case Typedef:
// TODO: Check for array expression
result = string_append_fmt( result, "typedef %s %s", Entries[0]->to_string(), Name );
break;
case Typename:
result = string_append_fmt( result, "%s", Name );
result = string_append_fmt( result, "%s %s", Name, Entries[0]->to_string() );
break;
case Using:
// TODO: Check for array expression
if ( Entries[0] )
result = string_append_fmt( result, "using %s = %s", Name, Entries[0] );
@ -767,7 +771,6 @@ namespace gen
break;
case Class_Body:
case Enum_Body:
case Function_Body:
@ -1056,7 +1059,7 @@ namespace gen
{
using namespace EOperator;
if ( op == Invalid )
if ( op == EOperator::Invalid )
{
log_failure("gen::def_operator: op cannot be invalid");
return OpValidateResult::Fail;
@ -1903,7 +1906,7 @@ namespace gen
return result;
}
Code def_type( u32 length, char const* name, Code specifiers )
Code def_type( u32 length, char const* name, Code specifiers, Code ArrayExpr )
{
name_check( def_type, length, name );
@ -1913,8 +1916,6 @@ namespace gen
return Code::Invalid;
}
Code
result = make_code();
result->Name = get_cached_string( name, length );
@ -1923,6 +1924,9 @@ namespace gen
if ( specifiers )
result->add_entry( specifiers );
if ( ArrayExpr )
result->add_entry( ArrayExpr );
result.lock();
return result;
}
@ -2739,6 +2743,7 @@ namespace gen
char const* Text;
sptr Length;
TokType Type;
bool IsAssign;
};
TokType get_tok_type( char const* word, s32 length )
@ -2760,7 +2765,7 @@ namespace gen
return TokType::Invalid;
}
char const* str_tok_ype( TokType type )
char const* str_tok_type( TokType type )
{
local_persist
char const* lookup[(u32)TokType::Num] =
@ -2773,6 +2778,8 @@ namespace gen
return lookup[(u32)type];
}
# undef Define_TokType
inline
bool tok_is_specifier( Token const& tok )
{
@ -2782,8 +2789,6 @@ namespace gen
;
}
# undef Define_TokType
Arena LexAllocator;
struct TokArray
@ -2794,10 +2799,18 @@ namespace gen
inline
bool __eat( TokType type, char const* context )
{
if ( array_count(Arr) - Idx <= 0 )
{
log_failure( "gen::%s: No tokens left", context );
return Code::Invalid;
}
if ( Arr[0].Type != type )
{
String token_str = string_make_length( g_allocator, Arr[Idx].Text, Arr[Idx].Length );
log_failure( "gen::%s: expected %s, got %s", context, str_tok_ype(type), str_tok_ype(Arr[Idx].Type) );
log_failure( "gen::%s: expected %s, got %s", context, str_tok_type(type), str_tok_type(Arr[Idx].Type) );
return Code::Invalid;
}
@ -2863,7 +2876,7 @@ namespace gen
while (left )
{
Token token = { nullptr, 0, TokType::Invalid };
Token token = { nullptr, 0, TokType::Invalid, false };
switch ( current )
{
@ -3040,12 +3053,22 @@ namespace gen
goto FoundToken;
// All other operators we just label as an operator and move forward.
case '=':
token.Text = scanner;
token.Length = 1;
token.Type = TokType::Operator;
token.IsAssign = true;
if (left)
move_forward();
goto FoundToken;
case '+':
case '%':
case '^':
case '~':
case '!':
case '=':
case '<':
case '>':
case '|':
@ -3059,6 +3082,7 @@ namespace gen
if ( current == '=' )
{
token.Length++;
token.IsAssign = true;
if (left)
move_forward();
@ -3095,6 +3119,7 @@ namespace gen
else if ( current == '=' )
{
token.Length++;
token.IsAssign = true;
if (left)
move_forward();
@ -3207,6 +3232,12 @@ namespace gen
}
}
if ( array_count(Tokens) == 0 )
{
log_failure( "Failed to lex any tokens" );
return { 0, nullptr };
}
return { 0, Tokens };
# undef current
# undef move_forward
@ -3231,19 +3262,19 @@ namespace gen
# define currtok toks.Arr[toks.Idx]
# define eat( Type_ ) toks.__eat( Type_, txt(context) )
# define left array_count(toks.Arr) - toks.Idx
# define check( Type_ ) left && currtok.Type == Type_
#pragma endregion Helper Macros
Code parse_class( s32 length, char const* def )
{
# define context parse_class
using namespace Parser;
# define context parse_class
TokArray toks = lex( length, def );
if ( array_count( toks.Arr ) == 0 )
{
log_failure( "gen::" txt(context) ": failed to lex tokens" );
if ( toks.Arr == nullptr )
return Code::Invalid;
}
Token name { nullptr, 0, TokType::Invalid };
@ -3263,16 +3294,14 @@ namespace gen
Code parse_enum( s32 length, char const* def )
{
# define context parse_enum
check_parse_args( parse_enum, length, def );
using namespace Parser;
# define context parse_enum
check_parse_args( parse_enum, length, def );
TokArray toks = lex( length, def );
if ( array_count( toks.Arr ) == 0 )
{
log_failure( "gen::" txt(context) ": failed to lex tokens" );
if ( toks.Arr == nullptr )
return Code::Invalid;
}
SpecifierT specs_found[16] { ESpecifier::Num_Specifiers };
s32 num_specifiers = 0;
@ -3375,11 +3404,8 @@ namespace gen
check_parse_args( parse_friend, length, def );
TokArray toks = lex( length, def );
if ( array_count( toks.Arr ) == 0 )
{
log_failure( "gen::" txt(context) ": failed to lex tokens" );
if ( toks.Arr == nullptr )
return Code::Invalid;
}
eat( TokType::Decl_Friend );
@ -3495,18 +3521,86 @@ namespace gen
Code parse_variable( s32 length, char const* def )
{
# define context parse_variable
check_parse_args( parse_variable, length, def );
using namespace Parser;
# define context parse_variable
check_parse_args( parse_variable, length, def );
TokArray toks = lex( length, def );
if ( array_count( toks.Arr ) == 0 )
if ( toks.Arr == nullptr )
return Code::Invalid;
Token* name = nullptr;
SpecifierT specs_found[16] { ESpecifier::Num_Specifiers };
s32 num_specifiers = 0;
Code lang_linkage;
Code attributes;
Code array_expr = { nullptr };
if ( check( TokType::BraceSquare_Open ) )
{
log_failure( "gen::" txt(context) ": failed to lex tokens" );
eat( TokType::BraceSquare_Open );
// TODO: Need to have a parser for attributes, it gets complicated...
#if 0
Token attris_tok = currtok;
while ( left && currtok.Type != TokType::BraceSquare_Close )
{
eat( TokType::Identifier )
}
#endif
attributes = untyped_str( currtok.Length, currtok.Text );
eat( TokType::BraceSquare_Close );
}
while ( left && tok_is_specifier( currtok ) )
{
SpecifierT spec = ESpecifier::to_type( currtok.Text, currtok.Length );
switch ( spec )
{
case ESpecifier::Constexpr:
case ESpecifier::Constinit:
case ESpecifier::Export:
case ESpecifier::External_Linkage:
case ESpecifier::Import:
case ESpecifier::Local_Persist:
case ESpecifier::Mutable:
case ESpecifier::Static_Member:
case ESpecifier::Thread_Local:
case ESpecifier::Volatile:
break;
default:
log_failure( "gen::parse_variable: invalid specifier " txt(spec) " for variable" );
return Code::Invalid;
}
Token* name = nullptr;
if ( spec == ESpecifier::External_Linkage )
{
specs_found[num_specifiers] = spec;
num_specifiers++;
eat( TokType::Spec_Extern );
if ( currtok.Type == TokType::String )
{
lang_linkage = untyped_str( currtok.Length, currtok.Text );
eat( TokType::String );
}
continue;
}
specs_found[num_specifiers] = spec;
num_specifiers++;
eat( currtok.Type );
}
Code type = parse_type( toks, txt(parse_variable) );
@ -3522,20 +3616,103 @@ namespace gen
name = & currtok;
eat( TokType::Identifier );
Code expr = { nullptr };
if ( currtok.IsAssign )
{
eat( TokType::Operator );
Token expr_tok = currtok;
if ( currtok.Type == TokType::Statement_End )
{
log_failure( "gen::parse_variable: expected expression after assignment operator" );
return Code::Invalid;
}
while ( left && currtok.Type != TokType::Statement_End )
{
expr_tok.Length = ( (sptr)currtok.Text + currtok.Length ) - (sptr)expr_tok.Text;
eat( currtok.Type );
}
expr = untyped_str( expr_tok.Length, expr_tok.Text );
}
if ( check( TokType::BraceSquare_Open ) )
{
eat( TokType::BraceSquare_Open );
if ( left == 0 )
{
log_failure( "%s: Error, unexpected end of typedef definition ( '[]' scope started )", txt(parse_typedef) );
return Code::Invalid;
}
if ( currtok.Type == TokType::BraceSquare_Close )
{
log_failure( "%s: Error, empty array expression in typedef definition", txt(parse_typedef) );
return Code::Invalid;
}
Token
untyped_tok = currtok;
while ( left && currtok.Type != TokType::BraceSquare_Close )
{
untyped_tok.Length = ( (sptr)currtok.Text + currtok.Length ) - (sptr)untyped_tok.Text;
}
array_expr = untyped_str( untyped_tok.Length, untyped_tok.Text );
if ( left == 0 )
{
log_failure( "%s: Error, unexpected end of type definition, expected ]", txt(parse_typedef) );
return Code::Invalid;
}
if ( currtok.Type != TokType::BraceSquare_Close )
{
log_failure( "%s: Error, expected ] in type definition, not %s", txt(parse_typedef), str_tok_type( currtok.Type ) );
return Code::Invalid;
}
eat( TokType::BraceSquare_Close );
}
eat( TokType::Statement_End );
using namespace ECode;
Code result = make_code();
result->Type = Variable;
result->Name = get_cached_string( name->Text, name->Length );
result->add_entry( type );
if (array_expr)
type->add_entry( array_expr );
if ( attributes )
result->add_entry( attributes );
if ( expr )
result->add_entry( expr );
return Code::Invalid;
# undef context
}
Code parse_type( Parser::TokArray& toks, char const* func_name )
{
# define context parse_type
using namespace Parser;
# define context parse_type
SpecifierT specs_found[16] { ESpecifier::Num_Specifiers };
s32 num_specifiers = 0;
Token name = { nullptr, 0, TokType::Invalid };
Code array_expr = { nullptr };
while ( left && tok_is_specifier( currtok ) )
{
@ -3593,49 +3770,10 @@ namespace gen
eat( currtok.Type );
}
if ( left && currtok.Type == TokType::BraceSquare_Open )
{
eat( TokType::BraceSquare_Open );
if ( left == 0 )
{
log_failure( "%s: Error, unexpected end of type definition", func_name );
return Code::Invalid;
}
if ( currtok.Type == TokType::BraceSquare_Close )
{
eat( TokType::BraceSquare_Close );
return Code::Invalid;
}
Token
untyped_tok = currtok;
while ( left && currtok.Type != TokType::BraceSquare_Close )
{
untyped_tok.Length += currtok.Length;
}
array_expr = untyped_str( untyped_tok.Length, untyped_tok.Text );
if ( left == 0 )
{
log_failure( "%s: Error, unexpected end of type definition", func_name );
return Code::Invalid;
}
if ( currtok.Type != TokType::BraceSquare_Close )
{
log_failure( "%s: Error, expected ] in type definition", func_name );
return Code::Invalid;
}
eat( TokType::BraceSquare_Close );
}
using namespace ECode;
// TODO: Need to figure out type code caching wiht the type table.
Code
result = make_code();
result->Type = Typename;
@ -3648,57 +3786,48 @@ namespace gen
result->add_entry( specifiers );
}
if ( array_expr )
result->add_entry( array_expr );
result.lock();
return result;
# undef context
}
Code parse_type( s32 length, char const* def )
{
using namespace Parser;
# define context parse_type
check_parse_args( parse_type, length, def );
using namespace Parser;
TokArray toks = lex( length, def );
if ( array_count( toks.Arr ) == 0 )
{
log_failure( "gen::" txt(context) ": failed to lex tokens" );
if ( toks.Arr == nullptr )
return Code::Invalid;
}
Code result = parse_type( toks, txt(parse_type) );
result.lock();
return result;
# undef context
}
Code parse_typedef( s32 length, char const* def )
{
using namespace Parser;
# define context parse_typedef
check_parse_args( parse_typedef, length, def );
using namespace Parser;
TokArray toks = lex( length, def );
if ( array_count( toks.Arr ) == 0 )
{
log_failure( "gen::" txt(context) ": failed to lex tokens" );
if ( toks.Arr == nullptr )
return Code::Invalid;
}
Token name = { nullptr, 0, TokType::Invalid };
Code array_expr = { nullptr };
Code type = { nullptr };
SpecifierT specs_found[16] { ESpecifier::Num_Specifiers };
s32 num_specifiers = 0;
eat( TokType::Decl_Typedef );
type = parse_type( toks, txt(parse_typedef) );
if ( currtok.Type != TokType::Identifier )
if ( check( TokType::Identifier ) )
{
log_failure( "gen::parse_typedef: Error, expected identifier for typedef" );
return Code::Invalid;
@ -3707,6 +3836,46 @@ namespace gen
name = currtok;
eat( TokType::Identifier );
if ( check( TokType::BraceSquare_Open ) )
{
eat( TokType::BraceSquare_Open );
if ( left == 0 )
{
log_failure( "%s: Error, unexpected end of typedef definition ( '[]' scope started )", txt(parse_typedef) );
return Code::Invalid;
}
if ( currtok.Type == TokType::BraceSquare_Close )
{
log_failure( "%s: Error, empty array expression in typedef definition", txt(parse_typedef) );
return Code::Invalid;
}
Token untyped_tok = currtok;
while ( left && currtok.Type != TokType::BraceSquare_Close )
{
untyped_tok.Length = ( (sptr)currtok.Text + currtok.Length ) - (sptr)untyped_tok.Text;
}
array_expr = untyped_str( untyped_tok.Length, untyped_tok.Text );
if ( left == 0 )
{
log_failure( "%s: Error, unexpected end of type definition, expected ]", txt(parse_typedef) );
return Code::Invalid;
}
if ( currtok.Type != TokType::BraceSquare_Close )
{
log_failure( "%s: Error, expected ] in type definition, not %s", txt(parse_typedef), str_tok_type( currtok.Type ) );
return Code::Invalid;
}
eat( TokType::BraceSquare_Close );
}
eat( TokType::Statement_End );
using namespace ECode;
@ -3718,6 +3887,9 @@ namespace gen
result->add_entry( type );
if ( array_expr )
type->add_entry( array_expr );
result.lock();
return result;
# undef context
@ -3725,16 +3897,14 @@ namespace gen
Code parse_using( s32 length, char const* def )
{
using namespace Parser;
# define context parse_using
check_parse_args( parse_using, length, def );
using namespace Parser;
TokArray toks = lex( length, def );
if ( array_count( toks.Arr ) == 0 )
{
log_failure( "gen::" txt(context) ": failed to lex tokens" );
if ( toks.Arr == nullptr )
return Code::Invalid;
}
SpecifierT specs_found[16] { ESpecifier::Num_Specifiers };
s32 num_specifiers = 0;
@ -3755,15 +3925,52 @@ namespace gen
eat( TokType::Identifier );
if ( currtok.Type != TokType::Statement_End )
if ( currtok.IsAssign )
{
if ( is_namespace )
eat( TokType::Operator );
type = parse_type( toks, txt(parse_typedef) );
}
if ( ! is_namespace && check( TokType::BraceSquare_Open ) )
{
log_failure( "gen::parse_using: Error, expected ; after identifier for a using namespace declaration" );
eat( TokType::BraceSquare_Open );
if ( left == 0 )
{
log_failure( "%s: Error, unexpected end of typedef definition ( '[]' scope started )", txt(parse_typedef) );
return Code::Invalid;
}
type = parse_type( toks, txt(parse_using) );
if ( currtok.Type == TokType::BraceSquare_Close )
{
log_failure( "%s: Error, empty array expression in typedef definition", txt(parse_typedef) );
return Code::Invalid;
}
Token
untyped_tok = currtok;
while ( left && currtok.Type != TokType::BraceSquare_Close )
{
untyped_tok.Length = ( (sptr)currtok.Text + currtok.Length ) - (sptr)untyped_tok.Text;
}
array_expr = untyped_str( untyped_tok.Length, untyped_tok.Text );
if ( left == 0 )
{
log_failure( "%s: Error, unexpected end of type definition, expected ]", txt(parse_typedef) );
return Code::Invalid;
}
if ( currtok.Type != TokType::BraceSquare_Close )
{
log_failure( "%s: Error, expected ] in type definition, not %s", txt(parse_typedef), str_tok_type( currtok.Type ) );
return Code::Invalid;
}
eat( TokType::BraceSquare_Close );
}
eat( TokType::Statement_End );
@ -3777,6 +3984,9 @@ namespace gen
result->add_entry( type );
if ( array_expr )
type->add_entry( array_expr );
result.lock();
return result;
# undef context

View File

@ -212,7 +212,6 @@ namespace gen
#define Define_Specifiers \
Entry( API_Import, API_Export_Code ) \
Entry( API_Export, API_Import_Code ) \
Entry( Attribute, "You cannot stringize an attribute this way" ) \
Entry( Alignas, alignas ) \
Entry( Array_Decl, "You cannot stringize an array declare this way" ) \
Entry( C_Linkage, extern "C" ) \
@ -724,7 +723,7 @@ namespace gen
Code def_struct ( s32 length, char const* name, Code parent = NoCode, Code specifiers = NoCode, Code body = NoCode );
Code def_typedef ( s32 length, char const* name, Code type );
Code def_type ( s32 length, char const* name, Code specifiers = NoCode );
Code def_type ( s32 length, char const* name, Code specifiers = NoCode, Code ArrayExpr = NoCode );
Code def_using ( s32 length, char const* name, Code type = NoCode, UsingT specifier = UsingRegular );
Code def_variable ( Code type, s32 length, char const* name, Code value = NoCode, Code specifiers = NoCode );

View File

@ -1,24 +0,0 @@
project( 'test', 'c', default_options : ['buildtype=debug'] )
# add_global_arguments('-E', language : 'cpp')
includes = include_directories(
[
'../gen',
'../../singleheader'
])
# get_sources = files('./get_sources.ps1')
# sources = files(run_command('powershell', get_sources, check: true).stdout().strip().split('\n'))
sources = [ 'test.c99.c' ]
if get_option('buildtype').startswith('debug')
add_project_arguments('-DBuild_Debug', language : ['c' ])
endif
add_project_arguments('-Dgentime', language : ['c', 'cpp'])
executable( 'test_c99', sources, include_directories : includes )

View File

@ -1,79 +0,0 @@
#include "gen.h"
#define Table( Type_ ) Table_##Type_
typedef u64(*)(void*) HashingFn;
#if gen_time
# define gen_table( Type_, HashingFn_ ) gen_request_table( #Type_, sizeof(Type_), HashingFn_ )
u64 table_default_hash_fn( void* address )
{
return crc32( address, 4 );
}
Code gen_table_code( char const* type_str, sw type_size, HashingFn hash_fn )
{
Code table;
return table;
}
struct TableRequest
{
char const* Type;
sw Size;
HashingFn HashFn;
};
array(TableRequest) TableRequests;
void gen_request_table( const char* type_str, sw type_size, HashingFn hash_fn )
{
TableRequest request = { type_str, type_size, hash_fn };
array_append( TableRequests, request );
}
u32 gen_table_file()
{
gen_table( u32 );
gen_table( char const* );
array(Code) array_asts;
array_init( array_asts, g_allocator );
sw left = array_count( TableRequests );
sw index = 0;
while( left -- )
{
ArrayRequest request = TableRequests[index];
Code result = gen_table_code( request.Name, request.Size, request.HashFn );
array_append( array_asts, result );
}
Builder
arraygen;
arraygen.open( "table.gen.h" );
left = array_count( array_asts );
index = 0;
while( left-- )
{
Code code = array_asts[index];
arraygen.print( code );
}
arraygen.write();
return 0;
}
#endif
#ifndef gen_time
# include "table.gen.h"
#endif

View File

@ -1,40 +0,0 @@
#define GENC_IMPLEMENTATION
#include "genc.h"
#include "table.h"
struct Test
{
u64 A;
u64 B;
};
#if gen_time
u64 hash_struct( void* test )
{
return crc32( ((Test)test).A, sizeof(u64) );
}
int gen_main()
{
gen_table( Test, & hash_struct )
gen_table_file();
}
#endif
#if runtime
int main()
{
Table(Test) test_table;
}
#endif