Setup codegen metaprogram skeleton

Going use it to swap macro implementation usage in the compiler.
This commit is contained in:
2024-05-04 15:23:33 -04:00
parent fa82547705
commit 1c633f7306
20 changed files with 46544 additions and 0 deletions
+28
View File
@@ -0,0 +1,28 @@
BSD 3-Clause License
Copyright (c) 2023, Edward R. Gonzalez
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+81
View File
@@ -0,0 +1,81 @@
# Parsing
The library features a naive parser tailored for only what the library needs to construct the supported syntax of C++ into its AST.
This parser does not, and should not do the compiler's job. By only supporting this minimal set of features, the parser is kept (so far) around ~5600 loc. I hope to keep it under 10k loc worst case.
You can think of this parser of a frontend parser vs a semantic parser. Its intuitively similar to WYSIWYG. What you precerive as the syntax from the user-side before the compiler gets a hold of it, is what you get.
User exposed interface:
```cpp
CodeClass parse_class ( StrC class_def );
CodeConstructor parse_constructor ( StrC constructor_def );
CodeDestructor parse_destructor ( StrC destructor_def );
CodeEnum parse_enum ( StrC enum_def );
CodeBody parse_export_body ( StrC export_def );
CodeExtern parse_extern_link ( StrC exten_link_def );
CodeFriend parse_friend ( StrC friend_def );
CodeFn parse_function ( StrC fn_def );
CodeBody parse_global_body ( StrC body_def );
CodeNS parse_namespace ( StrC namespace_def );
CodeOperator parse_operator ( StrC operator_def );
CodeOpCast parse_operator_cast( StrC operator_def );
CodeStruct parse_struct ( StrC struct_def );
CodeTemplate parse_template ( StrC template_def );
CodeType parse_type ( StrC type_def );
CodeTypedef parse_typedef ( StrC typedef_def );
CodeUnion parse_union ( StrC union_def );
CodeUsing parse_using ( StrC using_def );
CodeVar parse_variable ( StrC var_def );
```
To parse file buffers, use the `parse_global_body` function.
***Parsing will aggregate any tokens within a function body or expression statement to an untyped Code AST.***
Everything is done in one pass for both the preprocessor directives and the rest of the language.
The parser performs no macro expansion as the scope of gencpp feature-set is to only support the preprocessor for the goal of having rudimentary awareness of preprocessor ***conditionals***, ***defines***, ***includes***, and ***pragmas***.
The keywords supported for the preprocessor are:
* include
* define
* if
* ifdef
* elif
* endif
* pragma
Each directive `#` line is considered one preproecessor unit, and will be treated as one Preprocessor AST. *These ASTs will be considered members or entries of braced scope they reside within*.
If a directive is used with an unsupported keyword its will be processed as an untyped AST.
The preprocessor lines are stored as members of their associated scope they are parsed within. ( Global, Namespace, Class/Struct )
Any preprocessor definition abuse that changes the syntax of the core language is unsupported and will fail to parse if not kept within an execution scope (function body, or expression assignment).
Exceptions:
* function signatures are allowed for a preprocessed macro: `neverinline MACRO() { ... }`
* Disable with: `#define GEN_PARSER_DISABLE_MACRO_FUNCTION_SIGNATURES`
* typedefs allow for a preprocessed macro: `typedef MACRO();`
* Disable with: `#define GEN_PARSER_DISABLE_MACRO_TYPEDEF`
*(Exceptions are added on an on-demand basis)*
*(See functions `parse_operator_function_or_variable` and `parse_typedef` )*
Adding your own exceptions is possible by simply modifying the parser to allow for the syntax you need.
*Note: You could interpret this strictness as a feature. This would allow the user to see if their codebase or a third-party's codebase some some egregious preprocessor abuse.*
The lexing and parsing takes shortcuts from whats expected in the standard.
* Numeric literals are not checked for validity.
* The parse API treats any execution scope definitions with no validation and are turned into untyped Code ASTs.
* *This includes the assignment of variables.*
* Attributes ( `[[]]` (standard), `__declspec` (Microsoft), or `__attribute__` (GNU) )
* Assumed to *come before specifiers* (`const`, `constexpr`, `extern`, `static`, etc) for a function or right afterthe return type.
* Or in the usual spot for class, structs, (*right after the declaration keyword*)
* typedefs have attributes with the type (`parse_type`)
* Parsing attributes can be extended to support user defined macros by defining `GEN_DEFINE_ATTRIBUTE_TOKENS` (see `gen.hpp` for the formatting)
Empty lines used throughout the file are preserved for formatting purposes during ast serialization.
+135
View File
@@ -0,0 +1,135 @@
# gencpp
An attempt at simple staged metaprogramming for c/c++.
The library API is a composition of code element constructors.
These build up a code AST to then serialize with a file builder.
This code base attempts follow the [handmade philosophy](https://handmade.network/manifesto).
Its not meant to be a black box metaprogramming utility, it should be easy to intergrate into a user's project domain.
## Notes
**On Partial Hiatus: Working on handmade hero for now. Only fixes will be pushed as I come across them until I get what I want done from the series**
This project is still in development (very much an alpha state), so expect bugs and missing features.
See [issues](https://github.com/Ed94/gencpp/issues) for a list of known bugs or todos.
The library can already be used to generate code just fine, but the parser is where the most work is needed. If your C++ isn't "down to earth" expect issues.
A `natvis` and `natstepfilter` are provided in the scripts directory (its outdated, I'll update this readme when its not).
***The editor and scanner have not been implemented yet. The scanner will come first, then the editor.***
A C variant is hosted [here](https://github.com/Ed94/genc); I will complete it when this library is feature complete, it should be easier to make than this...
## Usage
A metaprogram is built to generate files before the main program is built. We'll term runtime for this program as `GEN_TIME`. The metaprogram's core implementation are within `gen.hpp` and `gen.cpp` in the project directory.
`gen.cpp` \`s `main()` is defined as `gen_main()` which the user will have to define once for their program. There they will dictate everything that should be generated.
In order to keep the locality of this code within the same files the following pattern may be used (although this pattern isn't required at all):
Within `program.cpp` :
```cpp
#ifdef GEN_TIME
#include "gen.hpp"
...
u32 gen_main()
{
...
}
#endif
// "Stage" agnostic code.
#ifndef GEN_TIME
#include "program.gen.cpp"
// Regular runtime dependent on the generated code here.
#endif
```
The design uses a constructive builder API for the code to generate.
The user is provided `Code` objects that are used to build up the AST.
Example using each construction interface:
### Upfront
Validation and construction through a functional interface.
```cpp
Code t_uw = def_type( name(uw) );
Code t_allocator = def_type( name(allocator) );
Code t_string_const = def_type( name(char), def_specifiers( args( ESpecifier::Const, ESpecifier::Ptr ) ));
Code header;
{
Code num = def_variable( t_uw, name(Num) );
Code cap = def_variable( t_uw, name(Capacity) );
Code mem_alloc = def_variable( t_allocator, name(Allocator) );
Code body = def_struct_body( args( num, cap, mem_alloc ) );
header = def_struct( name(ArrayHeader), __, __, body );
}
```
### Parse
Validation through ast construction.
```cpp
Code header = parse_struct( code(
struct ArrayHeader
{
uw Num;
uw Capacity;
allocator Allocator;
};
));
```
### Untyped
No validation, just glorified text injection.
```cpp
Code header = code_str(
struct ArrayHeader
{
uw Num;
uw Capacity;
allocator Allocator;
};
);
```
`name` is a helper macro for providing a string literal with its size, intended for the name parameter of functions.
`code` is a helper macro for providing a string literal with its size, but intended for code string parameters.
`args` is a helper macro for providing the number of arguments to varadic constructors.
`code_str` is a helper macro for writting `untyped_str( code( <content> ))`
All three constrcuton interfaces will generate the following C code:
```cpp
struct ArrayHeader
{
uw Num;
uw Capacity;
allocator Allocator;
};
```
**Note: The formatting shown here is not how it will look. For your desired formatting its recommended to run a pass through the files with an auto-formatter.**
*(The library currently uses clang-format for formatting, beware its pretty slow...)*
## Building
See the [scripts directory](scripts/).
+489
View File
@@ -0,0 +1,489 @@
## Documentation
The project has no external dependencies beyond:
* `errno.h`
* `stat.h`
* `stdarg.h`
* `stddef.h`
* `stdio.h`
* `copyfile.h` (Mac)
* `types.h` (Linux)
* `unistd.h` (Linux/Mac)
* `intrin.h` (Windows)
* `io.h` (Windows with gcc)
* `windows.h` (Windows)
Dependencies for the project are wrapped within `GENCPP_ROLL_OWN_DEPENDENCIES` (Defining it will disable them).
The majority of the dependency's implementation was derived from the [c-zpl library](https://github.com/zpl-c/zpl).
This library was written in a subset of C++ where the following are not used at all:
* RAII (Constructors/Destructors), lifetimes are managed using named static or regular functions.
* Language provide dynamic dispatch, RTTI
* Object-Oriented Inheritance
* Exceptions
Polymorphic & Member-functions are used as an ergonomic choice, along with a conserative use of operator overloads.
There are only 4 template definitions in the entire library. (`Array<Type>`, `Hashtable<Type>`, `swap<Type>`, and `AST/Code::cast<Type>`)
Two generic templated containers are used throughout the library:
* `template< class Type> struct Array`
* `template< class Type> struct HashTable`
Both Code and AST definitions have a `template< class Type> Code/AST :: cast()`. Its just an alternative way to explicitly cast to each other.
`template< class Type> swap( Type& a, Type& b)` is used over a macro.
Otherwise the library is free of any templates.
### *WHAT IS NOT PROVIDED*
**There is no support for validating expressions.**
Its difficult to parse without enough benefits (At the metaprogramming level).
I plan to add this only at the tail of the project parsing milestone.
**Only trivial template support is provided.**
The intention is for only simple, non-recursive substitution.
The parameters of the template are treated like regular parameter AST entries.
This means that the typename entry for the parameter AST would be either:
* `class`
* `typename`
* A fundamental type, function, or pointer type.
Anything beyond this usage is not supported by parse_template for arguments (at least not intentionally).
Use at your own mental peril.
*Concepts and Constraints are not supported, its usage is non-trivial substitution.*
### The Data & Interface
As mentioned in root readme, the user is provided Code objects by calling the constructor's functions to generate them or find existing matches.
The AST is managed by the library and provided to the user via its interface.
However, the user may specifiy memory configuration.
Data layout of AST struct (Subject to heavily change with upcoming redesign):
```cpp
union {
struct
{
AST* InlineCmt; // Class, Constructor, Destructor, Enum, Friend, Functon, Operator, OpCast, Struct, Typedef, Using, Variable
AST* Attributes; // Class, Enum, Function, Struct, Typedef, Union, Using, Variable
AST* Specs; // Destructor, Function, Operator, Typename, Variable
union {
AST* InitializerList; // Constructor
AST* ParentType; // Class, Struct, ParentType->Next has a possible list of interfaces.
AST* ReturnType; // Function, Operator, Typename
AST* UnderlyingType; // Enum, Typedef
AST* ValueType; // Parameter, Variable
};
union {
AST* Macro; // Parameters
AST* BitfieldSize; // Variable (Class/Struct Data Member)
AST* Params; // Constructor, Function, Operator, Template, Typename
};
union {
AST* ArrExpr; // Typename
AST* Body; // Class, Constructr, Destructor, Enum, Function, Namespace, Struct, Union
AST* Declaration; // Friend, Template
AST* Value; // Parameter, Variable
};
union {
AST* NextVar; // Variable; Possible way to handle comma separated variables declarations. ( , NextVar->Specs NextVar->Name NextVar->ArrExpr = NextVar->Value )
AST* SpecsFuncSuffix; // Only used with typenames, to store the function suffix if typename is function signature.
};
};
StringCached Content; // Attributes, Comment, Execution, Include
struct {
SpecifierT ArrSpecs[AST::ArrSpecs_Cap]; // Specifiers
AST* NextSpecs; // Specifiers
};
};
union {
AST* Prev;
AST* Front;
AST* Last;
};
union {
AST* Next;
AST* Back;
};
AST* Parent;
StringCached Name;
CodeT Type;
ModuleFlag ModuleFlags;
union {
b32 IsFunction; // Used by typedef to not serialize the name field.
b32 IsParamPack; // Used by typename to know if type should be considered a parameter pack.
OperatorT Op;
AccessSpec ParentAccess;
s32 NumEntries;
};
s32 Token; // Handle to the token, stored in the CodeFile (Otherwise unretrivable)
```
*`CodeT` is a typedef for `ECode::Type` which has an underlying type of `u32`*
*`OperatorT` is a typedef for `EOperator::Type` which has an underlying type of `u32`*
*`StringCahced` is a typedef for `String const`, to denote it is an interned string*
*`String` is the dynamically allocated string type for the library*
AST widths are setup to be AST_POD_Size.
The width dictates how much the static array can hold before it must give way to using an allocated array:
```cpp
constexpr static
uw ArrSpecs_Cap =
(
AST_POD_Size
- sizeof(AST*) * 3
- sizeof(StringCached)
- sizeof(CodeT)
- sizeof(ModuleFlag)
- sizeof(u32)
)
/ sizeof(SpecifierT) -1; // -1 for 4 extra bytes (Odd num of AST*)
```
*Ex: If the AST_POD_Size is 128 the capacity of the static array is 20.*
Data Notes:
* The allocator definitions used are exposed to the user incase they want to dictate memory usage
* You'll find the memory handling in `init`, `deinit`, `reset`, `gen_string_allocator`, `get_cached_string`, `make_code`.
* Allocators are defined with the `AllocatorInfo` structure found in `dependencies\memory.hpp`
* Most of the work is just defining the allocation procedure:
```cpp
void* ( void* allocator_data, AllocType type, sw size, sw alignment, void* old_memory, sw old_size, u64 flags );
```
* ASTs are wrapped for the user in a Code struct which is a wrapper for a AST* type.
* Both AST and Code have member symbols but their data layout is enforced to be POD types.
* This library treats memory failures as fatal.
* Cached Strings are stored in their own set of arenas. AST constructors use cached strings for names, and content.
* `StringArenas`, `StringCache`, `Allocator_StringArena`, and `Allocator_StringTable` are the associated containers or allocators.
* Strings used for serialization and file buffers are not contained by those used for cached strings.
* They are currently using `GlobalAllocator`, which are tracked array of arenas that grows as needed (adds buckets when one runs out).
* Memory within the buckets is not reused, so its inherently wasteful.
* I will be augmenting the single arena with a simple slag allocator.
* Linked lists used children nodes on bodies, and parameters.
* Its intended to generate the AST in one go and serialize after. The constructors and serializer are designed to be a "one pass, front to back" setup.
* Allocations can be tuned by defining the folloiwng macros:
* `GEN_GLOBAL_BUCKET_SIZE` : Size of each bucket area for the global allocator
* `GEN_CODEPOOL_NUM_BLOCKS` : Number of blocks per code pool in the code allocator
* `GEN_SIZE_PER_STRING_ARENA` : Size per arena used with string caching.
* `GEN_MAX_COMMENT_LINE_LENGTH` : Longest length a comment can have per line.
* `GEN_MAX_NAME_LENGTH` : Max length of any identifier.
* `GEN_MAX_UNTYPED_STR_LENGTH` : Max content length for any untyped code.
* `GEN_TOKEN_FMT_TOKEN_MAP_MEM_SIZE` : token_fmt_va uses local_persit memory of this size for the hashtable.
* `GEN_LEX_ALLOCATOR_SIZE`
* `GEN_BUILDER_STR_BUFFER_RESERVE`
The following CodeTypes are used which the user may optionally use strong typing with if they enable: `GEN_ENFORCE_STRONG_CODE_TYPES`
* CodeBody : Has support for `for-range` iterating across Code objects.
* CodeAttributes
* CodeComment
* CodeClass
* CodeConstructor
* CodeDefine
* CodeDestructor
* CodeEnum
* CodeExec
* CodeExtern
* CodeInclude
* CodeFriend
* CodeFn
* CodeModule
* CodeNS
* CodeOperator
* CodeOpCast
* CodeParam : Has support for `for-range` iterating across parameters.
* CodePreprocessCond
* CodePragma
* CodeSpecifiers : Has support for `for-range` iterating across specifiers.
* CodeStruct
* CodeTemplate
* CodeType
* CodeTypedef
* CodeUnion
* CodeUsing
* CodeVar
Each Code boy has an associated "filtered AST" with the naming convention: `AST_<CodeName>`
Unrelated fields of the AST for that node type are omitted and only necessary padding members are defined otherwise.
Retrieving a raw version of the ast can be done using the `raw()` function defined in each AST.
## There are three sets of interfaces for Code AST generation the library provides
* Upfront
* Parsing
* Untyped
### Upfront Construction
All component ASTs must be previously constructed, and provided on creation of the code AST.
The construction will fail and return CodeInvalid otherwise.
Interface :``
* def_attributes
* *This is pre-appended right before the function symbol, or placed after the class or struct keyword for any flavor of attributes used.*
* *Its up to the user to use the desired attribute formatting: `[[]]` (standard), `__declspec` (Microsoft), or `__attribute__` (GNU).*
* def_comment
* def_class
* def_constructor
* def_define
* def_destructor
* def_enum
* def_execution
* *This is equivalent to untyped_str, except that its intended for use only in execution scopes.*
* def_extern_link
* def_friend
* def_function
* def_include
* def_module
* def_namespace
* def_operator
* def_operator_cast
* def_param
* def_params
* def_pragma
* def_preprocess_cond
* def_specifier
* def_specifiers
* def_struct
* def_template
* def_type
* def_typedef
* def_union
* def_using
* def_using_namespace
* def_variable
Bodies:
* def_body
* def_class_body
* def_enum_body
* def_export_body
* def_extern_link_body
* def_function_body
* *Use this for operator bodies as well*
* def_global_body
* def_namespace_body
* def_struct_body
* def_union_body
Usage:
```cpp
<name> = def_<function type>( ... );
Code <name>
{
...
<name> = def_<function name>( ... );
}
```
When using the body functions, its recommended to use the args macro to auto determine the number of arguments for the varadic:
```cpp
def_global_body( args( ht_entry, array_ht_entry, hashtable ));
// instead of:
def_global_body( 3, ht_entry, array_ht_entry, hashtable );
```
If a more incremental approach is desired for the body ASTs, `Code def_body( CodeT type )` can be used to create an empty body.
When the members have been populated use: `AST::validate_body` to verify that the members are valid entires for that type.
### Parse construction
A string provided to the API is parsed for the intended language construct.
Interface :
* parse_class
* parse_constructor
* parse_destructor
* parse_enum
* parse_export_body
* parse_extern_link
* parse_friend
* Purposefully are only support forward declares with this constructor.
* parse_function
* parse_global_body
* parse_namespace
* parse_operator
* parse_operator_cast
* parse_struct
* parse_template
* parse_type
* parse_typedef
* parse_union
* parse_using
* parse_variable
Usage:
```cpp
Code <name> = parse_<function name>( string with code );
Code <name> = def_<function name>( ..., parse_<function name>(
<string with code>
));
```
### Untyped constructions
Code ASTs are constructed using unvalidated strings.
Interface :
* token_fmt_va
* token_fmt
* untyped_str
* untyped_fmt
* untyped_token_fmt
During serialization any untyped Code AST has its string value directly injected inline of whatever context the content existed as an entry within.
Even though these are not validated from somewhat correct c/c++ syntax or components, it doesn't mean that Untyped code can be added as any component of a Code AST:
* Untyped code cannot have children, thus there cannot be recursive injection this way.
* Untyped code can only be a child of a parent of body AST, or for values of an assignment (ex: variable assignment).
These restrictions help prevent abuse of untyped code to some extent.
Usage Conventions:
```cpp
Code <name> = def_variable( <type>, <name>, untyped_<function name>(
<string with code>
));
Code <name> = untyped_str( code(
<some code without "" quotes>
));
```
Optionally, `code_str`, and `code_fmt` macros can be used so that the code macro doesn't have to be used:
```cpp
Code <name> = code_str( <some code without "" quotes > )
```
Template metaprogramming in the traditional sense becomes possible with the use of `token_fmt` and parse constructors:
```cpp
StrC value = txt("Something");
char const* template_str = txt(
Code with <key> to replace with token_values
...
);
char const* gen_code_str = token_fmt( "key", value, template_str );
Code <name> = parse_<function name>( gen_code_str );
```
## Predefined Codes
The following are provided predefined by the library as they are commonly used:
* `access_public`
* `access_protected`
* `access_private`
* `attrib_api_export`
* `attrib_api_import`
* `module_global_fragment`
* `module_private_fragment`
* `fmt_newline`
* `param_varaidc` (Used for varadic definitions)
* `pragma_once`
* `preprocess_else`
* `preprocess_endif`
* `spec_const`
* `spec_consteval`
* `spec_constexpr`
* `spec_constinit`
* `spec_extern_linkage` (extern)
* `spec_final`
* `spec_forceinline`
* `spec_global` (global macro)
* `spec_inline`
* `spec_internal_linkage` (internal macro)
* `spec_local_persist` (local_persist macro)
* `spec_mutable`
* `spec_neverinline`
* `spec_override`
* `spec_ptr`
* `spec_pure`
* `spec_ref`
* `spec_register`
* `spec_rvalue`
* `spec_static_member` (static)
* `spec_thread_local`
* `spec_virtual`
* `spec_volatile`
* `t_empty` (Used for varaidc macros)
* `t_auto`
* `t_void`
* `t_int`
* `t_bool`
* `t_char`
* `t_wchar_t`
* `t_class`
* `t_typename`
Optionally the following may be defined if `GEN_DEFINE_LIBRARY_CODE_CONSTANTS` is defined
* `t_b32`
* `t_s8`
* `t_s16`
* `t_s32`
* `t_s64`
* `t_u8`
* `t_u16`
* `t_u32`
* `t_u64`
* `t_sw` (ssize_t)
* `t_uw` (size_t)
* `t_f32`
* `t_f64`
## Extent of operator overload validation
The AST and constructors will be able to validate that the arguments provided for the operator type match the expected form:
* If return type must match a parameter
* If number of parameters is correct
* If added as a member symbol to a class or struct, that operator matches the requirements for the class (types match up)
* There is no support for validating new & delete operations (yet)
The user is responsible for making sure the code types provided are correct
and have the desired specifiers assigned to them beforehand.
## Code generation and modification
There are three provided auxillary interfaces:
* Builder
* Editor
* Scanner
Editor and Scanner are disabled by default, use `GEN_FEATURE_EDITOR` and `GEN_FEATURE_SCANNER` to enable them.
### Builder is a similar object to the jai language's string_builder
* The purpose of it is to generate a file.
* A file is specified and opened for writing using the open( file_path) function.
* The code is provided via print( code ) function will be serialized to its buffer.
* When all serialization is finished, use the write() command to write the buffer to the file.
### Scanner Auxillary Interface
Provides *(eventually)* `scan_file` to automatically populate a CodeFile which contains a parsed AST (`Code`) of the file, with any contextual failures that are reported from the parser.
+62
View File
@@ -0,0 +1,62 @@
// This file was generated automatially by gencpp's bootstrap.cpp (See: https://github.com/Ed94/gencpp)
#include "gen.builder.hpp"
GEN_NS_BEGIN
Builder Builder::open( char const* path )
{
Builder result;
FileError error = file_open_mode( &result.File, EFileMode_WRITE, path );
if ( error != EFileError_NONE )
{
log_failure( "gen::File::open - Could not open file: %s", path );
return result;
}
result.Buffer = String::make_reserve( GlobalAllocator, Builder_StrBufferReserve );
// log_fmt("$Builder - Opened file: %s\n", result.File.filename );
return result;
}
void Builder::pad_lines( s32 num )
{
Buffer.append( "\n" );
}
void Builder::print( Code code )
{
String str = code->to_string();
// const sw len = str.length();
// log_fmt( "%s - print: %.*s\n", File.filename, len > 80 ? 80 : len, str.Data );
Buffer.append( str );
}
void Builder::print_fmt( char const* fmt, ... )
{
sw res;
char buf[GEN_PRINTF_MAXLEN] = { 0 };
va_list va;
va_start( va, fmt );
res = str_fmt_va( buf, count_of( buf ) - 1, fmt, va ) - 1;
va_end( va );
// log_fmt( "$%s - print_fmt: %.*s\n", File.filename, res > 80 ? 80 : res, buf );
Buffer.append( buf, res );
}
void Builder::write()
{
bool result = file_write( &File, Buffer, Buffer.length() );
if ( result == false )
log_failure( "gen::File::write - Failed to write to file: %s\n", file_name( &File ) );
log_fmt( "Generated: %s\n", File.filename );
file_close( &File );
Buffer.free();
}
GEN_NS_END
+24
View File
@@ -0,0 +1,24 @@
// This file was generated automatially by gencpp's bootstrap.cpp (See: https://github.com/Ed94/gencpp)
#pragma once
#include "gen.hpp"
GEN_NS_BEGIN
struct Builder
{
FileInfo File;
String Buffer;
static Builder open( char const* path );
void pad_lines( s32 num );
void print( Code );
void print_fmt( char const* fmt, ... );
void write();
};
GEN_NS_END
+12837
View File
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
+593
View File
@@ -0,0 +1,593 @@
// This file was generated automatially by gencpp's bootstrap.cpp (See: https://github.com/Ed94/gencpp)
#pragma once
#include "gen.hpp"
GEN_NS_BEGIN
#pragma region ADT
enum ADT_Type : u32
{
EADT_TYPE_UNINITIALISED, /* node was not initialised, this is a programming error! */
EADT_TYPE_ARRAY,
EADT_TYPE_OBJECT,
EADT_TYPE_STRING,
EADT_TYPE_MULTISTRING,
EADT_TYPE_INTEGER,
EADT_TYPE_REAL,
};
enum ADT_Props : u32
{
EADT_PROPS_NONE,
EADT_PROPS_NAN,
EADT_PROPS_NAN_NEG,
EADT_PROPS_INFINITY,
EADT_PROPS_INFINITY_NEG,
EADT_PROPS_FALSE,
EADT_PROPS_TRUE,
EADT_PROPS_NULL,
EADT_PROPS_IS_EXP,
EADT_PROPS_IS_HEX,
// Used internally so that people can fill in real numbers they plan to write.
EADT_PROPS_IS_PARSED_REAL,
};
enum ADT_NamingStyle : u32
{
EADT_NAME_STYLE_DOUBLE_QUOTE,
EADT_NAME_STYLE_SINGLE_QUOTE,
EADT_NAME_STYLE_NO_QUOTES,
};
enum ADT_AssignStyle : u32
{
EADT_ASSIGN_STYLE_COLON,
EADT_ASSIGN_STYLE_EQUALS,
EADT_ASSIGN_STYLE_LINE,
};
enum ADT_DelimStyle : u32
{
EADT_DELIM_STYLE_COMMA,
EADT_DELIM_STYLE_LINE,
EADT_DELIM_STYLE_NEWLINE,
};
enum ADT_Error : u32
{
EADT_ERROR_NONE,
EADT_ERROR_INTERNAL,
EADT_ERROR_ALREADY_CONVERTED,
EADT_ERROR_INVALID_TYPE,
EADT_ERROR_OUT_OF_MEMORY,
};
struct ADT_Node
{
char const* name;
struct ADT_Node* parent;
/* properties */
ADT_Type type : 4;
u8 props : 4;
#ifndef GEN_PARSER_DISABLE_ANALYSIS
u8 cfg_mode : 1;
u8 name_style : 2;
u8 assign_style : 2;
u8 delim_style : 2;
u8 delim_line_width : 4;
u8 assign_line_width : 4;
#endif
/* adt data */
union
{
char const* string;
Array<ADT_Node> nodes; ///< zpl_array
struct
{
union
{
f64 real;
s64 integer;
};
#ifndef GEN_PARSER_DISABLE_ANALYSIS
/* number analysis */
s32 base;
s32 base2;
u8 base2_offset : 4;
s8 exp : 4;
u8 neg_zero : 1;
u8 lead_digit : 1;
#endif
};
};
};
/* ADT NODE LIMITS
* delimiter and assignment segment width is limited to 128 whitespace symbols each.
* real number limits decimal position to 128 places.
* real number exponent is limited to 64 digits.
*/
/**
* @brief Initialise an ADT object or array
*
* @param node
* @param backing Memory allocator used for descendants
* @param name Node's name
* @param is_array
* @return error code
*/
u8 adt_make_branch( ADT_Node* node, AllocatorInfo backing, char const* name, b32 is_array );
/**
* @brief Destroy an ADT branch and its descendants
*
* @param node
* @return error code
*/
u8 adt_destroy_branch( ADT_Node* node );
/**
* @brief Initialise an ADT leaf
*
* @param node
* @param name Node's name
* @param type Node's type (use zpl_adt_make_branch for container nodes)
* @return error code
*/
u8 adt_make_leaf( ADT_Node* node, char const* name, ADT_Type type );
/**
* @brief Fetch a node using provided URI string.
*
* This method uses a basic syntax to fetch a node from the ADT. The following features are available
* to retrieve the data:
*
* - "a/b/c" navigates through objects "a" and "b" to get to "c"
* - "arr/[foo=123]/bar" iterates over "arr" to find any object with param "foo" that matches the value "123", then gets its field called "bar"
* - "arr/3" retrieves the 4th element in "arr"
* - "arr/[apple]" retrieves the first element of value "apple" in "arr"
*
* @param node ADT node
* @param uri Locator string as described above
* @return zpl_adt_node*
*
* @see code/apps/examples/json_get.c
*/
ADT_Node* adt_query( ADT_Node* node, char const* uri );
/**
* @brief Find a field node within an object by the given name.
*
* @param node
* @param name
* @param deep_search Perform search recursively
* @return zpl_adt_node * node
*/
ADT_Node* adt_find( ADT_Node* node, char const* name, b32 deep_search );
/**
* @brief Allocate an unitialised node within a container at a specified index.
*
* @param parent
* @param index
* @return zpl_adt_node * node
*/
ADT_Node* adt_alloc_at( ADT_Node* parent, sw index );
/**
* @brief Allocate an unitialised node within a container.
*
* @param parent
* @return zpl_adt_node * node
*/
ADT_Node* adt_alloc( ADT_Node* parent );
/**
* @brief Move an existing node to a new container at a specified index.
*
* @param node
* @param new_parent
* @param index
* @return zpl_adt_node * node
*/
ADT_Node* adt_move_node_at( ADT_Node* node, ADT_Node* new_parent, sw index );
/**
* @brief Move an existing node to a new container.
*
* @param node
* @param new_parent
* @return zpl_adt_node * node
*/
ADT_Node* adt_move_node( ADT_Node* node, ADT_Node* new_parent );
/**
* @brief Swap two nodes.
*
* @param node
* @param other_node
* @return
*/
void adt_swap_nodes( ADT_Node* node, ADT_Node* other_node );
/**
* @brief Remove node from container.
*
* @param node
* @return
*/
void adt_remove_node( ADT_Node* node );
/**
* @brief Initialise a node as an object
*
* @param obj
* @param name
* @param backing
* @return
*/
b8 adt_set_obj( ADT_Node* obj, char const* name, AllocatorInfo backing );
/**
* @brief Initialise a node as an array
*
* @param obj
* @param name
* @param backing
* @return
*/
b8 adt_set_arr( ADT_Node* obj, char const* name, AllocatorInfo backing );
/**
* @brief Initialise a node as a string
*
* @param obj
* @param name
* @param value
* @return
*/
b8 adt_set_str( ADT_Node* obj, char const* name, char const* value );
/**
* @brief Initialise a node as a float
*
* @param obj
* @param name
* @param value
* @return
*/
b8 adt_set_flt( ADT_Node* obj, char const* name, f64 value );
/**
* @brief Initialise a node as a signed integer
*
* @param obj
* @param name
* @param value
* @return
*/
b8 adt_set_int( ADT_Node* obj, char const* name, s64 value );
/**
* @brief Append a new node to a container as an object
*
* @param parent
* @param name
* @return*
*/
ADT_Node* adt_append_obj( ADT_Node* parent, char const* name );
/**
* @brief Append a new node to a container as an array
*
* @param parent
* @param name
* @return*
*/
ADT_Node* adt_append_arr( ADT_Node* parent, char const* name );
/**
* @brief Append a new node to a container as a string
*
* @param parent
* @param name
* @param value
* @return*
*/
ADT_Node* adt_append_str( ADT_Node* parent, char const* name, char const* value );
/**
* @brief Append a new node to a container as a float
*
* @param parent
* @param name
* @param value
* @return*
*/
ADT_Node* adt_append_flt( ADT_Node* parent, char const* name, f64 value );
/**
* @brief Append a new node to a container as a signed integer
*
* @param parent
* @param name
* @param value
* @return*
*/
ADT_Node* adt_append_int( ADT_Node* parent, char const* name, s64 value );
/* parser helpers */
/**
* @brief Parses a text and stores the result into an unitialised node.
*
* @param node
* @param base
* @return*
*/
char* adt_parse_number( ADT_Node* node, char* base );
/**
* @brief Parses a text and stores the result into an unitialised node.
* This function expects the entire input to be a number.
*
* @param node
* @param base
* @return*
*/
char* adt_parse_number_strict( ADT_Node* node, char* base_str );
/**
* @brief Parses and converts an existing string node into a number.
*
* @param node
* @return
*/
ADT_Error adt_str_to_number( ADT_Node* node );
/**
* @brief Parses and converts an existing string node into a number.
* This function expects the entire input to be a number.
*
* @param node
* @return
*/
ADT_Error adt_str_to_number_strict( ADT_Node* node );
/**
* @brief Prints a number into a file stream.
*
* The provided file handle can also be a memory mapped stream.
*
* @see zpl_file_stream_new
* @param file
* @param node
* @return
*/
ADT_Error adt_print_number( FileInfo* file, ADT_Node* node );
/**
* @brief Prints a string into a file stream.
*
* The provided file handle can also be a memory mapped stream.
*
* @see zpl_file_stream_new
* @param file
* @param node
* @param escaped_chars
* @param escape_symbol
* @return
*/
ADT_Error adt_print_string( FileInfo* file, ADT_Node* node, char const* escaped_chars, char const* escape_symbol );
#pragma endregion ADT
#pragma region CSV
enum CSV_Error : u32
{
ECSV_Error__NONE,
ECSV_Error__INTERNAL,
ECSV_Error__UNEXPECTED_END_OF_INPUT,
ECSV_Error__MISMATCHED_ROWS,
};
typedef ADT_Node CSV_Object;
GEN_DEF_INLINE u8 csv_parse( CSV_Object* root, char* text, AllocatorInfo allocator, b32 has_header );
u8 csv_parse_delimiter( CSV_Object* root, char* text, AllocatorInfo allocator, b32 has_header, char delim );
void csv_free( CSV_Object* obj );
GEN_DEF_INLINE void csv_write( FileInfo* file, CSV_Object* obj );
GEN_DEF_INLINE String csv_write_string( AllocatorInfo a, CSV_Object* obj );
void csv_write_delimiter( FileInfo* file, CSV_Object* obj, char delim );
String csv_write_string_delimiter( AllocatorInfo a, CSV_Object* obj, char delim );
/* inline */
GEN_IMPL_INLINE u8 csv_parse( CSV_Object* root, char* text, AllocatorInfo allocator, b32 has_header )
{
return csv_parse_delimiter( root, text, allocator, has_header, ',' );
}
GEN_IMPL_INLINE void csv_write( FileInfo* file, CSV_Object* obj )
{
csv_write_delimiter( file, obj, ',' );
}
GEN_IMPL_INLINE String csv_write_string( AllocatorInfo a, CSV_Object* obj )
{
return csv_write_string_delimiter( a, obj, ',' );
}
#pragma endregion CSV
// This is a simple file reader that reads the entire file into memory.
// It has an extra option to skip the first few lines for undesired includes.
// This is done so that includes can be kept in dependency and component files so that intellisense works.
Code scan_file( char const* path )
{
FileInfo file;
FileError error = file_open_mode( &file, EFileMode_READ, path );
if ( error != EFileError_NONE )
{
GEN_FATAL( "scan_file: Could not open: %s", path );
}
sw fsize = file_size( &file );
if ( fsize <= 0 )
{
GEN_FATAL( "scan_file: %s is empty", path );
}
String str = String::make_reserve( GlobalAllocator, fsize );
file_read( &file, str, fsize );
str.get_header().Length = fsize;
// Skip GEN_INTELLISENSE_DIRECTIVES preprocessor blocks
// Its designed so that the directive should be the first thing in the file.
// Anything that comes before it will also be omitted.
{
#define current ( *scanner )
#define matched 0
#define move_fwd() \
do \
{ \
++scanner; \
--left; \
} while ( 0 )
const StrC directive_start = txt( "ifdef" );
const StrC directive_end = txt( "endif" );
const StrC def_intellisense = txt( "GEN_INTELLISENSE_DIRECTIVES" );
bool found_directive = false;
char const* scanner = str.Data;
s32 left = fsize;
while ( left )
{
// Processing directive.
if ( current == '#' )
{
move_fwd();
while ( left && char_is_space( current ) )
move_fwd();
if ( ! found_directive )
{
if ( left && str_compare( scanner, directive_start.Ptr, directive_start.Len ) == matched )
{
scanner += directive_start.Len;
left -= directive_start.Len;
while ( left && char_is_space( current ) )
move_fwd();
if ( left && str_compare( scanner, def_intellisense.Ptr, def_intellisense.Len ) == matched )
{
scanner += def_intellisense.Len;
left -= def_intellisense.Len;
found_directive = true;
}
}
// Skip to end of line
while ( left && current != '\r' && current != '\n' )
move_fwd();
move_fwd();
if ( left && current == '\n' )
move_fwd();
continue;
}
if ( left && str_compare( scanner, directive_end.Ptr, directive_end.Len ) == matched )
{
scanner += directive_end.Len;
left -= directive_end.Len;
// Skip to end of line
while ( left && current != '\r' && current != '\n' )
move_fwd();
move_fwd();
if ( left && current == '\n' )
move_fwd();
// sptr skip_size = fsize - left;
if ( ( scanner + 2 ) >= ( str.Data + fsize ) )
{
mem_move( str, scanner, left );
str.get_header().Length = left;
break;
}
mem_move( str, scanner, left );
str.get_header().Length = left;
break;
}
}
move_fwd();
}
#undef move_fwd
#undef matched
#undef current
}
file_close( &file );
return untyped_str( str );
}
#if 0
struct CodeFile
{
using namespace Parser;
String FilePath;
TokArray Tokens;
Array<ParseFailure> ParseFailures;
Code CodeRoot;
};
namespace Parser
{
struct ParseFailure
{
String Reason;
Code Node;
};
}
CodeFile scan_file( char const* path )
{
using namespace Parser;
CodeFile
result = {};
result.FilePath = String::make( GlobalAllocator, path );
Code code = scan_file( path );
result.CodeRoot = code;
ParseContext context = parser_get_last_context();
result.Tokens = context.Tokens;
result.ParseFailures = context.Failures;
return result;
}
#endif
GEN_NS_END