Large updates to docs

This commit is contained in:
Edward R. Gonzalez 2024-12-10 19:31:50 -05:00
parent a18b5b97aa
commit 8891657eb1
20 changed files with 344 additions and 196 deletions

View File

@ -1,14 +0,0 @@
{
"files": [
{
"path": "project/auxillary/vis_ast/dependencies/temp/raylib-master/src/rcamera.h",
"bookmarks": [
{
"line": 140,
"column": 14,
"label": ""
}
]
}
]
}

View File

@ -20,7 +20,7 @@
"windowsSdkVersion": "10.0.19041.0",
"compilerPath": "C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.29.30133/bin/HostX64/x64/cl.exe",
"intelliSenseMode": "msvc-x64",
"compileCommands": "C:\\projects\\gencpp\\.vscode\\tasks.json",
"compileCommands": "${workspaceFolder}/.vscode/tasks.json",
"compilerArgs": [
"/EHsc-",
"/GR-",
@ -47,12 +47,12 @@
"windowsSdkVersion": "10.0.19041.0",
"compilerPath": "C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.29.30133/bin/HostX64/x64/cl.exe",
"intelliSenseMode": "msvc-x64",
"compileCommands": "C:\\projects\\gencpp\\.vscode\\tasks.json"
"compileCommands": "${workspaceFolder}/.vscode/tasks.json"
},
{
"name": "Win32 clang",
"includePath": [
"${workspaceFolder}/**"
"${workspaceFolder}/base/**"
],
"defines": [
"_DEBUG",
@ -65,9 +65,9 @@
"INTELLISENSE_DIRECTIVES"
],
"windowsSdkVersion": "10.0.19041.0",
"compilerPath": "C:/Users/Ed/scoop/apps/llvm/current/bin/clang++.exe",
"compilerPath": "clang++.exe",
"intelliSenseMode": "windows-clang-x64",
"compileCommands": ".vscode/tasks.json"
"compileCommands": "${workspaceFolder}/.vscode/tasks.json"
}
],
"version": 4

45
.vscode/launch.json vendored
View File

@ -4,39 +4,20 @@
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"type": "lldb",
"request": "launch",
"name": "Debug gentime lldb",
"program": "${workspaceFolder}/test/test.exe",
"args": [],
"cwd": "${workspaceFolder}/test/",
"postRunCommands": [
]
},
{
"type": "cppvsdbg",
"request": "launch",
"name": "Debug gentime vsdbg",
"program": "${workspaceFolder}/test/build/test.exe",
"name": "Debug base vsdbg",
"program": "${workspaceFolder}/base/build/base.exe",
"args": [],
"cwd": "${workspaceFolder}/test/",
"visualizerFile": "${workspaceFolder}/scripts/gencpp.natvis"
},
{
"type": "cppvsdbg",
"request": "launch",
"name": "Debug bootstrap vsdbg",
"program": "${workspaceFolder}/project/build/bootstrap.exe",
"args": [],
"cwd": "${workspaceFolder}/project/",
"cwd": "${workspaceFolder}/base/",
"visualizerFile": "${workspaceFolder}/scripts/gencpp.natvis"
},
{
"type": "cppvsdbg",
"request": "launch",
"name": "Debug singleheader vsdbg",
"program": "${workspaceFolder}/singleheader/build/gencpp_singleheader.exe",
"program": "${workspaceFolder}/singleheader/build/singleheader.exe",
"args": [],
"cwd": "${workspaceFolder}/singleheader/",
"visualizerFile": "${workspaceFolder}/scripts/gencpp.natvis"
@ -49,24 +30,6 @@
"args": [],
"cwd": "${workspaceFolder}/unreal_engine/",
"visualizerFile": "${workspaceFolder}/scripts/gencpp.natvis"
},
{
"type": "cppvsdbg",
"request": "launch",
"name": "Debug raylib refactor vsdbg",
"program": "${workspaceFolder}/project/auxillary/vis_ast/dependencies/raylib/build/raylib_refactor.exe",
"args": [],
"cwd": "${workspaceFolder}/project/auxillary/vis_ast/dependencies/temp/raylib-master/src/",
"visualizerFile": "${workspaceFolder}/scripts/gencpp.natvis"
},
{
"type": "cppvsdbg",
"request": "launch",
"name": "Debug VIS AST",
"program": "${workspaceFolder}/project/auxillary/vis_ast/binaries/vis_ast.exe",
"args": [],
"cwd": "${workspaceFolder}/project/auxillary/vis_ast/binaries/",
"visualizerFile": "${workspaceFolder}/scripts/gencpp.natvis"
}
]
}

View File

@ -2,23 +2,34 @@
An attempt at simple staged metaprogramming for C/C++.
The library API is a composition of code element constructors, and a non-standards-compliant single-pass C/C++ parser.
The library API is a composition of code element constructors, and a non-standards-compliant single-pass C/C++ parser.
These build up a code AST to then serialize with a file builder, or can be traversed for staged-reflection of C/C++ code.
This code base attempts follow the [handmade philosophy](https://handmade.network/manifesto).
Its not meant to be a black box metaprogramming utility, it should be easy to intergrate into a user's project domain.
# Documentation
## Documentation
[]
If your going to metaprogram, you can never have enough docs on your tooling...
* [docs - General](./docs/Readme.md): Overview and additional docs
* [AST_Design](./docs/AST_Design.md): Overvie of ASTs
* [AST Types](./docs/AST_Types.md): Listing of all AST types along with their Code type interface.
* [Parsing](./docs/Parsing.md): Overview of the parsing interface.
* [Parser Algo](./docs/Parser_Algo.md): In-depth breakdown of the parser's implementation.
* [base](./base/Readme.md): Essential (base) library.
* [gen_c_library](./gen_c_library/Readme.md): C11 library variant generation (single header and segmeented).
* [gen_segmented](./gen_segemented/Readme.md): Segemented C++ (`gen.<hpp/cpp>`, `gen.dep.<hpp/cpp>`) generation
* [gen_singleheader](./gen_singleheader/Readme.md): Singlehader C++ generation `gen.hpp`
* [gen_unreal_engine](./gen_unreal_engine/Readme.md): Unreal Engine thirdparty code generation.
## Notes
**On Partial Hiatus: Life has got me tackling other issues..**
I will be passively updating the library with bug fixes and minor improvements as I use it for my personal projects.
I will be passively updating the library with bug fixes and minor improvements as I use it for my personal projects.
There won't be any major reworks or features to this thing for a while.
This project is still in development (very much an alpha state), so expect bugs and missing features.
This project is still in development (very much an alpha state), so expect bugs and missing features.
See [issues](https://github.com/Ed94/gencpp/issues) for a list of known bugs or todos.
The library can already be used to generate code just fine, but the parser is where the most work is needed. If your C++ isn't "down to earth" expect issues.
@ -56,7 +67,7 @@ u32 gen_main()
#endif
```
The design uses a constructive builder API for the code to generate.
The design uses a constructive builder API for the code to generate.
The user is provided `Code` objects that are used to build up the AST.
Example using each construction interface:
@ -112,9 +123,9 @@ Code header = code_str(
);
```
`name` is a helper macro for providing a string literal with its size, intended for the name parameter of functions.
`code` is a helper macro for providing a string literal with its size, but intended for code string parameters.
`args` is a helper macro for providing the number of arguments to varadic constructors.
`name` is a helper macro for providing a string literal with its size, intended for the name parameter of functions.
`code` is a helper macro for providing a string literal with its size, but intended for code string parameters.
`args` is a helper macro for providing the number of arguments to varadic constructors.
`code_str` is a helper macro for writting `untyped_str( code( <content> ))`
All three constrcuton interfaces will generate the following C code:
@ -128,7 +139,7 @@ struct ArrayHeader
};
```
**Note: The formatting shown here is not how it will look. For your desired formatting its recommended to run a pass through the files with an auto-formatter.**
**Note: The formatting shown here is not how it will look. For your desired formatting its recommended to run a pass through the files with an auto-formatter.**
*(The library currently uses clang-format for formatting, beware its pretty slow...)*
## Building

View File

@ -1,4 +1,10 @@
# Documentation
## Navigation
# base
[Top](../Readme.md)
* [docs](../docs/Readme.md)
The library is fragmented into a series of headers and source files meant to be scanned in and then generated to a standard target format, or a user's desires.
@ -17,17 +23,108 @@ Standard formats:
* **`Scanner`**: Interface to load up `Code` from files two basic funcctions are currently provided.
* `scan_file`: Used mainly by the library format generators to directly scan files into untyped `Code` (raw string content, pre-formatted no AST parsed).
* `parse_file`: Used to read file and then parsed to populate a `CodeBody` AST.
* CSV parsing via one or two columns simplified.
* **gen_segemetned**: Dependencies go into gen.dep.{hpp/cpp} and components into gen.{hpp/cpp}
* **gen_singleheader**: Everything into a single file: gen.hpp
* **gen_unreal_engine**: Like gen_segemented but the library is modified slightly to compile as a thirdparty library within an Unreal Engine plugin or module.
* **gen_c_library**: The library is heavily modifed into C11 compliant code. A segemented and single-header set of variants are generatd.
Code not making up the core library is located in `auxiliary/<auxiliary_name>.<hpp/cpp>`. These are optional extensions or tools for the library.
*Note: A variant of the C++ library could be generated where those additonal support features are removed (see gen_c_library implementation for an idea of how)*
## Dependencies
The project has no external dependencies beyond:
* `errno.h`
* `stat.h`
* `stdarg.h`
* `stddef.h`
* `stdio.h`
* `copyfile.h` (Mac)
* `types.h` (Linux)
* `sys/man.h` (Linux)
* `fcntl.h` (POSXIX Filesystem)
* `unistd.h` (Linux/Mac)
* `intrin.h` (Windows)
* `io.h` (Windows with gcc)
* `windows.h` (Windows)
Dependencies for the project are wrapped within `GENCPP_ROLL_OWN_DEPENDENCIES` (Defining it will disable them).
The majority of the dependency's implementation was derived from the [c-zpl library](https://github.com/zpl-c/zpl).
See the following files for any updates:
* [`platform.hpp`](./dependencies/platform.hpp)
* [`src_start.cpp`](./dependencies/src_start.cpp)
* [`filesystem.cpp`](./dependencies/filesystem.cpp)
* [`memory.cpp`](./dependencies/memory.cpp)
## Conventions
This library was written in a subset of C++ where the following are not used at all:
* RAII (Constructors/Destructors), lifetimes are managed using named static or regular functions.
* Language provide dynamic dispatch, RTTI
* Object-Oriented Inheritance
* Exceptions
Polymorphic & Member-functions are used as an ergonomic choice, along with a conserative use of operator overloads.
The base library itself does not use anything but C-like features to allow for generating a derviative compatiable with C (WIP).
## C++ template usage
There are only 4 template definitions in the entire library (C++ versions). (`Array<Type>`, `Hashtable<Type>`, `swap<Type>`, and `tmpl_cast<CodeT>(CodeT code)`)
Two generic templated containers are used throughout the library:
* `template< class Type> struct Array`
* `template< class Type> struct HashTable`
`tmpl_cast<CodeT>(CodeT code)` is just an alternative way to explicitly cast to code. Its usage is wrapped in a macro called `cast` for the base library (needed for interoperability with C).
`template< class Type> swap( Type& a, Type& b)` is used over a macro.
Otherwise the library is free of any templates.
## Macro usage
Since this is a meta-programming library, it was desired to keep both templates and macros (especially macros) usage very limited.
Most macros are defined within [macros.hpp](./dependencies/macros.hpp).
The most advanced macro usage is `num_args` which is a helper for counting the number of arguments of another macro.
Any large macros used implementing the gen interface or parser are going to be phased out in favor of just forcinlined functions.
*(Unless there is a hot-path that requires them)*
The vast majority of macros should be single-line subsitutions that either add:
* Improvements to searching
* Inteniality of keyword usage
* A feature that only the preprocessor has (ex: function name reflection or stringifying)
* Compatibility of statements or expressions bewteen C & C++ that cannot be parsed by gencpp itself.
* Masking highly verbose syntax (the latter is getting phased out).
[gen_c_library](../gen_c_library/) has the most advanced set of macros for c11's generic selection.
* A significant amount of explicit code geneeration is utilized to keep abuse of the preprocessor to the absolute minimum.
* There is a heavy set of documentation inlined wth them; their naming is also highly verbose and explicit.
* See its documentation for more information.
## On base code generation
There are ***five*** header files which are automatically generated by [base_codegen.hpp](./helpers/base_codegen.hpp). They are all located in [components/gen](./components/gen/).
* [`ecode.hpp`](./components/gen/ecode.hpp): `CodeType` enum definition and related implementaiton. Generation is based off of [`ECodeType.csv](./enums/ECodeTypes.csv).
* [`especifier.hpp`](./components/gen/especifier.hpp): `Specifier` enum definition, etc. Generated using [`ESpecifier.csv`](./enums/ESpecifier.csv).
* [`eoperator.hpp`](./components/gen/eoperator.hpp): `Operator` enum definition, etc. Generated using [`EOperator.hpp`](./enums/EOperator.csv).
* [`etoktype.cpp`](./components/gen/etoktype.cpp): `TokType` enum defininition, etc. Used by the lexer and parser backend. Uses two csvs:
* [`ETokType.csv`](./enums/ETokType.csv): Provides the enum entries and their strinng ids.
* [`AttributeTokens.csv`](./enums/AttributeTokens.csv): Provides tokens entries that should be considered as attributes by the lexer and parser. Sspecfiically macro attributes such as those use for exporting symbols.
* [`ast_inlines.hpp`](./components/gen/ast_inlines.hpp): Member trivial `operator` definitions for C++ code types. Does not use a csv.
## On multi-threading
Currently unsupported. I want the library to be *stable* and *correct*, with the addition of exhausting all basic single-threaded optimizations before I consider multi-threading.

View File

@ -240,22 +240,22 @@ void init()
}
if (Allocator_DataArrays.Proc == nullptr) {
Allocator_DataArrays = heap();
Allocator_DataArrays = GlobalAllocator;
}
if (Allocator_CodePool.Proc == nullptr ) {
Allocator_CodePool = heap();
Allocator_CodePool = GlobalAllocator;
}
if (Allocator_Lexer.Proc == nullptr) {
Allocator_Lexer = heap();
Allocator_Lexer = GlobalAllocator;
}
if (Allocator_StringArena.Proc == nullptr) {
Allocator_StringArena = heap();
Allocator_StringArena = GlobalAllocator;
}
if (Allocator_StringTable.Proc == nullptr) {
Allocator_StringTable = heap();
Allocator_StringTable = GlobalAllocator;
}
if (Allocator_TypeTable.Proc == nullptr) {
Allocator_TypeTable = heap();
Allocator_TypeTable = GlobalAllocator;
}
// Setup the arrays

View File

@ -60,7 +60,7 @@ struct Opts_def_struct {
s32 num_interfaces;
ModuleFlag mflags;
};
CodeClass def_class( StrC name, Opts_def_struct otps GEN_PARAM_DEFAULT );
CodeClass def_class( StrC name, Opts_def_struct opts GEN_PARAM_DEFAULT );
struct Opts_def_constructor {
CodeParam params;
@ -69,7 +69,10 @@ struct Opts_def_constructor {
};
CodeConstructor def_constructor( Opts_def_constructor opts GEN_PARAM_DEFAULT );
CodeDefine def_define( StrC name, StrC content );
struct Opts_def_define {
b32 dont_append_preprocess_defines;
};
CodeDefine def_define( StrC name, StrC content, Opts_def_define opts GEN_PARAM_DEFAULT );
struct Opts_def_destructor {
Code body;

View File

@ -8,6 +8,8 @@
// Publically Exposed Interface
void parser_define_macro( StrC )
CodeClass parse_class( StrC def )
{
GEN_USING_NS_PARSER;

View File

@ -417,7 +417,6 @@ OpValidateResult operator__validate( Operator op, CodeParam params_code, CodeTyp
return InvalidCode;
#pragma endregion Helper Marcos
/*
The implementaiton of the upfront constructors involves doing three things:
* Validate the arguments given to construct the intended type of AST is valid.
@ -616,7 +615,7 @@ CodeClass def_class( StrC name, Opts_def_struct p )
return result;
}
CodeDefine def_define( StrC name, StrC content )
CodeDefine def_define( StrC name, StrC content, Opts_def_define p )
{
name_check( def_define, name );
@ -640,6 +639,17 @@ CodeDefine def_define( StrC name, StrC content )
else
result->Content = get_cached_string( string_to_strc(string_fmt_buf(GlobalAllocator, "%SC\n", content)) );
b32 append_preprocess_defines = ! p.dont_append_preprocess_defines;
if ( append_preprocess_defines ) {
// Add the define to PreprocessorDefines for usage in parsing
s32 lex_id_len = 0;
for (; lex_id_len < result->Name.Len; ++ lex_id_len ) {
if ( reuslt->Name.Ptr[lex_id_len] == '(' )
break;
}
StrC lex_id = { lex_id_len, result->Name.Ptr };
array_append(PreprocessorDefines, lex_id );
}
return result;
}

View File

@ -1,6 +1,6 @@
#ifdef GEN_INTELLISENSE_DIRECTIVES
#pragma once
#include "gen.hpp"
#include "../gen.hpp"
#endif
#pragma region StaticData

View File

@ -47,6 +47,7 @@ void clang_format_file( char const* path, char const* style_path )
system( command );
log_fmt("\tclang-format finished formatting.\n");
}
// Will refactor a file with the given script at the provided path.
// Assumes refactor is defined in an user-exposed or system enviornment PATH.
// (See: ./gencpp/scripts/build.ci.ps1 for how)
@ -65,6 +66,8 @@ void refactor_file( char const* path, char const* refactor_script )
#undef refactor
}
// Does either of the above or both to the provided code.
// Code returned will be untyped content (its be serialized)
Code code_refactor_and_format( Code code, char const* scratch_path, char const* refactor_script, char_const* clang_format_sytle_path )
{
GEN_ASSERT_NOT_NULL(code);

View File

@ -1,44 +1,58 @@
## Navigation
[Top](../Readme.md)
<- [docs - General](Readme.md)
## Current Design
`AST` is the actual managed node object for the library.
Its raw and really not meant to be used directly.
All user interaction must be with its pointer so the type they deal with is `AST*`.
For user-facing code, they should never be giveen a nullptr. Instead, they should be given a designated `Invalid` AST node.
In order to abstract away constant use of `AST*` its wrapped in a Code type which can be either:
In order to abstract away constant use of `AST*`, I wanted to provide a wrapper for it.
The simpliest being just a type alias.
```cpp
using Code = AST*;
When its the [C generated variant of the library](../gen_c_library/)
```c
typedef AST* Code;
tyepdef AST_<name>* Code<name>;
...
```
This is what the genc library would have to use due to its constraints of a langauge.
The actual content per type of AST is covered within [AST_Types.md](AST_Types.md).
These are pure PODS that just have the lay members relevant to the type of AST node they represent.
Each of them has a Code type alias specific to it.
Again, the simpliest case for these would be a type alias.
**or**
For C++:
```cpp
using struct AST_Typedef CodeTypedef;
struct Code {
AST* ast;
};
struct Code<name> {
...
AST_<name>* ast;
};
```
As of November 21st, 2023, the AST has had a strict layout for how its content is laid out.
This will be abandoned during its redesign that will occur starting with support for statments & expressions for either execution and type declarations.
Having a strict layout is too resctrictive vs allowing each AST type to have maximum control over the layout.
The full definitions of all asts are within:
The redesign will occur after the following todos are addressed:
* [`ast.hpp`](../base/components/ast.hpp)
* [`ast_types.hpp`](../base/components/ast_types.hpp)
* [`code_types.hpp`](../base/components/ast_types.hpp)
* [Improvements Lexer & Token struct#27](https://github.com/Ed94/gencpp/issues/27)
* [Generalize AST Flags to a single 4-byte flag#42](https://github.com/Ed94/gencpp/issues/42)
* [AST-Code Object Redesign.#38](https://github.com/Ed94/gencpp/issues/38)
* [Code-AST Documentation#40](https://github.com/Ed94/gencpp/issues/40)
* [AST::debug_str() improvements#33](https://github.com/Ed94/gencpp/issues/33)
* [AST::is_equal implemented and works with singleheader-test#31](https://github.com/Ed94/gencpp/issues/31)
* [Parser : Add ability to have a parse failure and continue with errors recorded.#35](https://github.com/Ed94/gencpp/issues/35)
* [Scanner : Add CodeFile#29](https://github.com/Ed94/gencpp/issues/29)
* [Auxiliary : AST visual debugger#36](https://github.com/Ed94/gencpp/issues/36)
## Serialization
All code types can either serialize using a function of the pattern:
```c
String <prefix>_to_string(Code code);
// or
<prefix>_to_string(Code code, String& result);
```
Where the first generates strings allocated using Allocator_StringArena and the other appends an existing strings with their backed allocator.
Serialization of for the AST is defined for `Code` in [`ast.chpp`](../base/components/ast.cpp) with `code_to_string_ptr` & `code_to_string`.
Serializtion for the rest of the code types is within [`code_serialization.cpp`](../base/components/code_serialization.cpp).
gencpp's serialization does not provide coherent formatting of the code. The user should use a formatter after.

View File

@ -1,10 +1,16 @@
## Navigation
[Top](../Readme.md)
<- [docs - General](Readme.md)
# AST Types Documentation
While the Readme for docs covers the data layout per AST, this will focus on the AST types avaialble, and their nuances.
## Body
These are containers representing a scope body of a definition that can be of the following `ECode` type:
These are containers representing a scope body of a definition that can be of the following `CodeType` type:
* Class_Body
* Enum_Body

View File

@ -1,3 +1,9 @@
## Navigation
[Top](../Readme.md)
<- [docs - General](Readme.md)
# Parser's Algorithim
gencpp uses a hand-written recursive descent parser. Both the lexer and parser currently handle a full C/C++ file in a single pass.
@ -7,7 +13,7 @@ gencpp uses a hand-written recursive descent parser. Both the lexer and parser c
### Lexer
The lex procedure does the lexical pass of content provided as a `StrC` type.
The tokens are stored (for now) in `gen::parser::Tokens`.
The tokens are stored (for now) in `gen::parser::Lexer_Tokens`.
Fields:
@ -16,7 +22,7 @@ Array<Token> Arr;
s32 Idx;
```
What token types are supported can be found in [ETokType.csv](../project/enums/ETokType.csv) you can also find the token types in [ETokType.h](../project/components/gen/etoktype.cpp) , which is the generated enum from the csv file.
What token types are supported can be found in [ETokType.csv](../base/enums/ETokType.csv) you can also find the token types in [ETokType.h](../base/components/gen/etoktype.cpp) , which is the generated enum from the csv file.
Tokens are defined with the struct `gen::parser::Token`:

View File

@ -1,10 +1,16 @@
## Navigation
[Top](../Readme.md)
<- [docs - General](Readme.md)
# Parsing
The library features a naive parser tailored for only what the library needs to construct the supported syntax of C++ into its AST.
The library features a naive single-pass parser tailored for only what the library needs to construct the supported syntax of C++ into its AST for *"front-end"* meta-programming purposes.
This parser does not, and should not do the compiler's job. By only supporting this minimal set of features, the parser is kept (so far) around ~5600 loc. I hope to keep it under 10k loc worst case.
You can think of this parser of a frontend parser vs a semantic parser. Its intuitively similar to WYSIWYG. What you precerive as the syntax from the user-side before the compiler gets a hold of it, is what you get.
You can think of this parser as *frontend parser* vs a *semantic parser*. Its intuitively similar to WYSIWYG. What you ***precerive*** as the syntax from the user-side before the compiler gets a hold of it, is what you get.
User exposed interface:
@ -47,10 +53,11 @@ The keywords supported for the preprocessor are:
* endif
* pragma
Each directive `#` line is considered one preproecessor unit, and will be treated as one Preprocessor AST. *These ASTs will be considered members or entries of braced scope they reside within*.
Each directive `#` line is considered one preproecessor unit, and will be treated as one Preprocessor AST.
If a directive is used with an unsupported keyword its will be processed as an untyped AST.
The preprocessor lines are stored as members of their associated scope they are parsed within. ( Global, Namespace, Class/Struct )
The preprocessor lines are stored as members of their associated scope they are parsed within. ( Global, Namespace, Class/Struct )
***Again (Its not standard): These ASTs will be considered members or entries of braced scope they reside within***
Any preprocessor definition abuse that changes the syntax of the core language is unsupported and will fail to parse if not kept within an execution scope (function body, or expression assignment).
Exceptions:
@ -59,6 +66,8 @@ Exceptions:
* Disable with: `#define GEN_PARSER_DISABLE_MACRO_FUNCTION_SIGNATURES`
* typedefs allow for a preprocessed macro: `typedef MACRO();`
* Disable with: `#define GEN_PARSER_DISABLE_MACRO_TYPEDEF`
* Macros can behave as typenames
* There is some macro support in paramters for functions or templates *(Specifically added to support parsing Unreal Engine source)*.
*(Exceptions are added on an on-demand basis)*
*(See functions `parse_operator_function_or_variable` and `parse_typedef` )*
@ -67,15 +76,23 @@ Adding your own exceptions is possible by simply modifying the parser to allow f
*Note: You could interpret this strictness as a feature. This would allow the user to see if their codebase or a third-party's codebase some some egregious preprocessor abuse.*
If a macro is not defined withint e scope of parsing a set of files, it can be defined beforehand by:
* Appending the [`PreprocessorDefines`](https://github.com/Ed94/gencpp/blob/a18b5b97aa5cfd20242065cbf53462a623cd18fa/base/components/header_end.hpp#L137) array.
* For functional macros a "(" just needs to be added after the name like: `<name>(` so that it will tokenize its arguments as part of the token during lexing.
* Defining a CodeDefine using `def_define`. The definition will be processed by the interface for user into `PreprocessorDefines`.
* This can be prevented by setting the optional prameter `dont_append_preprocess_defines`.
The lexing and parsing takes shortcuts from whats expected in the standard.
* Numeric literals are not checked for validity.
* The parse API treats any execution scope definitions with no validation and are turned into untyped Code ASTs.
* The parse API treats any execution scope definitions with no validation and are turned into untyped Code ASTs. (There is a [todo](https://github.com/Ed94/gencpp/issues/49) to add support)
* *This includes the assignment of variables.*
* Attributes ( `[[]]` (standard), `__declspec` (Microsoft), or `__attribute__` (GNU) )
* Assumed to *come before specifiers* (`const`, `constexpr`, `extern`, `static`, etc) for a function or right afterthe return type.
* Or in the usual spot for class, structs, (*right after the declaration keyword*)
* typedefs have attributes with the type (`parse_type`)
* Parsing attributes can be extended to support user defined macros by defining `GEN_DEFINE_ATTRIBUTE_TOKENS` (see `gen.hpp` for the formatting)
* This is useful for example: parsing Unreal `Module_API` macros.
Empty lines used throughout the file are preserved for formatting purposes during ast serialization.

View File

@ -1,46 +1,15 @@
## Documentation
# General Docs
The project has no external dependencies beyond:
[Top](../Readme.md)
* `errno.h`
* `stat.h`
* `stdarg.h`
* `stddef.h`
* `stdio.h`
* `copyfile.h` (Mac)
* `types.h` (Linux)
* `unistd.h` (Linux/Mac)
* `intrin.h` (Windows)
* `io.h` (Windows with gcc)
* `windows.h` (Windows)
Contains:
Dependencies for the project are wrapped within `GENCPP_ROLL_OWN_DEPENDENCIES` (Defining it will disable them).
The majority of the dependency's implementation was derived from the [c-zpl library](https://github.com/zpl-c/zpl).
* [AST_Design](./AST_Design.md): Overvie of ASTs
* [AST Types](./AST_Types.md): Listing of all AST types along with their Code type interface.
* [Parsing](./Parsing.md): Overview of the parsing interface.
* [Parser Algo](./Parser_Algo.md): In-depth breakdown of the parser's implementation.
This library was written in a subset of C++ where the following are not used at all:
* RAII (Constructors/Destructors), lifetimes are managed using named static or regular functions.
* Language provide dynamic dispatch, RTTI
* Object-Oriented Inheritance
* Exceptions
Polymorphic & Member-functions are used as an ergonomic choice, along with a conserative use of operator overloads.
The base library itself does not use anything but C-like features to allow for generating a derviative compatiable with C (WIP).
There are only 4 template definitions in the entire library (C++ versions). (`Array<Type>`, `Hashtable<Type>`, `swap<Type>`, and `AST/Code::cast<Type>`)
Two generic templated containers are used throughout the library:
* `template< class Type> struct Array`
* `template< class Type> struct HashTable`
Both Code and AST definitions have a `template< class Type> Code/AST :: cast()`. Its just an alternative way to explicitly cast to each other.
`template< class Type> swap( Type& a, Type& b)` is used over a macro.
Otherwise the library is free of any templates.
### *WHAT IS NOT PROVIDED*
### *CURRENTLY UNSUPPORTED*
**There is no support for validating expressions.**
Its a [todo](https://github.com/Ed94/gencpp/issues/49)
@ -60,48 +29,49 @@ Its a [todo](https://github.com/Ed94/gencpp/issues/21)
### Feature Macros:
* `GEN_DEFINE_ATTRIBUTE_TOKENS` : Allows user to define their own attribute macros for use in parsing.
* This is auto-generated if using the bootstrap or single-header generation
* *Note: The user will use the `AttributeTokens.csv` when the library is fully self-hosting.*
* This can be generated using base.cpp.
* `GEN_DEFINE_LIBRARY_CORE_CONSTANTS` : Optional typename codes as they are non-standard to C/C++ and not necessary to library usage
* `GEN_DONT_ENFORCE_GEN_TIME_GUARD` : By default, the library ( gen.hpp/ gen.cpp ) expects the macro `GEN_TIME` to be defined, this disables that.
* `GEN_ENFORCE_STRONG_CODE_TYPES` : Enforces casts to filtered code types.
* `GEN_EXPOSE_BACKEND` : Will expose symbols meant for internal use only.
* `GEN_ROLL_OWN_DEPENDENCIES` : Optional override so that user may define the dependencies themselves.
* `GEN_DONT_ALLOW_INVALID_CODE` (Not implemented yet) : Will fail when an invalid code is constructed, parsed, or serialized.
* `GEN_C_LIKE_PP` : Will prevent usage of function defnitions using references and structs with member functions.
Structs will still have user-defined operator conversions, for-range support, and other operator overloads
* `GEN_C_LIKE_PP` : Setting to `<true or 1>` Will prevent usage of function defnitions using references and structs with member functions. Structs will still have user-defined operator conversions, for-range support, and other operator overloads
### The Data & Interface
As mentioned in root readme, the user is provided Code objects by calling the constructor's functions to generate them or find existing matches.
The AST is managed by the library and provided to the user via its interface.
The AST is managed by the library and provided to the user via its interface.
However, the user may specifiy memory configuration.
[Data layout of AST struct (Subject to heavily change with upcoming todos)](../base/components/ast.hpp#L396-461)
https://github.com/Ed94/gencpp/blob/eea4ebf5c40d5d87baa465abfb1be30845b2377e/base/components/ast.hpp#L396-L461
*`CodeT` is a typedef for `ECode::Type` which has an underlying type of `u32`*
*`OperatorT` is a typedef for `EOperator::Type` which has an underlying type of `u32`*
*`StringCahced` is a typedef for `String const`, to denote it is an interned string*
*`String` is the dynamically allocated string type for the library*
*`CodeType` is enum taggin the type of code. Has an underlying type of `u32`*
*`OperatorT` is a typedef for `EOperator::Type` which has an underlying type of `u32`*
*`StringCahced` is a typedef for `String const`, to denote it is an interned string*
*`String` is the dynamically allocated string type for the library*
AST widths are setup to be AST_POD_Size.
AST widths are setup to be AST_POD_Size.
The width dictates how much the static array can hold before it must give way to using an allocated array:
```cpp
constexpr static
usize ArrSpecs_Cap =
int AST_ArrSpecs_Cap =
(
AST_POD_Size
- sizeof(AST*) * 3
- sizeof(Code)
- sizeof(StringCached)
- sizeof(CodeT)
- sizeof(Code) * 2
- sizeof(Token*)
- sizeof(Code)
- sizeof(CodeType)
- sizeof(ModuleFlag)
- sizeof(u32)
)
/ sizeof(SpecifierT) -1; // -1 for 4 extra bytes (Odd num of AST*)
/ sizeof(Specifier) - 1;
```
*Ex: If the AST_POD_Size is 128 the capacity of the static array is 20.*
@ -110,7 +80,7 @@ Data Notes:
* The allocator definitions used are exposed to the user incase they want to dictate memory usage
* You'll find the memory handling in `init`, `deinit`, `reset`, `gen_string_allocator`, `get_cached_string`, `make_code`.
* Allocators are defined with the `AllocatorInfo` structure found in `dependencies\memory.hpp`
* Allocators are defined with the `AllocatorInfo` structure found in [`memory.hpp`](../base/dependencies/memory.hpp)
* Most of the work is just defining the allocation procedure:
```cpp
@ -118,30 +88,30 @@ Data Notes:
```
* ASTs are wrapped for the user in a Code struct which is a wrapper for a AST* type.
* Both AST and Code have member symbols but their data layout is enforced to be POD types.
* Code types have member symbols but their data layout is enforced to be POD types.
* This library treats memory failures as fatal.
* Cached Strings are stored in their own set of arenas. AST constructors use cached strings for names, and content.
* `StringArenas`, `StringCache`, `Allocator_StringArena`, and `Allocator_StringTable` are the associated containers or allocators.
* Strings used for serialization and file buffers are not contained by those used for cached strings.
* They are currently using `GlobalAllocator`, which are tracked array of arenas that grows as needed (adds buckets when one runs out).
* Memory within the buckets is not reused, so its inherently wasteful.
* I will be augmenting the single arena with a simple slag allocator.
* Linked lists used children nodes on bodies, and parameters.
* I will be augmenting the default allocator with virtual memory & a slab allocator in the [future](https://github.com/Ed94/gencpp/issues/12)
* Intrusive linked lists used children nodes on bodies, and parameters.
* Its intended to generate the AST in one go and serialize after. The constructors and serializer are designed to be a "one pass, front to back" setup.
* Allocations can be tuned by defining the folloiwng macros:
* Allocations can be tuned by defining the folloiwng macros (will be moved to runtime configuration in the future):
* `GEN_GLOBAL_BUCKET_SIZE` : Size of each bucket area for the global allocator
* `GEN_CODEPOOL_NUM_BLOCKS` : Number of blocks per code pool in the code allocator
* `GEN_SIZE_PER_STRING_ARENA` : Size per arena used with string caching.
* `GEN_MAX_COMMENT_LINE_LENGTH` : Longest length a comment can have per line.
* `GEN_MAX_NAME_LENGTH` : Max length of any identifier.
* `GEN_MAX_UNTYPED_STR_LENGTH` : Max content length for any untyped code.
* `GEN_TOKEN_FMT_TOKEN_MAP_MEM_SIZE` : token_fmt_va uses local_persit memory of this size for the hashtable.
* `TokenMap_FixedArena` : token_fmt_va uses local_persit memory of this arena type for the hashtable.
* `GEN_LEX_ALLOCATOR_SIZE`
* `GEN_BUILDER_STR_BUFFER_RESERVE`
The following CodeTypes are used which the user may optionally use strong typing with if they enable: `GEN_ENFORCE_STRONG_CODE_TYPES`
* CodeBody : Has support for `for-range` iterating across Code objects.
* CodeBody : Has support for `for : range` iterating across Code objects.
* CodeAttributes
* CodeComment
* CodeClass
@ -158,13 +128,13 @@ The following CodeTypes are used which the user may optionally use strong typing
* CodeNS
* CodeOperator
* CodeOpCast
* CodeParam : Has support for `for-range` iterating across parameters.
* CodeParam : Has support for `for : range` iterating across parameters.
* CodePreprocessCond
* CodePragma
* CodeSpecifiers : Has support for `for-range` iterating across specifiers.
* CodeSpecifiers : Has support for `for : range` iterating across specifiers.
* CodeStruct
* CodeTemplate
* CodeType
* CodeTypename
* CodeTypedef
* CodeUnion
* CodeUsing
@ -249,6 +219,7 @@ Code <name>
```
When using the body functions, its recommended to use the args macro to auto determine the number of arguments for the varadic:
```cpp
def_global_body( args( ht_entry, array_ht_entry, hashtable ));
@ -329,6 +300,7 @@ Code <name> = untyped_str( code(
```
Optionally, `code_str`, and `code_fmt` macros can be used so that the code macro doesn't have to be used:
```cpp
Code <name> = code_str( <some code without "" quotes > )
```
@ -358,8 +330,8 @@ The following are provided predefined by the library as they are commonly used:
* `module_global_fragment`
* `module_private_fragment`
* `fmt_newline`
* `param_varaidc` (Used for varadic definitions)
* `pragma_once`
* `param_varaidc` (Used for varadic definitions)
* `preprocess_else`
* `preprocess_endif`
* `spec_const`
@ -375,6 +347,7 @@ The following are provided predefined by the library as they are commonly used:
* `spec_local_persist` (local_persist macro)
* `spec_mutable`
* `spec_neverinline`
* `spec_noexcept`
* `spec_override`
* `spec_ptr`
* `spec_pure`
@ -406,8 +379,8 @@ Optionally the following may be defined if `GEN_DEFINE_LIBRARY_CODE_CONSTANTS` i
* `t_u16`
* `t_u32`
* `t_u64`
* `t_sw` (ssize_t)
* `t_uw` (size_t)
* `t_ssize` (ssize_t)
* `t_usize` (size_t)
* `t_f32`
* `t_f64`
@ -425,7 +398,7 @@ and have the desired specifiers assigned to them beforehand.
## Code generation and modification
There are three provided auxillary interfaces:
There are two provided auxillary interfaces:
* Builder
* Scanner
@ -439,4 +412,4 @@ There are three provided auxillary interfaces:
### Scanner Auxillary Interface
Provides *(eventually)* `scan_file` to automatically populate a CodeFile which contains a parsed AST (`Code`) of the file, with any contextual failures that are reported from the parser.

23
gen_c_library/Readme.md Normal file
View File

@ -0,0 +1,23 @@
## Navigation
# base
[Top](../Readme.md)
* [docs](../docs/Readme.md)
# C Library Generation
`c_library.cpp` generates both *segemnted* and *singleheader* variants of the library compliant with C11.
The output will be in the `gen_segmented/gen` directory (if the directory does not exist, it will create it).
If using the library's provided build scripts:
```ps1
.\build.ps1 <compiler> <debug or omit> c_library
```
All free from tag identifiers will be prefixed with `gen_` or `GEN_` as the namespace. This can either be changed after generation with a `.refactor` script (or your preferred subst method), OR by modifying c_library.refactor.
**If c_library.refactor is modified you may need to modify c_library.cpp and its [components](./components/). As some of the container generation relies on that prefix.**

View File

@ -1,4 +1,12 @@
# Generate Segemented Library
## Navigation
# base
[Top](../Readme.md)
* [docs](../docs/Readme.md)
# Segemented Library Generation
The principal (user) files are `gen.hpp` and `gen.cpp`.
They contain includes for its various components: `components/<component_name>.<hpp/cpp>`
@ -6,10 +14,8 @@ They contain includes for its various components: `components/<component_name>.<
Dependencies are bundled into `gen.dep.<hpp/cpp>`. They are included in `gen.<hpp/cpp>` before component includes.
Just like the `gen.<hpp/cpp>` they include their components: `dependencies/<dependency_name>.<hpp/cpp>`
If using the library's provided build scripts:
Both libraries use *pre-generated* (self-hosting I guess) version of the library to then generate the latest version of itself.
They dedicated header and source files. Dependencies included at the top of the file and each header starting with a pragma once.
The output will be in the `project/gen` directory (if the directory does not exist, it will create it).
Use those to get a general idea of how to make your own tailored version.
```ps1
.\build.ps1 <compiler> <debug or omit> segmented
```

View File

@ -1,4 +1,18 @@
## Navigation
# base
[Top](../Readme.md)
* [docs](../docs/Readme.md)
# Singleheader
Creates a single header file version of the library using `singleheader.cpp`.
Follows the same convention seen in the gb, stb, and zpl libraries.
If using the library's provided build scripts:
```ps1
.\build.ps1 <compiler> <debug or omit> singleheader
```

View File

@ -1,3 +1,17 @@
## Navigation
# base
[Top](../Readme.md)
* [docs](../docs/Readme.md)
# Unreal Engine Version Generator
This generates a variant of gencpp thats compatiable with use as a thirdparty module within a plugin or module of an Unreal Project or the Engine itself.
This generates a variant of gencpp thats compatiable with use as a thirdparty module within a plugin or module of an Unreal Project or the Engine itself.
If using the library's provided build scripts:
```ps1
.\build.ps1 <compiler> <debug or omit> unreal
```