Updates to docs and various changes to project from working on them.

- CodeParam -> CodeParams
- interfaces array weren't being used in parse_class_struct
- added enum_underlying_sig
This commit is contained in:
2024-12-11 13:33:35 -05:00
parent 9b86d84dd6
commit cb898595b0
31 changed files with 308 additions and 249 deletions

View File

@@ -39,6 +39,8 @@ The full definitions of all asts are within:
* [`ast_types.hpp`](../base/components/ast_types.hpp)
* [`code_types.hpp`](../base/components/ast_types.hpp)
The C/C++ interface procedures are located with `ast.hpp` (for the Code type), and `code_types.hpp` for all others.
## Serialization
All code types can either serialize using a function of the pattern:
@@ -51,8 +53,6 @@ String <prefix>_to_string(Code code);
Where the first generates strings allocated using Allocator_StringArena and the other appends an existing strings with their backed allocator.
Serialization of for the AST is defined for `Code` in [`ast.chpp`](../base/components/ast.cpp) with `code_to_string_ptr` & `code_to_string`.
Serializtion for the rest of the code types is within [`code_serialization.cpp`](../base/components/code_serialization.cpp).
gencpp's serialization does not provide coherent formatting of the code. The user should use a formatter after.
Serialization of for the AST is defined for `Code` in [`ast.chpp`](../base/components/ast.cpp) with `code_to_string_ptr` & `code_to_string`.
Serializtion for the rest of the code types is within [`code_serialization.cpp`](../base/components/code_serialization.cpp).
Gencpp's serialization does not provide coherent formatting of the code. The user should use a formatter after serializing.

View File

@@ -25,11 +25,11 @@ These are containers representing a scope body of a definition that can be of th
Fields:
```cpp
StringCached Name;
Code Front;
Code Back;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
s32 NumEntries;
```
@@ -37,14 +37,15 @@ s32 NumEntries;
The `Front` member represents the start of the link list and `Back` the end.
NumEntries is the number of entries in the body.
Parent should have a compatible ECode type for the type of defintion used.
Parent should have a compatible CodeType type for the type of defintion used.
Serialization:
Will output only the entries, the braces are handled by the parent.
```cpp
<Front>...
<Front>
...
<Back>
```
@@ -56,11 +57,11 @@ Fields:
```cpp
StringCached Content;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
```
@@ -80,11 +81,11 @@ Fields:
```cpp
StringCached Content;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
```
@@ -108,11 +109,11 @@ CodeComment InlineCmt; // Only supported by forward declarations
CodeAttributes Attributes;
CodeType ParentType;
CodeBody Body;
CodeType Prev; // Used to store references to interfaces
CodeType Next; // Used to store references to interfaces
parser::Token* Tok;
Code Parent;
StringCached Name;
CodeType Prev;
CodeType Next;
Token* Tok;
Code Parent;
CodeT Type;
ModuleFlag ModuleFlags;
AccessSpec ParentAccess;
@@ -131,7 +132,7 @@ Serialization:
};
```
You'll notice that only one parent type is supported only with parent access. This library only supports single inheritance, the rest must be done through interfaces.
You'll notice that only one parent type is supported only with parent access. This library only supports single inheritance, the rest are assumed to be interfaces and are given public acess specifiers.
## Constructor
@@ -140,13 +141,13 @@ Fields:
```cpp
CodeComment InlineCmt; // Only supported by forward declarations
Code InitializerList;
CodeParam Params;
CodeParams Params;
Code Body;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
```
@@ -178,11 +179,11 @@ Fields:
```cpp
StringCached Content;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
```
@@ -200,11 +201,11 @@ Fields:
CodeComment InlineCmt;
CodeSpecifiers Specs;
Code Body;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
```
@@ -235,24 +236,28 @@ Fields:
CodeComment InlineCmt;
CodeAttributes Attributes;
CodeType UnderlyingType;
Code UnderlyingTypeMacro;
CodeBody Body;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
ModuleFlag ModuleFlags;
```
UnderlyingTypeMacro is a macro the library natively supports: `enum_underlying(type)` that is meant to behave as a wrapper for underlying type assignment.
The `enum_underlying_sig` is a `StrC` global var that can be set which will be defined within `PreprocessorDefines` and used in `parser_parse_enum` to identify a valid macro.
Serialization:
```cpp
// Enum_Fwd
<ModuleFlags> enum class <Name> : <UnderlyingType>; <InlineCmt>
<ModuleFlags> enum class <Name> : <UnderlyingType> or <UnderlyingTypeMacro> ; <InlineCmt>
// Enum
<ModuleFlags> <enum or enum class> <Name> : <UnderlyingType>
<ModuleFlags> <enum or enum class> <Name> : <UnderlyingType> or <UnderlyingTypeMacro>
{
<Body>
};
@@ -267,11 +272,11 @@ Fields:
```cpp
StringCached Content;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
```
@@ -287,11 +292,11 @@ Fields:
```cpp
CodeBody Body;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
```
@@ -310,11 +315,11 @@ Fields:
```cpp
StringCached Content;
StringCached Name;
Code Prev;
Code Next;
Code Parent;
parser::Token* Tok;
StringCached Name;
Token* Tok;
CodeT Type;
```
@@ -333,11 +338,11 @@ Fields:
```cpp
CodeComment InlineCmt;
Code Declaration;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
```
@@ -356,13 +361,13 @@ CodeComment InlineCmt;
CodeAttributes Attributes;
CodeSpecifiers Specs;
CodeType ReturnType;
CodeParam Params;
CodeParams Params;
CodeBody Body;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
ModuleFlag ModuleFlags;
```
@@ -385,11 +390,11 @@ Serialization:
Fields:
```cpp
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
ModuleFlag ModuleFlags;
```
@@ -406,11 +411,11 @@ Fields:
```cpp
CodeBody Body;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
ModuleFlag ModuleFlags;
```
@@ -424,7 +429,7 @@ Serialization:
}
```
## Operator Overload
## Operator Overload (Operator)
Fields:
@@ -433,13 +438,13 @@ CodeComment InlineCmt;
CodeAttributes Attributes;
CodeSpecifiers Specs;
CodeType ReturnType;
CodeParam Params;
CodeParams Params;
CodeBody Body;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
ModuleFlag ModuleFlags;
OperatorT Op;
@@ -458,7 +463,7 @@ Serialization:
}
```
## Operator Cast Overload ( User-Defined Type Conversion )
## Operator Cast Overload ( User-Defined Type Conversion, OpCast )
Fields:
@@ -467,11 +472,11 @@ CodeComment InlineCmt;
CodeSpecifiers Specs;
CodeType ValueType;
CodeBody Body;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
```
@@ -488,7 +493,7 @@ Serialization:
}
```
## Parameters (AST_Param)
## Parameters (AST_Params)
Fields:
@@ -496,11 +501,12 @@ Fields:
CodeType ValueType;
Code Macro;
Code Value;
CodeParam Last;
CodeParam Next;
parser::Token* Tok;
Code Parent;
Code PostNameMacro;
StringCached Name;
CodeParams Last;
CodeParams Next;
Token* Tok;
Code Parent;
CodeT Type;
s32 NumEntries;
```
@@ -510,7 +516,7 @@ Serialization:
```cpp
<Macro>, <Next> ... <Last>
<Macro> <ValueType> <Name>, <Next>... <Last>
<Macro> <ValueType> <Name> <PostNameMacro> = <Value>, <Next>... <Last>
```
## Pragma
@@ -519,11 +525,11 @@ Fields:
```cpp
StringCached Content;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
```
@@ -539,11 +545,11 @@ Fields:
```cpp
StringCached Content;
StringCached Name;
Code Prev;
Code Next;
paser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
```
@@ -560,11 +566,11 @@ Fields:
```cpp
SpecifierT ArrSpecs[ AST_ArrSpecs_Cap ];
CodeSpecifiers NextSpecs;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
s32 NumEntries;
```
@@ -580,13 +586,13 @@ Serialization:
Fields:
```cpp
CodeParam Params;
CodeParams Params;
Code Declaration;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
ModuleFlag ModuleFlags;
```
@@ -609,23 +615,30 @@ Fields:
CodeAttributes Attributes;
CodeSpecifiers Specs;
CodeReturnType ReturnType;
CodeParam Params;
CodeParams Params;
Code ArrExpr;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
b32 IsParamPack;
ETypenameTag TypeTag;
```
Serialization:
```cpp
<Attributes> <Name> <Specs> <IsParamPack ?: ...>
<Attributes> <TypeTag> <Name> <Specs> <IsParamPack ?: ...>
// Function
<Attributes> <ReturnType> <Name> <Params> <Specs>
```
`<Name>` currently has the full serialization of anything with
*Note: ArrExpr is not used in serialization by `typename_to_string_ref` its instead handled by a parent AST's serailization (variable, typedef, using).*
## Typedef
Behave as usual except function or macro typedefs.
@@ -636,11 +649,11 @@ Fields:
```cpp
CodeComment InlineCmt;
Code UnderlyingType;
StringCached Name;
Code Prev;
Code Next;
parse::Token* Tok
Token* Tok
Code Parent;
StringCached Name;
CodeT Type;
ModuleFlag ModuleFlags;
b32 IsFunction;
@@ -650,11 +663,16 @@ Serialization:
```cpp
// Regular
<ModuleFlags> typedef <UnderlyingType> <Name>; <InlineCmt>
<ModuleFlags> typedef <UnderlyingType> <Name> <UnderlyingType-ArrExpr>; <InlineCmt>
// Functions
<ModuleFlags> typedef <ReturnType> <Name>( <Parameters> ); <InlineCmt>
<ModuleFlags> typedef <ReturnType> ( <Expression that yeilds an Identifier signature> )( <Parameters> ); <InlineCmt>
// Currently:
<ModuleFlags> typedef <UnderlyingType (Serialized expression)>; <InlineCmt>
// Desired: Not handled yet
<ModuleFlags> typedef <UnderlyingType->ReturnType> UnderlyingType->Name> <UnderlyingType-ArrExpr> ( <UnderlyingType->Parameters> ); <InlineCmt>
<ModuleFlags> typedef <UnderlyingType->ReturnType> ( <Name->Namespace> for<Specs->has(Spec_Ptr) ?: *> <UnderlyingType->Name> <UnderlyingType-ArrExpr> ) ( <UnderlyingType->Parameters> ); <InlineCmt>
```
## Union
@@ -664,11 +682,11 @@ Fields:
```cpp
CodeAttributes Attributes;
CodeBody Body;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
ModuleFlag ModuleFlags;
```
@@ -690,11 +708,11 @@ Fields:
CodeComment InlineCmt;
CodeAttributes Attributes;
CodeType UnderlyingType;
StringCached Name;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
ModuleFlag ModuleFlags;
```
@@ -722,14 +740,15 @@ CodeSpecifiers Specs;
CodeType ValueType;
Code BitfieldSize;
Code Value;
StringCached Name;
CodeVar NextVar;
Code Prev;
Code Next;
parser::Token* Tok;
Token* Tok;
Code Parent;
StringCached Name;
CodeT Type;
ModuleFlag ModuleFlags;
s32 VarParenthesizedInit;
```
Serialization:
@@ -740,4 +759,7 @@ Serialization:
// Bitfield
<ModuleFlags> <Attributes> <Specs> <ValueType> <Name> : <BitfieldSize> = <Value>, NextVar ...; <InlineCmt>
// VarParenthesizedInit
<Attributes> <Specs> <ValueType> <Name>( <Value>, NextVar ... ); <InlineCmt>
```

View File

@@ -8,7 +8,7 @@
The library features a naive single-pass parser tailored for only what the library needs to construct the supported syntax of C++ into its AST for *"front-end"* meta-programming purposes.
This parser does not, and should not do the compiler's job. By only supporting this minimal set of features, the parser is kept (so far) around ~5600 loc. I hope to keep it under 10k loc worst case.
This parser does not, and should not do the compiler's job. By only supporting this minimal set of features, the parser is kept (so far) around ~7000 loc. I hope to keep it under 10k loc worst case.
You can think of this parser as *frontend parser* vs a *semantic parser*. Its intuitively similar to WYSIWYG. What you ***precerive*** as the syntax from the user-side before the compiler gets a hold of it, is what you get.

View File

@@ -128,7 +128,7 @@ The following CodeTypes are used which the user may optionally use strong typing
* CodeNS
* CodeOperator
* CodeOpCast
* CodeParam : Has support for `for : range` iterating across parameters.
* CodeParams : Has support for `for : range` iterating across parameters.
* CodePreprocessCond
* CodePragma
* CodeSpecifiers : Has support for `for : range` iterating across specifiers.
@@ -412,4 +412,8 @@ There are two provided auxillary interfaces:
### Scanner Auxillary Interface
* The purpose is to scan or parse files
* Some with two basic functions to convert a fil to code: `scan_file` and `parse_file`
* `scan_file`: Merely grabs the file and stores it in an untyped Code.
* `parse_file`: Will parse the file using `parse_global_body` and return a `CodeBody`.
* Two basic functions for grabbing columns from a CSV: `parse_csv_one_column` and `parse_csv_two_columns`