gencpp/docs/ASTs.md
Ed_ 3e249d9bc5 Reorganization of parser, refactor of parse_type( bool* ) and progression of parser docs
Wanted to make parser implementation easier to sift through, so I emphasized alphabetical order more.

Since I couldn't just strip whitespace from typenames I decided to make the parse_type more aware of the typename's components if it was a function signature.
This ofc lead to the dark & damp hell that is parsing typenames.

Also made initial implementation to support parsing decltype within a typename signature..

The test failure for the singleheader is still a thing, these changes have not addressed that.
2023-09-05 01:48:11 -04:00

11 KiB

ASTs Documentation

While the Readme for docs covers the data layout per AST, this will focus on the AST types avaialble, and their nuances.

Body

These are containers representing a scope body of a definition that can be of the following ECode type:

  • Class_Body
  • Enum_Body
  • Export_Body
  • Extern_Linkage_Body
  • Function_Body
  • Global_Body
  • Namespace_Body
  • Struct_Body
  • Union_Body

Fields:

Code         Front;
Code         Back;
Code         Parent;
StringCached Name;
CodeT        Type;
s32          NumEntries;

The Front member represents the start of the link list and Back the end. NumEntries is the number of entries in the body.

Parent should have a compatible ECode type for the type of defintion used.

Serialization:

Will output only the entries, the braces are handled by the parent.

<Front>...
<Back>

Attributes

Represent standard or vendor specific C/C++ attributes.

Fields:

StringCached  Content;
Code          Prev;
Code          Next;
Code          Parent;
StringCached  Name;
CodeT         Type;

Serialization:

<Content>

While the parser supports the __declspec and __attribute__ syntax, the upfront constructor ( def_attributes ) must have the user specify the entire attribute, including the [[]], __declspec or __attribute__ parts.

Comment

Stores a comment.

Fields:

StringCached  Content;
Code          Prev;
Code          Next;
Code          Parent;
StringCached  Name;
CodeT         Type;

Serialization:

<Content>

The parser will perserve comments found if residing with a body or in accepted inline-to-definition locations. Otherwise they will be skipped by the TokArray::__eat and TokArray::current( skip foramtting enabled ) functions.

The upfront constructor: def_comment expects to recieve a comment without the // or /* */ parts. It will add them during construction.

Class & Struct

Fields:

CodeComment    InlineCmt; // Only supported by forward declarations
CodeAttributes Attributes;
CodeType       ParentType;
CodeBody       Body;
CodeType       Last; // Used to store references to interfaces
CodeType       Next; // Used to store references to interfaces
Code           Parent;
StringCached   Name;
CodeT          Type;
ModuleFlag     ModuleFlags;
AccessSpec     ParentAccess;

Serialization:

// Class_Fwd
<ModuleFlags> <class/struct> <Name>; <InlineCmt>

// Class
<ModuleFlags> <class/struct> <Attributes> <Name> : <ParentAccess> <ParentType>, public <Next>, ...<Last>
{
    <Body>
};

You'll notice that only one parent type is supported only with parent access. This library only supports single inheritance, the rest must be done through interfaces.

Constructor

Fields:

CodeComment InlineCmt;  // Only supported by forward declarations
Code        InitializerList;
CodeParam   Params;
Code        Body;
Code        Prev;
Code        Next;
Code        Parent;
CodeT       Type;

Serialization:

// Constructor_Fwd
<Specs> <Parent->Name>( <Params> ); <InlineCmt>

// Constructor
<Specs> <Parent->Name>( <Params> ): <InitializerList>
{
    <Body>
}

Define

Represents a preprocessor define

Fields:

StringCached  Content;
Code          Prev;
Code          Next;
Code          Parent;
StringCached  Name;
CodeT         Type;

Serialization:

#define <Name> <Content>

Destructor

Fields:

CodeComment    InlineCmt;
CodeSpecifiers Specs;
Code           Body;
Code           Prev;
Code           Next;
Code           Parent;
CodeT          Type;

Serialization:

// Destructor_Fwd
<Specs> ~<Parent->Name>( <Params> ) <Specs>; <InlineCmt>

// Destructor
<Specs> ~<Parent->Name>( <Params> ) <Specs>
{
    <Body>
}

Enum

Fields:

CodeComment    InlineCmt;
CodeAttributes Attributes;
CodeType       UnderlyingType;
CodeBody       Body;
Code           Prev;
Code           Next;
Code           Parent;
StringCached   Name;
CodeT          Type;
ModuleFlag     ModuleFlags;

Serialization:

// Enum_Fwd
<ModuleFlags> enum class <Name> : <UnderlyingType>; <InlineCmt>

// Enum
<ModuleFlags> <enum or enum class> <Name> : <UnderlyingType>
{
    <Body>
};

Execution

Just represents an execution body. Equivalent to an untyped body. Will be obsolute when function body parsing is implemented.

Fields:

StringCached Content;
Code         Prev;
Code         Next;
Code         Parent;
StringCached Name;
CodeT        Type;

Serialization:

<Content>

External Linkage

Fields:

CodeBody     Body;
Code         Prev;
Code         Next;
Code         Parent;
StringCached Name;
CodeT        Type;

Serialization:

extern "<Name>"
{
    <Body>
}

Include

Fields:

StringCached Content;
Code         Prev;
Code         Next;
Code         Parent;
StringCached Name;
CodeT        Type;

Serialization:

#include <Content>

Friend

This library (until its necessary become some third-party library to do otherwise) does not support friend declarations with in-statment function definitions.

Fields:

CodeComment  InlineCmt;
Code         Declaration;
Code         Prev;
Code         Next;
Code         Parent;
StringCached Name;
CodeT        Type;

Serialization:

friend <Declaration>; <InlineCmt>

Function

Fields:

CodeComment    InlineCmt;
CodeAttributes Attributes;
CodeSpecifiers Specs;
CodeType       ReturnType;
CodeParam      Params;
CodeBody       Body;
Code           Prev;
Code           Parent;
Code           Next;
StringCached   Name;
CodeT          Type;
ModuleFlag     ModuleFlags;

Serialization:

// Function_Fwd
<ModuleFlags> <Attributes> <Specs> <ReturnType> <Name>( <Params> ) <Specs>; <InlineCmt>

// Function
<ModuleFlags> <Attributes> <Specs> <ReturnType> <Name>( <Params> ) <Specs>
{
    <Body>
}

Module

Fields:

Code         Prev;
Code         Next;
Code         Parent;
StringCached Name;
CodeT        Type;
ModuleFlag   ModuleFlags;

Serialization:

<ModuleFlags> module <Name>;

Namespace

Fields:

CodeBody     Body;
Code         Prev;
Code         Next;
Code         Parent;
StringCached Name;
CodeT        Type;
ModuleFlag   ModuleFlags;

Serialization:

<ModuleFlags> namespace <Name>
{
    <Body>
}

Operator Overload

Fields:

CodeComment    InlineCmt;
CodeAttributes Attributes;
CodeSpecifiers Specs;
CodeType       ReturnType;
CodeParam      Params;
CodeBody       Body;
Code           Prev;
Code           Next;
Code           Parent;
StringCached   Name;
CodeT          Type;
ModuleFlag     ModuleFlags;
OperatorT      Op;

Serialization:

// Operator_Fwd
<ModuleFlags> <Attributes> <Specs> <ReturnType> operator <Op>( <Params> ) <Specs>; <InlineCmt>

// Operator
<ModuleFlags> <Attributes> <Specs> <ReturnType> <Name>operator <Op>( <Params> ) <Specs>
{
    <Body>
}

Operator Cast Overload ( User-Defined Type Conversion )

Fields:

CodeComment    InlineCmt;
CodeSpecifiers Specs;
CodeType       ValueType;
CodeBody       Body;
Code           Prev;
Code           Next;
Code           Parent;
StringCached   Name;
CodeT          Type;

Serialization:

// Operator_Cast_Fwd
<Specs> operator <ValueType>() <Specs>; <InlineCmt>

// Operator_Cast
<Specs> <Name>operator <ValueType>() <Specs>
{
    <Body>
}

Parameters

Fields:

CodeType     ValueType;
Code         Value;
CodeParam    Last;
CodeParam    Next;
Code         Parent;
StringCached Name;
CodeT        Type;
s32          NumEntries;

Serialization:

<ValueType> <Name>, <Next>... <Last>

Pragma

Fields:

StringCached Content;
Code         Prev;
Code         Next;
Code         Parent;
StringCached Name;
CodeT        Type;

Serialization:

#pragma <Content>

Preprocessor Conditional

Fields:

StringCached Content;
Code         Prev;
Code         Next;
Code         Parent;
StringCached Name;
CodeT        Type;

Serialization:

#<based off Type> <Content>

Specifiers

Fields:

SpecifierT        ArrSpecs[ AST::ArrSpecs_Cap ];
Code              Prev;
Code              Next;
Code              Parent;
StringCached      Name;
CodeT             Type;
s32               NumEntries;

Serialization:

<Spec>, ...

Template

Fields:

CodeParam    Params;
Code         Declaration;
Code         Prev;
Code         Next;
Code         Parent;
StringCached Name;
CodeT        Type;
ModuleFlag   ModuleFlags;

Serialization:

<ModuleFlags>
template< <Params> >
<Declaration>

Typename

Typenames represent the type "symbol".

Fields:

CodeAttributes Attributes;
CodeSpecifiers Specs;
CodeReturnType ReturnType;
CodeParam      Params;
Code           ArrExpr;
Code           Prev;
Code           Next;
Code           Parent;
StringCached   Name;
CodeT          Type;
b32            IsParamPack;

Serialization:

<Attributes> <Name> <Specs> <IsParamPack ?: ...>

Typedef

Behave as usual except function or macro typedefs.
Those don't use the underlying type field as everything was serialized under the Name field.

Fields:

CodeComment  InlineCmt;
Code         UnderlyingType;
Code         Prev;
Code         Next;
Code         Parent;
StringCached Name;
CodeT        Type;
ModuleFlag   ModuleFlags;
b32          IsFunction;

Serialization:

// Regular
<ModuleFlags> typedef <UnderlyingType> <Name>; <InlineCmt>

// Functions
<ModuleFlags> typedef <Name>; <InlineCmt>

Union

Fields:

CodeAttributes Attributes;
CodeBody       Body;
Code           Prev;
Code           Next;
Code           Parent;
StringCached   Name;
CodeT          Type;
ModuleFlag     ModuleFlags;

Serialization:

<ModuleFlags> union <Attributes> <Name>
{
    <Body>
}

Using

Fields:

CodeComment    InlineCmt;
CodeAttributes Attributes;
CodeType       UnderlyingType;
Code           Prev;
Code           Next;
Code           Parent;
StringCached   Name;
CodeT          Type;
ModuleFlag     ModuleFlags;

Serialization:

// Regular
<ModuleFlags> using <Attributes> <Name> = <UnderlyingType>; <InlineCmt>

// Namespace
<ModuleFlags> using namespace <Name>; <InlineCmt>

Variable

Fields:

CodeComment    InlineCmt;
CodeAttributes Attributes;
CodeSpecifiers Specs;
CodeType       ValueType;
Code           BitfieldSize;
Code           Value;
Code           Prev;
Code           Next;
Code           Parent;
StringCached   Name;
CodeT          Type;
ModuleFlag     ModuleFlags;

Serialization:

// Regular
<ModuleFlags> <Attributes> <Specs> <ValueType> <Name> = <Value>; <InlineCmt>

// Bitfield
<ModuleFlags> <Attributes> <Specs> <ValueType> <Name> : <BitfieldSize> = <Value>; <InlineCmt>