Rework of project implementation

For include and multi-file support. I still need to debug it,
Test will be adjusted as well; I want to get all the files not just zpl refactored using a powershell script.

I dropped the idea of semantically identifiying macros. While it may be possible, I don't see the utility vs the regular idendentifier distinction.
I want to keep the refactoring as simple as possible, where it just takes one pass to go through a file without any context to other files.

So far the ignores behave as a good guard filter for unwanted refactors and the only true weak area was the includes (which should be aleviated with the coming support for it.
This commit is contained in:
Edward R. Gonzalez 2023-03-17 02:09:19 -04:00
parent d44f7ed6fa
commit 231c893c6b
11 changed files with 1074 additions and 587 deletions

View File

@ -1,9 +1,10 @@
# refactor
A code identifier refactoring app. Intended for c/c++ like identifiers.
Refactor c/c++ files (and problably others) with ease.
Parameters :
* `-num` : Used if more than one source file is provided (if used, number of destination files provided MUST MATCH).
* `-src` : Source file to refactor
* `-dst` : Destination file after the refactor (omit to use the same as source)
* `-spec` : Specification containing rules to use for the refactor.
@ -11,18 +12,33 @@ Parameters :
Syntax :
* `not` Omit word or namespace.
* `include` Preprocessor include <file> related identifiers.
* `word` Fixed sized identifier.
* `namespace` Variable sized identifiers, mainly intended to redefine c-namespace of an identifier.
* `,` is used to delimit arguments to word or namespace.
* `L-Value` is the signature to modify.
* `R-Value` is the substitute ( only available if rule does not use `not` keyword )
The only keyword here excluisve to c/c++ is the `include` as it does search specifically for `#include <L-Value>`.
However, the rest of the categorical keywords (word, namespace), can really be used for any langauge.
There is no semantic awareness this is truely just a simple find and replace, but with some filters specifiable, and
words/namespaces only being restricted to the rules for C/C++ identifiers (alphanumeric or underscores only)
The main benefit for using this over other stuff is its faster and more ergonomic for large refactors on libraries that
you may want to have automated in a script.
There are other programs more robust for doing that sort of thing but I was not able to find something this simple.
**Note**
* Building for debug provides some nice output with context on a per-line basis.
* Release will only show errors for asserts (that will kill the refactor early).
* If the refactor crashes, the files previously written to will retain their changes.
Make sure to have the code backed up on a VCS or in some other way.
* This was compiled using meson with ninja and clang on windows 11. The ZPL library used however should work fine on the other major os platforms and compiler venders.
* The scripts used for building and otherwise are in the scripts directory and are all in powershell (with exception to the meson.build). Techncially there should be a powershell package available on other platorms but worst case it should be pretty easily to port these scripts to w/e shell script you'd perfer.
TODO:
* Possibly come up with a better name.
* Cleanup memory usage (it hogs quite a bit for what it does..)
* Split lines of file and refactor it that way instead (better debug, problably negligable performance loss, worst case can have both depending on build type)
* Accept multiple files at once `-files`
* Add support for `macro` keyword (single out macro identifiers)
* Add support for `include` keyword (single out include definitions)
* Add support for auto-translating a namespace in a macro to a cpp namespace
* Test to see how much needs to be ported for other platforms (if at all)
* Provide binaries in the release page for github. (debug and release builds)

View File

@ -85,7 +85,7 @@ namespace Memory
void setup()
{
arena_init_from_allocator( & Global_Arena, heap(), megabytes(10) );
arena_init_from_allocator( & Global_Arena, heap(), megabytes(1) );
if ( Global_Arena.total_size == 0 )
{

37
project/Bloat.cpp Normal file
View File

@ -0,0 +1,37 @@
#define BLOAT_IMPL
#include "bloat.hpp"
namespace Memory
{
static zpl_arena Global_Arena {};
void setup()
{
zpl_arena_init_from_allocator( & Global_Arena, zpl_heap(), zpl_megabytes(2) );
if ( Global_Arena.total_size == 0 )
{
zpl_assert_crash( "Failed to reserve memory for Tests:: Global_Arena" );
}
}
void resize( uw new_size )
{
void* new_memory = zpl_resize( zpl_heap(), Global_Arena.physical_start, Global_Arena.total_size, new_size );
if ( new_memory == nullptr )
{
fatal("Failed to resize global arena!");
}
Global_Arena.physical_start = new_memory;
Global_Arena.total_size = new_size;
}
void cleanup()
{
zpl_arena_free( & Global_Arena);
}
}

159
project/IO.cpp Normal file
View File

@ -0,0 +1,159 @@
#include "IO.hpp"
namespace IO
{
using array_string = zpl_array( zpl_string );
namespace StaticData
{
array_string Sources = nullptr;
array_string Destinations = nullptr;
zpl_string Specification = nullptr;
// Current source and destination index.
// Used to keep track of which file get_next_source or write refer to.
uw Current = 0;
char* Current_Content = nullptr;
uw Current_Size = 0;
uw Largest_Src_Size = 0;
/*
Will persist throughout loading different file content.
Should hold a bit more than the largest source file's content,
As an array of lines.
*/
zpl_arena MemPerist;
/*
Temporary memory held while procesisng files to get their content.
zpl_files are stored here
*/
// zpl_arena MemTransient;
}
using namespace StaticData;
void prepare()
{
const sw num_srcs = zpl_array_count( Sources );
// Determine the largest content size.
sw left = num_srcs;
zpl_string* path = Sources;
do
{
zpl_file src = {};
zpl_file_error error = zpl_file_open( & src, *path );
if ( error != ZPL_FILE_ERROR_NONE )
{
fatal("Could not open source file: %s", *path );
}
const sw fsize = zpl_file_size( & src );
if ( fsize > Largest_Src_Size )
{
Largest_Src_Size = fsize;
}
zpl_file_close( & src );
}
while ( --left );
uw persist_size = ZPL_ARRAY_GROW_FORMULA( Largest_Src_Size );
zpl_arena_init_from_allocator( & MemPerist, zpl_heap(), persist_size );
// zpl_arena_init_from_allocator( & MemTransient, zpl_heap(), Largest_Src_Size );
}
void cleanup()
{
zpl_arena_free( & MemPerist );
// zpl_arena_free( & MemTransient );
}
Array_Line get_specification()
{
zpl_file file {};
zpl_file_error error = zpl_file_open( & file, Specification);
if ( error != ZPL_FILE_ERROR_NONE )
{
fatal("Could not open the specification file: %s", Specification);
}
sw fsize = scast( sw, zpl_file_size( & file ) );
if ( fsize <= 0 )
{
fatal("No content in specificaiton to process");
}
char* content = rcast( char*, zpl_alloc( zpl_arena_allocator( & MemPerist), fsize + 1) );
zpl_file_read( & file, content, fsize);
zpl_file_close( & file );
content[fsize] = 0;
Array_Line lines = zpl_str_split_lines( zpl_arena_allocator( & MemPerist ), content, false );
return lines;
}
Array_Line get_next_source()
{
// zpl_memset( MemTransient.physical_start, 0, MemTransient.total_allocated);
// MemTransient.total_allocated = 0;
// MemTransient.temp_count = 0;
zpl_memset( MemPerist.physical_start, 0, MemPerist.total_allocated);
MemPerist.total_allocated = 0;
MemPerist.temp_count = 0;
zpl_file file {};
zpl_file_error error = zpl_file_open( & file, Specification);
if ( error != ZPL_FILE_ERROR_NONE )
{
fatal("Could not open the source file: %s", Sources[Current]);
}
Current_Size = scast( sw, zpl_file_size( & file ) );
if ( Current_Size <= 0 )
return nullptr;
Current_Content = rcast( char* , zpl_alloc( zpl_arena_allocator( & MemPerist), Current_Size + 1) );
zpl_file_read( & file, Current_Content, Current_Size );
zpl_file_close( & file );
Current_Content[Current_Size] = 0;
Current_Size++;
Array_Line lines = zpl_str_split_lines( zpl_arena_allocator( & MemPerist), Current_Content, ' ' );
return lines;
}
void write( zpl_string refacotred )
{
if ( refacotred == nullptr)
return;
zpl_string dst = Destinations[Current];
zpl_file file_dest {};
zpl_file_error error = zpl_file_create( & file_dest, dst );
if ( error != ZPL_FILE_ERROR_NONE )
{
fatal( "Unable to open destination file: %s\n", dst );
}
zpl_file_write( & file_dest, refacotred, zpl_string_length(refacotred) );
zpl_file_close( & file_dest );
}
}

25
project/IO.hpp Normal file
View File

@ -0,0 +1,25 @@
#pragma once
#include "bloat.hpp"
namespace IO
{
ct uw Path_Size_Largest = zpl_kilobytes(1);
// Preps the IO by loading all the files and checking to see what the largest size is.
// The file with the largest size is used to determine the size of the persistent memory.
void prepare();
// Frees the persistent and transient memory arenas.
void cleanup();
// Provides the content of the specification.
Array_Line get_specification();
// Provides the content of the next source, broken up as a series of lines.
Array_Line get_next_source();
// Writes the refactored content ot the current corresponding destination.
void write( zpl_string refactored );
}

315
project/Spec.cpp Normal file
View File

@ -0,0 +1,315 @@
#include "Spec.hpp"
#include "IO.hpp"
namespace Spec
{
ct uw Array_Reserve_Num = zpl_kilobytes(4);
ct uw Token_Max_Length = zpl_kilobytes(1);
namespace StaticData
{
Array_Entry Ignore_Includes;
Array_Entry Ignore_Words;
Array_Entry Ignore_Regexes;
Array_Entry Ignore_Namespaces;
Array_Entry Includes;
Array_Entry Words;
Array_Entry Regexes;
Array_Entry Namespaces;
u32 Sig_Smallest = Token_Max_Length;
}
using namespace StaticData;
void cleanup()
{
zpl_array_free( Ignore_Includes );
zpl_array_free( Ignore_Words );
zpl_array_free( Ignore_Namespaces );
zpl_array_free( Includes );
zpl_array_free( Words );
zpl_array_free( Namespaces );
}
// Helper function for process().
forceinline
void find_next_token( zpl_string& token, char*& line, u32& length )
{
zpl_string_clear( token );
length = 0;
while ( zpl_char_is_alphanumeric( line[length] ) || line[length] == '_' )
{
length++;
}
if ( length == 0 )
{
fatal("Failed to find valid initial token");
}
token = zpl_string_append_length( token, line, length );
line += length;
}
void parse()
{
static zpl_string token = zpl_string_make_reserve( g_allocator, zpl_kilobytes(1));
static bool Done = false;
if (Done)
{
zpl_array_clear( Ignore_Includes );
zpl_array_clear( Ignore_Words );
zpl_array_clear( Ignore_Namespaces );
zpl_array_clear( Includes );
zpl_array_clear( Words );
zpl_array_clear( Namespaces );
}
else
{
Done = true;
zpl_array_init_reserve( Ignore_Includes, zpl_heap(), Array_Reserve_Num );
zpl_array_init_reserve( Ignore_Words, zpl_heap(), Array_Reserve_Num );
zpl_array_init_reserve( Ignore_Namespaces, zpl_heap(), Array_Reserve_Num );
zpl_array_init_reserve( Includes, zpl_heap(), Array_Reserve_Num );
zpl_array_init_reserve( Words, zpl_heap(), Array_Reserve_Num );
zpl_array_init_reserve( Namespaces, zpl_heap(), Array_Reserve_Num );
}
Array_Line lines = IO::get_specification();
sw left = zpl_array_count( lines );
if ( left == 0 )
{
fatal("Spec::parse: lines array imporoperly setup");
}
// Skip the first line as its the version number and we only support __VERSION 1.
left--;
lines++;
char token[ Token_Max_Length ];
do
{
char* line = * lines;
// Ignore line if its a comment
if ( line[0] == '/' && line[1] == '/')
{
lines++;
continue;
}
// Remove indent
{
while ( zpl_char_is_space( line[0] ) )
line++;
if ( line[0] == '\0' )
{
lines++;
continue;
}
}
u32 length = 0;
// Find a valid token
find_next_token( token, line, length );
Tok type = Tok::Num_Tok;
bool ignore = false;
Entry entry {};
// Will be reguarded as an ignore.
if ( is_tok( Tok::Not, token, length ))
{
ignore = true;
while ( zpl_char_is_space( line[0] ) )
line++;
if ( line[0] == '\0' )
{
lines++;
continue;
}
// Find the category token
find_next_token( token, line, length );
}
if ( is_tok( Tok::Word, token, length ) )
{
type = Tok::Word;
}
else if ( is_tok( Tok::Namespace, token, length ) )
{
type = Tok::Namespace;
}
else if ( is_tok( Tok::Include, token, length ))
{
type = Tok::Include;
}
else
{
log_fmt( "Sec::Parse - Unsupported keyword: %s on line: %d", token, zpl_array_count(lines) - left );
lines++;
continue;
}
// Find the first argument
while ( zpl_char_is_space( line[0] ) )
line++;
if ( line[0] == '\0' )
{
lines++;
continue;
}
find_next_token( token, line, length );
// First argument is signature.
entry.Sig = zpl_string_make_length( g_allocator, token, length );
if ( length < StaticData::Sig_Smallest )
StaticData::Sig_Smallest = length;
if ( line[0] == '\0' || ignore )
{
switch ( type )
{
case Tok::Word:
if ( ignore)
zpl_array_append( Ignore_Words, entry );
else
zpl_array_append( Words, entry );
break;
case Tok::Namespace:
if ( ignore)
zpl_array_append( Ignore_Namespaces, entry );
else
zpl_array_append( Namespaces, entry );
break;
case Tok::Include:
if ( ignore)
zpl_array_append( Ignore_Includes, entry );
else
zpl_array_append( Includes, entry );
break;
}
lines++;
continue;
}
// Look for second argument indicator
{
bool bSkip = false;
while ( line[0] != ',' )
{
if ( line[0] == '\0' )
{
switch ( type )
{
case Tok::Word:
zpl_array_append( Words, entry );
break;
case Tok::Namespace:
zpl_array_append( Namespaces, entry );
break;
case Tok::Include:
zpl_array_append( Includes, entry );
break;
}
bSkip = true;
break;
}
line++;
}
if ( bSkip )
{
lines++;
continue;
}
}
// Eat the argument delimiter.
line++;
// Remove spacing
{
bool bSkip = true;
while ( zpl_char_is_space( line[0] ) )
line++;
if ( line[0] == '\0' )
{
switch ( type )
{
case Tok::Word:
zpl_array_append( Words, entry );
break;
case Tok::Namespace:
zpl_array_append( Namespaces, entry );
break;
case Tok::Include:
zpl_array_append( Includes, entry );
break;
}
lines++;
continue;
}
}
find_next_token( token, line, length );
// Second argument is substitute.
entry.Sub = zpl_string_make_length( g_allocator, token, length );
switch ( type )
{
case Tok::Word:
zpl_array_append( Words, entry );
lines++;
continue;
case Tok::Namespace:
zpl_array_append( Namespaces, entry );
lines++;
continue;
case Tok::Include:
zpl_array_append( Includes, entry );
lines++;
continue;
}
}
while ( --left );
}
}

73
project/Spec.hpp Normal file
View File

@ -0,0 +1,73 @@
#pragma once
#include "Bloat.hpp"
namespace Spec
{
enum Tok
{
Not,
Include,
Namespace,
Word,
Num_Tok
};
forceinline
char const* str_tok( Tok tok )
{
static
char const* tok_to_str[ Tok::Num_Tok ] =
{
"not",
"include",
"namespace",
"word",
};
return tok_to_str[ tok ];
}
forceinline
char strlen_tok( Tok tok )
{
static
const u8 tok_to_len[ Tok::Num_Tok ] =
{
3,
7,
9,
4,
};
return tok_to_len[ tok ];
}
forceinline
bool is_tok( Tok tok, zpl_string str, u32 length )
{
char const* tok_str = str_tok(tok);
const u8 tok_len = strlen_tok(tok);
if ( tok_len != length)
return false;
s32 result = zpl_strncmp( tok_str, str, tok_len );
return result == 0;
}
struct Entry
{
zpl_string Sig = nullptr; // Signature
zpl_string Sub = nullptr; // Substitute
};
using Array_Entry = zpl_array( Entry );
void cleanup();
// Extract the specificication from the provided file.
void parse();
}

41
project/_Docs.md Normal file
View File

@ -0,0 +1,41 @@
# Documentation
The current implementation is divided into 4 parts:
* Bloat : General library provider.
* IO : File I/O processing.
* Spec : Specification parsing.
* Refactor : Entrypoint, argument parsing, and refactoring process.
The files are setup to compile as one unit. As such the source files for Bloat, IO, and Spec are located within `refactor.cpp`.
Bloat contains some aliasing of some C++ keywords and does not use the standard library. Instead a library called ZPL is used (Single header replacement).
The program has pretty much no optimizations made to it, its just regular loops with no threading.
Just tried to keep the memory at reasonable size of what it does.
The program execution is pretty much outlined quite clearly in `int main()`.
1. Setup initial reserve of global memory in an arena.
2. Parse the arguments provided.
3. Prepare IO's memory for retreviing content.
4. Reserve memory for the refactor buffer.
5. Parse the specification file
6. Iterate through all provided files to refactor and write the refactored content to the specificed destintation files.
7. Cleanup all reserves of memory`*`
`*` This technically can be skipped on windows, may be worth doing to reduce latency of process shutdown.
There are constraints of specific sizes of variables;
* `Path_Size_Largest` : Longest path size is set to 1 KB of characters.
* `Token_Max_Length` : Set to 1 KB characters as well.
* `Array_Reserve_Num` : Is set to 4 KB.
* Initial Global arena size : Set to 2 megabytes.
The `Path_Size_Largest` and `Token_Max_Length` are compile-time constraints that the runtime will not have a fallback for, if 1 KB is not enough it will need to be changed for your use case.
`Array_Reserve_Num` is used to dictate the assumed amount of tokens will be held in total for any of spec's arrays holding ignores and refactor entries. If any of the array's exceed 4 KB they will grow trigigng a resize which will bog down the speed of the refactor. Adjust if you think you can increase or lower for use case.
Initial Global arena size is a finicy thing, its most likely going to be custom allocator at one point so that it can handle growing properly, right now its just grows if the amount of memory file paths will need for sources is greater than 1 MB.

View File

@ -1,13 +1,13 @@
/*
BLOAT.
ZPL requires ZPL_IMPLEMENTATION whereever this library is included.
This file assumes it will be included in one compilation unit.
*/
#pragma once
#ifdef BLOAT_IMPL
# define ZPL_IMPLEMEntATION
#endif
#if __clang__
# pragma clang diagnostic ignored "-Wunused-const-variable"
# pragma clang diagnostic ignored "-Wswitch"
@ -33,14 +33,13 @@
// # define ZPL_MODULE_REGEX
// # define ZPL_MODULE_EVENT
// # define ZPL_MODULE_DLL
# define ZPL_MODULE_OPTS
# define ZPL_MODULE_OPTS
// # define ZPL_MODULE_PROCESS
// # define ZPL_MODULE_MAT
// # define ZPL_MODULE_THREADING
// # define ZPL_MODULE_JOBS
// # define ZPL_MODULE_PARSER
#include "zpl.h"
// }
#if __clang__
# pragma clang diagnostic pop
@ -59,16 +58,16 @@
#define rcast( Type_, Value_ ) reinterpret_cast< Type_ >( Value_ )
#define pcast( Type_, Value_ ) ( * (Type_*)( & Value_ ) )
#define do_once() \
do \
{ \
static \
bool Done = true; \
if ( Done ) \
return; \
Done = false; \
} \
while(0) \
#define do_once() \
do \
{ \
static \
bool Done = false; \
if ( Done ) \
return; \
Done = true; \
} \
while(0) \
using s8 = zpl_i8;
@ -81,39 +80,34 @@ using f64 = zpl_f64;
using uw = zpl_usize;
using sw = zpl_isize;
using Line = char*;
using Array_Line = zpl_array( Line );
ct char const* Msg_Invalid_Value = "INVALID VALUE PROVIDED";
namespace Memory
{
zpl_arena Global_Arena {};
extern zpl_arena Global_Arena;
#define g_allocator zpl_arena_allocator( & Memory::Global_Arena)
void setup()
{
zpl_arena_init_from_allocator( & Global_Arena, zpl_heap(), zpl_megabytes(10) );
if ( Global_Arena.total_size == 0 )
{
zpl_assert_crash( "Failed to reserve memory for Tests:: Global_Arena" );
}
}
void cleanup()
{
zpl_arena_free( & Global_Arena);
}
void setup();
void resize( uw new_size );
void cleanup();
}
sw log_fmt(char const *fmt, ...)
{
inline
sw log_fmt(char const *fmt, ...)
{
#if Build_Debug
sw res;
va_list va;
va_start(va, fmt);
res = zpl_printf_va(fmt, va);
va_end(va);
return res;
#else
@ -121,6 +115,7 @@ namespace Memory
#endif
}
inline
void fatal(char const *fmt, ...)
{
zpl_local_persist zpl_thread_local
@ -139,6 +134,6 @@ void fatal(char const *fmt, ...)
zpl_printf_err_va( fmt, va);
va_end(va);
exit(1);
zpl_exit(1);
#endif
}

File diff suppressed because it is too large Load Diff

63
refactor.10x Normal file
View File

@ -0,0 +1,63 @@
<?xml version="1.0"?>
<N10X>
<Workspace>
<IncludeFilter>*.*,</IncludeFilter>
<ExcludeFilter>*.obj,*.lib,*.pch,*.dll,*.pdb,.vs,Debug,Release,x64,obj,*.user,Intermediate,</ExcludeFilter>
<SyncFiles>true</SyncFiles>
<Recursive>true</Recursive>
<ShowEmptyFolders>true</ShowEmptyFolders>
<IsVirtual>false</IsVirtual>
<IsFolder>false</IsFolder>
<BuildCommand></BuildCommand>
<RebuildCommand></RebuildCommand>
<BuildFileCommand></BuildFileCommand>
<CleanCommand></CleanCommand>
<BuildWorkingDirectory></BuildWorkingDirectory>
<CancelBuild></CancelBuild>
<RunCommand></RunCommand>
<DebugCommand></DebugCommand>
<ExePathCommand></ExePathCommand>
<DebugSln></DebugSln>
<UseVisualStudioEnvBat>false</UseVisualStudioEnvBat>
<Configurations>
<Configuration>Debug</Configuration>
<Configuration>Release</Configuration>
</Configurations>
<Platforms>
<Platform>x64</Platform>
</Platforms>
<AdditionalIncludePaths>
<AdditionalIncludePath>C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.35.32215\include</AdditionalIncludePath>
<AdditionalIncludePath>C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.35.32215\ATLMFC\include</AdditionalIncludePath>
<AdditionalIncludePath>C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Auxiliary\VS\include</AdditionalIncludePath>
<AdditionalIncludePath>C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt</AdditionalIncludePath>
<AdditionalIncludePath>C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\um</AdditionalIncludePath>
<AdditionalIncludePath>C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\shared</AdditionalIncludePath>
<AdditionalIncludePath>C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt</AdditionalIncludePath>
<AdditionalIncludePath>C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt</AdditionalIncludePath>
<AdditionalIncludePath>C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um</AdditionalIncludePath>
<AdditionalIncludePath>.\thirdparty</AdditionalIncludePath>
</AdditionalIncludePaths>
<Defines>
<Define>ZPL_IMPLEMENTATION</Define>
</Defines>
<ConfigProperties>
<ConfigAndPlatform>
<Name>Debug:x64</Name>
<Defines></Defines>
<ForceIncludes>
<ForceInclude>C:\projects\refactor\thirdparty</ForceInclude>
</ForceIncludes>
</ConfigAndPlatform>
<Config>
<Name>Debug</Name>
<Defines></Defines>
</Config>
<Platform>
<Name>x64</Name>
<Defines></Defines>
</Platform>
</ConfigProperties>
<Children></Children>
</Workspace>
</N10X>