timesynk/test/vm_compile.txt

147 lines
8.1 KiB
Plaintext

/*
================================================================
This file is the test development area for precompiling and compiling C-syntax VM source files in "code/".
It provides, globally, a Table of vm_Function pointers, as well as two additional Tables of vm_Function pointers for tile-type and tile-specific functions.
Table *vm_T_global_func
Table *vm_T_group_func[max_tile_types]
Table *vm_T_individual_func[max_tile_types][max_tile_ids]
Using Tables is a temporary solution as it stands, as it is quite inefficient. I would eventually like for a vm_Function pointer list to be generated, with particular functions being referred to solely through a vm_Function pointer (or, failing that, an index in an array).
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Program Structure - file input and initial parsing
````````````````````````````````
The program reads in each timesynk-VM C-style syntax file found in the "code/" directory. These files contains the function declarations for all Tiles, Tile types, and specific Tiles. ex.:
return_type functionName(param_type param_name, param_type, param_name, . . .) {
type result = param_name;
return(param_name);
}
TID {
return_type typeFunct(. . .) {
. . .
}
TID:ID {
return_type tileFunct(. . .) {
. . .
}
}
Categorization of functions into global, group, and individual is accomplished through the following method. During the reading in of a line while depth is equal to 0 (e.g., not in a function declaration): if the first word is solely numeric, it is a group function(s) declaration; if the first word is numeric save for a single colon ':', then it is an individual declaration; if the first word is non-numeric, then it is a global declaration.
Once the scope is found, the Parser begins to search for variable and function declarations. During reading a line, if the line does not end with a ';', then it is checked as a possible function declaration - otherwise, it is checked as a variable declaration. During this stage, spaces, tabs and return characters are eaten, allowing for multiple function declaration styles.
Once a function section is identified, a mid-level function struct is created and populated with the function name, return data type, and variable structs for the parameters and added to the appropriate scope function Table. When a variable is encountered, it is converted into a mid-level variable struct and added into the appropriate scope variable Table.
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Program Structure - function parsing and pre-compilation
````````````````````````````````
Once the parser is within a function, the source code is reduced to mid-level (precompiled) statements and expressions. At this stage, variables remain referred to via char arrays, with new variables added to the mid-level function's variable Table.
For example, the following variable assignment statement:
int x = 2;
Would be precompiled in the following manner:
add variable { name: "x", type: int }
add variable { name: "2", type: int }
add statement { type: assign, var_1: "x", var_2: "2" }
add statement indicates an added statement to the linked list of statements AND an increment to the line count. add variable indicates adding a variable to the variable Table. Or, for a compound statement:
float y;
if (x > 2) {
y = 0.0;
} else if (x < 1) {
y = 0.5;
} else {
y = 1.0;
}
Would precompile:
add variable { name: "y", type: float, value: "0.0" }
add variable { name: "2", type: int, value "0" } // if not existing
add statement { type: gt, var_1: "x", var_2: "2" }
add statement { type: jump, pos: undef } ---------------------.
// jump[depth][pos++] = this jump |
add variable { name: "0.0", type: float, value: "0.0" } |
add statement { type: assign, var_1: "y", var_2: "0.0" } |
add statement { type: jump, pos: undef } =====================|============================
// jump_to_end[depth][pos++] = this jump | "
// set jump[depth][pos-1].pos = this line <-------------------' "
add variable { name: "1", type: int, value: "1" } "
add statement { type: lt, var_1: "x", var_2: "1" } "
add statement { type: jump, pos: undef } ---------------------------------. "
// jump[depth][pos++] = this jump | "
// set jump[depth][pos-1].pos = this line | "
add variable { name: "0.5", type: float, value: "0.5" } | "
add statement { type: assign, var_1: "y", var_2: "0.5" } | "
add statement { type: jump, pos: undef } =================================|==============="
// jump_to_end[depth][pos++] = this jump | "
// set jump[depth][pos-1].pos = this line <-------------------------------' "
add variable { name: "1.0", type: float, value: "1.0" } "
"
// for(int jpos = pos;jpos > 0;jpos++) jump_to_end[depth][jpos] = this line <<=============
The basic logic for conditionals is that when a left curly bracket is encountered, a jump is created and added by reference to array[depth++][pos++]. When a right curly bracket is found, array[depth--][pos].pos is set to the current line. If pos >= 1, then another jump is added for end-of-conditional-block-jump and added a jump_to_end[depth][pos]. When a right curly bracket is found and pos >= 0, then every jump in jump_to_end is set to the current line.
Or, for a for loop:
for(int z = 0;z < x;z++) {
y += 0.1;
}
Would precompile into:
add variable { name: "z", type: int, value: "0" }
add variable { name: "0", type: int, value: "0" }
add statement { type: assign, var_1: "z", var_2: "0" }
add statement { type: lt, var_1: "z", var_2: "x" }
add statement { type: jump, pos: undef } -------------------------------------------.
// jump[depth][pos++] = this jump |
add variable { name: "1", type: int, value: "1" } |
add statement { type: add, var_1: "z", var_2: "1" } |
add variable { name: "0.1", type: float, value: "0.1" } |
add statement { type: add, var_1: "y", var_2: "0.1" } |
// set jump[depth][pos].pos = this line <-------------------------------------------'
Once parsing is complete, as marked by the depth reaching -1, the precompilation process is complete.
Functions reserve special locations in memory for return and parameters. For example:
int myFunction(int some_value) {
int i = some_value -1;
return i;
}
Would precompile into:
add variable { name: "i", type: int, value: "0" }
add statement { type: assign, var_1: "i", var_2: "some_value" }
add statement { type: min, var_1: "i", var_2: "1" }
add statement { type: return, var_1: "i" }
Then:
int i = myFunction(1);
precompiles into:
add variable { name: "i", type: int, value: "0" }
add variable { name: "1", type: int, value: "1" }
add statement { type: call, var_return: "i", var_1: "myFunction", var_2: "1" }
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Program Structure - compilation
````````````````````````````````
The compilation process begins by iterating over all precompiled variables and functions. First, memory is allocated for all variables, starting from global, creating vm_Stacks for the bytesize necessary to fit all variables. The values are then copied into their appropriate locations in memory, with the precompiled variable's target pointer set to these locations.
After this, precompiled statements are converted into bytecode operations. Variables are replaced with their memory addresses and statement types with OP codes.
There is more written about the OP code format in my(kts) .plan files for 2014/03, and as such, will not be written here.
================================================================
*/