Over the years of writing a lot of assembly in pawn I?ve developed a few tricks for things.? These are just some of them.
Normally to call a function you use CALL, but that only takes a constant, not a variable.? If you want to load the target address from a variable you instead need to use SCTRL to set the target.? But CALL does some other bits as well like setting the return address, which must be done manually with SCTRL using LCTRL.? LCTRL gets the current instruction pointer, but the return address should be the instruction after the SCTRL.? With this snippet it turns out that that return address is exactly nine cells later, so the code to call a function stored in a local variable called func is:
If you?re using the JIT plugin the return address needs to be modified to a new address space.? This is what LCTRL 8 does.? It takes the AMX address in pri and converts it to a JIT address.? However if you?re not using the JIT plugin there is no register eight and pri remains the same.? So always using it is perfectly safe - either the address will be correctly updated or it won?t change at all.
That code is already a bit unreadable.? I said the return address is nine cells later, yet 9 doesn?t appear anywhere in the code.? Also what are 6 and 8 doing?? Fortunately const values work in #emit, so we can name some of these constants and make them more readable.? The previous snippet thus becomes:
[See this file]() for the full list of constant offsets defined in YSI, there are too many to list here.? But a few highlighted ones are:
So for example to load the number of parameters passed to the current function use:
Or using another YSI define:
This code will get the current address of the stack pointer (stk):
But it will get the value in to the pri register while sometimes you want it in the alt register instead.? You could load it and move it:
But that takes two instructions and clobbers pri.? You could swap the registers about to preserve pri:
But that takes even more instructions.? Or you could just load the value straight in to alt:
This adjusts the stack size by naught cells.? It doesn?t get any bigger or smaller.? At first that seems pointless, but pawn-impl.pdf says this about STACK:
So the instruction first saves the current value of stk in alt, then adjusts the size.? We only want the first part so we NOP the second part by making the size adjustment naught, yet still get the register being saved.
There is no AND.C so to do:
In assembly is:
That?s usually fine, but what if there is some data we want to keep in alt?? It could be saved to the stack:
But that takes many extra instructions.? But this code is the same as:
>>> and << lose data, and there are SHR.C and SHL.C instructions:
This is one cell longer than the first method, but two instructions shorter than the second and doesn?t clobber alt.? However, it doesn?t work for masks like 0x00FF0000 without yet another shift, at which point it probably isn?t worth the effort compared to AND, nor at all for any mask with naughts in like 0xF0F0F0F0:
The data segment is called dat, and is distinct from the code segment cod.? The former is where all data is stored, including the stack and heap, the latter is where all the code is stored.? But to write self-modifying code of the sort found in @emit we must write to cod instead of dat, so how is that done?? The instruction STOR.pri writes the value of the register pri to the given variable, but the VM checks that these variables are valid, i.e. that they are in some active part of global memory the stack.? But there is an oversight in the indirection instructions - SREF.pri, LREF.S.alt, etc.? These take a variable and check that this variable is within the memory segment, then use the value in that variable as a pointer to the location to write to.? However, crucially this second access is NOT checked for validity, so we can write anywhere.? The code section comes immediately before the data section so we can write there using negative addresses.? The offset of dat is found in LCTRL 1, the offset of cod is in LCTRL 0, so the two together give the relative offset of cod from dat.? For example to read the parameter of HALT at address 0 in cod:
This trick is basically the core of amx_assembly, indirection, YSI, and more.? Without it there would be no code re-writing at all of the sort needed for advanced techniques like hook and inline.
For some reason this code compiles:
This is likely an oversight in the compiler because the inversion of a string is gibberish, the code will probably just crash.? But inverting an inversion gives the original value back so this is fine:
It might seem pointless but in assembly you end up with:
Still pointless, except for the fact that it is a very interesting sequence of instructions that can be scanned for.? This INVERT/INVERT pair is the core of how y_inline actually locates inline functions and their names in memory (searching through the dat segment with LREF as detailed above).
Compiles as:
So we push something to the stack, then reset the stack.? This is exploited in the decl keyword which declares a large variable without initialising it:
Becomes:
Which compiles as:
Most importantly the FILL opcode is skipped, but because labels don?t modify scope a still exists.
There is at least one bit of assembly in YSI where the LCTRL/SCTRL pair generated by a label are important to functionality (Inline_NumArgs to cancel out a PUSH.pri elsewhere in the function), but they?re extremely rare for one reason - you can only jump backwards in assembly, future labels can?t be used.
Consider the following function:
There are four main ways to implement this function and forward all the parameters - 1) a macro, 2) y_va, 3) copy all the parameters as in the very common bit of assembly everyone copies, 4) the clever way.? We already have most of the parameters for format on the stack, we just need to add two more.? To call a function the number of parameters in bytes is pushed, then the function is called with CALL, which adds the return address to the stack and jumps to the address specified.? The first thing in a function is then PROC, which saves the current frame pointer to the stack and sets up a new frame.? So at the moment that format is called in this function the stack looks like:
target is declared static in this example to make the stack much simpler.? So really to call format we need:
Most of that data is already there if we can just modify the rest:
We have now modified the stack to look how we want, put the return address in alt, and set the frame pointer back to what it used to be.? So call format:
After format returns we need to restore the stack to how is was without clobbering alt, which still holds the return address we need, since SYSREQ.C doesn?t touch that register:
Now the part this entire section has been building to - we need to get the frame pointer back out and back on to the stack.? We used SCTRL to save it (we could have saved it to a global variable, but why waste one when the control register is right there, plus a global variable might not work if the native calls a callback), so the obvious code is:
But there?s an instruction that does all of this in one go:
So you may sometimes see PROC randomly in the middle of a function, and this is what it is doing.? In fact it isn?t unusual to see:
Which is the name of the section - we call PROC just to set up the stack correctly for calling RETN.
Normally after calling a native you need to remove all the parameters pushed for it, but you don?t after calling a normal function - RETN does that for you.? In the above example we needed to restore the stack to how it was before format was called, but sometimes you don?t need to because the native is the last thing done in the function.? In that case we can just exploit RETN to remove all the extra parameters we added too:
Calling a function by variable.
Normally to call a function you use CALL, but that only takes a constant, not a variable.? If you want to load the target address from a variable you instead need to use SCTRL to set the target.? But CALL does some other bits as well like setting the return address, which must be done manually with SCTRL using LCTRL.? LCTRL gets the current instruction pointer, but the return address should be the instruction after the SCTRL.? With this snippet it turns out that that return address is exactly nine cells later, so the code to call a function stored in a local variable called func is:
Pawn Wrote:#emit LCTRL? ? ? 6
#emit ADD.C? ? ? 36
#emit LCTRL? ? ? 8
#emit PUSH.pri
#emit LOAD.S.pri func
#emit SCTRL? ? ? 6
Control register 8
If you?re using the JIT plugin the return address needs to be modified to a new address space.? This is what LCTRL 8 does.? It takes the AMX address in pri and converts it to a JIT address.? However if you?re not using the JIT plugin there is no register eight and pri remains the same.? So always using it is perfectly safe - either the address will be correctly updated or it won?t change at all.
Named registers.
That code is already a bit unreadable.? I said the return address is nine cells later, yet 9 doesn?t appear anywhere in the code.? Also what are 6 and 8 doing?? Fortunately const values work in #emit, so we can name some of these constants and make them more readable.? The previous snippet thus becomes:
Pawn Wrote:#emit LCTRL? ? ? __cip
#emit ADD.C? ? ? __9_cells
#emit LCTRL? ? ? __jmp
#emit PUSH.pri
#emit LOAD.S.pri ptr
#emit SCTRL? ? ? __cip
[See this file]() for the full list of constant offsets defined in YSI, there are too many to list here.? But a few highlighted ones are:
- __frame_offset - The offset of the previous frame?s pointer in the current frame.? Actually just 0, but still named to be clear.
- __return_offset - The offset of the return address in the current frame (the value pushed in the code above).
- __args_offset - The offset of the number of arguments passed to the current function (in bytes) in the current frame.
- __param0_offset - The offset of the first parameter passed to the function, for example playerid in GetPlayerName(playerid, name, size);.
- __minus1 - Just the number -1, since - is broken in #emit.
- __cip - The control register for the current instruction pointer (cip).
- __2_cells - The size in bytes of two cells.? Useful when adjusting other values to make it clear that it is just two cells generally, not an offset such as __args_offset.
So for example to load the number of parameters passed to the current function use:
Pawn Wrote:#emit LOAD.S.pri __args_offset // In bytes.
#emit SHR.C.pri 2 // In cells.
Or using another YSI define:
Pawn Wrote:#emit LOAD.S.pri __args_offset // In bytes.
#emit SHR.C.pri __COMPILER_CELL_SHIFT // In cells.
Current stack address.
This code will get the current address of the stack pointer (stk):
Pawn Wrote:#emit LCTRL __stk
But it will get the value in to the pri register while sometimes you want it in the alt register instead.? You could load it and move it:
Pawn Wrote:#emit LCTRL __stk
#emit MOVE.alt
But that takes two instructions and clobbers pri.? You could swap the registers about to preserve pri:
Pawn Wrote:#emit MOVE.alt
#emit LCTRL __stk
#emit XCHG
But that takes even more instructions.? Or you could just load the value straight in to alt:
Pawn Wrote:#emit STACK 0
This adjusts the stack size by naught cells.? It doesn?t get any bigger or smaller.? At first that seems pointless, but pawn-impl.pdf says this about STACK:
Quote:ALT = STK, STK = STK value
So the instruction first saves the current value of stk in alt, then adjusts the size.? We only want the first part so we NOP the second part by making the size adjustment naught, yet still get the register being saved.
Masking with shifts.
There is no AND.C so to do:
Pawn Wrote:a = b & 0xFFFF0000;
In assembly is:
Pawn Wrote:#emit LOAD.S.pri b
#emit CONST.alt? 0xFFFF0000
#emit AND
#emit STOR.S.pri a
That?s usually fine, but what if there is some data we want to keep in alt?? It could be saved to the stack:
Pawn Wrote:#emit LOAD.S.pri b
#emit PUSH.alt
#emit CONST.alt? 0xFFFF0000
#emit AND
#emit POP.alt
#emit STOR.S.pri a
But that takes many extra instructions.? But this code is the same as:
Pawn Wrote:a = (b >>> 16) << 16;
>>> and << lose data, and there are SHR.C and SHL.C instructions:
Pawn Wrote:#emit LOAD.S.pri b
#emit SHR.C.pri? 16
#emit SHR.L.pri? 16
#emit STOR.S.pri a
This is one cell longer than the first method, but two instructions shorter than the second and doesn?t clobber alt.? However, it doesn?t work for masks like 0x00FF0000 without yet another shift, at which point it probably isn?t worth the effort compared to AND, nor at all for any mask with naughts in like 0xF0F0F0F0:
Pawn Wrote:#emit LOAD.S.pri b
#emit SHR.L.pri? 8
#emit SHR.C.pri? 24
#emit SHR.L.pri? 16
#emit STOR.S.pri a
Escaping DAT
The data segment is called dat, and is distinct from the code segment cod.? The former is where all data is stored, including the stack and heap, the latter is where all the code is stored.? But to write self-modifying code of the sort found in @emit we must write to cod instead of dat, so how is that done?? The instruction STOR.pri writes the value of the register pri to the given variable, but the VM checks that these variables are valid, i.e. that they are in some active part of global memory the stack.? But there is an oversight in the indirection instructions - SREF.pri, LREF.S.alt, etc.? These take a variable and check that this variable is within the memory segment, then use the value in that variable as a pointer to the location to write to.? However, crucially this second access is NOT checked for validity, so we can write anywhere.? The code section comes immediately before the data section so we can write there using negative addresses.? The offset of dat is found in LCTRL 1, the offset of cod is in LCTRL 0, so the two together give the relative offset of cod from dat.? For example to read the parameter of HALT at address 0 in cod:
Pawn Wrote:new ptr;
// ptr = cod - dat 4
#emit LCTRL __dat
#emit MOVE.alt
#emit LCTRL __cod
#emit SUB
#emit ADD.C __1_cell
#emit STOR.S.pri ptr
// ptr = *ptr;
#emit LREF.S.pri ptr
#emit STOR.S.pri ptr
printf("Default HALT parameter: %d", ptr);
This trick is basically the core of amx_assembly, indirection, YSI, and more.? Without it there would be no code re-writing at all of the sort needed for advanced techniques like hook and inline.
You can invert strings.
For some reason this code compiles:
Pawn Wrote:Func(const str[])
{
}
main()
{
? ? new str[] = "Hello";
? ? Func(~str);
}
This is likely an oversight in the compiler because the inversion of a string is gibberish, the code will probably just crash.? But inverting an inversion gives the original value back so this is fine:
Pawn Wrote:Func(const str[])
{
}
main()
{
? ? new str[] = "Hello";
? ? Func(~~str);
}
It might seem pointless but in assembly you end up with:
asm Wrote:ADDR.pri str
INVERT
INVERT
PUSH.pri
Still pointless, except for the fact that it is a very interesting sequence of instructions that can be scanned for.? This INVERT/INVERT pair is the core of how y_inline actually locates inline functions and their names in memory (searching through the dat segment with LREF as detailed above).
Labels reset the stack.
Pawn Wrote:Func()
{
? ? new a = 4;
label:
? ? return a;
}
Compiles as:
asm Wrote:PROC
PUSH.C 4
LCRTL 5
ADD.C -4
SCTRL 4
LOAD.S.pri -4
STACK 4
RETN
So we push something to the stack, then reset the stack.? This is exploited in the decl keyword which declares a large variable without initialising it:
Pawn Wrote:decl a[128];
Becomes:
Pawn Wrote:goto after_a;
new a[128];
after_a:
Which compiles as:
asm Wrote:JUMP after_a
STACK -512
ZERO.pri
FILL 512
after_a:
LCTRL 5
ADD.C -512
SCTRL 5
Most importantly the FILL opcode is skipped, but because labels don?t modify scope a still exists.
There is at least one bit of assembly in YSI where the LCTRL/SCTRL pair generated by a label are important to functionality (Inline_NumArgs to cancel out a PUSH.pri elsewhere in the function), but they?re extremely rare for one reason - you can only jump backwards in assembly, future labels can?t be used.
Starting a function just to end it.
Consider the following function:
Pawn Wrote:sprintf(const fmat[], {Float, _}:...)
{
? ? static target[144];
? ? format(target, sizeof (target), fmat, ???);
? ? return target;
}
There are four main ways to implement this function and forward all the parameters - 1) a macro, 2) y_va, 3) copy all the parameters as in the very common bit of assembly everyone copies, 4) the clever way.? We already have most of the parameters for format on the stack, we just need to add two more.? To call a function the number of parameters in bytes is pushed, then the function is called with CALL, which adds the return address to the stack and jumps to the address specified.? The first thing in a function is then PROC, which saves the current frame pointer to the stack and sets up a new frame.? So at the moment that format is called in this function the stack looks like:
- ???
- ???
- ???
- fmat
- arg_count
- return_address
- frame_pointer
target is declared static in this example to make the stack much simpler.? So really to call format we need:
- ???
- ???
- ???
- fmat
- sizeof (target)
- target
- arg_count 8
Most of that data is already there if we can just modify the rest:
Pawn Wrote:// Remove the frame pointer from the stack and use it.
#emit POP.pri
#emit SCTRL 5
// Remove the return address from the stack.
#emit POP.alt
// Remove the arg count from the stack.
#emit POP.pri
// Push the two extra parameters.
const size = sizeof (target) * cellbytes;
#emit PUSH.C size
#emit PUSH.C target
// Update and push the parameter count.
#emit ADD.C __2_cells
#emit PUSH.pri
We have now modified the stack to look how we want, put the return address in alt, and set the frame pointer back to what it used to be.? So call format:
Pawn Wrote:#emit SYSREQ.C format
After format returns we need to restore the stack to how is was without clobbering alt, which still holds the return address we need, since SYSREQ.C doesn?t touch that register:
Pawn Wrote:// Remove and reset the count again.
#emit POP.pri
#emit ADD.C __m2_cells // -8
// Remove the next cell without altering any registers (tricky).
#emit SWAP.alt
#emit POP.alt
// Put the count back on the stack.
#emit SWAP.pri
// Put the return address back.
#emit PUSH.alt
Now the part this entire section has been building to - we need to get the frame pointer back out and back on to the stack.? We used SCTRL to save it (we could have saved it to a global variable, but why waste one when the control register is right there, plus a global variable might not work if the native calls a callback), so the obvious code is:
Pawn Wrote:#emit LCTRL 5
#emit PUSH.pri
#emit LCTRL 4
#emit SCTRL 5
But there?s an instruction that does all of this in one go:
Pawn Wrote:#emit PROC
So you may sometimes see PROC randomly in the middle of a function, and this is what it is doing.? In fact it isn?t unusual to see:
Pawn Wrote:#emit PROC
#emit RETN
Which is the name of the section - we call PROC just to set up the stack correctly for calling RETN.
The stack doesn?t even need restoring.
Normally after calling a native you need to remove all the parameters pushed for it, but you don?t after calling a normal function - RETN does that for you.? In the above example we needed to restore the stack to how it was before format was called, but sometimes you don?t need to because the native is the last thing done in the function.? In that case we can just exploit RETN to remove all the extra parameters we added too:
Pawn Wrote:// As before.
#emit POP.pri
#emit SCTRL 5
#emit POP.alt
#emit POP.pri
const size = sizeof (target) * cellbytes;
#emit PUSH.C size
#emit PUSH.C target
#emit ADD.C __2_cells
#emit PUSH.pri
#emit SYSREQ.C format
// Leave the two extra parameters on the stack and just pretend they were passed to us.
// Push the return address again.
#emit PUSH.alt
// Put the frame pointer back and end the function.
#emit PROC
#emit RETN