//go:noinlinecompiler-directive here... Don't get bitten.)
0x0000: Offset of the current instruction, relative to the start of the function.
TEXT "".add: The
TEXTdirective declares the
"".addsymbol as part of the
.textsection (i.e. runnable code) and indicates that the instructions that follow are the body of the function. The empty string
""will be replaced by the name of the current package at link-time: i.e.,
main.addonce linked into our final binary.
SBis the virtual register that holds the "static-base" pointer, i.e. the address of the beginning of the address-space of our program.
"".add(SB)declares that our symbol is located at some constant offset (computed by the linker) from the start of our address-space. Put differently, it has an absolute, direct address: it's a global function symbol. Good ol'
objdumpwill confirm all of that for us:
NOSPLIT: Indicates to the compiler that it should not insert the stack-split preamble, which checks whether the current stack needs to be grown. In the case of our
addfunction, the compiler has set the flag by itself: it is smart enough to figure that, since
addhas no local variables and no stack-frame of its own, it simply cannot outgrow the current stack; thus it'd be a complete waste of CPU cycles to run these checks at each call site.
$0denotes the size in bytes of the stack-frame that will be allocated; while
$16specifies the size of the arguments passed in by the caller.
"".a+8(SP)respectively refer to the addresses 12 bytes and 8 bytes below the top of the stack (remember: it grows downwards!).
.bare arbitrary aliases given to the referred locations; although they have absolutely no semantic meaning whatsoever, they are mandatory when using relative addressing on virtual registers. The documentation about the virtual frame-pointer has some to say about this:
ais not located at
0(SP), but rather at
8(SP); that's because the caller stores its return-address in
CALLpseudo-instruction. 2. Arguments are passed in reverse-order; i.e. the first argument is the closest to the top of the stack.
ADDLdoes the actual addition of the two Long-words (i.e. 4-byte values) stored in
CX, then stores the final result in
AX. That result is then moved over to
"".~r2+16(SP), where the caller had previously reserved some stack space and expects to find its return values. Once again,
"".~r2has no semantic meaning here.
trueboolean value. The mechanics at play are exactly the same as for our first return value; only the offset relative to
RETpseudo-instruction tells the Go assembler to insert whatever instructions are required by the calling convention of the target platform in order to properly return from a subroutine call. Most likely this will cause the code to pop off the return-address stored at
0(SP)then jump back to it.
main.addhas finished executing:
mainfunction looks like:
main.mainonce linked) is a global function symbol in the
.textsection, whose address is some constant offset from the beginning of our address-space.
main, grows its stack-frame by 24 bytes (remember that the stack grows downwards, so
SUBQhere actually makes the stack-frame bigger) by decrementing the virtual stack-pointer. Of those 24 bytes:
24(SP)) are used to store the current value of the frame-pointer
BP(the real one!) to allow for stack-unwinding and facilitate debugging
16(SP)) are reserved for the second return value (
bool) plus 3 bytes of necessary alignment on
12(SP)) are reserved for the first return value (
8(SP)) are reserved for the value of argument
4(SP)) are reserved for the value of argument
LEAQcomputes the new address of the frame-pointer and stores it in
137438953482actually corresponds to the
324-byte values concatenated into one 8-byte value:
addfunction as an offset relative to the static-base pointer: i.e. this is a straightforward jump to a direct address.
CALLalso pushes the return-address (8-byte value) at the top of the stack; so every references to
SPmade from within our
addfunction end up being offsetted by 8 bytes! E.g.
"".ais not at
0(SP)anymore, but at
NOSPLITas a hint for the compiler not to insert these checks.
TLSis a virtual register maintained by the runtime that holds a pointer to the current
g, i.e. the data-structure that keeps track of all the state of a goroutine.
gfrom the source code of the runtime:
g.stackguard0, which is the threshold value maintained by the runtime that, when compared to the stack-pointer, indicates whether or not a goroutine is about to run out of space. The prologue thus checks if the current
SPvalue is less than or equal to the
stackguard0threshold (that is, it's bigger), then jumps to the epilogue if it happens to be the case.
NOPinstruction just before the
CALLexists so that the prologue doesn't jump directly onto a
CALLinstruction. On some platforms, doing so can lead to very dark places; it's a common pratice to set-up a noop instruction right before the actual call and land on this
NOPinstead. [UPDATE: We've discussed about this matter in issue #4: Clarify "nop before call" paragraph.]