Let's Code In Binary
2025-06-23I remember first learning about binary in a computer science lesson at the age of 13ish. Even then I had a strong interest in computers and when the teacher told us we would get to write some binary in that lesson I was thrilled. I imagined finally getting to program the computer in the actual language it understood. Imagine my disappointment when that activity turned out to be drawing an image using 0s for white pixels and 1s for black.
Was this technically writing binary? Yes.
Did it satisfy my urge to commune with the computer on its own terms? Not even close.
Over a decade later this thought suddenly came back into my mind and I realised that I had spent a large chunk of the time since then studying computers, and that writing a program in binary was probably now finally within my reach.
And so, the challenge begins...
The rules
The objective is to write a hello world executable that I can run on my laptop, in binary.
Rule 1: No compilers
I'm including here almost anything that does a job that a compiler does. Assemblers and linkers are also out. Conversions between denary, hex, and binary must be done by hand. Jump destinations must be manually located. The process of producing the binary cannot be automated in any way.
Software that looks at the output binary is allowed (e.g. readelf, file). The only other bit of software I will use is a packer, to convert a text file of 0s and 1s into a binary file. This effectively lets me use a text editor to write individual bits, which they don't typically support for some reason.
Rule 2: No tutorials
Following a tutorial would completely trivialise the challenge. What I will use instead, is specifications, official documentation, and kernel header files.
Rule 3: Binary must be hand typed
Every bit of the final executable must have been typed, manually, by me. Part of the fun of the challenge is to get a feel for the pain of early programmers using punch cards or physical switches to enter information into a computer. They did not use copy and paste, and neither will I.
Quick note
This post recounts how I went about completing this challenge. You should note then that if you read the post, you will violate rule 2 of your own challenge, should you choose to complete it. Obviously there is no reason for you to do this, nor is there any reason why you'd use the same rules as me, but just in case that's something you're interested in, you've been warned.
Let's write some binary
Step 1: What even is an executable?
My laptop currently runs the linux kernel. I already knew that an executable file for linux is an ELF file. Suspecting that my dubious knowledge of elves would not be enough, I looked for the documents that covered exactly what an ELF file actually was, and fairly quickly found this page. If you wanted to do this on Windows or MacOS, you'd need to work out what type of executable they use and find documentation for that (disappointingly, MacOS does have DWARF files but they are not equivalent to ELF files. I am not aware of any system with HOBBIT files).
My laptop also has an amd64 processor (a.k.a x86_64, the most common type of processor on modern laptops), so I went looking for a version with information specific to that and found this. Older devices typically use x86 (sometimes called 32-bit) and some devices such as recent macs use AArch64 (a completely different type of 64-bit processor). All these things make a big difference once you're dealing with binary!
Already I've made (at least) two implicit decisions. Firstly, ELF is not the only binary executable format which can run on linux, but it is by far the most common. Secondly, amd64 (64 bit) processors can also run x86 (32 bit) executables. It would've been simpler to write a 32 bit executable, but again, 64 bit is more common.
Step 2: File layout
The ABI document has a nice diagram here showing the general layout of an executable ELF file. It looks like this:
Get used to this diagram, it will be the thing that grounds us to reality as we are swept up into the dizzying world of bits and bytes.
Some parts aren't necessary apparently and the position doesn't actually have to follow the above, but it seems like the simplest option is just to stick with this structure. You might wonder why the blocks are weird shapes? Well at the time of writing I've actually already finished the executable (spoilers), so I know exactly how much space each section will need.
Step 3: ELF header
Might as well start at the top, right? The ELF Header is detailed here and a few 64 bit specifics are added in section 4.1 of the other document. It differs between 32 bit and 64 bit files, so we're using the 64 bit version, which, by my count, is (by pure coincidence I think) 64 bytes long.
In blue are the parts we can complete now. The parts which depend on the program header are in yellow, and those depending on the section header are in red. If you are colour blind, I am so sorry, these diagrams may become very difficult for you to see. I have attempted to write the accompanying paragraphs in ways that mean this can still be read without the diagrams. In reality it's probably incomprehensible madness with or without diagrams.
Starting from the top again, we immediately hit a weird value represented as 16 bytes.
Step 4: ELF identification
These first 16 bytes are totally hardware portable (the structure is the same for all processors). Each value is only a single byte (I suspect to allow portability between little and big endian systems, more on this later). It is also supposed to be completely future proof, so we have a whole chunk of redundant space that does nothing at all.
Finally, we can start filling out some bytes!
Every ELF file starts with the same 4 magic bytes.
The first is 7f
and it is followed by E
,
L
, and F
in ASCII.
Looking up those in an ascii table gets us the first four bytes of our file:
7f 45 4c 46
.
As previously mentioned, we are writing a 64 bit executable, which corresponds
to a value of 02
for the class.
My processor uses little endian encoding for multi-byte integers, yours
probably does too (big endian is extremely uncommon in modern processors
in my experience).
We indicate little endian with a value of 01
in the data encoding
field.
I'll explain what endianness is later.
There is only one version of ELF currently, so we must use a value of
01
for the version field.
Some operating systems have bespoke extensions to ELF.
If we are using any of those extensions then we need to specify the OS
and version of the extensions.
I don't know what any of these extensions are, and I want to keep these
relatively simple, so we can use 00 00
for these two bytes
to mean no extensions.
The spec says we must set the padding bytes to zeroes, so those will be
00 00 00 00 00 00 00
.
With that, we have the first 16 bytes of our file set!
Step 5: More ELF header
We have the bytes for the identification field, but we can fill out some more now too.
The spec here
lists the 5 possible values for the Type field.
One of them is called "executable file".
I want to be able to execute (i.e. run) this file as a program, so I picked
this one which calls for the value 2
.
The Type is also the first number that we need to encode using two bytes,
which means we need to come back to the little endian choice we made earlier.
A 2
in hex is written as 0002
(I've included some leading zeroes since we're about to use 2 bytes).
When we convert this to a series of bytes, there are two ways of doing it
(there are more but ELF only has two):
-
00 02
how you'd expect. The most significant byte is on the left. This is called big endian. In my experience it's most common in networking protocols. -
02 00
kinda backwards. It's the same two bytes, but they are in reverse order. This is called little endian, and is what is most common for data in memory on modern architectures, and what I'm using for my executable.
We're going to see a lot more of little endian as we keep going.
For the Type field, it means the two bytes we want are 02 00
in
that order.
The spec lists a ton of different Machine values.
Fortunately, the x86_64 specific document
here
says that I have to use EM_X86_64
,
which corresponds to the value of 62
which with 2 bytes in little endian is 3e 00
.
Once again, there is only one allowed value for the Version field,
1
.
This is a 4 byte field, and little endian again,
so the bytes we want are 01 00 00 00
.
It is insane to me that the designers of ELF decided they wanted to future
proof it to the point of allowing it to have over
four billion versions.
The fact that it's been decades and we haven't even had two only makes it
funnier.
There are a few fields we'll have to come back to, so let's skip to Flags.
Apparently there are some processor specific flags I could set.
I didn't go looking for them and just left this as 00 00 00 00
.
The last field we can fill out now is the ELF Header Size.
As mentioned earlier, I manually counted up how many bytes were in the header
to get 64
in denary, which in hex is 40
.
We get 2 bytes little endian for this, which will be 40 00
.
And we're done with the ELF header for now!
Step 6: Segment/s and Section/s
There is a lot we can't fill out in the program header table until we know more about the segments we want, so we'll skip ahead slightly for now.
The segments are the bits of the ELF file that become the actual memory of the process once execution starts. Memory is where instructions and data lives, so how many segments we have and what we put in them will depend on what instructions and data our program needs. If we want bits of memory with different properties (e.g. make sure the CPU doesn't accidentally execute our data thinking they are instructions) or chunks dotted around memory rather than in one contiguous block, we'd use multiple segments to achieve this. It's much simpler and still works to just have all the data and instructions in a single segment with all the permissions though, so we'll do that.
The same part of the file that contains segments can also contain sections, which can overlap with the segments. This splitting of those bytes into sections is to help linkers combine multiple ELF files into one. Since we are just writing this one ELF file and will not be linking it to any others, we might not need any sections at all. I'm not really sure though, and to be on the safe side, let's just have all the segments be in one single big section.
Great, we know we want one contiguous block of memory, but what to put in it? Welcome to the programming part! Are you ready for pain? Then have a look at this collosal manual. This manual details how 64 bit intel processors will interpret binary as instructions. Once our segment is loaded into memory somewhere, the operating system will execute our program starting with the instruction located at the memory address specified in the Entry Point ELF header field.
Our program is going to be a "Hello, world!" program, and so has two essential pieces.
- Print "Hello, world!"
- Exit
The exit
is important, otherwise the processor will just keep running
whatever bytes come after the print.
Step 7: Print "Hello, world!"
Both IO (input/output, which includes print) and exit are features provided to us by the linux kernel, so we need a way for our program to call a procedure that exists in the kernel. This is exactly what syscalls were invented for. They are a bit like procedures, but when you call them your program is paused and the kernel takes control, then once they are finished the kernel hands control back over to your program.
Linux inherited the "everything is a file" ethos from Unix, which means that
printing to a terminal is actually writing to a file.
We can do that with the perfectly named write
syscall.
Section A.2.1 from the amd64 ABI has some details about how to actually do syscalls on an amd64 processor.
- Set certain registers to the arguments of the syscall
- Set the
%rax
register to the number of the syscall - Run the
syscall
instruction
Step 7.1: Arguments to pass to write
Now it gets serious, time to look at
the actual source code of the linux kernel.
Look for the function signature of sys_write
,
and you'll see 3 parameters:
-
unsigned int fd
- the file descriptor of the file to write to. We're actually printing, not storing data in a file, so we will write tostdout
(standard out) which has a file descriptor of 1. -
const char __user *buf
- the address of the data to write. We haven't put this data anywhere yet, so we don't know this address. We'll come back to this one. -
size_t count
- the number of bytes to write. We know that our eventual message should be"Hello, world!\n"
. We will encode it in ASCII, which will take 14 bytes. Note that the string does not need to be zero terminated.
The amd64 ABI lists the registers used for the parameters of the kernel interface (syscalls). That gives us:
Parameter name | Type | Register | Value |
---|---|---|---|
fd |
int |
%rdi |
1 |
buf |
const char __user * |
%rsi |
unknown |
count |
size_t |
%rdx |
14 |
That massive intel manual is largely made up of an alphabetical list of
every single amd64 instruction.
There are loads of different ways to store a value into a register,
but the most intuitive is probably mov
,
which generally is for copying (moving) a value from one place to another.
fd
is an int
which means it should be 32 bits long,
so we'll use mov r32, imm32
to move a 32 bit immediate value
(a constant value that we set) into a 32 bit register.
The manual has an opcode for this instruction which is b8 + rd id
.
Section 3.1.1.1 of the manual explains what the + rd id
bit
means.
+rd
means we use the lowest 3 bits of the first byte
(the b8
) to say which register to move into.
id
means we put a 4 byte (little endian) value after that
first byte with the immediate value to move into the register.
The same section also shows which registers correspond to which numbers,
%edi
(the 32 bit version of %rdi
) is 7
.
We also know that 1 as a 4 byte little endian value is
01 00 00 00
.
Finally we combine the first 5 bit of b8
with the
three bits in 7
to get bf
,
and then add the 4 bytes of the value to get bf 01 00 00 00
.
Setting the count to 14 is very similar.
size_t
is a 64 bit value, so we'll use mov r64, imm64
.
The opcode is rex.w b8 + rd io
.
Section 2.2.1 tells us that rex.w
is the byte 48
.
io
is just like id
but 8 bytes instead of 4.
With that, we have the encoding of the instruction to set the count
argument: 48 ba 0e 00 00 00 00 00 00 00
.
We don't know the value we need for the data address yet,
but we know it will be a 64 bit value and must be in %rsi
.
Following the same process as with count
we know we'll have
48 be ?? ?? ?? ?? ?? ?? ?? ??
.
We can fill the gap with the little endian address later.
Step 7.2: write
syscall number
Every syscall in linux has a number given to it.
To find the number of a syscall, look at this
other linux kernel file.
write
is near the top with a number of 1
so the
%rax
register will need to be set to 1 which we can do with
another mov
.
The mov r64, imm64
for this is 48 b8 01 00 00 00 00 00 00 00
.
Step 7.3: Do a syscall
The intel manual has over two pages dedicated to the syscall
instruction, but it looks like the right one so the only bit we care about
is the opcode in the top left: 0f 05
.
Step 8: Exit
exit
is another syscall, but is much simpler than write
so I'll summarise.
It takes one parameter, an integer with an exit code. We will use 0 which
is usually used to indicate to the caller that the program ran successfully.
It is a 4 byte integer and, being the first argument, must be stored in
%rdi
.
We'll use mov r32, imm32
for that which is encoded as
bf 00 00 00 00
.
The syscall number of exit
is 60, or 3c
in hex.
We'll use mov r64, imm64
again to set %rax
to that,
which is encoded as 48 b8 3c 00 00 00 00 00 00 00
.
Finally we do another syscall
instruction to actually run it.
Just like that, we're done with instructions!
Step 9: Hello, world!
You may have noticed a suspicious gap in my diagrams. Well instructions aren't the only thing that we will need to load into memory to run the program, there's also the data.
For this tiny program, the only data is the string "Hello, world!"
.
Linux terminals expect standard out to be ASCII, so we will need to ASCII
encode the string.
Check an ASCII reference table
(e.g. this one)
with each character in the string (and a newline character at the end),
and you should get the bytes
48 65 6c 6c 6f 2c 20 77 6f 72 6c 64 21 0a
.
Step 10: Program header table
The program header is detailed here, and when mapped onto a diagram looks like this:
The spec lists various different values for Type,
but we want to load data into memory,
so we will choose PT_LOAD
which is 01 00 00 00
in
hex.
Flags determines the permissions of the memory that the segment
will be loaded into. Our segment contains both instructions and data, but
we only need to read the data, not write to it.
That means we can get away with just giving the segment read and execute
permissions, which calls for a flags value of 05 00 00 00
.
Offset is the location in the file of the segment.
Now that we know the size
of the ELF header (64 bytes) and the program header table (56 bytes), we can
deduce that the segment is 120 bytes into the file, or
78 00 00 00 00 00 00 00
as a little endian 64 bit integer.
We'll jump to Align next, since it's a factor in our choice of Virtual Address. A couple of observations on this:
- Section 5.1 of the ABI document for amd64 states that if the offset and virtual address are congruent modulo the page size, then the loading into memory can be much more efficient.
- In the spec for the align field, it states that the offset and virtual address must be congruent modulo the alignment.
How very interesting, those are the same condition!
I take this to mean that the purpose of Align is to tell the
operating system the page size for which it is designed,
potentially allowing the OS to load it more efficiently.
I want my executable to be designed for my OS (shocking I know), so I ran
getconf PAGESIZE
in the shell and found out my page size is
4096 (I bet yours is too).
Based on that, we'll set Align to 4096 in hex,
which is 00 10 00 00 00 00 00 00
.
Now we have a restriction on Virtual Address.
Section 3.3.4 of the ABI document also says that programs will typically load
their segments into the range 0x400000
to 0x80000000000
.
We'll (arbitrarily) pick the least possible address that meets the alignment
criterion, which is 0x400078
.
Encoded as a 64 bit little endian integer, that's
78 00 40 00 00 00 00 00
.
The spec says that System V ignores physical addressing, so
Physical Address is totally unneeded and can be set to
anything.
I'll just use 00 00 00 00 00 00 00 00
(I know, so original).
File Size is easy, our segment uses 68 bytes, which encodes to
44 00 00 00 00 00 00 00
.
Memory Size is a separate field in case we want some of the
memory not to be based on the file.
We don't need that though, so the memory size will be the same as the file size:
44 00 00 00 00 00 00 00
.
Here's how the program header table looks with all that put together:
Step 11: Section header table
The only part of the overall file we still need to look at is the section header table. This is an array of section headers which are described here.
Reading through the spec for the section header table, it mentions that a section index of 0 is reserved as a null section. Figure 4-10 in the section header spec gives more detail, listing that every field either must be 0 or is unspecified. The size and link fields need to be non-zero in cases where there are so many sections that the SH entity number field in the ELF header can't fit them into 2 bytes, but we won't have that many so these fields will just be zero too. Filling out the first section header is really easy then!
For the second section header, we have to actually do some work. The name field is set to the name of the section. But hang on, the name field is a 4 byte integer, not a string, how does that work? Well the integer stored in the name field acts as an index into a special section that stores all the strings. We index into that section to get a character which is the start of the name, and then continue reading subsequent characters in the section until we reach a null byte. The name is everything from the starting character we indexed up to but not including the null byte. We don't really need to give our section a name (I'm actually not sure if we need a section at all) and conveniently, the spec tells us that an index of 0 must always be the empty string, so we will just use that as our name.
There are a whole bunch of possible values for the type field.
Since our section just contains data to be mapped into memory for the program
to run, SHT_PROGBITS
seems to make the most sense.
It has a value of 1
.
There are also quite a few flags that we can set, so I'll just cover the ones
we care about.
Our section will be loaded into memory, and contains instructions that will be
executed, so the two flags we need are SHF_ALLOC
and
SHF_EXECINSTR
.
This gives a flags value of 06 00 00 00 00 00 00 00
.
The address field is the same as the virtual address in the program header, as are offset and size, so we can just copy the same values over.
The use of the link field depends on the type of the section, but
SHT_PROGBITS
sections don't use it, so we'll set it to 0.
The same applies to the info field.
The address align field is NOT the same as the align field in the program header. It specifies a value which the address must be divisible by. We don't have a requirement for that, so we'll just set it to 0.
Some sections store an array of elements, in which case the entity size field would store the size of a single element in that array. Our section is not an array, so this field must be set to 0.
Step 12: ELF header again
The only things left to fill out are the remaining parts of the ELF header that we started with.
Some of these are really easy now. We can count bytes to determine that the program header table is 64 bytes into the file, and contains one program header which is 56 bytes in size. Similarly our section header table is 188 bytes into the file and contains two entries which are each 64 bytes in size.
We hit a snag at the string index though, we don't have a string section! Fortunately, the spec specifies that we can use the null section as the string section, and we haven't actually used any strings (our only chance was the section name, but we set that to the empty string which is still valid even for an empty string table). That means we should be able to set the string index to 0.
Finally we have our entry point, which should be the virtual address of the
first instruction we want to run in our program.
We specified in our program header that the instructions should start at the
address 0x400078
, so that should be our entry point.
There's also one other tiny thing that is not done yet.
The address of the "Hello, world!" string is not encoded into the instruction
that uses it yet.
Knowing that our first instruction will have address 0x400078
and
that the string starts 54 bytes after it, we get that the string has address
0x4000ae
.
Step 13: Putting it all together
We've done a really good job of sticking to the rules thus far (well, I did, you have broken rule 2 you filthy cheat), but if you cast your mind back to rule 3, the challenge is not complete until I have hand typed the binary of this file and checked that it functions as expected.
So that's what I did.
01111111 01000101 01001100 01000110 00000010 00000001 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000010 00000000 00111110 00000000 00000001 00000000 00000000 00000000 01111000 00000000 01000000 00000000 00000000 00000000 00000000 00000000 01000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 10111100 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 01000000 00000000 00111000 00000000 00000001 00000000 01000000 00000000 00000010 00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000101 00000000 00000000 00000000 01111000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 01111000 00000000 01000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 01000100 00000000 00000000 00000000 00000000 00000000 00000000 00000000 01000100 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00010000 00000000 00000000 00000000 00000000 00000000 00000000 10111111 00000001 00000000 00000000 00000000 01001000 10111110 10101110 00000000 01000000 00000000 00000000 00000000 00000000 00000000 01001000 10111010 00001110 00000000 00000000 00000000 00000000 00000000 00000000 00000000 01001000 10111000 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00001111 00000101 10111111 00000000 00000000 00000000 00000000 01001000 10111000 00111100 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00001111 00000101 01001000 01100101 01101100 01101100 01101111 00101100 00100000 01110111 01101111 01110010 01101100 01100100 00100001 00001010 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000110 00000000 00000000 00000000 00000000 00000000 00000000 00000000 01111000 00000000 01000000 00000000 00000000 00000000 00000000 00000000 01111000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 01000100 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
If you take that binary, pack it into a file, make it executable, and execute it on an amd64 linux system, you should see "Hello, world!" printed!
Debugging
I did actually do this, the first time I did it properly and manually converted and typed out all the bits, but while doing it a second time for this post, I took a shortcut and used a program for the conversion. I also did things differently this time (I used some slightly different instructions the first time. I changed it so that there weren't as many different instructions used, to make it easier to understand). This means that on two separate occasions, I spent hours creating a binary file only to have it not work.
The first time I did this I executed my binary only to be immediately hit
with a segmentation fault.
I examined my binary using file
and readelf
and discovered that there were two typos.
I fixed these, repacked the binary, reran the program, and...
segmentation fault.
I had tried loading the segment to address
0x78
instead of 0x400078
.
I couldn't (and still can't) find anything in the spec that says this is not
allowed.
It does say that address 0 is strictly off limits, but
I'm not using address 0 so what's the problem?
Anyway, I eventually tried changing the virtual address to the more standard
one greater than 0x400000
and finally it worked!
Quick aside:
file
and readelf
are existing tools, but they only
examine existing binary and do not do anything to create the binary, and are
therefore allowed under my very arbitrary rules.
Could you imagine looking for a typo in the above block of binary without
using something to simplify it first?
When recreating the binary while writing this post, I somehow made
more mistakes than I did the first time.
First of all it refused to run it because it was for the wrong machine.
I looked at the hexadecimal and quickly realised that I'd only used 2 bytes
for the version in the ELF header instead of 4.
I fixed that and was met with the same error.
I tried readelf
once again and discovered that my file had an
invalid type and was designed for some random niche architecture, not amd64.
The type was just a typo so easily fixed.
The architecture, however, was because I had
converted 62 denary into hex and arrived at... 62.
Facepalm.
Fixing both of those problems actually got some progress.
The executable would now print hello world as intended, but would then
segmentation fault instead of exiting cleanly.
I scrutinised the instructions for the exit syscall and realised that I had
thought the syscall number for exit was 59
when, in fact, it is
60
.
With that I finally had a second executable written in binary!
Conclusion
Honestly, this was really fun and I'd highly recommend it. Even though you've read this article you could probably just wait a year and then try it and you'll have forgotten all the important details. It was tough to find and read all the relevant documentation, but it is fascinating to have a fuller understanding of how the software on my computer runs.
Plus, now that I've laid this groundwork, the sky's the limit babeee! I could keep making increasingly complex programs, eventually bootstrapping a full application. Equally I could go the other way and use hardware to bootstrap the file instead of a packing program and text editor. At this point, why not both?
If you've found this interesting, I'd love to hear from you. My email address is shtanton at shtanton dot xyz.