Smallest executable program (x86-64)











up vote
-1
down vote

favorite












I recently came across this post describing the smallest possible ELF executable, however the post was written for 32 bit and I was unable to get the final version to compile on my machine. This brings me to the question: what's the smallest x86-64 ELF executable it's possible to write that runs without error?










share|improve this question
























  • What machine do you have? Windows subsystem for Linux (which doesn't support 32-bit executable at all)? Or a proper Linux kernel built without IA-32 compat? What do you mean you couldn't get the final version to even compile? Surely you got a binary file, but couldn't run it? (Anyway, I know your question isn't about that, but if you couldn't even compile the 32-bit version, you probably won't be able to use NASM's flat-binary output to create a 64-bit executable with code packed into the ELF headers either.)
    – Peter Cordes
    Nov 19 at 21:10








  • 1




    Can you use 32-bit int 0x80 system calls in your 64-bit executable? If so, your probably don't need to change much. I know there's some overlap of ELF header fields being interpreted as part of the machine code, so some change might be needed for ELF64.
    – Peter Cordes
    Nov 19 at 21:13






  • 2




    For 64 bit mode, you basically need to recreate the entire program as both the machine code and the layout of the ELF header is quite different. While this is a nice exercise for an experienced programmer, I'm not sure if you are going to get an answer to your question within the scope of this site.
    – fuz
    Nov 19 at 21:33






  • 1




    I'm voting to close this question as off-topic because code golf questions are off-topic on StackOverflow.
    – Ross Ridge
    Nov 19 at 22:32















up vote
-1
down vote

favorite












I recently came across this post describing the smallest possible ELF executable, however the post was written for 32 bit and I was unable to get the final version to compile on my machine. This brings me to the question: what's the smallest x86-64 ELF executable it's possible to write that runs without error?










share|improve this question
























  • What machine do you have? Windows subsystem for Linux (which doesn't support 32-bit executable at all)? Or a proper Linux kernel built without IA-32 compat? What do you mean you couldn't get the final version to even compile? Surely you got a binary file, but couldn't run it? (Anyway, I know your question isn't about that, but if you couldn't even compile the 32-bit version, you probably won't be able to use NASM's flat-binary output to create a 64-bit executable with code packed into the ELF headers either.)
    – Peter Cordes
    Nov 19 at 21:10








  • 1




    Can you use 32-bit int 0x80 system calls in your 64-bit executable? If so, your probably don't need to change much. I know there's some overlap of ELF header fields being interpreted as part of the machine code, so some change might be needed for ELF64.
    – Peter Cordes
    Nov 19 at 21:13






  • 2




    For 64 bit mode, you basically need to recreate the entire program as both the machine code and the layout of the ELF header is quite different. While this is a nice exercise for an experienced programmer, I'm not sure if you are going to get an answer to your question within the scope of this site.
    – fuz
    Nov 19 at 21:33






  • 1




    I'm voting to close this question as off-topic because code golf questions are off-topic on StackOverflow.
    – Ross Ridge
    Nov 19 at 22:32













up vote
-1
down vote

favorite









up vote
-1
down vote

favorite











I recently came across this post describing the smallest possible ELF executable, however the post was written for 32 bit and I was unable to get the final version to compile on my machine. This brings me to the question: what's the smallest x86-64 ELF executable it's possible to write that runs without error?










share|improve this question















I recently came across this post describing the smallest possible ELF executable, however the post was written for 32 bit and I was unable to get the final version to compile on my machine. This brings me to the question: what's the smallest x86-64 ELF executable it's possible to write that runs without error?







assembly x86-64 elf






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 19 at 21:08









Jester

46.1k34481




46.1k34481










asked Nov 19 at 21:03









Antonio Perez

1754




1754












  • What machine do you have? Windows subsystem for Linux (which doesn't support 32-bit executable at all)? Or a proper Linux kernel built without IA-32 compat? What do you mean you couldn't get the final version to even compile? Surely you got a binary file, but couldn't run it? (Anyway, I know your question isn't about that, but if you couldn't even compile the 32-bit version, you probably won't be able to use NASM's flat-binary output to create a 64-bit executable with code packed into the ELF headers either.)
    – Peter Cordes
    Nov 19 at 21:10








  • 1




    Can you use 32-bit int 0x80 system calls in your 64-bit executable? If so, your probably don't need to change much. I know there's some overlap of ELF header fields being interpreted as part of the machine code, so some change might be needed for ELF64.
    – Peter Cordes
    Nov 19 at 21:13






  • 2




    For 64 bit mode, you basically need to recreate the entire program as both the machine code and the layout of the ELF header is quite different. While this is a nice exercise for an experienced programmer, I'm not sure if you are going to get an answer to your question within the scope of this site.
    – fuz
    Nov 19 at 21:33






  • 1




    I'm voting to close this question as off-topic because code golf questions are off-topic on StackOverflow.
    – Ross Ridge
    Nov 19 at 22:32


















  • What machine do you have? Windows subsystem for Linux (which doesn't support 32-bit executable at all)? Or a proper Linux kernel built without IA-32 compat? What do you mean you couldn't get the final version to even compile? Surely you got a binary file, but couldn't run it? (Anyway, I know your question isn't about that, but if you couldn't even compile the 32-bit version, you probably won't be able to use NASM's flat-binary output to create a 64-bit executable with code packed into the ELF headers either.)
    – Peter Cordes
    Nov 19 at 21:10








  • 1




    Can you use 32-bit int 0x80 system calls in your 64-bit executable? If so, your probably don't need to change much. I know there's some overlap of ELF header fields being interpreted as part of the machine code, so some change might be needed for ELF64.
    – Peter Cordes
    Nov 19 at 21:13






  • 2




    For 64 bit mode, you basically need to recreate the entire program as both the machine code and the layout of the ELF header is quite different. While this is a nice exercise for an experienced programmer, I'm not sure if you are going to get an answer to your question within the scope of this site.
    – fuz
    Nov 19 at 21:33






  • 1




    I'm voting to close this question as off-topic because code golf questions are off-topic on StackOverflow.
    – Ross Ridge
    Nov 19 at 22:32
















What machine do you have? Windows subsystem for Linux (which doesn't support 32-bit executable at all)? Or a proper Linux kernel built without IA-32 compat? What do you mean you couldn't get the final version to even compile? Surely you got a binary file, but couldn't run it? (Anyway, I know your question isn't about that, but if you couldn't even compile the 32-bit version, you probably won't be able to use NASM's flat-binary output to create a 64-bit executable with code packed into the ELF headers either.)
– Peter Cordes
Nov 19 at 21:10






What machine do you have? Windows subsystem for Linux (which doesn't support 32-bit executable at all)? Or a proper Linux kernel built without IA-32 compat? What do you mean you couldn't get the final version to even compile? Surely you got a binary file, but couldn't run it? (Anyway, I know your question isn't about that, but if you couldn't even compile the 32-bit version, you probably won't be able to use NASM's flat-binary output to create a 64-bit executable with code packed into the ELF headers either.)
– Peter Cordes
Nov 19 at 21:10






1




1




Can you use 32-bit int 0x80 system calls in your 64-bit executable? If so, your probably don't need to change much. I know there's some overlap of ELF header fields being interpreted as part of the machine code, so some change might be needed for ELF64.
– Peter Cordes
Nov 19 at 21:13




Can you use 32-bit int 0x80 system calls in your 64-bit executable? If so, your probably don't need to change much. I know there's some overlap of ELF header fields being interpreted as part of the machine code, so some change might be needed for ELF64.
– Peter Cordes
Nov 19 at 21:13




2




2




For 64 bit mode, you basically need to recreate the entire program as both the machine code and the layout of the ELF header is quite different. While this is a nice exercise for an experienced programmer, I'm not sure if you are going to get an answer to your question within the scope of this site.
– fuz
Nov 19 at 21:33




For 64 bit mode, you basically need to recreate the entire program as both the machine code and the layout of the ELF header is quite different. While this is a nice exercise for an experienced programmer, I'm not sure if you are going to get an answer to your question within the scope of this site.
– fuz
Nov 19 at 21:33




1




1




I'm voting to close this question as off-topic because code golf questions are off-topic on StackOverflow.
– Ross Ridge
Nov 19 at 22:32




I'm voting to close this question as off-topic because code golf questions are off-topic on StackOverflow.
– Ross Ridge
Nov 19 at 22:32












1 Answer
1






active

oldest

votes

















up vote
3
down vote













Starting from an answer of mine about the "real" entrypoint of an ELF executable on Linux and "raw" syscalls, we can strip it down to



bits 64
global _start
_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall


I don't think you can get it to be any smaller without going out of specs - in particular, the psABI doesn't guarantee anything about the state of eax. This gets assembled to precisely 10 bytes (as opposed to the 7 bytes of the 32 bit payload):



66 bf 2a 00 31 c0 b0 3c 0f 05




The straightforward way (assemble with nasm, link with ld) produces me a 352 bytes executable.



The first "real" transformation he does is building the ELF "by hand"; doing this (with some modifications, as the ELF header for x86_64 is a bit bigger)



bits 64
org 0x08048000

ehdr: ; Elf64_Ehdr
db 0x7F, "ELF", 2, 1, 1, 0 ; e_ident
times 8 db 0
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq phdr - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 1 ; e_phnum
dw 0 ; e_shentsize
dw 0 ; e_shnum
dw 0 ; e_shstrndx

ehdrsize equ $ - ehdr

phdr: ; Elf64_Phdr
dd 1 ; p_type
dd 5 ; p_flags
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq filesize ; p_filesz
dq filesize ; p_memsz
dq 0x1000 ; p_align

phdrsize equ $ - phdr

_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall

filesize equ $ - $$


we get down to 130 bytes. This is a tad bigger than the 91 bytes executable, but it comes from the fact that several fields become 64 bits instead of 32.





We can then apply some tricks similar to his; the partial overlap of phdr and ehdr can be done, although the order of fields in phdr is different, and we have to overlap p_flags with e_shnum (which however should be ignored due to e_shentsize being 0).



Moving the code inside the header is slightly more difficult, as it's 3 bytes larger, but that part of header is just as big as in the 32 bit case. We overcome this by starting 2 bytes earlier, overwriting the padding byte (ok) and the ABI version field (not ok, but still works).



So, we reach:



bits 64
org 0x08048000

ehdr: ; Elf64_Ehdr
db 0x7F, "ELF", 2, 1, ; e_ident
_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq phdr - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
phdr: ; Elf64_Phdr
dw 1 ; e_phnum p_type
dw 0 ; e_shentsize
dw 5 ; e_shnum p_flags
dw 0 ; e_shstrndx
ehdrsize equ $ - ehdr
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq filesize ; p_filesz
dq filesize ; p_memsz
dq 0x1000 ; p_align

phdrsize equ $ - phdr
filesize equ $ - $$


which is 112 bytes long.



Here I stop for the moment, as I don't have much time for this right now. You now have the basic layout with the relevant modifications for 64 bit, so you just have to experiment with more audacious overlaps






share|improve this answer





















  • If you're golfing for code-size and you still want to _exit(42) instead of xor edi,edi like a normal person, you'd use push 42/pop rdi (3 bytes) instead of a 4-byte 66 mov-di imm16. And then a 3-byte lea eax, [rdi - 42 + 60] or another push/pop. Tips for golfing in x86/x64 machine code. Of course in practice Linux does zero all the registers before process startup. Depending on your golfing rules, you might take advantage. (codegolf.SE only requires that code work on at least one implementation, not necessarily all.)
    – Peter Cordes
    Nov 19 at 22:50












  • To set only the low byte, another option is mov al,42 (2 bytes) /xchg eax,edi (1 byte).
    – Peter Cordes
    Nov 19 at 22:54






  • 1




    @PeterCordes: argh the usual push/pop trick, I keep forgetting it... probably it's because I usually golf in 16 bit x86, where they aren't as useful (except for segment registers). _exit(42) is there to match the original, otherwise I would have just made it exit with whatever happened to be in rdi :-D. Unfortunately, as this is not a "regular" code-golf, there aren't really well-defined rules...
    – Matteo Italia
    Nov 19 at 23:11












  • I am at 9 Bytes with use64; xor edi, edi; mov al, 42; xchg eax, edi; mov al, 60; syscall?
    – sivizius
    Nov 20 at 23:13










  • @sivizius: you can get to 8 (3+1+3+1+2) using the tricks from @PeterCordes (push 42; pop rdi; push 60; pop rax; syscall)
    – Matteo Italia
    Nov 20 at 23:46













Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53382589%2fsmallest-executable-program-x86-64%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
3
down vote













Starting from an answer of mine about the "real" entrypoint of an ELF executable on Linux and "raw" syscalls, we can strip it down to



bits 64
global _start
_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall


I don't think you can get it to be any smaller without going out of specs - in particular, the psABI doesn't guarantee anything about the state of eax. This gets assembled to precisely 10 bytes (as opposed to the 7 bytes of the 32 bit payload):



66 bf 2a 00 31 c0 b0 3c 0f 05




The straightforward way (assemble with nasm, link with ld) produces me a 352 bytes executable.



The first "real" transformation he does is building the ELF "by hand"; doing this (with some modifications, as the ELF header for x86_64 is a bit bigger)



bits 64
org 0x08048000

ehdr: ; Elf64_Ehdr
db 0x7F, "ELF", 2, 1, 1, 0 ; e_ident
times 8 db 0
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq phdr - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 1 ; e_phnum
dw 0 ; e_shentsize
dw 0 ; e_shnum
dw 0 ; e_shstrndx

ehdrsize equ $ - ehdr

phdr: ; Elf64_Phdr
dd 1 ; p_type
dd 5 ; p_flags
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq filesize ; p_filesz
dq filesize ; p_memsz
dq 0x1000 ; p_align

phdrsize equ $ - phdr

_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall

filesize equ $ - $$


we get down to 130 bytes. This is a tad bigger than the 91 bytes executable, but it comes from the fact that several fields become 64 bits instead of 32.





We can then apply some tricks similar to his; the partial overlap of phdr and ehdr can be done, although the order of fields in phdr is different, and we have to overlap p_flags with e_shnum (which however should be ignored due to e_shentsize being 0).



Moving the code inside the header is slightly more difficult, as it's 3 bytes larger, but that part of header is just as big as in the 32 bit case. We overcome this by starting 2 bytes earlier, overwriting the padding byte (ok) and the ABI version field (not ok, but still works).



So, we reach:



bits 64
org 0x08048000

ehdr: ; Elf64_Ehdr
db 0x7F, "ELF", 2, 1, ; e_ident
_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq phdr - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
phdr: ; Elf64_Phdr
dw 1 ; e_phnum p_type
dw 0 ; e_shentsize
dw 5 ; e_shnum p_flags
dw 0 ; e_shstrndx
ehdrsize equ $ - ehdr
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq filesize ; p_filesz
dq filesize ; p_memsz
dq 0x1000 ; p_align

phdrsize equ $ - phdr
filesize equ $ - $$


which is 112 bytes long.



Here I stop for the moment, as I don't have much time for this right now. You now have the basic layout with the relevant modifications for 64 bit, so you just have to experiment with more audacious overlaps






share|improve this answer





















  • If you're golfing for code-size and you still want to _exit(42) instead of xor edi,edi like a normal person, you'd use push 42/pop rdi (3 bytes) instead of a 4-byte 66 mov-di imm16. And then a 3-byte lea eax, [rdi - 42 + 60] or another push/pop. Tips for golfing in x86/x64 machine code. Of course in practice Linux does zero all the registers before process startup. Depending on your golfing rules, you might take advantage. (codegolf.SE only requires that code work on at least one implementation, not necessarily all.)
    – Peter Cordes
    Nov 19 at 22:50












  • To set only the low byte, another option is mov al,42 (2 bytes) /xchg eax,edi (1 byte).
    – Peter Cordes
    Nov 19 at 22:54






  • 1




    @PeterCordes: argh the usual push/pop trick, I keep forgetting it... probably it's because I usually golf in 16 bit x86, where they aren't as useful (except for segment registers). _exit(42) is there to match the original, otherwise I would have just made it exit with whatever happened to be in rdi :-D. Unfortunately, as this is not a "regular" code-golf, there aren't really well-defined rules...
    – Matteo Italia
    Nov 19 at 23:11












  • I am at 9 Bytes with use64; xor edi, edi; mov al, 42; xchg eax, edi; mov al, 60; syscall?
    – sivizius
    Nov 20 at 23:13










  • @sivizius: you can get to 8 (3+1+3+1+2) using the tricks from @PeterCordes (push 42; pop rdi; push 60; pop rax; syscall)
    – Matteo Italia
    Nov 20 at 23:46

















up vote
3
down vote













Starting from an answer of mine about the "real" entrypoint of an ELF executable on Linux and "raw" syscalls, we can strip it down to



bits 64
global _start
_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall


I don't think you can get it to be any smaller without going out of specs - in particular, the psABI doesn't guarantee anything about the state of eax. This gets assembled to precisely 10 bytes (as opposed to the 7 bytes of the 32 bit payload):



66 bf 2a 00 31 c0 b0 3c 0f 05




The straightforward way (assemble with nasm, link with ld) produces me a 352 bytes executable.



The first "real" transformation he does is building the ELF "by hand"; doing this (with some modifications, as the ELF header for x86_64 is a bit bigger)



bits 64
org 0x08048000

ehdr: ; Elf64_Ehdr
db 0x7F, "ELF", 2, 1, 1, 0 ; e_ident
times 8 db 0
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq phdr - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 1 ; e_phnum
dw 0 ; e_shentsize
dw 0 ; e_shnum
dw 0 ; e_shstrndx

ehdrsize equ $ - ehdr

phdr: ; Elf64_Phdr
dd 1 ; p_type
dd 5 ; p_flags
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq filesize ; p_filesz
dq filesize ; p_memsz
dq 0x1000 ; p_align

phdrsize equ $ - phdr

_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall

filesize equ $ - $$


we get down to 130 bytes. This is a tad bigger than the 91 bytes executable, but it comes from the fact that several fields become 64 bits instead of 32.





We can then apply some tricks similar to his; the partial overlap of phdr and ehdr can be done, although the order of fields in phdr is different, and we have to overlap p_flags with e_shnum (which however should be ignored due to e_shentsize being 0).



Moving the code inside the header is slightly more difficult, as it's 3 bytes larger, but that part of header is just as big as in the 32 bit case. We overcome this by starting 2 bytes earlier, overwriting the padding byte (ok) and the ABI version field (not ok, but still works).



So, we reach:



bits 64
org 0x08048000

ehdr: ; Elf64_Ehdr
db 0x7F, "ELF", 2, 1, ; e_ident
_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq phdr - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
phdr: ; Elf64_Phdr
dw 1 ; e_phnum p_type
dw 0 ; e_shentsize
dw 5 ; e_shnum p_flags
dw 0 ; e_shstrndx
ehdrsize equ $ - ehdr
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq filesize ; p_filesz
dq filesize ; p_memsz
dq 0x1000 ; p_align

phdrsize equ $ - phdr
filesize equ $ - $$


which is 112 bytes long.



Here I stop for the moment, as I don't have much time for this right now. You now have the basic layout with the relevant modifications for 64 bit, so you just have to experiment with more audacious overlaps






share|improve this answer





















  • If you're golfing for code-size and you still want to _exit(42) instead of xor edi,edi like a normal person, you'd use push 42/pop rdi (3 bytes) instead of a 4-byte 66 mov-di imm16. And then a 3-byte lea eax, [rdi - 42 + 60] or another push/pop. Tips for golfing in x86/x64 machine code. Of course in practice Linux does zero all the registers before process startup. Depending on your golfing rules, you might take advantage. (codegolf.SE only requires that code work on at least one implementation, not necessarily all.)
    – Peter Cordes
    Nov 19 at 22:50












  • To set only the low byte, another option is mov al,42 (2 bytes) /xchg eax,edi (1 byte).
    – Peter Cordes
    Nov 19 at 22:54






  • 1




    @PeterCordes: argh the usual push/pop trick, I keep forgetting it... probably it's because I usually golf in 16 bit x86, where they aren't as useful (except for segment registers). _exit(42) is there to match the original, otherwise I would have just made it exit with whatever happened to be in rdi :-D. Unfortunately, as this is not a "regular" code-golf, there aren't really well-defined rules...
    – Matteo Italia
    Nov 19 at 23:11












  • I am at 9 Bytes with use64; xor edi, edi; mov al, 42; xchg eax, edi; mov al, 60; syscall?
    – sivizius
    Nov 20 at 23:13










  • @sivizius: you can get to 8 (3+1+3+1+2) using the tricks from @PeterCordes (push 42; pop rdi; push 60; pop rax; syscall)
    – Matteo Italia
    Nov 20 at 23:46















up vote
3
down vote










up vote
3
down vote









Starting from an answer of mine about the "real" entrypoint of an ELF executable on Linux and "raw" syscalls, we can strip it down to



bits 64
global _start
_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall


I don't think you can get it to be any smaller without going out of specs - in particular, the psABI doesn't guarantee anything about the state of eax. This gets assembled to precisely 10 bytes (as opposed to the 7 bytes of the 32 bit payload):



66 bf 2a 00 31 c0 b0 3c 0f 05




The straightforward way (assemble with nasm, link with ld) produces me a 352 bytes executable.



The first "real" transformation he does is building the ELF "by hand"; doing this (with some modifications, as the ELF header for x86_64 is a bit bigger)



bits 64
org 0x08048000

ehdr: ; Elf64_Ehdr
db 0x7F, "ELF", 2, 1, 1, 0 ; e_ident
times 8 db 0
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq phdr - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 1 ; e_phnum
dw 0 ; e_shentsize
dw 0 ; e_shnum
dw 0 ; e_shstrndx

ehdrsize equ $ - ehdr

phdr: ; Elf64_Phdr
dd 1 ; p_type
dd 5 ; p_flags
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq filesize ; p_filesz
dq filesize ; p_memsz
dq 0x1000 ; p_align

phdrsize equ $ - phdr

_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall

filesize equ $ - $$


we get down to 130 bytes. This is a tad bigger than the 91 bytes executable, but it comes from the fact that several fields become 64 bits instead of 32.





We can then apply some tricks similar to his; the partial overlap of phdr and ehdr can be done, although the order of fields in phdr is different, and we have to overlap p_flags with e_shnum (which however should be ignored due to e_shentsize being 0).



Moving the code inside the header is slightly more difficult, as it's 3 bytes larger, but that part of header is just as big as in the 32 bit case. We overcome this by starting 2 bytes earlier, overwriting the padding byte (ok) and the ABI version field (not ok, but still works).



So, we reach:



bits 64
org 0x08048000

ehdr: ; Elf64_Ehdr
db 0x7F, "ELF", 2, 1, ; e_ident
_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq phdr - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
phdr: ; Elf64_Phdr
dw 1 ; e_phnum p_type
dw 0 ; e_shentsize
dw 5 ; e_shnum p_flags
dw 0 ; e_shstrndx
ehdrsize equ $ - ehdr
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq filesize ; p_filesz
dq filesize ; p_memsz
dq 0x1000 ; p_align

phdrsize equ $ - phdr
filesize equ $ - $$


which is 112 bytes long.



Here I stop for the moment, as I don't have much time for this right now. You now have the basic layout with the relevant modifications for 64 bit, so you just have to experiment with more audacious overlaps






share|improve this answer












Starting from an answer of mine about the "real" entrypoint of an ELF executable on Linux and "raw" syscalls, we can strip it down to



bits 64
global _start
_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall


I don't think you can get it to be any smaller without going out of specs - in particular, the psABI doesn't guarantee anything about the state of eax. This gets assembled to precisely 10 bytes (as opposed to the 7 bytes of the 32 bit payload):



66 bf 2a 00 31 c0 b0 3c 0f 05




The straightforward way (assemble with nasm, link with ld) produces me a 352 bytes executable.



The first "real" transformation he does is building the ELF "by hand"; doing this (with some modifications, as the ELF header for x86_64 is a bit bigger)



bits 64
org 0x08048000

ehdr: ; Elf64_Ehdr
db 0x7F, "ELF", 2, 1, 1, 0 ; e_ident
times 8 db 0
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq phdr - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 1 ; e_phnum
dw 0 ; e_shentsize
dw 0 ; e_shnum
dw 0 ; e_shstrndx

ehdrsize equ $ - ehdr

phdr: ; Elf64_Phdr
dd 1 ; p_type
dd 5 ; p_flags
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq filesize ; p_filesz
dq filesize ; p_memsz
dq 0x1000 ; p_align

phdrsize equ $ - phdr

_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall

filesize equ $ - $$


we get down to 130 bytes. This is a tad bigger than the 91 bytes executable, but it comes from the fact that several fields become 64 bits instead of 32.





We can then apply some tricks similar to his; the partial overlap of phdr and ehdr can be done, although the order of fields in phdr is different, and we have to overlap p_flags with e_shnum (which however should be ignored due to e_shentsize being 0).



Moving the code inside the header is slightly more difficult, as it's 3 bytes larger, but that part of header is just as big as in the 32 bit case. We overcome this by starting 2 bytes earlier, overwriting the padding byte (ok) and the ABI version field (not ok, but still works).



So, we reach:



bits 64
org 0x08048000

ehdr: ; Elf64_Ehdr
db 0x7F, "ELF", 2, 1, ; e_ident
_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq phdr - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
phdr: ; Elf64_Phdr
dw 1 ; e_phnum p_type
dw 0 ; e_shentsize
dw 5 ; e_shnum p_flags
dw 0 ; e_shstrndx
ehdrsize equ $ - ehdr
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq filesize ; p_filesz
dq filesize ; p_memsz
dq 0x1000 ; p_align

phdrsize equ $ - phdr
filesize equ $ - $$


which is 112 bytes long.



Here I stop for the moment, as I don't have much time for this right now. You now have the basic layout with the relevant modifications for 64 bit, so you just have to experiment with more audacious overlaps







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 19 at 22:25









Matteo Italia

96.9k13135236




96.9k13135236












  • If you're golfing for code-size and you still want to _exit(42) instead of xor edi,edi like a normal person, you'd use push 42/pop rdi (3 bytes) instead of a 4-byte 66 mov-di imm16. And then a 3-byte lea eax, [rdi - 42 + 60] or another push/pop. Tips for golfing in x86/x64 machine code. Of course in practice Linux does zero all the registers before process startup. Depending on your golfing rules, you might take advantage. (codegolf.SE only requires that code work on at least one implementation, not necessarily all.)
    – Peter Cordes
    Nov 19 at 22:50












  • To set only the low byte, another option is mov al,42 (2 bytes) /xchg eax,edi (1 byte).
    – Peter Cordes
    Nov 19 at 22:54






  • 1




    @PeterCordes: argh the usual push/pop trick, I keep forgetting it... probably it's because I usually golf in 16 bit x86, where they aren't as useful (except for segment registers). _exit(42) is there to match the original, otherwise I would have just made it exit with whatever happened to be in rdi :-D. Unfortunately, as this is not a "regular" code-golf, there aren't really well-defined rules...
    – Matteo Italia
    Nov 19 at 23:11












  • I am at 9 Bytes with use64; xor edi, edi; mov al, 42; xchg eax, edi; mov al, 60; syscall?
    – sivizius
    Nov 20 at 23:13










  • @sivizius: you can get to 8 (3+1+3+1+2) using the tricks from @PeterCordes (push 42; pop rdi; push 60; pop rax; syscall)
    – Matteo Italia
    Nov 20 at 23:46




















  • If you're golfing for code-size and you still want to _exit(42) instead of xor edi,edi like a normal person, you'd use push 42/pop rdi (3 bytes) instead of a 4-byte 66 mov-di imm16. And then a 3-byte lea eax, [rdi - 42 + 60] or another push/pop. Tips for golfing in x86/x64 machine code. Of course in practice Linux does zero all the registers before process startup. Depending on your golfing rules, you might take advantage. (codegolf.SE only requires that code work on at least one implementation, not necessarily all.)
    – Peter Cordes
    Nov 19 at 22:50












  • To set only the low byte, another option is mov al,42 (2 bytes) /xchg eax,edi (1 byte).
    – Peter Cordes
    Nov 19 at 22:54






  • 1




    @PeterCordes: argh the usual push/pop trick, I keep forgetting it... probably it's because I usually golf in 16 bit x86, where they aren't as useful (except for segment registers). _exit(42) is there to match the original, otherwise I would have just made it exit with whatever happened to be in rdi :-D. Unfortunately, as this is not a "regular" code-golf, there aren't really well-defined rules...
    – Matteo Italia
    Nov 19 at 23:11












  • I am at 9 Bytes with use64; xor edi, edi; mov al, 42; xchg eax, edi; mov al, 60; syscall?
    – sivizius
    Nov 20 at 23:13










  • @sivizius: you can get to 8 (3+1+3+1+2) using the tricks from @PeterCordes (push 42; pop rdi; push 60; pop rax; syscall)
    – Matteo Italia
    Nov 20 at 23:46


















If you're golfing for code-size and you still want to _exit(42) instead of xor edi,edi like a normal person, you'd use push 42/pop rdi (3 bytes) instead of a 4-byte 66 mov-di imm16. And then a 3-byte lea eax, [rdi - 42 + 60] or another push/pop. Tips for golfing in x86/x64 machine code. Of course in practice Linux does zero all the registers before process startup. Depending on your golfing rules, you might take advantage. (codegolf.SE only requires that code work on at least one implementation, not necessarily all.)
– Peter Cordes
Nov 19 at 22:50






If you're golfing for code-size and you still want to _exit(42) instead of xor edi,edi like a normal person, you'd use push 42/pop rdi (3 bytes) instead of a 4-byte 66 mov-di imm16. And then a 3-byte lea eax, [rdi - 42 + 60] or another push/pop. Tips for golfing in x86/x64 machine code. Of course in practice Linux does zero all the registers before process startup. Depending on your golfing rules, you might take advantage. (codegolf.SE only requires that code work on at least one implementation, not necessarily all.)
– Peter Cordes
Nov 19 at 22:50














To set only the low byte, another option is mov al,42 (2 bytes) /xchg eax,edi (1 byte).
– Peter Cordes
Nov 19 at 22:54




To set only the low byte, another option is mov al,42 (2 bytes) /xchg eax,edi (1 byte).
– Peter Cordes
Nov 19 at 22:54




1




1




@PeterCordes: argh the usual push/pop trick, I keep forgetting it... probably it's because I usually golf in 16 bit x86, where they aren't as useful (except for segment registers). _exit(42) is there to match the original, otherwise I would have just made it exit with whatever happened to be in rdi :-D. Unfortunately, as this is not a "regular" code-golf, there aren't really well-defined rules...
– Matteo Italia
Nov 19 at 23:11






@PeterCordes: argh the usual push/pop trick, I keep forgetting it... probably it's because I usually golf in 16 bit x86, where they aren't as useful (except for segment registers). _exit(42) is there to match the original, otherwise I would have just made it exit with whatever happened to be in rdi :-D. Unfortunately, as this is not a "regular" code-golf, there aren't really well-defined rules...
– Matteo Italia
Nov 19 at 23:11














I am at 9 Bytes with use64; xor edi, edi; mov al, 42; xchg eax, edi; mov al, 60; syscall?
– sivizius
Nov 20 at 23:13




I am at 9 Bytes with use64; xor edi, edi; mov al, 42; xchg eax, edi; mov al, 60; syscall?
– sivizius
Nov 20 at 23:13












@sivizius: you can get to 8 (3+1+3+1+2) using the tricks from @PeterCordes (push 42; pop rdi; push 60; pop rax; syscall)
– Matteo Italia
Nov 20 at 23:46






@sivizius: you can get to 8 (3+1+3+1+2) using the tricks from @PeterCordes (push 42; pop rdi; push 60; pop rax; syscall)
– Matteo Italia
Nov 20 at 23:46




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53382589%2fsmallest-executable-program-x86-64%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Wiesbaden

Marschland

Dieringhausen