echo 'print("it works!")' > __main__.py
zip -q code.zip __main__.py
python code.zip
it works!
October 12, 2024
One reason I am excited about starting this blog is because I have a long backlog of “dumb stuff I have done with computers” that I want to share with the world. Here’s one of them.
To show you the idea, check out this PDF.
A little-known (I think) feature of Python is that you can run code directly from a .zip
file containing a __main__.py
. Check this out:
(See this part of the Python docs.)
Now consider the following two facts about the structure of ZIP and PDF files:
%%EOF
to mark the end of the file.But wait. This means that if we just concatenate a PDF and a ZIP file, the result will still start with the PDF header, and be a PDF until %%EOF
, and the last part of the file will be the ZIP directory, which describes in relative offsets where to find the ZIP data.
👀
So it’s a valid file, in both formats!
In short, you can make a file that is both a PDF and a ZIP file by simply concatenating the two, and since it is possible to create and run ZIP files as Python, that means you can create a file that both is a valid PDF and runs as Python!
A surprising thing that somewhat undermines the above argument about file structure is that (at least on my machine) it actually still works if you concatenate them in the “wrong” order—the ZIP first, then the PDF! I guess the libraries for reading PDFs and ZIP files are really robust to weird/corrupt file structures. I wonder what possibilities this opens up. Could we make a single file that is a PNG, and a PDF, and a ZIP file? It is left for the reader to explore…