Trust Issues Relative to Open Source
Two separate discussions have very recently opened my eyes to an issue I had not considered – how to confirm the Open Source binary that one uses is based on the published source code.
Zooko Wilcox-O'Hearn, founder and CEO of LeastAuthority.com, wrote an open letter on the subject to Phil Zimmermann and Jon Callas, two of the principals behind Silent Circle, which ran Silent Mail. There is a large discussion thread on cryptography-randombit based on the letter. Additionally, a Dr. Dobbs article published today entitled Putting Absolutely Everything in Version Control touched on it as well.
The issue of concern for this question is ability to recompile Open Source code and get the same results as the published binary. In other words, if you recreate the same binary and hash it from source code, it is unlikely to be identical due to differences in tool chains and some randomizations in compilers themselves.
The Dr. Dobbs article suggests putting even the tool chain under version control for reasons of reproducibility. Jon Callas points out that in many cases it may be impossible to redistribute the tool chain for various reasons including licensing restrictions. Unless you compile the code yourself, you are adding a trust step to your assumption set as the binary cannot be recreated by others with the same results.
I now understand that this is an understandably accepted risk. Are there other discussions or indicators relative to making source code byte-for-byte reproducible when compiled, thus eliminating the need to trust the provider of even Open Source binaries? As referenced in Jon Callas’ discussion, Ken Thompson showed “You can't trust code that you did not totally create yourself.” What are the security implications thoughts on this subject?