bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#22533: Python bytecode reproducibility


From: Gábor Boskovits
Subject: bug#22533: Python bytecode reproducibility
Date: Sun, 4 Mar 2018 16:30:59 +0100

2018-03-04 13:46 GMT+01:00 Ricardo Wurmus <address@hidden>:

Hi Gábor,

> Nix had this issue, it seems they have a python 3.5 solution, which
> should be easy to adopt: https://github.com/NixOS/nixpkgs/issues/22570.
> WDYT?

Here’s the patch for Nix:

  https://patch-diff.githubusercontent.com/raw/NixOS/nixpkgs/pull/22585.diff

Here are the relevant changes to the Python packages:

* Python 3.4

  substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])"
  substituteInPlace "Lib/importlib/_bootstrap.py" --replace "source_mtime = int(source_stats['mtime'])" "source_mtime = 1"

* Python 3.5

  substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])"
  substituteInPlace "Lib/importlib/_bootstrap_external.py" --replace "source_mtime = int(st['mtime'])" "source_mtime = 1"

* Python 3.6
  substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])"
  substituteInPlace "Lib/importlib/_bootstrap_external.py" --replace "source_mtime = int(st['mtime'])" "source_mtime = 1"



Nice, thanks for the summary.
Can we adopt this as is?
Do we need the 3.4 and 3.5 fix or the 3.6 one is enough? 
 
For all packages they set these environment variables:

  - set PYTHONHASHSEED=0 (for hashes of str, bytes and datetime objects)

  - set DETERMINISTIC_BUILD; for conditional patching of the timestamp
    for package builds.  The timestamp is not patched in ad-hoc
    environments, because that would mess with Python’s ability to
    determine whether to compile source files.


Should we set these in python-build-system? What about python booststrap?
I guess we use gnu-build-system there, so bootstrap packages might need to
set these explicitly?
 
They also rebuild all bytecode (with the exception of lib2to3 because it
is Python 2 code) three times, once for each optimization level.

--8<---------------cut here---------------start------------->8---
+    # Determinism: rebuild all bytecode
+    # We exclude lib2to3 because that's Python 2 code which fails
+    # We rebuild three times, once for each optimization level
+    find $out -name "*.py" | $out/bin/python -m compileall -q -f -x "lib2to3" -i -
+    find $out -name "*.py" | $out/bin/python -O -m compileall -q -f -x "lib2to3" -i -
+    find $out -name "*.py" | $out/bin/python -OO -m compileall -q -f -x "lib2to3" -i -
--8<---------------cut here---------------end--------------->8---


Do we also have to do this, or should we settle with one optimization level? Which one?
 
--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net




reply via email to

[Prev in Thread] Current Thread [Next in Thread]