[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#30820: Chunked store references in compiled code break grafting (aga
From: |
Mark H Weaver |
Subject: |
bug#30820: Chunked store references in compiled code break grafting (again) |
Date: |
Wed, 21 Mar 2018 00:17:53 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) |
Hi Ludovic,
address@hidden (Ludovic Courtès) writes:
> Mark H Weaver <address@hidden> skribis:
>
>> We would also need to find a solution to the problem described in the
>> thread "broken references in jar manifests" on guix-devel started by
>> Ricardo, which still has not found a satifactory solution.
>>
>> https://lists.gnu.org/archive/html/guix-devel/2018-03/msg00006.html
Okay, do you have a proposed fix for the issue of jar manifests?
There's a specification for that file format which mandates that "No
line may be longer than 72 bytes (not characters), in its UTF8-encoded
form. If a value would make the initial line longer than this, it
should be continued on extra lines (each starting with a single SPACE)."
>> My opinion is that I consider Guix's current expectations for how
>> software must store its data on disk to be far too onerous, in cases
>> where that data might include a store reference. I don't see sufficient
>> justification for imposing such an onerous requirement on the software
>> in Guix.
>
> In practice Guix and Nix have been living fine under these constraints,
> and with almost no modifications to upstream software, so it’s not that
> bad. Nix doesn’t have grafts though, which is why this problem was less
> visible there.
>
>> Ultimately, I would prefer to see the scanning and grafting operations
>> completely generalized, so that in general each package can specify how
>> to scan and graft that particular package, making use of libraries in
>> (guix build ...) to cover the usual cases. In most cases, that code
>> would be within build-systems.
>
> That would be precise GC instead of conservative GC in a way, right?
> So in essence we’d have, say, a scanner for ELF files (like ‘dh_shdep’
> in Debian or whatever it’s called), a scanner for jars, and so on?
No, I wasn't thinking along those lines. While I'd very much prefer
precise GC, it seems wholly infeasible for us to write precise scanners
and grafters for every file format of every package in Guix.
My thought was that supporting scanning and grafting of 8-byte-or-longer
substrings of hashes would cover both GCC's inlined strings and jar
manifests, the two issues that we currently know about, and that it
would be nice if we could add further methods in the future. For
example, some software might store its data in UTF-16, or compressed.
> Still, how would we deal with strings embedded in the middle of
> binaries, as in this case? It seems to remain an open issue, no?
I believe that I addressed that case in my original proposal, no?
> I’m interested in experiments in that direction. I think that’s a
> longer-term goal, though, and there are open questions: we have no idea
> how well that would work in practice.
Thanks for discussing it. I'm willing to drop it and go with your
decision for now, but the "jar manifest" issue still needs a solution.
Regards,
Mark
- bug#30820: Chunked store references in compiled code break grafting (again), (continued)
bug#30820: Chunked store references in compiled code break grafting (again), Mark H Weaver, 2018/03/21