[Users] ExternalLibraries without tarballs

Roland Haas roland.haas at physics.gatech.edu
Sun Aug 11 20:09:35 CDT 2013


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello all,

> Taking a set of source trees which are very similar and
> compressing each of them into a tarball makes it impossible to
> efficiently delta-compress them.
This is not quite true. gzip (at least version 1.6 on Debian) has an
option rsyncable which claims to help in these situations, though most
likely not as much as actually checking in the source code into the
repository.

> This is a valid concern.  Erik, with your testing of boost, could
> you measure the number of files in the working tree, and the number
> of files in the built tree, for just the boost library?  I imagine
> that the number of files in the built tree is a factor of a few
> larger than the number in the source tree.  However, as Frank said,
> if the user is not building this library, then they do have a much
> larger inode cost if we store the source tree.  We discusses
> skipping syncing of certain external libraries before; maybe that
> is a better solution here.
I would like to also lend my voice to this argument. Having the
tarballs extracted in the checkout creates very many small files which
can be terribly slow to synchronize via rsync. Much slower than
transferring a single medium sized (~30MB is the largest tarball,
excluding boost) file.

> Note that I am already opposed to using a large library such as
> Boost in the ET.  It requires a GB of space to uncompress, and more
> to build, for little benefit that I can see (though I have not
> looked). If Boost is licensed appropriately and written in a
> modular-enough fashion, maybe it is possible to extract just the
> bits that people find useful? I'm sure we are pulling in a huge
> amount of code that we don't need by including Boost.
I thin the Boost ExternalLibrary in its current repository is not a
good example. We would not handle it this way. Git is a not a good VCS
for ExternalLibraries since it keeps the whole history around which we
don't need for the external libraries. Boost *can* be split, eg Debian
delivers it as ~13 packages. Not sure though if it is worthwhile for
us to split it up like this. I'd much rather have the checkout
commented out in the thornlist (same as OpenCL) and then if someone
needs it they can enable it. If a thorn then relies on Boost the thorn
can include the corresponding source files if that is sufficient (and
once sufficiently many thorns do so we might as well just include
Boost itself...).

> A developer would not have to work with patches.  You would commit 
> your changes on the ET branch of the library, and use the version 
> control tool to see differences, merge in the new version, etc.
> This is much better than messing about with patches, which are just
> a hack used in the absence of proper version control.
I would expect we can already do so with the current setup when eg the
maintainer of the ExternalLibrary keeps a (local) git repository (eg
via git-svn) and rebase the ET specific patches after each update.
git-format-patch then produces the patches that need applying. This is
more or less how I handle eg. my own Carpet branch where I have local
modifications that I like but which were rejected from upstream so I
cannot push them.

> Most of the active ET developers prefer to work with git anyway,
> and since git stores all the history locally, the problem is worse
> for us.  Note that the boost external library thorn is in git,
> which is why Erik noticed this problem in the first place.
I believe git is not a good choice for ExternalLibraries. In my
understanding we do not want to modify ExternalLibraries's source code
at all if possible since otherwise we run the risk of not able to
function with at plain vanille copy already installed on the system.
If we want to provide extra functionality beyond what the library
offers, I'd rather use a second thorn to do so. From my point of view
ExternalLibraries are just packaged up external components to be used
with the ET, not something that the ET develops. Eg. if I was actually
developing for LORENE, I would not use the ET tarball but download it
on my own.

Yours,
Roland

- -- 
My email is as private as my paper mail. I therefore support encrypting
and signing email messages. Get my PGP key from http://keys.gnupg.net.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Icedove - http://www.enigmail.net/

iEYEARECAAYFAlIINc8ACgkQTiFSTN7SboW8iwCgzhWFMeBJtnLL6oxa5zqJtGKg
ZJ0AniFeXpr5kURe/nYB+89C7i5u+pqj
=OrbF
-----END PGP SIGNATURE-----


More information about the Users mailing list