[ET Trac] [Einstein Toolkit] #349: pyc files when syncing
Einstein Toolkit
trac-noreply at einsteintoolkit.org
Sun Jun 19 06:13:32 CDT 2011
#349: pyc files when syncing
----------------------------+-----------------------------------------------
Reporter: barry.wardell | Owner: mthomas
Type: defect | Status: new
Priority: major | Milestone:
Component: SimFactory | Version:
Resolution: | Keywords:
----------------------------+-----------------------------------------------
Comment (by barry.wardell):
Replying to [comment:12 eschnett]:
> You are introducing a new "paths" syntax to the sync command. I thought
previously that one could just list multiple machines, and simfactory
would sync to all of them -- apparently that isn't the case, that was lost
in translation.
Yes, sorry for the confusion. This patch introduces three changes
(switching to filter rules system, changing the behavior of sim sync with
multiple arguments and removing the --sync-parfiles and --sync-sourcetree
options) which should ideally be separated into separate issues for
consideration. The reason I didn't do so was that the the three changes
naturally came at the same time in terms of the changes to the code. The
last two can be restored to their original behavior if desired. However, I
actually prefer the new behavior because:
* I am much more likely to want to sync specific paths than to sync to
multiple machines at once.
* The paths system provides more flexibility and control than the --sync-
parfiles and --sync-sourcetree options did and makes them somewhat
unnecessary. This flexibility is particularly useful on machines with
slower filesystems where only syncing a specific path can save a lot of
time.
What are other people's opinions on this?
>However, it seems that filter.cactus.rules contains only a list of top-
level paths, and isn't supposed to contain any actual rules -- if it did
contain rules, then the result would be confusing, because "sim sync" and
"sim sync paths" would copy and/or delete different sets of files. Also,
people may want to change this default list of paths, so there should
probably also be a filter.cactus.local.rules... Should this list of paths
instead be stored in an ini file, where there is already a mechanism to
configure settings, and where simfactory could check that these are
actually only path names and not accidentally patterns?
The idea is that if specific paths are not given, then the file
filter.cactus.rules is read in and gives a default list of paths to be
included. I agree that this should not contain any actual rules for the
reason you give and have added a comment to this effect to the top of
filter.cactus.rules. Any filter rules to be applied to all paths should be
put in filter.rules.
If the user wants to modify this, they can add a .rsync.rules file in
their Cactus base directory which is read in first and so will override
anything in filter.cactus.rules. I don't really like storing these in ini
files because that would be moving away from using rsync's filter rules
system (unless simfactory parsed the ini file and generated the
appropriate .rsync.rules file).
> Shouldn't "_darcs" always be excluded, similar to CVS .svn .git .hg etc?
Yes, "_darcs" is also excluded by a rule in the filter.rules file. I have
also added .hg in the lastest patch.
> What does "C" do? Does it read a .cvsignore file? If so, shouldn't this
file be transferred as well, and be documented for simfactory? This would
be one more configuration file for people to understand; can we ignore
this file instead? cvs is not important any more these days.
"C" is designed to exclude many common paths which you often don't want to
transfer. These are:
"RCS SCCS CVS CVS.adm RCSLOG cvslog.* tags TAGS .make.state
.nse_depinfo *~ #* .#* ,* _$* *$ *.old *.bak *.BAK *.orig *.rej .del-* *.a
*.olb *.o *.obj *.so *.exe *.Z *.elc *.ln core .svn/ .git/ .bzr/"
It also appends any patterns listed in $HOME/.cvsignore and in directory-
local .cvsignore files. I don't think we should worry too much about
.cvsignore files though - I doubt many people have them any more.
> How do you expect people to use the "paths" mechanism? Can one give just
top-level paths, or also directly paths deep into the hierarchy? Would you
expect to do this regularly? If so, why? I find this somewhat dangerous,
because people may miss transferring an updated file. Instead of telling
simfactory what to do, the user currently tells simfactory his/her intent,
e.g. "copy source files" or "copy parameter files", which are
prerequisites to either building or submitting. Simfactory then deals with
the details, ensuring things are done in a safe way. Would you find it
inconvenient if you had to use an option to specify a pathname, e.g. "sim
sync damiana -p par"?
The idea is that there are three modes of operation:
* Without any paths specified we sync all paths given in
filter.cactus.rules (and also include anything any modifications in the
file .rsync.rules). This is essentially the same as what happened before.
* With a list of paths given, only those paths are synchronized. Both
filter.cactus.rules and $CACTUSDIR/.rsync.rules are ignored (but any
.rsync.rules files in the specified paths are read). For consistency, only
toplevel paths are accepted in this mode.
* With a single path given, only this path is synchronized. Both
filter.cactus.rules and $CACTUSDIR/.rsync.rules are ignored (but any
.rsync.rules files in the specified paths are read). In this case, non-
toplevel paths are allowed and handled appropriately.
The main case where I would expect to use this regularly is when syncing
to machine with a slow filesystem (eg. Kraken) where simply checking which
files need to be synced can sometimes take a long time. In fact, before
now I often used rsync manually instead of 'sim sync' when I was syncing
small changes often (eg. when debugging a problem, setting up a new
parameter file, etc.). I quite like how things work with this patch
applied. We could add the --sync-sources and --sync-parfiles convenience
options back, although I'm not sure if I would personally use them.
What is the advantage of using an option to specify a pathname?
--
Ticket URL: <https://trac.einsteintoolkit.org/ticket/349#comment:13>
Einstein Toolkit <http://einsteintoolkit.org>
The Einstein Toolkit
More information about the Trac
mailing list