[Users] Abort on unexpected process or thread configuration

Ian Hinder ian.hinder at aei.mpg.de
Thu Jun 11 03:58:39 CDT 2015


Hi all,

Carpet supports environment variables

	CACTUS_NUM_PROCS
	CACTUS_NUM_THREADS

which, if set, indicate the number of MPI processes and the number of threads per process that were expected.  Carpet uses these to check that the actual number of processes started, and the actual number of threads, are correct, since it is very easy to get these wrong due to interactions with scheduling systems, different MPI implementations, different command-line options etc.  If the variables are unset, Carpet does nothing.  However, if they have been set, Carpet currently prints a warning if there is a mismatch.  I think, and Erik agrees, that this should instead be a fatal error which aborts the simulation on startup.  

If you use simfactory, it often (depending on the machine runscript) sets these variables.  If it is getting it wrong for a particular machine, the number of threads and processes could be wrong, which usually leads to a severe performance penalty which might go unnoticed if you don't check carefully.  

I will make this change in Carpet in the development version, and this might lead to simulations aborting on startup if the configuration is wrong.  In that case, the configuration, usually in the machine run script, should be fixed.  Hopefully this change will make things more robust and easier to debug.

-- 
Ian Hinder
http://members.aei.mpg.de/ianhin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.einsteintoolkit.org/pipermail/users/attachments/20150611/b039ab64/attachment.html 


More information about the Users mailing list