[Users] meeting minutes for 2017-02-27

Roland Haas rhaas at illinois.edu
Mon Feb 27 15:31:54 CST 2017


Hello all,

> comet file corruption:
> * CarpetIOASCII output files can contain corrupted lines of output in
>   the middle of a component
> * happens only on the scratch file system, not on $HOME and only for
>   more than 1 MPI ranks
> * happens for both the current release version of ET (ET_PayonGaposhkin
>   2016_11) as well as old (years old) versions of Carpet
> * workaound is to force flushing after each line by using std::eol
>   instead of "\n". This may be slow since flushing can be a very slow
>   operation (milliseconds if it flushes all the way to the physical
>   disk)
> * a workaround may be to write output into a string stream first then
>   dump its rdbuf() to file
> * will engage SDSC support to see if they have suggestions on how to
>   avoid this issue without having to first write to a local file, then
>   copy to scratch once the run finishes
This seems to be a file system issue. I modified the runscript to
sandwich strace into the call (14 in the RunScript ended up being the
file descriptor used for ASCII output) and used that information and
gawk:

gawk -vFS='"' '/write.*\/grid-coordinates.xy.asc/{print "printf \""$2"\""}' ~/strace.18826.log >recreate.sh

to produce a shell script that when run, would reply all the write()
calls that the MPI code used. Running the script produces a file that
differs from what what Cactus produced when run under MPI and which
agrees with the data in the testsuite.

So: Cactus calls the OS write() function with the correct data but
somehow the OS writes incorrect data to disk.

I will file a ticket with SDSC support.

Yours,
Roland

-- 
My email is as private as my paper mail. I therefore support encrypting
and signing email messages. Get my PGP key from http://keys.gnupg.net.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RunScript
Type: application/octet-stream
Size: 842 bytes
Desc: not available
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20170227/303ebe6f/attachment.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: OpenPGP digital signature
Url : http://lists.einsteintoolkit.org/pipermail/users/attachments/20170227/303ebe6f/attachment.bin 


More information about the Users mailing list