[FLASH-USERS] Big Problem??
Tomasz Plewa
tplewa at fsu.edu
Wed Jul 30 06:57:59 EDT 2008
Seyit -
The "problem" is already present in a serial model.
Also, I noticed that you are using very aggressive compiler
optimizations. This frequently affects floating point accuracy. Can you
produce a single processor, single patch models with no optimizations?
Tomek
--
Seyit Hocuk wrote:
> BTW... The results also differ when I am using multiple cpu's instead
> of 1.
>
> mpiexec -np 4 ./flash2 (on pc 3 instead of running just ./flash2)
> Note: k,kJ,c0 etc. are the same
>
>
> flash: initializing for jeans problem.
>
> kx = 7.8539816E-20 ky = 1.5707963E-19 kz = 0.0000000E+00
>
> k = 1.756203682760181E-019
> kJ = 6.508776532382562E-019
> c0 = 181807.294175918 omega^2 = -1.298354260861123E-026
> velamp = 64881.6252564676
> perturbation is unstable with growth time 8776137041619.21
> min_blocks 1 max_blocks 2 tot_blocks 5
> min_blocks 5 max_blocks 6 tot_blocks 21
> min_blocks 21 max_blocks 22 tot_blocks 85
> min_blocks 85 max_blocks 86 tot_blocks 341
> INITIAL TIMESTEP = 100000.000000000 [CHECKPOINT_WR] NOTE: will
> send 520 blocks per message.
> [CHECKPOINT_WR] Writing checkpoint file jeans_r13vB_hdf5_chk_0000
> Progress: |...
> *** Wrote output to jeans_r13vB_hdf5_chk_0000 ( 341 blocks ) ***
> number of variables found for plotfile storage = 3
> dens pres temp
> *** Wrote output to jeans_r13vB_hdf5_plt_cnt_0000 ***
> n t dt | dt_hydro
> 1 2.0000E+05 2.0000E+05 | 4.197E+11
> 2 6.0000E+05 4.0000E+05 | 4.197E+11
> 3 1.4000E+06 8.0000E+05 | 4.197E+11
> 4 3.0000E+06 1.6000E+06 | 4.197E+11
> 5 6.2000E+06 3.2000E+06 | 4.197E+11
> 6 1.2600E+07 6.4000E+06 | 4.197E+11
> 7 2.5400E+07 1.2800E+07 | 4.197E+11
> 8 5.1000E+07 2.5600E+07 | 4.197E+11
> 9 1.0220E+08 5.1200E+07 | 4.197E+11
> 10 2.0460E+08 1.0240E+08 | 4.198E+11
> 11 4.0940E+08 2.0480E+08 | 4.198E+11
> 12 8.1900E+08 4.0960E+08 | 4.198E+11
> 13 1.6382E+09 8.1920E+08 | 4.199E+11
> 14 3.2766E+09 1.6384E+09 | 4.201E+11
> 15 6.5534E+09 3.0000E+09 | 4.205E+11
> 16 1.2553E+10 3.0000E+09 | 4.213E+11
> 17 1.8553E+10 3.0000E+09 | 4.221E+11
> 18 2.4553E+10 3.0000E+09 | 4.229E+11
> 19 3.0553E+10 3.0000E+09 | 4.237E+11
> 20 3.6553E+10 3.0000E+09 | 4.246E+11
> 21 4.2553E+10 3.0000E+09 | 4.254E+11
> 22 4.8553E+10 3.0000E+09 | 4.263E+11
> 23 5.4553E+10 3.0000E+09 | 4.272E+11
> 24 6.0553E+10 3.0000E+09 | 4.281E+11
> 25 6.6553E+10 3.0000E+09 | 4.290E+11
> 26 7.2553E+10 3.0000E+09 | 4.300E+11
> 27 7.8553E+10 3.0000E+09 | 4.309E+11
> 28 8.4553E+10 3.0000E+09 | 4.319E+11
> 29 9.0553E+10 3.0000E+09 | 4.328E+11
> 30 9.6553E+10 3.0000E+09 | 4.338E+11
> 31 1.0255E+11 3.0000E+09 | 4.348E+11
> 32 1.0855E+11 3.0000E+09 | 4.358E+11
> 33 1.1455E+11 3.0000E+09 | 4.369E+11
> 34 1.2055E+11 3.0000E+09 | 4.380E+11
> 35 1.2655E+11 3.0000E+09 | 4.393E+11
> 36 1.3255E+11 3.0000E+09 | 4.405E+11
> 37 1.3855E+11 3.0000E+09 | 4.416E+11
> 38 1.4455E+11 3.0000E+09 | 4.429E+11
> 39 1.5055E+11 3.0000E+09 | 4.441E+11
> 40 1.5655E+11 3.0000E+09 | 4.454E+11
> 41 1.6255E+11 3.0000E+09 | 4.465E+11
> 42 1.6855E+11 3.0000E+09 | 4.478E+11
> 43 1.7455E+11 3.0000E+09 | 4.491E+11
> 44 1.8055E+11 3.0000E+09 | 4.503E+11
> 45 1.8655E+11 3.0000E+09 | 4.516E+11
> 46 1.9255E+11 3.0000E+09 | 4.529E+11
> 47 1.9855E+11 3.0000E+09 | 4.541E+11
> 48 2.0455E+11 3.0000E+09 | 4.554E+11
> 49 2.1055E+11 3.0000E+09 | 4.567E+11
> 50 2.1655E+11 3.0000E+09 | 4.576E+11
>
>
>
>
> Seyit Hocuk wrote:
>> Hi Tomek good morning,
>>
>> Let me answer your first question.
>>
>> * /Any luck with a single block/single level runs? /
>>
>> Nope. Even single level (lref_min = lref_max) runs differ in
>> dt_hydro. That means I don't use refinement at all. I guess it is
>> also affecting the end results. I have to do a complete simulation
>> again to be sure. But the initial files show small differences already.
>>
>> Here the initial results.
>> Note 1: the differences in precision in k, kJ, c0 etc...
>> Note 2: pc 1 and 3 are not that much differen, but grow more
>> different later on.
>>
>>
>> *pc 1*
>>
>> flash: initializing for jeans problem.
>>
>> kx = 7.8539816E-20 ky = 1.5707963E-19 kz = 0.0000000E+00
>>
>> k = 1.756203682760181E-019
>> kJ = 6.508776532382562E-019
>> c0 = 181807.294175918 omega^2 = -1.298354260861123E-026
>> velamp = 64881.6252564676 perturbation is unstable with growth
>> time 8776137041619.21 min_blocks 5 max_blocks
>> 5 tot_blocks 5
>> min_blocks 21 max_blocks 21 tot_blocks 21
>> min_blocks 85 max_blocks 85 tot_blocks 85
>> min_blocks 341 max_blocks 341 tot_blocks 341
>> INITIAL TIMESTEP = 100000.000000000 [CHECKPOINT_WR] NOTE: will
>> send 520 blocks per message.
>> Warning! The HDF5 header files included by this application do not
>> match the
>> version used by the HDF5 library to which this application is linked.
>> Data
>> corruption or segmentation faults may occur if the application is
>> allowed to continue. You can, at your own risk, disable this check
>> by setting
>> the environment variable 'HDF5_DISABLE_VERSION_CHECK' to a value of '1'.
>> Setting it to 2 will suppress the warning totally.
>> Headers are 1.6.6, library is 1.6.7
>> Bye...
>> forrtl: info: Fortran error message number is 76.
>> forrtl: warning: Could not open message catalog: ifcore_msg.cat.
>> forrtl: info: Check environment variable NLSPATH and protection of
>> /usr/lib/ifcore_msg.cat.
>> Abort
>> [1] + Done emacs flash.par
>> [seyit at celsius objectT3]$ setenv HDF5_DISABLE_VERSION_CHECK 1
>> [seyit at celsius objectT3]$ ./flash2
>>
>> flash: initializing for jeans problem.
>>
>> kx = 7.8539816E-20 ky = 1.5707963E-19 kz = 0.0000000E+00
>>
>> k = 1.756203682760181E-019
>> kJ = 6.508776532382562E-019
>> c0 = 181807.294175918 omega^2 = -1.298354260861123E-026
>> velamp = 64881.6252564676 perturbation is unstable with growth
>> time 8776137041619.21 min_blocks 5 max_blocks
>> 5 tot_blocks 5
>> min_blocks 21 max_blocks 21 tot_blocks 21
>> min_blocks 85 max_blocks 85 tot_blocks 85
>> min_blocks 341 max_blocks 341 tot_blocks 341
>> INITIAL TIMESTEP = 100000.000000000 [CHECKPOINT_WR] NOTE: will
>> send 520 blocks per message.
>> [CHECKPOINT_WR] Writing checkpoint file jeans_r13vA_hdf5_chk_0000
>> Progress: |
>> *** Wrote output to jeans_r13vA_hdf5_chk_0000 ( 341 blocks ) ***
>> number of variables found for plotfile storage = 3
>> dens pres temp
>> *** Wrote output to jeans_r13vA_hdf5_plt_cnt_0000 ***
>> n t dt | dt_hydro
>> 1 2.0000E+05 2.0000E+05 | 4.240E+11
>> 2 6.0000E+05 4.0000E+05 | 4.240E+11
>> 3 1.4000E+06 8.0000E+05 | 4.240E+11
>> 4 3.0000E+06 1.6000E+06 | 4.240E+11
>> 5 6.2000E+06 3.2000E+06 | 4.240E+11
>> 6 1.2600E+07 6.4000E+06 | 4.240E+11
>> 7 2.5400E+07 1.2800E+07 | 4.240E+11
>> 8 5.1000E+07 2.5600E+07 | 4.240E+11
>> 9 1.0220E+08 5.1200E+07 | 4.240E+11
>> 10 2.0460E+08 1.0240E+08 | 4.240E+11
>> 11 4.0940E+08 2.0480E+08 | 4.241E+11
>> 12 8.1900E+08 4.0960E+08 | 4.241E+11
>> 13 1.6382E+09 8.1920E+08 | 4.241E+11
>> 14 3.2766E+09 1.6384E+09 | 4.242E+11
>> 15 6.5534E+09 3.0000E+09 | 4.243E+11
>> 16 1.2553E+10 3.0000E+09 | 4.247E+11
>> 17 1.8553E+10 3.0000E+09 | 4.250E+11
>> 18 2.4553E+10 3.0000E+09 | 4.255E+11
>> 19 3.0553E+10 3.0000E+09 | 4.259E+11
>> 20 3.6553E+10 3.0000E+09 | 4.266E+11
>> 21 4.2553E+10 3.0000E+09 | 4.273E+11
>> 22 4.8553E+10 3.0000E+09 | 4.281E+11
>> 23 5.4553E+10 3.0000E+09 | 4.288E+11
>> 24 6.0553E+10 3.0000E+09 | 4.296E+11
>> 25 6.6553E+10 3.0000E+09 | 4.304E+11
>> 26 7.2553E+10 3.0000E+09 | 4.313E+11
>> 27 7.8553E+10 3.0000E+09 | 4.321E+11
>> 28 8.4553E+10 3.0000E+09 | 4.330E+11
>> 29 9.0553E+10 3.0000E+09 | 4.339E+11
>> 30 9.6553E+10 3.0000E+09 | 4.348E+11
>> 31 1.0255E+11 3.0000E+09 | 4.357E+11
>> 32 1.0855E+11 3.0000E+09 | 4.366E+11
>> 33 1.1455E+11 3.0000E+09 | 4.301E+11
>> 34 1.2055E+11 3.0000E+09 | 3.803E+11
>> 35 1.2655E+11 3.0000E+09 | 3.379E+11
>> 36 1.3255E+11 3.0000E+09 | 3.057E+11
>> 37 1.3855E+11 3.0000E+09 | 2.868E+11
>> 38 1.4455E+11 3.0000E+09 | 2.876E+11
>> 39 1.5055E+11 3.0000E+09 | 2.880E+11
>> 40 1.5655E+11 3.0000E+09 | 2.903E+11
>> 41 1.6255E+11 3.0000E+09 | 2.936E+11
>> 42 1.6855E+11 3.0000E+09 | 2.978E+11
>> 43 1.7455E+11 3.0000E+09 | 4.475E+11
>> 44 1.8055E+11 3.0000E+09 | 4.486E+11
>> 45 1.8655E+11 3.0000E+09 | 4.496E+11
>> 46 1.9255E+11 3.0000E+09 | 4.507E+11
>> 47 1.9855E+11 3.0000E+09 | 4.517E+11
>> 48 2.0455E+11 3.0000E+09 | 4.528E+11
>> 49 2.1055E+11 3.0000E+09 | 4.538E+11
>> 50 2.1655E+11 3.0000E+09 | 4.549E+11
>>
>>
>>
>> *pc 2*
>>
>> flash: initializing for jeans problem.
>>
>> kx = 7.8539816E-20 ky = 1.5707963E-19 kz = 0.0000000E+00
>>
>> k = 1.7562036827601810E-019
>> kJ = 6.5087765323825604E-019
>> c0 = 181807.2941759182 omega^2 = -1.2983542608611234E-026
>> velamp = 64881.62525646757 perturbation is unstable with
>> growth time 8776137041619.213 min_blocks 5
>> max_blocks 5 tot_blocks 5
>> min_blocks 21 max_blocks 21 tot_blocks 21
>> min_blocks 85 max_blocks 85 tot_blocks 85
>> min_blocks 341 max_blocks 341 tot_blocks 341
>> INITIAL TIMESTEP = 100000.0000000000 [CHECKPOINT_WR] NOTE:
>> will send 520 blocks per message.
>> [CHECKPOINT_WR] Writing checkpoint file jeans_r13vA_hdf5_chk_0000
>> Progress: |
>> *** Wrote output to jeans_r13vA_hdf5_chk_0000 ( 341 blocks )
>> ***
>> number of variables found for plotfile storage = 3
>> dens pres temp
>> *** Wrote output to jeans_r13vA_hdf5_plt_cnt_0000 ***
>> n t dt | dt_hydro
>> 1 2.0000E+05 2.0000E+05 | 4.406E+11
>> 2 6.0000E+05 4.0000E+05 | 4.406E+11
>> 3 1.4000E+06 8.0000E+05 | 4.406E+11
>> 4 3.0000E+06 1.6000E+06 | 4.406E+11
>> 5 6.2000E+06 3.2000E+06 | 4.406E+11
>> 6 1.2600E+07 6.4000E+06 | 4.406E+11
>> 7 2.5400E+07 1.2800E+07 | 4.406E+11
>> 8 5.1000E+07 2.5600E+07 | 4.406E+11
>> 9 1.0220E+08 5.1200E+07 | 4.406E+11
>> 10 2.0460E+08 1.0240E+08 | 4.405E+11
>> 11 4.0940E+08 2.0480E+08 | 4.404E+11
>> 12 8.1900E+08 4.0960E+08 | 4.403E+11
>> 13 1.6382E+09 8.1920E+08 | 4.399E+11
>> 14 3.2766E+09 1.6384E+09 | 4.393E+11
>> 15 6.5534E+09 3.0000E+09 | 4.381E+11
>> 16 1.2553E+10 3.0000E+09 | 4.365E+11
>> 17 1.8553E+10 3.0000E+09 | 4.353E+11
>> 18 2.4553E+10 3.0000E+09 | 4.345E+11
>> 19 3.0553E+10 3.0000E+09 | 4.340E+11
>> 20 3.6553E+10 3.0000E+09 | 4.337E+11
>> 21 4.2553E+10 3.0000E+09 | 4.337E+11
>> 22 4.8553E+10 3.0000E+09 | 4.339E+11
>> 23 5.4553E+10 3.0000E+09 | 4.343E+11
>> 24 6.0553E+10 3.0000E+09 | 4.348E+11
>> 25 6.6553E+10 3.0000E+09 | 4.355E+11
>> 26 7.2553E+10 3.0000E+09 | 4.363E+11
>> 27 7.8553E+10 3.0000E+09 | 4.371E+11
>> 28 8.4553E+10 3.0000E+09 | 4.382E+11
>> 29 9.0553E+10 3.0000E+09 | 4.393E+11
>> 30 9.6553E+10 3.0000E+09 | 4.405E+11
>> 31 1.0255E+11 3.0000E+09 | 4.418E+11
>> 32 1.0855E+11 3.0000E+09 | 4.431E+11
>> 33 1.1455E+11 3.0000E+09 | 4.447E+11
>> 34 1.2055E+11 3.0000E+09 | 4.463E+11
>> 35 1.2655E+11 3.0000E+09 | 4.479E+11
>> 36 1.3255E+11 3.0000E+09 | 4.496E+11
>> 37 1.3855E+11 3.0000E+09 | 4.513E+11
>> 38 1.4455E+11 3.0000E+09 | 4.531E+11
>> 39 1.5055E+11 3.0000E+09 | 4.550E+11
>> 40 1.5655E+11 3.0000E+09 | 4.570E+11
>> 41 1.6255E+11 3.0000E+09 | 4.590E+11
>> 42 1.6855E+11 3.0000E+09 | 4.611E+11
>> 43 1.7455E+11 3.0000E+09 | 4.632E+11
>> 44 1.8055E+11 3.0000E+09 | 4.653E+11
>> 45 1.8655E+11 3.0000E+09 | 4.675E+11
>> 46 1.9255E+11 3.0000E+09 | 4.697E+11
>> 47 1.9855E+11 3.0000E+09 | 4.719E+11
>> 48 2.0455E+11 3.0000E+09 | 4.733E+11
>> 49 2.1055E+11 3.0000E+09 | 4.734E+11
>> 50 2.1655E+11 3.0000E+09 | 4.736E+11
>>
>>
>>
>> *pc 3*
>>
>>
>> flash: initializing for jeans problem.
>>
>> kx = 7.8539816E-20 ky = 1.5707963E-19 kz = 0.0000000E+00
>>
>> k = 1.756203682760181E-019
>> kJ = 6.508776532382562E-019
>> c0 = 181807.294175918 omega^2 = -1.298354260861123E-026
>> velamp = 64881.6252564676 perturbation is unstable with growth
>> time 8776137041619.21 min_blocks 5 max_blocks
>> 5 tot_blocks 5
>> min_blocks 21 max_blocks 21 tot_blocks 21
>> min_blocks 85 max_blocks 85 tot_blocks 85
>> min_blocks 341 max_blocks 341 tot_blocks 341
>> INITIAL TIMESTEP = 100000.000000000 [CHECKPOINT_WR] NOTE: will
>> send 520 blocks per message.
>> [CHECKPOINT_WR] Writing checkpoint file jeans_r13vA_hdf5_chk_0000
>> Progress: |
>> *** Wrote output to jeans_r13vA_hdf5_chk_0000 ( 341 blocks ) ***
>> number of variables found for plotfile storage = 3
>> dens pres temp
>> *** Wrote output to jeans_r13vA_hdf5_plt_cnt_0000 ***
>> n t dt | dt_hydro
>> 1 2.0000E+05 2.0000E+05 | 4.240E+11
>> 2 6.0000E+05 4.0000E+05 | 4.240E+11
>> 3 1.4000E+06 8.0000E+05 | 4.240E+11
>> 4 3.0000E+06 1.6000E+06 | 4.240E+11
>> 5 6.2000E+06 3.2000E+06 | 4.240E+11
>> 6 1.2600E+07 6.4000E+06 | 4.240E+11
>> 7 2.5400E+07 1.2800E+07 | 4.240E+11
>> 8 5.1000E+07 2.5600E+07 | 4.240E+11
>> 9 1.0220E+08 5.1200E+07 | 4.240E+11
>> 10 2.0460E+08 1.0240E+08 | 4.240E+11
>> 11 4.0940E+08 2.0480E+08 | 4.241E+11
>> 12 8.1900E+08 4.0960E+08 | 4.241E+11
>> 13 1.6382E+09 8.1920E+08 | 4.241E+11
>> 14 3.2766E+09 1.6384E+09 | 4.242E+11
>> 15 6.5534E+09 3.0000E+09 | 4.244E+11
>> 16 1.2553E+10 3.0000E+09 | 4.247E+11
>> 17 1.8553E+10 3.0000E+09 | 4.251E+11
>> 18 2.4553E+10 3.0000E+09 | 4.255E+11
>> 19 3.0553E+10 3.0000E+09 | 4.259E+11
>> 20 3.6553E+10 3.0000E+09 | 4.266E+11
>> 21 4.2553E+10 3.0000E+09 | 4.273E+11
>> 22 4.8553E+10 3.0000E+09 | 4.281E+11
>> 23 5.4553E+10 3.0000E+09 | 4.288E+11
>> 24 6.0553E+10 3.0000E+09 | 4.296E+11
>> 25 6.6553E+10 3.0000E+09 | 4.304E+11
>> 26 7.2553E+10 3.0000E+09 | 4.313E+11
>> 27 7.8553E+10 3.0000E+09 | 4.321E+11
>> 28 8.4553E+10 3.0000E+09 | 4.330E+11
>> 29 9.0553E+10 3.0000E+09 | 4.338E+11
>> 30 9.6553E+10 3.0000E+09 | 4.347E+11
>> 31 1.0255E+11 3.0000E+09 | 4.356E+11
>> 32 1.0855E+11 3.0000E+09 | 4.366E+11
>> 33 1.1455E+11 3.0000E+09 | 4.375E+11
>> 34 1.2055E+11 3.0000E+09 | 4.384E+11
>> 35 1.2655E+11 3.0000E+09 | 4.394E+11
>> 36 1.3255E+11 3.0000E+09 | 4.404E+11
>> 37 1.3855E+11 3.0000E+09 | 4.413E+11
>> 38 1.4455E+11 3.0000E+09 | 4.423E+11
>> 39 1.5055E+11 3.0000E+09 | 4.433E+11
>> 40 1.5655E+11 3.0000E+09 | 4.444E+11
>> 41 1.6255E+11 3.0000E+09 | 4.454E+11
>> 42 1.6855E+11 3.0000E+09 | 4.464E+11
>> 43 1.7455E+11 3.0000E+09 | 4.474E+11
>> 44 1.8055E+11 3.0000E+09 | 4.484E+11
>> 45 1.8655E+11 3.0000E+09 | 4.495E+11
>> 46 1.9255E+11 3.0000E+09 | 4.505E+11
>> 47 1.9855E+11 3.0000E+09 | 4.516E+11
>> 48 2.0455E+11 3.0000E+09 | 4.526E+11
>> 49 2.1055E+11 3.0000E+09 | 4.537E+11
>> 50 2.1655E+11 3.0000E+09 | 4.547E+11
>>
>>
>>
>>
>> *here again the 3 different pc's*
>>
>>
>> 1)
>> Centos5
>> Linux celsius 2.6.18-53.1.19.el5 #1 SMP Wed May 7 08:20:19 EDT 2008
>> i686 i686 i386 GNU/Linux
>> ifort intel compiler version 8.0
>> hdf5 1.6.6 (serial) *(updated recently to 1.6.7)*
>> mpich2 Version: 1.0.3
>> gcc version 4.1.2 20070626 (Red Hat 4.1.2-14)
>> FFLAGS_OPT = -c -r8 -i4 -fast -ipo -ipo_obj -I $(MPI_PATH)/include
>>
>> 2)
>> Debian (??)
>> Linux hpcibm1 2.6.8-11-amd64-k8-smp #1 SMP Sun Oct 2 20:03:22 UTC
>> 2005 x86_64 GNU/Linux
>> pgf90 6.0-5 32-bit target on x86-64 Linux
>> hdf5 1.4.5 (serial)
>> mpich2 version 1.0.5
>> gcc version 3.3.5 (Debian 1:3.3.5-13)
>> FFLAGS_OPT = -c -fast -r8 -i4
>>
>> 3)
>> Ubuntu
>> Linux si01 2.6.24.3 #1 SMP Fri Jul 4 11:02:09 CEST 2008 x86_64 GNU/Linux
>> ifort intel compiler version 10.1.012
>> hdf5 1.6.7 (serial)
>> mpich2 Version: 1.0.6
>> gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)
>> FFLAGS_OPT = -c -r8 -i4 -xT -O3 -no-prec-div -static -I
>> $(MPI_PATH)/include
>> or (only -ipo difference, result seemingly doesn't change)
>> FFLAGS_OPT = -c -r8 -i4 -fast -I $(MPI_PATH)/include
>>
>>
>>
>>
>>
>>
>> Tomasz Plewa wrote:
>>> Seyit -
>>>
>>> Avoiding aggressive optimization is certainly good idea when it
>>> comes to debugging implementation. Alas, given you experience
>>> problems with 3 different compilers it is rather unlilkely that
>>> compilers are at fault.
>>>
>>> If your grid does not resolve structures correctly (persumably due
>>> to lax refinement criteria), the code may quite easily produce an
>>> unphysical state. But that should not happen during the initial grid
>>> generation.
>>>
>>> Any luck with a single block/single level runs?
>>>
>>> Tomek
>>> --
>>> Seyit Hocuk wrote:
>>>> Hi,
>>>>
>>>> It has been some time now, but to continue the discussion, I have
>>>> tested Jeans standard problem and Sedov standard problem including
>>>> changing the refinement levels. The result is exactly same for all
>>>> three different computers. Unfortunately my own setup shows
>>>> differences and it seems the end result changes by the coice of
>>>> computer/environment/compiler whatever. My own setup has a lot more
>>>> physics than the simpe test problems.
>>>>
>>>> One thing I am thinking about is that I adjusted the refinement
>>>> criteria to use a reference density instead of looking at the
>>>> second order derivative. I don't know if this could be the problem,
>>>> but dt_hydro should not depend on the refinement criteria so I
>>>> guess there is more to it.
>>>>
>>>> Unfortunately it is not easy to change compilers and some other
>>>> stuff (like cpu/kernel etc.) on the different computers because I
>>>> have limited rights on those machines. I don't want to find out I
>>>> have to choose one machine and stick with it. How do I know which
>>>> machine is the best. Anyway, I am going to try at least to get the
>>>> libraries the same (hdf5 / mpi) for all machines and see if that
>>>> helps. Btw the latest hdf5 (1.8.1) doesn't seem to work with Flash2.5.
>>>>
>>>> I will also play with compiler options. Try to make them exactly
>>>> the same and avoid aggressive optimizations. I hope this is where
>>>> the solution lies.
>>>>
>>>> Thanks,
>>>> Seyit
>>>>
>>>>
>>>>
>>>>
>>>> Tomasz Plewa wrote:
>>>>> Although I prefer a low cost approach to debugging which avoids
>>>>> dealing with setup/system/compiler information at first and
>>>>> attempts to eliminate 99% of the source code, here are quick
>>>>> comments on your systems.
>>>>>
>>>>> (1) pgi 6.0 is a quite antique compiler. It's so old that now I do
>>>>> not even remember what problems we may have had with it. I suspect
>>>>> quite a few and none of them might be relevant...
>>>>>
>>>>> (2) As you might noticed, intel has a newer compiler version. Your
>>>>> CentOS5 system needs upgrades on both mpich2 and compiler sides.
>>>>>
>>>>> (3) Your Ubuntu system seems relatively up-to-date although HDF5
>>>>> 1.8.x is out there and seems working fine.
>>>>>
>>>>> (4) Compilers are known to do strange things to perfect codes! So
>>>>> if you run into a trouble on a new system, one of the first things
>>>>> to try is to lower compiler optimization level. Avoiding -fast or
>>>>> -ip or -inline is usually a good idea. Simple -O0 may suffice for
>>>>> a couple of steps. As a bonus, the code compiles fast. Then the
>>>>> trick is to get it crashing quickly...
>>>>>
>>>>> Again, I suggest to simplify your application, isolate the
>>>>> problem, fix it, then find the best compiler, mpich, options,
>>>>> etc., to run your production code.
>>>>>
>>>>> Enjoy your weekend!
>>>>>
>>>>> Tomek
>>>>> --
>>>>> Seyit Hocuk wrote:
>>>>>> Hi Nathan,
>>>>>>
>>>>>> Thats a good Idea, I'll start simple and test those first thing
>>>>>> monday. It's too late for me now.
>>>>>>
>>>>>> To answer your questions:
>>>>>>
>>>>>> The problem I am running is my own home made, actually it's a
>>>>>> modified version of Jeans. Basically including heat, cool, eos
>>>>>> gamma, compositions, adjusted refinement criteria similar to one
>>>>>> of the other problems (with delta_ref and delta_deref) and a
>>>>>> scala of other small things.
>>>>>>
>>>>>> 1)
>>>>>> Centos5
>>>>>> Linux celsius 2.6.18-53.1.19.el5 #1 SMP Wed May 7 08:20:19 EDT
>>>>>> 2008 i686 i686 i386 GNU/Linux
>>>>>> ifort intel compiler version 8.0
>>>>>> hdf5 1.6.6 (serial)
>>>>>> mpich2 Version: 1.0.3
>>>>>> gcc version 4.1.2 20070626 (Red Hat 4.1.2-14)
>>>>>> FFLAGS_OPT = -c -r8 -i4 -fast -ipo -ipo_obj -I
>>>>>> $(MPI_PATH)/include
>>>>>>
>>>>>> 2)
>>>>>> Debian (??)
>>>>>> Linux hpcibm1 2.6.8-11-amd64-k8-smp #1 SMP Sun Oct 2 20:03:22 UTC
>>>>>> 2005 x86_64 GNU/Linux
>>>>>> pgf90 6.0-5 32-bit target on x86-64 Linux
>>>>>> hdf5 1.4.5 (serial)
>>>>>> mpich2 version 1.0.5
>>>>>> gcc version 3.3.5 (Debian 1:3.3.5-13)
>>>>>> FFLAGS_OPT = -c -fast -r8 -i4
>>>>>>
>>>>>> 3)
>>>>>> Ubuntu
>>>>>> Linux si01 2.6.24.3 #1 SMP Fri Jul 4 11:02:09 CEST 2008 x86_64
>>>>>> GNU/Linux
>>>>>> ifort intel compiler version 10.1.012
>>>>>> hdf5 1.6.7 (serial)
>>>>>> mpich2 Version: 1.0.6
>>>>>> gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)
>>>>>> FFLAGS_OPT = -c -r8 -i4 -xT -O3 -no-prec-div -static -I
>>>>>> $(MPI_PATH)/include
>>>>>> or (only -ipo difference, result seemingly doesn't change)
>>>>>> FFLAGS_OPT = -c -r8 -i4 -fast -I $(MPI_PATH)/include
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Nathan Hearn wrote:
>>>>>>> Hi Seyit,
>>>>>>>
>>>>>>> Are you seeing these differences in the standard test problems
>>>>>>> included in Flash (e.g., Noh, Sod, Sedov), or are they specific
>>>>>>> to the
>>>>>>> problem that you are running. (Really, this follows from Tomek's
>>>>>>> advice to look at the simplest configurations where the differences
>>>>>>> are seen.)
>>>>>>>
>>>>>>> Also, it would be good to include some more information in your
>>>>>>> messages, for instance: the problem that you are running, the
>>>>>>> specific
>>>>>>> architectures and compilers, the optimization flags, and the
>>>>>>> library
>>>>>>> versions for HDF5 and MPI that you are using. Details like
>>>>>>> these can
>>>>>>> be really helpful in evaluating and fixing the issues you are
>>>>>>> seeing.
>>>>>>>
>>>>>>>
>>>>>>> - Nathan
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jul 18, 2008 at 11:32 AM, Seyit Hocuk
>>>>>>> <seyit at astro.rug.nl> wrote:
>>>>>>>
>>>>>>>> Oh man,
>>>>>>>>
>>>>>>>> I am using 3 different computers (with different architectures,
>>>>>>>> compilers,
>>>>>>>> optimizations, environments). All three start differently
>>>>>>>> immediately from
>>>>>>>> the beginning (difference in dt_hydro and refinement).
>>>>>>>> Should I worry very very very much?
>>>>>>>> I'm now doing a complete simulation and am going to check if my
>>>>>>>> end results
>>>>>>>> also change.
>>>>>>>>
>>>>>>>> I'm trembling by the thought.....
>>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tplewa.vcf
Type: text/x-vcard
Size: 297 bytes
Desc: not available
URL: <http://flash.rochester.edu/pipermail/flash-users/attachments/20080730/038a1f3f/attachment-0001.vcf>
More information about the flash-users
mailing list