problems with cclm runs – in #9: CCLM

in #9: CCLM

<p> I have been trying to set up my cosmo-clm model on our cluster. now im carrying out a test run which runs from 1980010100 to 1980030100. </p> <p> Int2lm runs well but the problem is cclm. It runs up to the last day and complete successfully but when i checked the outputs it has been repeating same dates. </p> <p> It runs up to 14Z11JAN1980 instead of 1980030100. even if i get the output file. it is named as the lffd198003100 but the details of that file are those of 14Z11JAN1980 <br/> . Im using subchain script. The only file which has proper details is the first file (1980010100) all other files have wrong outputs and cant display <br/> i have attached the cclm script <br/> i need help </p>

  @sinclairchinyoka in #7ace788

<p> I have been trying to set up my cosmo-clm model on our cluster. now im carrying out a test run which runs from 1980010100 to 1980030100. </p> <p> Int2lm runs well but the problem is cclm. It runs up to the last day and complete successfully but when i checked the outputs it has been repeating same dates. </p> <p> It runs up to 14Z11JAN1980 instead of 1980030100. even if i get the output file. it is named as the lffd198003100 but the details of that file are those of 14Z11JAN1980 <br/> . Im using subchain script. The only file which has proper details is the first file (1980010100) all other files have wrong outputs and cant display <br/> i have attached the cclm script <br/> i need help </p>

problems with cclm runs

I have been trying to set up my cosmo-clm model on our cluster. now im carrying out a test run which runs from 1980010100 to 1980030100.

Int2lm runs well but the problem is cclm. It runs up to the last day and complete successfully but when i checked the outputs it has been repeating same dates.

It runs up to 14Z11JAN1980 instead of 1980030100. even if i get the output file. it is named as the lffd198003100 but the details of that file are those of 14Z11JAN1980
. Im using subchain script. The only file which has proper details is the first file (1980010100) all other files have wrong outputs and cant display
i have attached the cclm script
i need help

View in channel
<p> It is not possible to answer without knowing the content of the files in the joblogs, joboutputs and <span class="caps"> EXPID </span> directories. Please tar and gzip these directories and upload them to the forum. </p>

  @burkhardtrockel in #c97b690

<p> It is not possible to answer without knowing the content of the files in the joblogs, joboutputs and <span class="caps"> EXPID </span> directories. Please tar and gzip these directories and upload them to the forum. </p>

It is not possible to answer without knowing the content of the files in the joblogs, joboutputs and EXPID directories. Please tar and gzip these directories and upload them to the forum.

<p> job logs of the runs </p>

  @sinclairchinyoka in #ad7303e

<p> job logs of the runs </p>

job logs of the runs

<p> The joblogs directories are empty (except the finish_joblist file). You probably did some error in the job submitting part. Please tar and gzip your sp001 directory and upload it to the forum. </p>

  @burkhardtrockel in #893254c

<p> The joblogs directories are empty (except the finish_joblist file). You probably did some error in the job submitting part. Please tar and gzip your sp001 directory and upload it to the forum. </p>

The joblogs directories are empty (except the finish_joblist file). You probably did some error in the job submitting part. Please tar and gzip your sp001 directory and upload it to the forum.

<p> It also like that in my spool. It only contains one file. </p>

  @sinclairchinyoka in #469217c

<p> It also like that in my spool. It only contains one file. </p>

It also like that in my spool. It only contains one file.

<p> Here is the sp001 directory from the work directory </p>

  @sinclairchinyoka in #e64c898

<p> Here is the sp001 directory from the work directory </p>

Here is the sp001 directory from the work directory

<p> I need the <code> cclm-sp_2.4/chain/gcm_to_cclm/sp001/ </code> directory not the work directory. </p>

  @burkhardtrockel in #469ad90

<p> I need the <code> cclm-sp_2.4/chain/gcm_to_cclm/sp001/ </code> directory not the work directory. </p>

I need the cclm-sp_2.4/chain/gcm_to_cclm/sp001/ directory not the work directory.

<p> here it is </p>

  @sinclairchinyoka in #2fe62ca

<p> here it is </p>

here it is

<p> It seems that the cclm job not even has been submitted. <br/> Can you please: <br/> in int2lm.job.tmpl replace <br/> <pre> $RUNDIR/int2lm.log by <code>{JOBLOGFILE} &lt;/pre&gt; and in cclm.job.tmpl replace &lt;pre&gt; $RUNDIR/cosmo-clm.log by </code>{JOBLOGFILE} </pre> <br/> this writes the log files where they are supposed to be find. <br/> After this please make a new clean start: <br/> <pre> subchain clean subchain start </pre> <br/> after the end of the job, please upload the new <code> work/sp001 </code> and <code> cclm-sp_2.4/chain/gcm_to_cclm/sp001/ </code> to the forum. </p>

  @burkhardtrockel in #47a7f89

<p> It seems that the cclm job not even has been submitted. <br/> Can you please: <br/> in int2lm.job.tmpl replace <br/> <pre> $RUNDIR/int2lm.log by <code>{JOBLOGFILE} &lt;/pre&gt; and in cclm.job.tmpl replace &lt;pre&gt; $RUNDIR/cosmo-clm.log by </code>{JOBLOGFILE} </pre> <br/> this writes the log files where they are supposed to be find. <br/> After this please make a new clean start: <br/> <pre> subchain clean subchain start </pre> <br/> after the end of the job, please upload the new <code> work/sp001 </code> and <code> cclm-sp_2.4/chain/gcm_to_cclm/sp001/ </code> to the forum. </p>

It seems that the cclm job not even has been submitted.
Can you please:
in int2lm.job.tmpl replace

$RUNDIR/int2lm.log   by  {JOBLOGFILE}
</pre>
and in cclm.job.tmpl replace
<pre>
$RUNDIR/cosmo-clm.log   by  {JOBLOGFILE}

this writes the log files where they are supposed to be find.
After this please make a new clean start:
subchain clean
subchain start

after the end of the job, please upload the new work/sp001 and cclm-sp_2.4/chain/gcm_to_cclm/sp001/ to the forum.

<p> Thank you very much. let me do that now </p>

  @sinclairchinyoka in #ebed1e6

<p> Thank you very much. let me do that now </p>

Thank you very much. let me do that now

<p> attached log files. Also i noted that instead of it ending at 1980020100 it stopped at 1980013118 </p>

  @sinclairchinyoka in #8eb0be7

<p> attached log files. Also i noted that instead of it ending at 1980020100 it stopped at 1980013118 </p>

attached log files. Also i noted that instead of it ending at 1980020100 it stopped at 1980013118

<p> when proccessing the cclm out eg lffd1980010606 it opens lbfd1980010612.nc its like its 6 hours ahead and becoz of that it stops at lffd1980013118 instead of lffd1980020100. </p>

  @sinclairchinyoka in #5395c06

<p> when proccessing the cclm out eg lffd1980010606 it opens lbfd1980010612.nc its like its 6 hours ahead and becoz of that it stops at lffd1980013118 instead of lffd1980020100. </p>

when proccessing the cclm out eg lffd1980010606 it opens lbfd1980010612.nc its like its 6 hours ahead and becoz of that it stops at lffd1980013118 instead of lffd1980020100.

<p> The <span class="caps"> CCLM </span> run seems to be OK, but the next <code> subchain </code> command was not executed: <br/> <pre><br/> more sp001_wk/joblogs/cclm/cclm_sp001_1980_01.o%j<br/> /opt/tsce/share/cu100/mom_priv/jobs/6911.mu01.SC: line 309: subchain: command not found</pre> </p> <p> Can’t find file /lustre/home/uon_schinyoka/.vnc/mu01:81.pid <br/> You’ll have to kill the Xvnc process manually <br/> </p> <p> I guess you need to put a <code> ./ </code> before your subchain command. </p> <p> Therefore please change in <code> cclm.job.tmpl </code> “@subchain arch@” to “@./subchain arch@“ <br/> and <br/> in <code> arch.job.tmpl </code> “@subchain post@” to “@./subchain post@” and “@subchain cclm”@ to “@./subchain cclm@“ <br/> In <code> prep.job.tmpl </code> and <code> int2lm.job.tmpl </code> this is already OK. </p> <p> After that you may either try a <br/> <pre> subchain arch </pre> </p> <p> or start clean again with <br/> <pre> subchain clean subchain start </pre> </p>

  @burkhardtrockel in #450d425

<p> The <span class="caps"> CCLM </span> run seems to be OK, but the next <code> subchain </code> command was not executed: <br/> <pre><br/> more sp001_wk/joblogs/cclm/cclm_sp001_1980_01.o%j<br/> /opt/tsce/share/cu100/mom_priv/jobs/6911.mu01.SC: line 309: subchain: command not found</pre> </p> <p> Can’t find file /lustre/home/uon_schinyoka/.vnc/mu01:81.pid <br/> You’ll have to kill the Xvnc process manually <br/> </p> <p> I guess you need to put a <code> ./ </code> before your subchain command. </p> <p> Therefore please change in <code> cclm.job.tmpl </code> “@subchain arch@” to “@./subchain arch@“ <br/> and <br/> in <code> arch.job.tmpl </code> “@subchain post@” to “@./subchain post@” and “@subchain cclm”@ to “@./subchain cclm@“ <br/> In <code> prep.job.tmpl </code> and <code> int2lm.job.tmpl </code> this is already OK. </p> <p> After that you may either try a <br/> <pre> subchain arch </pre> </p> <p> or start clean again with <br/> <pre> subchain clean subchain start </pre> </p>

The CCLM run seems to be OK, but the next subchain command was not executed:


more sp001_wk/joblogs/cclm/cclm_sp001_1980_01.o%j
/opt/tsce/share/cu100/mom_priv/jobs/6911.mu01.SC: line 309: subchain: command not found

Can’t find file /lustre/home/uon_schinyoka/.vnc/mu01:81.pid
You’ll have to kill the Xvnc process manually

I guess you need to put a ./ before your subchain command.

Therefore please change in cclm.job.tmpl “@subchain arch@” to “@./subchain arch@“
and
in arch.job.tmpl “@subchain post@” to “@./subchain post@” and “@subchain cclm”@ to “@./subchain cclm@“
In prep.job.tmpl and int2lm.job.tmpl this is already OK.

After that you may either try a

subchain arch

or start clean again with

subchain clean
subchain start

<p> Thank you very much. </p> <p> But when i use gribapi2ctl on any output of cclm and use grads. all the variables are not displaying and the last time step is 14Z11JAN1980 instead of 00Z02FEB1980. </p> <p> no2t,tmax_2m and tmin_2m are all not displaying. </p>

  @sinclairchinyoka in #fd0db5f

<p> Thank you very much. </p> <p> But when i use gribapi2ctl on any output of cclm and use grads. all the variables are not displaying and the last time step is 14Z11JAN1980 instead of 00Z02FEB1980. </p> <p> no2t,tmax_2m and tmin_2m are all not displaying. </p>

Thank you very much.

But when i use gribapi2ctl on any output of cclm and use grads. all the variables are not displaying and the last time step is 14Z11JAN1980 instead of 00Z02FEB1980.

no2t,tmax_2m and tmin_2m are all not displaying.

<p> please can you take a look at the following files. It seems like i can only be able to display data from lffd1980010100.nc only. from all other files i cant. Its either i can a message that its a constant value of 0 or data is missing especially for the tot_prec,Tmax_2m and tmin_2m. Also the time steps are wrong </p>

  @sinclairchinyoka in #d2e4dd1

<p> please can you take a look at the following files. It seems like i can only be able to display data from lffd1980010100.nc only. from all other files i cant. Its either i can a message that its a constant value of 0 or data is missing especially for the tot_prec,Tmax_2m and tmin_2m. Also the time steps are wrong </p>

please can you take a look at the following files. It seems like i can only be able to display data from lffd1980010100.nc only. from all other files i cant. Its either i can a message that its a constant value of 0 or data is missing especially for the tot_prec,Tmax_2m and tmin_2m. Also the time steps are wrong

<p> arch and post are incurring some errors. please take a look at the log files </p>

  @sinclairchinyoka in #f7e6e4b

<p> arch and post are incurring some errors. please take a look at the log files </p>

arch and post are incurring some errors. please take a look at the log files

<p> The errors do not seem to result from the chain itself. There are very likely false namelist settings and/or compiler options. <br/> You are downscaling from about 200km ( <span class="caps"> NCEP </span> -RA) to 3km which is a factor of 67 !!!!!!! At least you need to perform a double nesting! <br/> Regarding the compiler options: Please choose an option that stops the program in case of producing NaN. Your job already produces NaNs after the first few time steps but runs still on for the whole month producing always just NaNs. <br/> It is very hard to help you from remote with so many mistakes. Do you plan to take part in the training course? I am sure that would be of valuable help for you. </p>

  @burkhardtrockel in #5f35152

<p> The errors do not seem to result from the chain itself. There are very likely false namelist settings and/or compiler options. <br/> You are downscaling from about 200km ( <span class="caps"> NCEP </span> -RA) to 3km which is a factor of 67 !!!!!!! At least you need to perform a double nesting! <br/> Regarding the compiler options: Please choose an option that stops the program in case of producing NaN. Your job already produces NaNs after the first few time steps but runs still on for the whole month producing always just NaNs. <br/> It is very hard to help you from remote with so many mistakes. Do you plan to take part in the training course? I am sure that would be of valuable help for you. </p>

The errors do not seem to result from the chain itself. There are very likely false namelist settings and/or compiler options.
You are downscaling from about 200km ( NCEP -RA) to 3km which is a factor of 67 !!!!!!! At least you need to perform a double nesting!
Regarding the compiler options: Please choose an option that stops the program in case of producing NaN. Your job already produces NaNs after the first few time steps but runs still on for the whole month producing always just NaNs.
It is very hard to help you from remote with so many mistakes. Do you plan to take part in the training course? I am sure that would be of valuable help for you.

<p> Thank you very much for your help. I final managed to display some interesting outputs. I changed the resolution (dlon) from 0.025 to 0.125. <br/> The only challenge remaining is that when processing the cclm outputs. its finishing at 1980013118 instead of 198002100. Also the file 1980013118 has the time step for 1980011114. </p>

  @sinclairchinyoka in #dc14a92

<p> Thank you very much for your help. I final managed to display some interesting outputs. I changed the resolution (dlon) from 0.025 to 0.125. <br/> The only challenge remaining is that when processing the cclm outputs. its finishing at 1980013118 instead of 198002100. Also the file 1980013118 has the time step for 1980011114. </p>

Thank you very much for your help. I final managed to display some interesting outputs. I changed the resolution (dlon) from 0.025 to 0.125.
The only challenge remaining is that when processing the cclm outputs. its finishing at 1980013118 instead of 198002100. Also the file 1980013118 has the time step for 1980011114.

<p> Also for the training course. I missed the registration. I wont be coming this year but the next training i will be available. I once attended last year’s training but i was on <span class="caps"> NWP </span> Group. </p>

  @sinclairchinyoka in #c64be23

<p> Also for the training course. I missed the registration. I wont be coming this year but the next training i will be available. I once attended last year’s training but i was on <span class="caps"> NWP </span> Group. </p>

Also for the training course. I missed the registration. I wont be coming this year but the next training i will be available. I once attended last year’s training but i was on NWP Group.