Our site saves small pieces of text information (cookies) on your
device in order to verify your login. These cookies are essential
to provide access to resources on this website and it will not
work properly without.
Learn more
<p>
Hi everybody,
<br/>
I have finished a 5 year simulation and the output files in
<span class="caps">
SCRATCH
</span>
directory have been removed. Now I want to continue the experiment for another year. Please let me know if this is possible and what are the actions required.
<br/>
Kind regards, Simon
</p>
<p>
Hi everybody,
<br/>
I have finished a 5 year simulation and the output files in
<span class="caps">
SCRATCH
</span>
directory have been removed. Now I want to continue the experiment for another year. Please let me know if this is possible and what are the actions required.
<br/>
Kind regards, Simon
</p>
Hi everybody,
I have finished a 5 year simulation and the output files in
SCRATCH
directory have been removed. Now I want to continue the experiment for another year. Please let me know if this is possible and what are the actions required.
Kind regards, Simon
<p>
If you use the latest subchain version you can create the directory structure with
<br/>
<pre>
subchain create
</pre>
<br/>
otherwise you have to do this by hand. In this case look at the section
<br/>
<code>
# create the job directory structure
</code>
<br/>
in the subchain script where the directories are created.
<br/>
I assume you want to perform a warm start, i.e. prolonging the run for another year. In this case do not change
<code>
YDATE_START
</code>
, but just adopt
<code>
YDATE_STOP
</code>
.
</p>
<p>
If you use the latest subchain version you can create the directory structure with
<br/>
<pre>
subchain create
</pre>
<br/>
otherwise you have to do this by hand. In this case look at the section
<br/>
<code>
# create the job directory structure
</code>
<br/>
in the subchain script where the directories are created.
<br/>
I assume you want to perform a warm start, i.e. prolonging the run for another year. In this case do not change
<code>
YDATE_START
</code>
, but just adopt
<code>
YDATE_STOP
</code>
.
</p>
If you use the latest subchain version you can create the directory structure with
subchain create
otherwise you have to do this by hand. In this case look at the section
# create the job directory structure
in the subchain script where the directories are created.
I assume you want to perform a warm start, i.e. prolonging the run for another year. In this case do not change
YDATE_START
, but just adopt
YDATE_STOP
.
<p>
Thanks much.
<br/>
I apparently did something wrong.
</p>
<p>
I use the (1.3.4) subchain version (is it the latest?).
<br/>
I submitted ./subchain create and now have the ….chain/scratch/…directory (which is empty).
<br/>
The date.log howvere contains the date which is equial to
<span class="caps">
YDATE
</span>
_START.
<br/>
After that I attempted to submit a restart job ./subchain cclm
<span class="caps">
DATE
</span>
<br/>
where the
<span class="caps">
DATE
</span>
is that from the original date.log file – i.e. not that
created by the ./subchain create but the job has stopped – first the cclm and then the prep and int2lm.
</p>
<p>
The subchain date.log and the log files + the files from jobs directory (tarred) are attached.
<br/>
Please have a look.
<br/>
Simon
</p>
<p>
Thanks much.
<br/>
I apparently did something wrong.
</p>
<p>
I use the (1.3.4) subchain version (is it the latest?).
<br/>
I submitted ./subchain create and now have the ….chain/scratch/…directory (which is empty).
<br/>
The date.log howvere contains the date which is equial to
<span class="caps">
YDATE
</span>
_START.
<br/>
After that I attempted to submit a restart job ./subchain cclm
<span class="caps">
DATE
</span>
<br/>
where the
<span class="caps">
DATE
</span>
is that from the original date.log file – i.e. not that
created by the ./subchain create but the job has stopped – first the cclm and then the prep and int2lm.
</p>
<p>
The subchain date.log and the log files + the files from jobs directory (tarred) are attached.
<br/>
Please have a look.
<br/>
Simon
</p>
I use the (1.3.4) subchain version (is it the latest?).
I submitted ./subchain create and now have the ….chain/scratch/…directory (which is empty).
The date.log howvere contains the date which is equial to
YDATE
_START.
After that I attempted to submit a restart job ./subchain cclm
DATE
where the
DATE
is that from the original date.log file – i.e. not that
created by the ./subchain create but the job has stopped – first the cclm and then the prep and int2lm.
The subchain date.log and the log files + the files from jobs directory (tarred) are attached.
Please have a look.
Simon
<p>
1.3.4 is the lastest subchain released version.
<br/>
You wrote:
<br/>
<em>
The date.log however contains the date which is equial to
<span class="caps">
YDATE
</span>
_START.
</em>
<br/>
but actually in your subchain
<span class="caps">
YSTART
</span>
_DATE=1989010100 and in date.log it is 1994010100, which is OK.
<br/>
I guess you have to create the input data for cclm first. Please run in your case
<br/>
<code>
./subchain prep 1994010100
</code>
<br/>
If everything goes well, this job should call int2lm and later cclm automatically.
<br/>
By the way, calling
<code>
./subchain cclm
</code>
always takes the date from date.log. A second argument will be ignored.
</p>
<p>
1.3.4 is the lastest subchain released version.
<br/>
You wrote:
<br/>
<em>
The date.log however contains the date which is equial to
<span class="caps">
YDATE
</span>
_START.
</em>
<br/>
but actually in your subchain
<span class="caps">
YSTART
</span>
_DATE=1989010100 and in date.log it is 1994010100, which is OK.
<br/>
I guess you have to create the input data for cclm first. Please run in your case
<br/>
<code>
./subchain prep 1994010100
</code>
<br/>
If everything goes well, this job should call int2lm and later cclm automatically.
<br/>
By the way, calling
<code>
./subchain cclm
</code>
always takes the date from date.log. A second argument will be ignored.
</p>
1.3.4 is the lastest subchain released version.
You wrote:
The date.log however contains the date which is equial to
YDATE
_START.
but actually in your subchain
YSTART
_DATE=1989010100 and in date.log it is 1994010100, which is OK.
I guess you have to create the input data for cclm first. Please run in your case
./subchain prep 1994010100
If everything goes well, this job should call int2lm and later cclm automatically.
By the way, calling
./subchain cclm
always takes the date from date.log. A second argument will be ignored.
<p>
This ./subchain prep 1994010100 job really called int2lm but stopped after it. I attach the log file obtained.
<br/>
The last thing it did was the creation of two directories 1994_01 and 1994_02 in ..scratch/output/int2lm .
<br/>
Any hint, please.
</p>
<p>
This ./subchain prep 1994010100 job really called int2lm but stopped after it. I attach the log file obtained.
<br/>
The last thing it did was the creation of two directories 1994_01 and 1994_02 in ..scratch/output/int2lm .
<br/>
Any hint, please.
</p>
This ./subchain prep 1994010100 job really called int2lm but stopped after it. I attach the log file obtained.
The last thing it did was the creation of two directories 1994_01 and 1994_02 in ..scratch/output/int2lm .
Any hint, please.
<p>
Beate just found an error in the subchain script. In case you call
<code>
subchain create
</code>
the following command at around line 95 should not be called:
<br/>
<pre>
echo ${YDATE_START} ${YDATE_START} > ${PFDIR}/${EXPID}/date.log
</pre>
<br/>
This is only for a cold start. Please check if you have not overwritten date.log when you used
<code>
subchain create
</code>
<br/>
For a warm start there should be 1994010100 in your case in the date.log file.
</p>
<p>
Beate just found an error in the subchain script. In case you call
<code>
subchain create
</code>
the following command at around line 95 should not be called:
<br/>
<pre>
echo ${YDATE_START} ${YDATE_START} > ${PFDIR}/${EXPID}/date.log
</pre>
<br/>
This is only for a cold start. Please check if you have not overwritten date.log when you used
<code>
subchain create
</code>
<br/>
For a warm start there should be 1994010100 in your case in the date.log file.
</p>
This is only for a cold start. Please check if you have not overwritten date.log when you used
subchain create
For a warm start there should be 1994010100 in your case in the date.log file.
<p>
Do you mean that in my case (warm start) then line has to be commented, but must exist in the case of cold start?
<br/>
With this correction I tried submitting ./subchain cclm 1994010100 (with 1994010100 1994010100 in date.log [why twice in fact?), but this didn’t work.
<br/>
Should I try submitting subchain prep for an earlier date probably?, like ./subchain prep 1994010100 or ./subchain prep 1993120100 ?
</p>
<p>
Do you mean that in my case (warm start) then line has to be commented, but must exist in the case of cold start?
<br/>
With this correction I tried submitting ./subchain cclm 1994010100 (with 1994010100 1994010100 in date.log [why twice in fact?), but this didn’t work.
<br/>
Should I try submitting subchain prep for an earlier date probably?, like ./subchain prep 1994010100 or ./subchain prep 1993120100 ?
</p>
Do you mean that in my case (warm start) then line has to be commented, but must exist in the case of cold start?
With this correction I tried submitting ./subchain cclm 1994010100 (with 1994010100 1994010100 in date.log [why twice in fact?), but this didn’t work.
Should I try submitting subchain prep for an earlier date probably?, like ./subchain prep 1994010100 or ./subchain prep 1993120100 ?
<p>
If date.log contains
<code>
1994010100 1994010100
</code>
then
<code>
./subchain prep 1994010100
</code>
should work and start the chain again. Otherwise you mixed something up in the chain.
<br/>
The two dates in date.log are just for the case of running sub monthly chunks. This is not the case in your run, just leave it as it is.
</p>
<p>
If date.log contains
<code>
1994010100 1994010100
</code>
then
<code>
./subchain prep 1994010100
</code>
should work and start the chain again. Otherwise you mixed something up in the chain.
<br/>
The two dates in date.log are just for the case of running sub monthly chunks. This is not the case in your run, just leave it as it is.
</p>
If date.log contains
1994010100 1994010100
then
./subchain prep 1994010100
should work and start the chain again. Otherwise you mixed something up in the chain.
The two dates in date.log are just for the case of running sub monthly chunks. This is not the case in your run, just leave it as it is.
<p>
Thank you. It doesn’t work. It may be my mistake of course, but I do not think I made any change in the scripts except for that suggested by you in the subchain (commented the line echo ${YDATE_START} ${YDATE_START} > ${PFDIR}/${EXPID}/date.log). [By the way – I work with cclm-sp_1.4 and not with the 1.3.4.Should I try restarting the job using 1.3.4 ?].
<br/>
To summarize
<br/>
my date.log is as follows 1994010100 1994010100
<br/>
the job ./subchain prep 1994010100 starts successfully and calls int2lm but doesn’t call cclm.
<br/>
I tried submitting ./subchain cclm 1994010100 after that (and also before) but it terminates with
<span class="caps">
ERROR
</span>
<span class="caps">
CODE
</span>
2014 in
<span class="caps">
ROUTINE
</span>
organize_input
<br/>
after attempting to open ncdf file lbff**000000.nc
<br/>
No such file or directory
</p>
<p>
================================
</p>
<p>
But, I many times successfully restarted my jobs from consecutive last time moments (i.e. when the experiment was not yet finished – and all the data in the /scratch directory were not removed and the last created files still were there). May it be that restarting is possible for last time moments only. Or, in principle, one should be able to restart his job from any time moment (where the input data are supposed to come from if yes?).
<br/>
Please kindly clarify.
</p>
<p>
Thank you. It doesn’t work. It may be my mistake of course, but I do not think I made any change in the scripts except for that suggested by you in the subchain (commented the line echo ${YDATE_START} ${YDATE_START} > ${PFDIR}/${EXPID}/date.log). [By the way – I work with cclm-sp_1.4 and not with the 1.3.4.Should I try restarting the job using 1.3.4 ?].
<br/>
To summarize
<br/>
my date.log is as follows 1994010100 1994010100
<br/>
the job ./subchain prep 1994010100 starts successfully and calls int2lm but doesn’t call cclm.
<br/>
I tried submitting ./subchain cclm 1994010100 after that (and also before) but it terminates with
<span class="caps">
ERROR
</span>
<span class="caps">
CODE
</span>
2014 in
<span class="caps">
ROUTINE
</span>
organize_input
<br/>
after attempting to open ncdf file lbff**000000.nc
<br/>
No such file or directory
</p>
<p>
================================
</p>
<p>
But, I many times successfully restarted my jobs from consecutive last time moments (i.e. when the experiment was not yet finished – and all the data in the /scratch directory were not removed and the last created files still were there). May it be that restarting is possible for last time moments only. Or, in principle, one should be able to restart his job from any time moment (where the input data are supposed to come from if yes?).
<br/>
Please kindly clarify.
</p>
Thank you. It doesn’t work. It may be my mistake of course, but I do not think I made any change in the scripts except for that suggested by you in the subchain (commented the line echo ${YDATE_START} ${YDATE_START} > ${PFDIR}/${EXPID}/date.log). [By the way – I work with cclm-sp_1.4 and not with the 1.3.4.Should I try restarting the job using 1.3.4 ?].
To summarize
my date.log is as follows 1994010100 1994010100
the job ./subchain prep 1994010100 starts successfully and calls int2lm but doesn’t call cclm.
I tried submitting ./subchain cclm 1994010100 after that (and also before) but it terminates with
ERROR
CODE
2014 in
ROUTINE
organize_input
after attempting to open ncdf file lbff**000000.nc
No such file or directory
================================
But, I many times successfully restarted my jobs from consecutive last time moments (i.e. when the experiment was not yet finished – and all the data in the /scratch directory were not removed and the last created files still were there). May it be that restarting is possible for last time moments only. Or, in principle, one should be able to restart his job from any time moment (where the input data are supposed to come from if yes?).
Please kindly clarify.
<p>
I just made a test by myself and it worked fine.
<br/>
Please run again $./subchain prep 1994010100$ and if it does not work, please attach the log files for prep, int2lm and cclm that have been produced by the job.
</p>
<p>
I just made a test by myself and it worked fine.
<br/>
Please run again $./subchain prep 1994010100$ and if it does not work, please attach the log files for prep, int2lm and cclm that have been produced by the job.
</p>
I just made a test by myself and it worked fine.
Please run again $./subchain prep 1994010100$ and if it does not work, please attach the log files for prep, int2lm and cclm that have been produced by the job.
<p>
Please see the log files attached (except for the cclm since it has not started). Also there are my subchain, all jobs and results of ls -l for restarts directory. Many thanks indeed for your help.
</p>
<p>
Please see the log files attached (except for the cclm since it has not started). Also there are my subchain, all jobs and results of ls -l for restarts directory. Many thanks indeed for your help.
</p>
Please see the log files attached (except for the cclm since it has not started). Also there are my subchain, all jobs and results of ls -l for restarts directory. Many thanks indeed for your help.
<p>
The prep and int2lm jobs you provide already created the data for 199402.
<br/>
Please check if the directory
<br/>
/Research/CLIMATE/Giora/COSMO-
<span class="caps">
CLM
</span>
/cclm-sp_1.4/chain/scratch/b3001/output/int2lm/1994_01/
<br/>
contains the laf1994010100.nc file and all necessary and lbfd199401mmddhh.nc files.
<br/>
If these are available, perform the command
<code>
./subchain cclm
</code>
and attach the resulting .job and joblog file for this to your reply.
</p>
<p>
The prep and int2lm jobs you provide already created the data for 199402.
<br/>
Please check if the directory
<br/>
/Research/CLIMATE/Giora/COSMO-
<span class="caps">
CLM
</span>
/cclm-sp_1.4/chain/scratch/b3001/output/int2lm/1994_01/
<br/>
contains the laf1994010100.nc file and all necessary and lbfd199401mmddhh.nc files.
<br/>
If these are available, perform the command
<code>
./subchain cclm
</code>
and attach the resulting .job and joblog file for this to your reply.
</p>
The prep and int2lm jobs you provide already created the data for 199402.
Please check if the directory
/Research/CLIMATE/Giora/COSMO-
CLM
/cclm-sp_1.4/chain/scratch/b3001/output/int2lm/1994_01/
contains the laf1994010100.nc file and all necessary and lbfd199401mmddhh.nc files.
If these are available, perform the command
./subchain cclm
and attach the resulting .job and joblog file for this to your reply.
<p>
I have submitted the job and it runs now without any problem. So, the problem seems to be solved. Do not really understand how. Thanks much anyway,
</p>
<p>
I have submitted the job and it runs now without any problem. So, the problem seems to be solved. Do not really understand how. Thanks much anyway,
</p>
<p>
I understood finally how I have managed to make my job running. I clearly made a mistake. As I see now cclm.job.tmpl file in /templates directory contains ydirini=@{YDIRINI}/’ and not ydirini=’/Research/CLIMATE/Giora/COSMO-
<span class="caps">
CLM
</span>
/cclm-sp_1.4/work/b3001/restarts’‘,
<br/>
This means that by submitting ./subchain cclm 1994010100 in reality I have used a cold start and not the warm one as I wanted.
<br/>
Sorry for misleading information of yesterday.
</p>
<p>
So, my problem remains unsolved apparently. Following your earlier recommendation I have repeated all my previous actions on another job b2001. Attached please find a tar file with the information on the files in /Research/CLIMATE/Giora/COSMO-
<span class="caps">
CLM
</span>
/cclm-sp_1.4/chain/scratch/b2001/output/int2lm/1994_01/ and /Research/CLIMATE/Giora/COSMO-
<span class="caps">
CLM
</span>
/cclm-sp_1.4/chain/scratch/b2001/output/int2lm/1994_02/
<br/>
as well as the resulting .job and joblog file.
</p>
<p>
I understood finally how I have managed to make my job running. I clearly made a mistake. As I see now cclm.job.tmpl file in /templates directory contains ydirini=@{YDIRINI}/’ and not ydirini=’/Research/CLIMATE/Giora/COSMO-
<span class="caps">
CLM
</span>
/cclm-sp_1.4/work/b3001/restarts’‘,
<br/>
This means that by submitting ./subchain cclm 1994010100 in reality I have used a cold start and not the warm one as I wanted.
<br/>
Sorry for misleading information of yesterday.
</p>
<p>
So, my problem remains unsolved apparently. Following your earlier recommendation I have repeated all my previous actions on another job b2001. Attached please find a tar file with the information on the files in /Research/CLIMATE/Giora/COSMO-
<span class="caps">
CLM
</span>
/cclm-sp_1.4/chain/scratch/b2001/output/int2lm/1994_01/ and /Research/CLIMATE/Giora/COSMO-
<span class="caps">
CLM
</span>
/cclm-sp_1.4/chain/scratch/b2001/output/int2lm/1994_02/
<br/>
as well as the resulting .job and joblog file.
</p>
I understood finally how I have managed to make my job running. I clearly made a mistake. As I see now cclm.job.tmpl file in /templates directory contains ydirini=@{YDIRINI}/’ and not ydirini=’/Research/CLIMATE/Giora/COSMO-
CLM
/cclm-sp_1.4/work/b3001/restarts’‘,
This means that by submitting ./subchain cclm 1994010100 in reality I have used a cold start and not the warm one as I wanted.
Sorry for misleading information of yesterday.
So, my problem remains unsolved apparently. Following your earlier recommendation I have repeated all my previous actions on another job b2001. Attached please find a tar file with the information on the files in /Research/CLIMATE/Giora/COSMO-
CLM
/cclm-sp_1.4/chain/scratch/b2001/output/int2lm/1994_01/ and /Research/CLIMATE/Giora/COSMO-
CLM
/cclm-sp_1.4/chain/scratch/b2001/output/int2lm/1994_02/
as well as the resulting .job and joblog file.
<p>
You are still messing up something in your subchain script.
<br/>
In cclmb2001.job one can read
<br/>
<pre>
ydirini='/Research/CLIMATE/Giora/COSMO-CLM/cclm-sp_1.4/chain/work/b2001/restarts'',
ydirbd='/Research/CLIMATE/Giora/COSMO-CLM/cclm-sp_1.4/chain/scratch/b2001/input/cclm/1994_01/',
</pre>
<br/>
There is a ‘ too much in ydirini.
<br/>
Maybe this causes the error in cclm-b2.o1032872:
<br/>
<pre>
OPEN: bina-file:
/Research/CLIMATE/Giora/COSMO-CLM/cclm-sp_1.4/chain/work/b2001/restarts/lrfd199
4010100o
*** Restart: A default set for refatm parameters is used: 2
CLOSING bina FILE
OPEN: ncdf-file: lbff**000000.nc
No such file or directory
</pre>
<br/>
Please attach the
<span class="caps">
YUSPEFIC
</span>
and subchain files next time. These are of help to understand the problem.
</p>
<p>
You are still messing up something in your subchain script.
<br/>
In cclmb2001.job one can read
<br/>
<pre>
ydirini='/Research/CLIMATE/Giora/COSMO-CLM/cclm-sp_1.4/chain/work/b2001/restarts'',
ydirbd='/Research/CLIMATE/Giora/COSMO-CLM/cclm-sp_1.4/chain/scratch/b2001/input/cclm/1994_01/',
</pre>
<br/>
There is a ‘ too much in ydirini.
<br/>
Maybe this causes the error in cclm-b2.o1032872:
<br/>
<pre>
OPEN: bina-file:
/Research/CLIMATE/Giora/COSMO-CLM/cclm-sp_1.4/chain/work/b2001/restarts/lrfd199
4010100o
*** Restart: A default set for refatm parameters is used: 2
CLOSING bina FILE
OPEN: ncdf-file: lbff**000000.nc
No such file or directory
</pre>
<br/>
Please attach the
<span class="caps">
YUSPEFIC
</span>
and subchain files next time. These are of help to understand the problem.
</p>
There is a ‘ too much in ydirini.
Maybe this causes the error in cclm-b2.o1032872:
OPEN: bina-file:
/Research/CLIMATE/Giora/COSMO-CLM/cclm-sp_1.4/chain/work/b2001/restarts/lrfd199
4010100o
*** Restart: A default set for refatm parameters is used: 2
CLOSING bina FILE
OPEN: ncdf-file: lbff**000000.nc
No such file or directory
Please attach the
YUSPEFIC
and subchain files next time. These are of help to understand the problem.
Restarting finished job
Hi everybody,
I have finished a 5 year simulation and the output files in SCRATCH directory have been removed. Now I want to continue the experiment for another year. Please let me know if this is possible and what are the actions required.
Kind regards, Simon
If you use the latest subchain version you can create the directory structure with
otherwise you have to do this by hand. In this case look at the section
# create the job directory structure
in the subchain script where the directories are created.
I assume you want to perform a warm start, i.e. prolonging the run for another year. In this case do not change
YDATE_START
, but just adoptYDATE_STOP
.Thanks much.
I apparently did something wrong.
I use the (1.3.4) subchain version (is it the latest?).
I submitted ./subchain create and now have the ….chain/scratch/…directory (which is empty).
The date.log howvere contains the date which is equial to YDATE _START.
After that I attempted to submit a restart job ./subchain cclm DATE
where the DATE is that from the original date.log file – i.e. not that created by the ./subchain create but the job has stopped – first the cclm and then the prep and int2lm.
The subchain date.log and the log files + the files from jobs directory (tarred) are attached.
Please have a look.
Simon
1.3.4 is the lastest subchain released version.
You wrote:
The date.log however contains the date which is equial to YDATE _START.
but actually in your subchain YSTART _DATE=1989010100 and in date.log it is 1994010100, which is OK.
I guess you have to create the input data for cclm first. Please run in your case
./subchain prep 1994010100
If everything goes well, this job should call int2lm and later cclm automatically.
By the way, calling
./subchain cclm
always takes the date from date.log. A second argument will be ignored.This ./subchain prep 1994010100 job really called int2lm but stopped after it. I attach the log file obtained.
The last thing it did was the creation of two directories 1994_01 and 1994_02 in ..scratch/output/int2lm .
Any hint, please.
Beate just found an error in the subchain script. In case you call
subchain create
the following command at around line 95 should not be called:This is only for a cold start. Please check if you have not overwritten date.log when you used
subchain create
For a warm start there should be 1994010100 in your case in the date.log file.
Do you mean that in my case (warm start) then line has to be commented, but must exist in the case of cold start?
With this correction I tried submitting ./subchain cclm 1994010100 (with 1994010100 1994010100 in date.log [why twice in fact?), but this didn’t work.
Should I try submitting subchain prep for an earlier date probably?, like ./subchain prep 1994010100 or ./subchain prep 1993120100 ?
If date.log contains
1994010100 1994010100
then./subchain prep 1994010100
should work and start the chain again. Otherwise you mixed something up in the chain.The two dates in date.log are just for the case of running sub monthly chunks. This is not the case in your run, just leave it as it is.
Thank you. It doesn’t work. It may be my mistake of course, but I do not think I made any change in the scripts except for that suggested by you in the subchain (commented the line echo ${YDATE_START} ${YDATE_START} > ${PFDIR}/${EXPID}/date.log). [By the way – I work with cclm-sp_1.4 and not with the 1.3.4.Should I try restarting the job using 1.3.4 ?].
To summarize
my date.log is as follows 1994010100 1994010100
the job ./subchain prep 1994010100 starts successfully and calls int2lm but doesn’t call cclm.
I tried submitting ./subchain cclm 1994010100 after that (and also before) but it terminates with ERROR CODE 2014 in ROUTINE organize_input
after attempting to open ncdf file lbff**000000.nc
No such file or directory
================================
But, I many times successfully restarted my jobs from consecutive last time moments (i.e. when the experiment was not yet finished – and all the data in the /scratch directory were not removed and the last created files still were there). May it be that restarting is possible for last time moments only. Or, in principle, one should be able to restart his job from any time moment (where the input data are supposed to come from if yes?).
Please kindly clarify.
I just made a test by myself and it worked fine.
Please run again $./subchain prep 1994010100$ and if it does not work, please attach the log files for prep, int2lm and cclm that have been produced by the job.
Please see the log files attached (except for the cclm since it has not started). Also there are my subchain, all jobs and results of ls -l for restarts directory. Many thanks indeed for your help.
Sorry, the subchain is attached here.
The prep and int2lm jobs you provide already created the data for 199402.
Please check if the directory
/Research/CLIMATE/Giora/COSMO- CLM /cclm-sp_1.4/chain/scratch/b3001/output/int2lm/1994_01/
contains the laf1994010100.nc file and all necessary and lbfd199401mmddhh.nc files.
If these are available, perform the command
./subchain cclm
and attach the resulting .job and joblog file for this to your reply.I have submitted the job and it runs now without any problem. So, the problem seems to be solved. Do not really understand how. Thanks much anyway,
I understood finally how I have managed to make my job running. I clearly made a mistake. As I see now cclm.job.tmpl file in /templates directory contains ydirini=@{YDIRINI}/’ and not ydirini=’/Research/CLIMATE/Giora/COSMO- CLM /cclm-sp_1.4/work/b3001/restarts’‘,
This means that by submitting ./subchain cclm 1994010100 in reality I have used a cold start and not the warm one as I wanted.
Sorry for misleading information of yesterday.
So, my problem remains unsolved apparently. Following your earlier recommendation I have repeated all my previous actions on another job b2001. Attached please find a tar file with the information on the files in /Research/CLIMATE/Giora/COSMO- CLM /cclm-sp_1.4/chain/scratch/b2001/output/int2lm/1994_01/ and /Research/CLIMATE/Giora/COSMO- CLM /cclm-sp_1.4/chain/scratch/b2001/output/int2lm/1994_02/
as well as the resulting .job and joblog file.
You are still messing up something in your subchain script.
In cclmb2001.job one can read
There is a ‘ too much in ydirini.
Maybe this causes the error in cclm-b2.o1032872:
Please attach the YUSPEFIC and subchain files next time. These are of help to understand the problem.