Tutorial 3: Cycle forecast and assimilation

We will run a cycled data assimilation experiment. Let’s say we want to cycle DA and forecasts from 13z to 14z. Then we can do it like this:

time_start = dt.datetime(2008, 7, 30, 13)
time_end = dt.datetime(2008, 7, 30, 14)

timedelta_btw_assim = dt.timedelta(minutes=15)
assim_times = pd.date_range(time_start, time_end, freq=timedelta_btw_assim)

We create the configuration cfg as shown in tutorial 1 (omitted here). Then we loop over the assimilation times and run assimilation and forecasts. We just need to update the config parameters using cfg.update(**kwargs) if parameters change while the script is running. This applies, e.g., to time, prior_init_time, prior_valid_time. prior_path_exp controls from which path to take initial conditions. All scripts can use the parameters in the config object cfg.

w = WorkFlows(cfg)
w.prepare_WRFrundir(cfg)

# loop over assimilations
for i, t in enumerate(assim_times):

    if i == 0:
        # initialize forecast with a previous forecast from 8z, valid at time_start
        cfg.update(
            time = t,
            prior_init_time = dt.datetime(2008, 7, 30, 8),
            prior_valid_time = t,
            prior_path_exp = '/jetfs/scratch/username/sim_archive/exp_another/',)
    else:
        # initialize forecast with a previous forecast from 15 min earlier
        cfg.update(
            time = t,
            prior_init_time = t - dt.timedelta(minutes=15),
            prior_valid_time = t,
            prior_path_exp = cfg.dir_archive,)

    id = w.assimilate(cfg, depends_on=id)

    # 1) Set posterior = prior
    id = w.prepare_IC_from_prior(cfg, depends_on=id)

    # 2) Update posterior += updates from assimilation
    id = w.update_IC_from_DA(cfg, depends_on=id)

    cfg.update( WRF_start=t,
                WRF_end=t+dt.timedelta(hours=1),  # make 1h forecasts
                restart=True,
                hist_interval_s=300,  # seconds output interval
    )

    # 3) Run WRF ensemble
    id = w.run_WRF(cfg, depends_on=id)

Using SLURM

SLURM is a job scheduler which is required when running resource-intensive tasks on a cluster.

It is very easy to use the SLURM job scheduler. Set parameter use_slurm = True of dartwrf.utils.Config and customize max_nproc (number of processors used at once in total), max_nproc_for_each_ensemble_member (number of processors used for each ensemble member), Configure the amount of resources each job gets allocated by SLURM by modifying the script DART-WRF/dartwrf/workflows.py for each workflow method you use.

Lastly, call workflow methods with the keyword argument depends_on=id. This tells each job to wait on another job’s completion. For example:

id = None
id = w.assimilate(cfg, depends_on=id)
id = w.prepare_IC_from_prior(cfg, depends_on=id)
id = w.update_IC_from_DA(cfg, depends_on=id)