Smart Initialization Configuration and User's Guide

February 15, 2012

This document describes how smart initialization works, and how it can be extended and modified.

Table of Contents

How Smart Initialization Works
Where are the files that control smart initialization?
Basic File Structure
Working with SCALAR, VECTOR, WEATHER, and DISCRETE data
Modifying an Existing Algorithm
Adding New Algorithms
Adding New Models
Running Smart Initialization from the Command Line
Examples of Complete Smart Initalization Files
Exercises


How Smart Initialization Works

Smart initialization is closely coupled with EDEX. EDEX is aware of D2D model data changes and therefore knows when IFP data grids can be created. Whenever EDEX first starts, or new D2D model data is detected (which occurs frequently - less than 5 minute intervals), the server will examine the serverConfig/localConfig configuration file for the INITMODULES definition. If it finds a match with the updated D2D model, then it will spawn a process to run smart initialization for that model.

When a smart initialization process is started, it will run a particular class file, either one that has been supplied with the release, or one you have added. The software will examine all of the functions within that class looking for function names that begin with calc***. These functions define the output weather element, e.g., calcT will derive T, and also the dependencies. The dependencies are determined from the argument list for each function.

The smart initialization software then figures out the dependencies for all of the calc*** functions and will determine what must run first, second, third, etc. This is done automatically and the programmer need not be concerned about it. The software also examines the inventory for the dependent grids and the inventory for the output grids to determine if the output grid is already present and no calculations are needed, or whether the output grid needs to be created by running the algorithms.

When all the possible dependencies and algorithms have run, the process exits until the next D2D model update.

Smart Initialization algorithms are written in a simple, intuitive language called Python and its extension, numpy. The following sections assume that you have knowledge of Python which is covered in the GFESuite Python Tutorial and Programming Guidelines.


Where are the files that control smart initialization?

The standard smart initialization files that are supplied with the release are installed into your release/edex/data/utility/edex_static/base/smartinit directory. No modifications should be made to any of the files in the BASE directory. User-customized files are installed into the release/edex/data/utility/edex_static/site/SITE/smartinit directory. These SITE files are not overwritten during installs and upgrades.

Basic File Structure

The basic structure of a smart initialization file contains a header that is simliar to:

from Init import *

class modelForecaster(derivedFromForecaster):
    def __init__(self):
        derivedFromForecaster.__init__(self, "sourceDb", "destDb")

This is basically defining a new class called "modelForecaster", which is derived from "derivedFromForecaster". The __init__ function is the constructor to the class, which calls the base class (class it has been defined from) with three arguments, self, "sourceDb", and "destDb". The source db is the name of the D2D database, such as NAM80, NAM40, NAM12, RUC80, GFS40, gfsLR. The destination database is the name of the output database, such as NAM12, GFS40, NAM_V. If there is an underscore in the destination database, then the format is the modelname_optionaltype.

A complete example of the header is shown below:

from Init import *

class NAM12Forecaster(Forecaster):
    def __init__(self):
        Forecaster.__init__(self, "NAM12", "NAM12")
 

There usually is a function called levels() which define a set of vertical pressure levels which are used when accessing cube data from the models. The levels() return a list of level values to use. Make sure that the levels you specify are actually available in the model. A complete example is shown below:

    def levels(self):
        return ["MB950", "MB900","MB850","MB800","MB750",
                "MB700","MB650","MB600","MB550", "MB500",
                "MB450", "MB400", "MB350"]

Following the levels() function are the set of functions with a particular name. The calculation functions MUST ALL BEGIN WITH calc in their name. That is how smart initialization can determine what to run. It basically runs all calc*** functions that are defined. The remainder of the name of the function after "calc" is the parameter name to create. For example, if RH is your weather element name, the name of the function to calculate RH would be calcRH(). The typical format for a calc function is shown below, this one calculates surface temperature from the eta model:

    def calcT(self, t_FHAG2, t_BL3060, p_SFC, stopo, topo):
        dpdz = 287.04 * t_FHAG2 / (p_SFC / 100 * 9.8) # meters / millibar
        # 45milibars is halfway between 30 and 60
        dpdz = dpdz * 45 # meters between p_SFC and t_BL3060
        lapse = (t_FHAG2 - t_BL3060) / dpdz # degrees / meter
        lapse = clip(lapse, lapse, 0.012)
        t = t_FHAG2 + lapse * (stopo - topo)
        return self.KtoF(t)
 

Here is the same function with more details:

The following grids from the D2D model database are passed into this routine, 2 meter temperatures, the boundary layer temperatures from the 30-60 hPa levels, the surface pressure in Pascals, surface topography from the model, and the high resolution topographical data.

    def calcT(self, t_FHAG2, t_BL3060, p_SFC, stopo, topo):

We begin to calculate the lapse rate, but first we need to determine the number of meters between the surface and 30-60 hPa level.
        dpdz = 287.04 * t_FHAG2 / (p_SFC / 100 * 9.8) # meters / millibar
        # 45milibars is halfway between 30 and 60
        dpdz = dpdz * 45 # meters between p_SFC and t_BL3060

The lapse rate can now be calculated:
        lapse = (t_FHAG2 - t_BL3060) / dpdz # degrees / meter

We ensure that the lapse rate can't get too steep
        lapse = clip(lapse, lapse, 0.012)

We calculate the surface temperature, which is the model 2m temperature modified by the lapse rate and the difference between the model surface and the real surface elevation.
        t = t_FHAG2 + lapse * (stopo - topo)

We perform unit conversion and return.
        return self.KtoF(t)

The name of the function is always calc*** where *** is the weather element name and level. If you are creating weather elements for surface data then the weather element name by itself is sufficient. If you are creating weather elements for upper air or non-surface data, then the name of the calc function is:  calc***_***, such as calcWind_3K() for the Wind at 3000 feet.

Note the argument list for the calc*** functions. The first argument is always self. The remainder of the arguments represent gridded data. The format of the specification can be one of the following formats:
Format Example Purpose
EditAreaName Colorado The name of an editArea. It is probably best to use polygon edit areas instead of queries (untested with queries). The value of the paramater, in this case Colorado, will be a boolean grid suitable for use as a mask for numeric functions.
parmName_level t_FHAG2 Refers to a single grid for the parmName and the level. The example accesses the temperature grid from the model that is at the FHAG2 (fixed height above ground 2m level)
parmName_c rh_c Refers to a cube of data for the parmName. The "_c" indicates the cube. The number of layers in the cube depend upon the levels() function contents. For example, if the levels() contain:
  def levels(self):
        return ["MB950", "MB900","MB850","MB800","MB750",
                "MB700","MB650","MB600","MB550", "MB500",
                "MB450", "MB400", "MB350"]
then the cube will contain 13 levels. Access of individual levels can be done using indexing within the function.
topo topo Refers to the high-resolution surface topography field, in units of meters above MSL.
stopo stopo Refers to the model topography field, in units of meters above MSL.
ctime ctime Time from the source database grid currently being calculated, as a time range tuple (startTime, endTime), in seconds since January 1, 1970 at 0000z.
mtime mtime Time in the destination database grid currently being calculated, as a time range tuple (startTime, endTime), in seconds since January 1, 1970 at 0000z.
stime stime Number of seconds from the model basetime currently being calculated, in seconds since the model analysis time.
parmName FzLevel Refers to the weather element in the OUTPUT database, not the INPUT D2D database.

You can place additional functions (e.g., utility) functions anywhere in the file after the constructor (__init__) and before the tail end of the file. An example of a utility function could be one to calculate Td from T and RH as shown below:

   def getTd(self, t, rh):
        # input/output in degrees K.
        desat = clip(t, 0, 373.15)
        desat = where(less(t, 173.15), 3.777647e-05, t)
        desat = exp(26.660820 - desat * 0.0091379024 - 6106.3960 / desat)
        desat = desat * rh / 100.0
        desat = 26.66082 - log(desat)
        td = (desat - sqrt(desat*desat-223.1986))/0.0182758048
        td = where(greater(td, t), t, td)
        return td

The tail end of the file contains a definition of main() and must be similar to that below:

def main():
    modelForecaster().run()

Here is an example of a real tail to the file. The name of the class within the main() function must match the name of the class you have defined in the header:

def main():
    NAM12Forecaster().run()


Working with SCALAR, VECTOR, WEATHER, and DISCRETE Data

This section describes accessing scalar, vector, and weather data through numerical python.

SCALAR


When passing in a weather element that is scalar, you will either get a grid, or a cube. The grid is a numeric 2-d grid (x,y), the cube is a numeric 3-d grid (z,x,y).

VECTOR

For a single vector grid (single level), you get a tuple. The first element is a grid of magnitude, the second is a grid of direction. To access the magnitude grid, use this syntax: wind_SFC[0], and for direction, use this syntax: wind_SFC[1]. Once you access either the magnitude or direction grid, they are treated like a scalar grid.

There are several "utility" functions in Init.py (located in your release/edex/data/utility/edex_static/base/smartinit directory) that can help you when working with vector data. The self._getUV( mag, dir) call will convert a magnitude/direction grids into a returned tuple of u and v. The u component is [0] and the v component is [1]. The self._getMD(u,v) function converts a grid in u and v components into a tuple of magnitude and direction. The magnitude component is [0] and the direction component is [1].

WEATHER

Weather is much more complicated and can be a big performance problem. A tuple is provided. The first element is a grid, which contains the indexes into the key. The second element is a sequence of all of the keys. The keys are the ugly strings associated with a WeatherKey. To access the grid:

Wx[0]

To access the sequence, use:

Wx[1]

To access a particular entry in the sequence, use:

Wx[1][3], would give your the 4th key.

Normally you don't access the weather grid in your calculations, but if you need to, you have generally created a weather grid first. In smart initialization, we know all of the possible weather keys that can be created and set up a table with those keys, then we simply poke in the correct index for the key. This is much more efficient than searching the keys for each grid point.

DICRETE

Discrete is much more complicated than the simple scalar and vector case, and like weather, can be a big performance problem. A tuple is provided. The first element is a grid, which contains the indexes into the key. The second element is a sequence of all of the keys. The keys are the discrete key values associated with the weather element. To access the grid:

DK[0]

To access the sequence, use:

DK[1]

To access a particular entry in the sequence, use:

DK[1][3], would give your the 4th key.

Normally you don't access the discrete grid in your calculations, but if you need to, you have generally created a discrete grid first. In smart initialization, we know all of the possible discrete keys that can be created and set up a table with those keys, then we simply poke in the correct index for the key. This is much more efficient than searching the keys for each grid point.


Modifying an Existing Algorithm

The basic procedure to modify an existing algorithm is shown below: Here is an example of overriding the NAM12 derivation of Snow Amount. The original NAM12.py file contains the following information (only part of the file is shown). The snow amount function calculates the snow ratio which varies depending upon temperature, and then assigns the snow amount based on the snow ratio and QPF where the snow level (freezing level - 1000

from Init import *

class NAM12Forecaster(Forecaster):
    def __init__(self):
        Forecaster.__init__(self, "NAM12", "NAM12")

    def levels(self):
        return ["MB950", "MB900","MB850","MB800","MB750",
                "MB700","MB650","MB600","MB550", "MB500",
                "MB450", "MB400", "MB350"]

    def calcSnowAmt(self, T, FzLevel, QPF, topo):
        m1 = less(T, 9)
        m2 = greater_equal(T, 30)
        snowr = T * -0.5 + 22.5
        snowr = where(m1, 20, snowr)
        snowr = where(m2, 0, snowr)
        snowamt = where(less_equal(FzLevel - 1000, topo*3.048), snowr * QPF, 0)
        return snowamt

def main():
    NAM12Forecaster().run()

Here is the derived MyNAM12 file that overrides the calcSnowAmt() function:

from NAM12 import *

class MyNAM12Forecaster(NAM12Forecaster):
    def __init__(self):
        NAM12Forecaster.__init__(self)

    def calcSnowAmt(self, T, QPF):
        m2 = less_equal(T, 32)
        snowamt = where(m2, 10.0 * QPF, 0)
        return snowamt

def main():
    MyNAM12Forecaster().run()

The algorithm was changed to have a fixed 10:1 snow ratio anytime the temperature is below 32. The freezing level is no longer used in this revision. Of course you can completely rewrite the algorithm, use different arguments, etc. Note that the name of the function, calcSnowAmt() in this case is identical to to the name in the original file. This is important!


Adding New Algorithms

Adding a new algorithm is just about the same as Modifying An Existing Algorithm. You will do the similar steps: Here is an example of a MyNAM12.py initialization file that creates a new weather element called RH for the NAM model.  It does nothing more than taking the model RH FHAG2 field and storing it into the NAM12 RH weather element, after ensuring that the field ranges between 0 and 100%:

from NAM12 import *

class MyNAM12Forecaster(NAM12Forecaster):
    def __init__(self):
        NAM12Forecaster.__init__(self)

    def calcRH(self, rh_FHAG2):
        return clip(rh_FHAG2, 0, 100)

def main():
    MyNAM12Forecaster().run()


Adding New Models

Adding new models, and all of the algorithms to derive the elements isn't much different from Adding New Algorithms or Modifying an Existing Algorithm. The basic steps are: Here is a complete example of creating WaveHeight and Surface Wind from the GWW model. The wave height logic catches values that are very high and assumes that they are missing data and resets them to zero height. There is also a conversion from meters to feet. The wind logic also catches values that are very high and assumes that they are missing data and resets the winds to calm. There is also a conversion from meters/second to knots.:

from Init import *
class GWWForecaster(Forecaster):
    def __init__(self):
        Forecaster.__init__(self, "GWW", "GWW")

    def calcWaveHeight(self, htsgw_SFC):
        grid = where(greater(htsgw_SFC, 50), 0.0, htsgw_SFC/3.048)
        return clip(grid, 0, 100)

    def calcWind(self, wind_SFC):
        mag = where(greater(wind_SFC[0], 50), 0, wind_SFC[0]*1.94)
        dir = where(greater(wind_SFC[0], 50), 0, wind_SFC[1])
        dir = clip(dir, 0, 359.5)
        return (mag, dir)

def main():
    GWWForecaster().run()


Running Smart Initialization from the Command Line

Normally you don't need to run smart initialization from the command line. Smart initialization is automatically started by EDEX when new D2D model data arrives.

In the event that you do want to run smart initialization from the command line, which you probably would when you are developing new algorithms, here is the proper syntax:

ifpInit [-h host] [-p port] [-t modelTime] [-s site] [-u userid] [-a] algorithmFile

Switch Optional? Purpose
-h host yes Defines the host upon which EDEX is running. Normally this switch is not needed and the software will determine where EDEX is running.
-p port yes Defines the RPC port upon which EDEX is running. Normally this switch is not needed and the software will determine where EDEX is running.
-t modelTime yes Specifies the model run time in the format of yyyymmdd_hhmm. If not specified, then run using the latest model data.
-s site no Specifies the site id for whom to run the init.
-u user yes Specifies the user id who is executing the init.
-a yes Specifies to create all of the possible data grids, which will overwrite existing previously calculated grids. Normally by default, only those grids that haven't yet been created will be attempted to be calculated.
algorithmFile no Mandatory argument specifying the name of the smart initialization module, such as NAM12, or MyNAM12.
Note: The -h and -p switches are predefined to match your AWIPS installation, such that they will point to EDEX specified on installation. Thus the -h and -p switches are not necessary for normal running of this program. However, if you wish to connect to a different server, then you will need to specify the -h and -p switches.


Examples of Complete Smart Initalization Files


Here is an example of the NAM12.py smart initialization file that is provided (or similar to what is provided).

Here is an example of a modification to the NAM12.py called MyNAM12.py which modifies the snow amount calculation, and adds the surface relative humidity field.

Here is an example of adding a new model, the GWW model, to calculate wave height and wind.


Exercises

Several smart initialization exercises are available here.