|Shyft Google Group|
Shyft is a cross-platform open source toolbox developed at Statkraft
The overall goal for the toolbox is to provide python-enabled high performance component with operational quality.
This allows model experts in the business domain, scientists at institutes/universities together with professional programmers can cooperate efficiently to maximize the IT-support in the energy-market domain. Once improved functionality is implemented and tested, it can be released for use.
Shyft have rolling-releases, - improvements are shipped as soon as testing proves the improvement and quality.
The Shyft software components are used in active 24x7 operation at Statkraft and actively maintained and developed.
Some of our tools and libraries will work nice for other domains as well, like the time-series package.
Currently the toolbox includes the major components as described in the next sections.
If your primary interest is Hydrology forecasting models and algorithms, skip to the Hydrology section, although the Time-series section might be useful while working with hydrology.
If your primary interest is in the generic high performance time-series engine, you would benefit from reading the intro presented here.
If you are interested in the energy-market model, power-market, pure energy and hydro-power, then the energy-market model section would be useful, -as well as the time-series section.
Allows you work with time-series easily, including everthing from storage to advanced distributed server-side evaluation of large expressions.
It allows you to write time-series expressions, as you would in numpy, using scalars, time-series, or vectors of time-series.
You can express your self in natural python, and get scalable high performance expressions like:
a = TimeSeries(time_axis,values,point_interpretation..) b = TimeSeries(time_axis,values,point_interpretation..) c = a*2.5 + b.pow(a) -1000.0 # this works, lazy eval, takes care of diff. time-resolution etc. etc. my_plot(c.values.to_numpy()) # you can extract numpy values from the expression. e = TimeSeries('shyft:/prod/price/no_1') p = TimeSeries('shyft:/prod/total_mw') ta_2018 = TimeAxis(time('2018-01-01T00:00:00Z'),deltahours(24),365) m = e*p.accumulate(ta_2018) # this also works, expression dtsc= DtsClient('dtss_host:20000') mr = dtsc.evaluate(TsVector([m,e,p]),ta_2018.total_period()) # vector eval, get back server-side evaluated expressions
You can create, store and update server-side time-series, and use those time-series in your expressions.
# On the server side ! def start_the_dtss_server(port:int=20000)->DtsServer: """ These 4 lines starts a HPC ts server on port 20000 (it could be your laptop!) """ dtss = DtsServer() dtss.set_container('prod','/mnt/tsdb/prod') dtss.set_port(port) dtss.start_async() return dtss # Anywhere on you network (ensure to open firewall for port 20000 etc. dtsc = DtsClient('dtss_host:20000') tsv_to_store = TsVector([ TimeSeries(shyft_ts_url('prod','price/no1'),TimeSeries(ta,values,stair_case)) # a ts with url and payload data TimeSeries(shyft_ts_url('prod','total_mw'),TimeSeries(ta,values,stair_case)) # a ts with url and payload data ]) dtsc.store_ts(tsv_to_store) # Done ! # now have fun, ref example above, you can use symbolic expressions referencing time-series for server evaluation
The DTSS is easily extensible, by python!
On the server-side, you can register your own methods to do the read, write and find time-series methods. Based on the pattern of the Shyft-time-series url. Those that starts with shyft://.. is handled internally, using local high-performance store. The other ts-urls, are grouped together and forwarded to your python code. Most likely, you already have a legacy system with python-api, so it's easy to do.
This allow you to integrate with any backend, legacy system or computational system that you might have.
Do you have a slow performing legacy time-series database ?
Bring your time-series data to life using python and Shyft DTSS!
Typical read/write speeds at server side is close to system-performance, typically 100..1000 GBytes/sec, for typical SSD and NVME drives. The computational speed is comparable to matrix library speeds, multicore.
The DTSS supports caching of time-series, giving you in-memory speed for computations, production-servers would typically keep 250 GB of cache (thats 25 Giga points of time-series float data!). Time is valuable, -memory is cheap!
In most scenario, with single-writer multiple readers, Shyft DTSS supports cache on write, so your client will always get fresh data, evaluated at multicore in-memory performance.
One of the success-stories in Statkraft is that we are using a model-driven architecture, and deriving the expressions from the models.
The hydrologic forecasting models follows the paradigm of distributed, lumped parameter models -- with recent developments introducing more physically based / process-level methods.
The energy-market model framework provides fundamental tools for building/storing and maintaining energy-market models.
As mentioned earlier, model-driven approach, combining fundamental models, with various algorithms is a key factor for business driven development, along with python-enabled architecture at both server and client-side.
The energy-market model at birds-view contains the electrical grid with consumers and producers within areas. The areas is typically partitioned due to power-grid transmission line capacity, or political/country strategies.
At the more detailed level, within an model-area, there are a details for each producer/consumer, power-modules. Between the areas, there are power-lines with capacity and regulations. For areas that have hydro-power, and maybe also hydro-power dominated, there is a quite detailed description of each hydro-power system with it's reservoirs, tunnels/rivers, aggregates and power-stations.
For hydro-power systems, a detailed model, suitable for day-to-day planning, bid-process, optimization, and daily operation and balance follow up is available.
The detail level of this hydro-power model also allows for estimation of inflow from catchments surrounding the hydro-power system.
The energy-market model does not currently provide algorithms for optimization, simulation or historical inflow estimates based on metered production, gate-flow and reservoir levels.
It rather provide a high performance python enabled framework where the IT-vendors and IT-suppliers can can collect it's data from, and feed it into their now proprietary algorithms to do the needed computations.
This way, we hope that the highly competent and skilled companies, institutes can focus on the algorithms and let the customers (companies that produce/use electrical power) handle, keep and provide their data.
We would like to cooperate closely with the vendors of algorithms, to ease integration so that we can provide the best possible product to the end users, researchers and analysts.
Contributions that allows end-users to test the algorithms, using the energy-market model to harvest data for the algorithms, is very welcome.
Also other contributions, and integrations, e.g. presentation-layer, is also welcome.
Shyft's primary end-user documentation is at Shyft readthedocs, where you will find instructions for installing Shyft and getting up and running with the tools it provides.
We also maintain this README file with basic instructions for building Shyft from a developer perspective.
Copyright (C) Sigbjørn Helset (SiH), John F. Burkhart (JFB), Ola Skavhaug (OS), Yisak Sultan Abdella (YAS), Statkraft AS
Contributors and current project participants include:
Shyft is released under LGPL V.3 See LICENCE
First time users and those are interested in learning how to use Shyft for hydrologic simulation are strongly encouraged to see Shyft at readthedocs.
Shyft is distributed in three separate code repositories. This repository,
shyft provides the main code base. A second repository (required for tests) is located at shyft-data. A third repository shyft-doc is available containing example notebooks and tutorials. The three repositories assume they have been checked out in parallel into a
mkdir shyft_workspace && cd shyft_workspace export SHYFT_WORKSPACE=`pwd` git clone https://gitlab.com/shyft-os/shyft.git git clone https://gitlab.com/shyft-os/shyft-data.git git clone https://gitlab.com/shyft-os/shyft-doc.git
For compiling and running Shyft, you will need:
In addition, a series of Python packages are needed mainly for running the tests. These can be easily installed via:
$ pip install -r requirements.txt
or, if you are using conda (see below):
$ cat requirements.txt | xargs conda install
Please refer to our Python Installation Guide
NOTE: the build/compile instructions below have been mainly tested on Linux platforms. Shyft can also be compiled (and it is actively maintained) for Windows, but the building instructions are not covered here (yet).
NOTE: the dependency regarding a modern compiler generally means gcc-7 is required to build Shyft.
You can compile Shyft by using the typical procedure for Python packages.
Shyft currently uses boost, dlib, armadillo and doctest to build the python-extensions.
The dependencies can be provided as pr. standard on your linux-system, or built from source following standard build-recipe from those above mentioned libraries.
We supply scripts to automate the build-from source strategy:
shyft/build_support/build_dependencies.sh (linux) shyft/build_support/win_build_dependencies.sh (windows)
You should execute the build_dependencies.sh script just after initial checkout or refresh,
prior to building the python extensions. The scripts will download and build required
shyft_dependencies directory in parallel with shyft directory.
The linux build will also download miniconda with required packages for the shyft_env in parallel with the shyft directory, effectively giving a complete sandboxed shyft development setup.
You should then prepend to miniconda/bin to PATH prior to working with shyft to ensure that the correct python interpreter is picked up.
When you call
setup.py the script will call cmake. If the dependencies exist in the aforementioned directory, they will be used.
Otherwise cmake will attempt to locate the libraries from the system.
pip install -r requirements.txt python setup.py build_ext --inplace
NOTE: If you haven't set
env_vars as part of your conda environment, then you need to do the following:
# assumes you are still in the shyft_workspace directory containing # the git repositories bash shyft/build_support/build_dependencies.sh . $SHYFT_WORKSPACE/miniconda/etc/profile.d/conda.sh conda activate base export LD_LIBRARY_PATH=$SHYFT_WORKSPACE/shyft_dependencies/lib cd shyft #the shyft repository python setup.py build_ext --inplace
It is recommended to at least run a few of the tests after building. This will ensure your paths and environment variables are set correctly.
The quickest and easiest test to run is:
python -c "from shyft import api"
To run further tests, see the TESTING section below.
If the tests above run, then you can simply install Shyft using:
cd $SHYFT_WORKSPACE/shyft python setup.py install
Although (at least on Linux) the
setup.py method above uses the
CMake building tool behind the scenes, you can also compile it
manually (in fact, if you plan to develop Shyft, this may be recommended because you will be able to run
the integrated C++ tests). The steps are the usual ones:
$ cd $SHYFT_WORKSPACE/shyft $ mkdir build $ cd build $ cmake .. # configuration step; or "ccmake .." for curses interface $ make -j 4 # do the actual compilation of C++ sources (using 4 processes) $ make install # install python extensions into the shyft python source tree
We have the beast compiled by now. For testing:
$ export LD_LIBRARY_PATH=$SHYFT_WORKSPACE/shyft_dependencies/lib $ make test # run the C++ tests $ export PYTHONPATH=$SHYFT_WORKSPACE/shyft $ pytest ../test_suites # run the Python tests
If all the tests pass, then you have an instance of Shyft that is
fully functional. In case this directory is going to act as a
long-term installation it is recommended to persist your
$PYTHONPATH environment variables (in
or using the conda
env_vars described above).
The way to test Shyft is by running:
$ pytest test_suites
from the root shyft repository directory.
The test suite is comprehensive, and in addition to unit-tests covering c++ parts and python parts, it also covers integration tests with netcdf and geo-services.
Shyft tests are meant to be run from the sources directory. As a start, you can run the python api test suite by:
cd $SHYFT_WORKSPACE/shyft pytest test_suites/api
To conduct further testing and to run direct C++ tests, you need to be sure you have the
shyft-data repository as a sibling of the
shyft repository directory.
To run some of the C++ core tests you can try the following:
cd $SHYFT_WORKSPACE/shyft/build/test make test