python-synthetica

Generate synthetic time series data.


License
BSD-1-Clause
Install
pip install python-synthetica==0.1.1

Documentation

logo

Overview
Open Source BSD 3-clause
Code !pypi !python-versions
CI/CD !codecov
Downloads PyPI - Downloads PyPI - Downloads Downloads
Table of Contents
  1. About The Project
  2. Installation
  3. Getting Started
  4. Contributing
  5. License

About The Project

Introduction

Synthetica is a versatile and robust tool for generating synthetic time series data. Whether you are engaged in financial modeling, IoT data simulation, or any project requiring realistic time series data to create correlated or uncorrelated signals, Synthetica provides high-quality, customizable generated datasets. Leveraging advanced statistical techniques and machine learning algorithms, Synthetica produces synthetic data that closely replicates the characteristics and patterns of real-world data.

The project latest version incorporates a wide array of models, offering an extensive toolkit for generating synthetic time series data. This version includes features like:

  • GeometricBrownianMotion
  • AutoRegressive
  • NARMA
  • Heston
  • CIR
  • LevyStable
  • MeanReverting
  • Merton
  • Poisson
  • Seasonal

However, the SyntheticaAdvenced version elevates the capabilities further, integrating more sophisticated deep learning data-driven algorithms, such as TimeGAN.

(back to top)

Built With

  • numpy = "^1.26.4"
  • pandas = "^2.2.2"
  • scipy = "^1.13.1"

(back to top)

Installation

$ pip install python-synthetica

(back to top)

Getting Started

Once you have cloned the repository, you can start using Synthetica to generate synthetic time series data. Here are some initial steps to help you kickstart your exploration:

>>> import synthetica as sth

In this example, we are using the following parameters for illustration purposes:

  • length=252: The length of the time series
  • num_paths=5: The number of paths to generate
  • seed=123: Reseed the numpy singleton RandomState instance for reproduction

Initialize the model: Using the GeometricBrownianMotion (GBM) model: This approach initializes the model with a specified path length, number of paths, and a fixed random seed:

>>> model = sth.GeometricBrownianMotion(length=252, num_paths=5, seed=123)

Generate random signals: The transform method then generates the random signals accordingly:

>>> model.transform() # Generate random signals

chart-1

Generate correlated paths: This process ensures that the resulting features are highly positively correlated, leveraging the Cholesky decomposition method to achieve the desired matrix correlation structure:

>>> model.transform(matrix) # Produces highly positively correlated features

chart-2

(back to top)

Positive Definiteness

What positive definite means in a covariance matrix

A covariance matrix is considered positive definite if it satisfies the following key properties:

  1. It is symmetric, meaning the matrix is equal to its transpose.
  2. For any non-zero vector $x$, $x^T * C * x > 0$, where $C$ is the covariance matrix and $x^T$ is the transpose of $x$.
  3. All of its eigenvalues are strictly positive.

Positive definiteness in a covariance matrix has important implications:

  1. It ensures the matrix is invertible, which is crucial for many statistical techniques.
  2. It guarantees that the matrix represents a valid probability distribution.
  3. It allows for unique solutions in optimization problems and ensures the stability of certain algorithms.
  4. It indicates that no linear combination of the variables has zero variance, meaning all variables contribute meaningful information.

A covariance matrix that is positive semi-definite (allowing for eigenvalues to be non-negative rather than strictly positive) is still valid, but may indicate linear dependencies among variables.

In practice, sample covariance matrices are often positive definite if the number of observations exceeds the number of variables and there are no perfect linear relationships among the variables.

Implementation

synthetica automatically finds the nearest positive-definite matrix to input using nearest_positive_definite python function. it is directly sourced from Computing a nearest symmetric positive semidefinite matrix.

Other Sources

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the BSD-3 License. See LICENSE.txt for more information.

(back to top)