data fusion in python via statistical matching

pip install data-fusion-sm==0.5.1



data-fusion-sm is a generic framework for the fusion of multiple data sources using statistical matching techniques. The objective is usually to study the relationship between variables not jointly observed in a sample.

It is geared towards use cases from the marketing industry and social sciences, where data fusion and statistical matching are synonomous and the typical data under study comes via panels, surveys, or similar longitudinal studies. That said, any similar data can be joined or missing data imputed with these methods.

Core Offerings

  • Hot-Deck imputation
  • Predictive Mean Matching
  • Fusion result evalutaion
    • Target variable distribution comparison
    • Linking variable accuracy
  • Feature analysis helpers

Distributed under BSD 3-Clause license.