Coppersmith - Feature Generation, as Functions
a person who makes artifacts from copper.
data is malleable; fold and hammer it into various shapes that are more attractive to analysts and data scientists.
coppersmith is a library to enable the joining, aggregation, and synthesis of "features", streams of facts about entities derived from "analytical records".
This library was originally written by a squad within the Analytics & Information group at Commonwealth Bank, looking to improve the task of authoring and maintaining features for use in predictive analytics and machine learning.
Our working hypothesis was that for all the complexity of the business domain and the size of the data sets involved, fundamentally the logic used in feature generation can be described as simple functions and those functions should be able to be composed. The framework now called coppersmith grew out of our efforts to improve the lives of feature authors.
Add the following dependency to your SBT build configuration
libraryDependencies += "au.com.cba.omnia" %% "coppersmith-scalding" % "<coppersmith-version>"
is replaced with the version number of coppersmith you want to use
(click the preceding link to find the latest version).
We have a richly detailed user guide, which we consider a good introduction to coppersmith. PR's to the user guide as you become familiar with the library are especially encouraged!!!
Classes and objects from the
commbank.coppersmith.lift.generated packages are generated at
build time with
The generated files can be found under the
directory of the
test subprojects respectively.
The change log lists all backwards-incompatible changes to the library (i.e. changes which might break existing client code). Any such changes require bumping the second number in the version.