org.aksw.commons:aksw-commons-entity-codecs-parent

A library of commonly used classes in AKSW applications.


Keywords
aggregator, java, observable-collections, path, sql-encoding
License
Apache-2.0

Documentation

AKSW Commons

A modular utility collection for solving recurrent basic tasks in a productive and robust way.

Modules

  • Lambdas: Serializable: Interfaces derived from the Java8 functions and collector ones that extend Serializable.

  • Lambdas: Throwing: Alternate version of the Java8 function interfaces that declare to throw exceptions.

  • Beans: An API to wrap an entity and declare custom getters+setters (including type conversions such as Integer-to-Long), annotations, constructors. For example, if a wrapped class does not expose a no-arg ctor, then the model allows for providing a lambda that can be used instead. This module does not depend on spring-core directly, but e.g. the ConversionServiceAdapter is designed for spring interoperability.

  • Collectors A framework for composable serializable aggregators suitable for application in map/reduce scenarios. The central class is ParallelAggregator. The resulting aggregators support being viewed as Java8 collectors for use with Java8 streams and can also be serialized for parallel computation with e.g. Hadoop/Apache Spark. Depends on Lambdas: Serializable.

  • Collection utilities: Features mutable collection views with corresponding iterators and miscellaneous classes for special use cases.

  • Entity Codec: Core: A framework for composable encoding and decoding entities of type "T" (in contrast to codecs usually tied to byte[]). The main use case is quoting and escaping of strings.

  • Entity Codec: SQL: An adaption of Entity Codecs: Core for the SQL domain. Features the SqlCodec interface which bundles codecs for for the various SQL identifier types such as column names, table names and aliases.

  • RX Additional operators and utilities for the Reactive eXtensions for Java RxJava. Most prominently features on-line aggregation of consecutive items belonging to the same group using FlowableOperatorSequentialGroupBy. Also includes an operator for measuring throughput.

  • XML: Static convenience methods for loading XML and evaluating XPath expressions.

  • io-utils: Various utilities. Includes a Java NIO-based file merger (because FileUtils.copyMerge is missing in Hadoop 3), URI-to-path conversion.

  • io-syscalls: Abstraction over java's ProcessBuilder to efficiently pass streams and file arguments to system processes.

  • io-process-pipes: Abstraction to efficiently build pipes using both system calls and native java implementations alike. Main use case is to enable codec implementations that make use of system calls, such as using lbzip2 instead commons compress. Under development.

  • util: This package needs refactoring as it contains utils for unrelated domains, such as JDBC metadata retrieval and health checking. Also the name 'util' is too generic.

Where is it used?

AKSW Commons is essentially the jena-idenpendent code from our jena-sparql-api Semantic Web toolkit.

  • RDFUnit: RDF data quality assessment framework
  • DL-Learner: Framework for symbolic machine learning
  • Facete: Faceted Search framework
  • SANSA: Big Data RDF Processing and Analytics framework

License

This code is released under the Apache License Version 2.0.