Skip to content

mmlspark-v0.11: v0.11

Compare
Choose a tag to compare
@elibarzilay elibarzilay released this 18 Jul 02:18
· 1452 commits to master since this release

New functionality:

  • TuneHyperparameters: parallel distributed randomized grid search for
    SparkML and TrainClassifier/TrainRegressor parameters. Sample
    notebook and python wrappers will be added in the near future.

  • Added PowerBIWriter for writing and streaming data frames to
    PowerBI.

  • Expanded image reading and writing capabilities, including using
    images with Spark Structured Streaming. Images can be read from and
    written to paths specified in a dataframe.

  • New functionality for convenient plotting in Python.

  • UDF transformer and additional UDFs.

  • Expanded pipeline support for arbitrary user code and libraries such
    as NLTK through UDFTransformer.

  • Refactored fuzzing system and added test coverage.

  • GPU training supports multiple VMs.

Updates:

  • Updated to Conda 4.3.31, which comes with Python 3.6.3.

  • Also updated SBT and JVM.

Improvements:

  • Additional bugfixes, stability, and notebook improvements.