mmlspark-v0.11: v0.11

elibarzilay released this 18 Jul 02:18

· 1452 commits to master since this release

New functionality:

TuneHyperparameters: parallel distributed randomized grid search for
SparkML and TrainClassifier/TrainRegressor parameters. Sample
notebook and python wrappers will be added in the near future.
Added PowerBIWriter for writing and streaming data frames to
PowerBI.
Expanded image reading and writing capabilities, including using
images with Spark Structured Streaming. Images can be read from and
written to paths specified in a dataframe.
New functionality for convenient plotting in Python.
UDF transformer and additional UDFs.
Expanded pipeline support for arbitrary user code and libraries such
as NLTK through UDFTransformer.
Refactored fuzzing system and added test coverage.
GPU training supports multiple VMs.

Updates:

Updated to Conda 4.3.31, which comes with Python 3.6.3.
Also updated SBT and JVM.

Improvements:

Additional bugfixes, stability, and notebook improvements.

Assets 2