Skip to content

dfdx/Spark.jl

Repository files navigation

Spark.jl

A Julia interface to Apache Spark™

Latest Version Documentation PackageEvaluator Build Status
PkgEval

Spark.jl provides an interface to Apache Spark™ platform, including SQL / DataFrame and Structured Streaming. It closely follows the PySpark API, making it easy to translate existing Python code to Julia.

Spark.jl supports multiple cluster types (in client mode), and can be considered as an analogue to PySpark or RSpark within the Julia ecosystem. It supports running within on-premise installations, as well as hosted instance such as Amazon EMR and Azure HDInsight.

Documentation

Trademarks

Apache®, Apache Spark and Spark are registered trademarks, or trademarks of the Apache Software Foundation in the United States and/or other countries.