Welcome to Spark Python API Docs!¶
Contents:
- pyspark package
 - pyspark.sql module
 - pyspark.streaming module
 - pyspark.ml package
 - pyspark.mllib package
- pyspark.mllib.classification module
 - pyspark.mllib.clustering module
 - pyspark.mllib.evaluation module
 - pyspark.mllib.feature module
 - pyspark.mllib.fpm module
 - pyspark.mllib.linalg module
 - pyspark.mllib.linalg.distributed module
 - pyspark.mllib.random module
 - pyspark.mllib.recommendation module
 - pyspark.mllib.regression module
 - pyspark.mllib.stat module
 - pyspark.mllib.tree module
 - pyspark.mllib.util module
 
 
Core classes:¶
Main entry point for Spark functionality.
A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
pyspark.streaming.StreamingContextMain entry point for Spark Streaming functionality.
A Discretized Stream (DStream), the basic abstraction in Spark Streaming.
Main entry point for DataFrame and SQL functionality.
A distributed collection of data grouped into named columns.