Capability
Pandas Api On Spark With Automatic Distributed Execution
5 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
Unified engine for large-scale data processing and ML.
Unique: Translates pandas DataFrame operations to Spark SQL logical plans automatically, enabling pandas-compatible syntax to execute distributedly; uses pandas Index semantics for groupby/join operations while maintaining Spark's distributed execution
vs others: More accessible than native Spark API for pandas users because syntax is identical; more efficient than Dask for large datasets because Spark's optimizer is more mature