1 min readJun 22, 2019
In your select_features_to_scale function, you have a couple of .to_pandas() calls on the df. They copy the df to the driver, right? How would you write that code to just use pyspark? Could koalas help?
In your select_features_to_scale function, you have a couple of .to_pandas() calls on the df. They copy the df to the driver, right? How would you write that code to just use pyspark? Could koalas help?
@jim_dowling CEO of Logical Clocks AB. Associate Prof at KTH Royal Institute of Technology Stockholm, and Senior Researcher at RISE SICS.