Trouble loading PySpark ALS model

Im trying to load a model created with Pyspark. I’ve created the model with the following code:

import pandas as pd
from pyspark.ml.evaluation import RegressionEvaluator
from pyspark.ml.recommendation import ALS
from pyspark.ml.tuning import TrainValidationSplit, ParamGridBuilder
from pyspark.context import SparkContext
from pyspark.sql.session import SparkSession
sc = SparkContext('local')
spark = SparkSession(sc)

data = pd.read_csv('matrix-out-small.csv')
df = spark.createDataFrame(data)

(training, test) = df.randomSplit([0.8, 0.2])

als = ALS(userCol="CustomerID", itemCol="ProductID", ratingCol="Rating", coldStartStrategy="drop", nonnegative=True)

# Tune model using param grid builder
param_grid = ParamGridBuilder().addGrid(als.rank, [12, 13, 14]).addGrid(als.maxIter, [18, 19, 20]).addGrid(als.regParam, [.17, .18, .19]).build()

evaluator = RegressionEvaluator(metricName="rmse", labelCol="Rating", predictionCol="prediction")

tvs = TrainValidationSplit(estimator=als, estimatorParamMaps=param_grid, evaluator=evaluator)

# fit model to training data
model = tvs.fit(training)

# extract best
best_model = model.bestModel  

best_model.save("modelSaveOut")

This creates a directory called ‘ModelSaveOut’ that contains ‘ItemFactors’, ‘metadata’ and ‘userFactors’

When I try to load the model using ALS.load I get the following:

model = ALS.load("modelSaveOut")

py4j.protocol.Py4JJavaError: An error occurred while calling o26.load. : java.lang.NoSuchMethodException: org.apache.spark.ml.recommendation.ALSModel.(java.lang.String)

model = TrainValidationSplit.load("modelSaveOut")

py4j.protocol.Py4JJavaError: An error occurred while calling o26.load. : java.lang.IllegalArgumentException: requirement failed: Error loading metadata: Expected class name org.apache.spark.ml.tuning.TrainValidationSplit but found class name org.apache.spark.ml.recommendation.ALSModel

It seems I’m not loading the model using the correct object/method. Is it possible to save the ‘bestModel’ or is it that I need to save the whole model using a different method?

Answer

If you read the exception trace

but found class name org.apache.spark.ml.recommendation.ALSModel

it will tell you exactly what to do:

 from pyspark.ml.recommendation import ALS, ALSModel

 ALSModel.load("modelSaveOut")

Leave a Reply

Your email address will not be published. Required fields are marked *