Identifying a minimum collection of baskets that covers the universe of products using PySpark

Cpak
2 min readSep 1, 2020

Problem Statement

From a collection of hundreds or even thousands of items (the universe of items), your data analytics team may want to recommend a personalized basket of items to each customer. Customer A may be recommended the basket {A, B, C} and customer B may be recommended {B, C, D}, etc. What is the minimum collection of baskets that together cover the universe of items? These types of problems are known as set cover problems.

--

--