Max Serialized Task Size -- Serialized task A:B was X bytes, which exceeds max allowed: C spark.rpc.message.maxSize

Some common causes of this include: - Too large task through inadvertant scope capture (think a function on an object needing to bring an object that happens to have a bunch of records in it) - sometimes this can be fixed by making those fields transient - moving the function out to another class - Too large task due to intentional/needed data - Using broadcast variables can be a way to work around this. - Intentional coalesece of all data into one task (generally not recommended but sometimes needed) - You can increase spark.rpc.message.maxSize