Spark Advanced Topics
  • Spark Advanced Topics Working Group Documentation
  • Details
    • SchemaColumnConvertNotSupportedException with vectorized Parquet reads
    • Bringing too much data back to the driver (collect and friends)
    • Too big broadcast joins
    • Tables getting broadcasted even when broadcast is disabled
    • Class or method not found
    • Container OOMs
    • spark.sql.AnalysisException: Correlated column is not allowed in predicate
    • Result size larger than spark.driver.maxResultSize error OR Kryo serialization failed: Buffer overflow.
    • Result size larger than spark.driver.maxResultsSize error
    • Driver ran out of memory
    • Driver ran out of memory
    • Executor out of disk error
    • Executor ran out of memory
    • Missing Files / File Not Found / Reading past RLE/BitPacking stream
    • Error
    • Memory Errors
    • Other errors
    • Fetch Failed exceptions
    • spark.sql.AnalysisException
    • Even Partitioning Yet Still Slow
    • Failed to read non-parquet file
    • Large record problems can show up in a few different ways.
    • Force computations
    • Key/Partition Skew
    • Max Serialized Task Size -- Serialized task A:B was X bytes, which exceeds max allowed: C spark.rpc.message.maxSize
    • Notenoughexecs
    • Partial v.s. Full Aggregates
    • PySpark UDF / UDAF OOM
    • Partition at read time
    • Bad Partitioning
    • Even Partitioning Yet Still Slow
    • ShuffleExchangeExec loses track of executor registrations
    • Corrupted Shuffle Blocks
    • Unexpected coalesce / shuffle down / SinglePartition, ENSURE_REQUIREMENTS
    • Slow executor
    • Slow job slow cluster
    • Slow job
    • Slow Map
    • Partition Filters
    • Slow reduce
    • Regular Expression Tips
    • Skewed Joins
    • Skewed/Slow Write
    • Identify the slow stage
    • Slow writes on S3
    • Slow writes due to Too many small files
    • Slow Writes
    • Too Big DAG (or when iterative algorithms go bump in the night)
    • Toofew tasks
    • Too Large JAR files
    • Toomany tasks
    • Avoid UDFs for the most part
    • Write Fails
  • Flowchart
    • Index
    • Error
    • Shared
    • Slow
  • Search
  • Previous
  • Next
  • Other errors

Other errors

Failed to read non-parquet file

Executor Failure from large record

Class or method not found

Invalid/Missing Files

Too Big DAG


Copyright various authors 2022-2024, CC-BY-SA 4.0

Documentation built with MkDocs.

Search

From here you can search these documents. Enter your search terms below.

Keyboard Shortcuts

Keys Action
? Open this help
n Next page
p Previous page
s Search