pyspark.sql.Catalog.recoverPartitions#

Catalog.recoverPartitions(tableName)[source]#

Recovers all the partitions of the given table and updates the catalog.

New in version 2.1.1.

Parameters
tableNamestr

name of the table to get.

Notes

Only works with a partitioned table, and not a view.

Examples

The example below creates a partitioned table against the existing directory of the partitioned table. After that, it recovers the partitions.

>>> import tempfile
>>> with tempfile.TemporaryDirectory(prefix="recoverPartitions") as d:
...     _ = spark.sql("DROP TABLE IF EXISTS tbl1")
...     spark.range(1).selectExpr(
...         "id as key", "id as value").write.partitionBy("key").mode("overwrite").save(d)
...     _ = spark.sql(
...          "CREATE TABLE tbl1 (key LONG, value LONG)"
...          "USING parquet OPTIONS (path '{}') PARTITIONED BY (key)".format(d))
...     spark.table("tbl1").show()
...     spark.catalog.recoverPartitions("tbl1")
...     spark.table("tbl1").show()
+-----+---+
|value|key|
+-----+---+
+-----+---+
+-----+---+
|value|key|
+-----+---+
|    0|  0|
+-----+---+
>>> _ = spark.sql("DROP TABLE tbl1")