pyspark.sql.DataFrame.mergeInto#

DataFrame.mergeInto(table, condition)[source]#

Merges a set of updates, insertions, and deletions based on a source table into a target table.

New in version 4.0.0.

Parameters
tablestr

Target table name to merge into.

conditionColumn

The condition that determines whether a row in the target table matches one in the source DataFrame.

Returns
MergeIntoWriter

MergeIntoWriter to use further to specify how to merge the source DataFrame into the target table.

Notes

This method does not support streaming queries.

Examples

>>> from pyspark.sql.functions import expr
>>> source = spark.createDataFrame(
...     [(14, "Tom"), (23, "Alice"), (16, "Bob")], ["id", "name"])
>>> (source.mergeInto("target", "id")  
...     .whenMatched().update({ "name": source.name })
...     .whenNotMatched().insertAll()
...     .whenNotMatchedBySource().delete()
...     .merge())