pyspark.sql.streaming.DataStreamReader.format#
- DataStreamReader.format(source)[source]#
Specifies the input data source format.
New in version 2.0.0.
Changed in version 3.5.0: Supports Spark Connect.
- Parameters
- sourcestr
name of the data source, e.g. ‘json’, ‘parquet’.
Notes
This API is evolving.
Examples
>>> spark.readStream.format("text") <...streaming.readwriter.DataStreamReader object ...>
This API allows to configure other sources to read. The example below writes a small text file, and reads it back via Text source.
>>> import tempfile >>> import time >>> with tempfile.TemporaryDirectory(prefix="format") as d: ... # Write a temporary text file to read it. ... spark.createDataFrame( ... [("hello",), ("this",)]).write.mode("overwrite").format("text").save(d) ... ... # Start a streaming query to read the text file. ... q = spark.readStream.format("text").load(d).writeStream.format("console").start() ... time.sleep(3) ... q.stop()