InterviewSolution
Saved Bookmarks
| 1. |
Is it possible to create PySpark DataFrame from external data sources? |
|
Answer» Yes, it is! Realtime applications make use of external file systems like LOCAL, HDFS, HBase, MySQL table, S3 Azure etc. Following example shows how we can create DATAFRAME by reading data from a csv file present in the local system: DF = spark.read.csv("/path/to/file.csv")PySpark supports csv, TEXT, AVRO, parquet, tsv and many other file extensions. |
|