My Spark Program to process changes from Kafka topic to DB table is working end-to-end. But when it is manually terminated when an insert/update/delete is in progress, the whole tables data gets dropped. How can I handle this in a safe way?

  • 1
    It would be interesting to see your code. I can't thing of a reason why the table would be dropped, even if the streaming job terminates.
  • 0
    @cmarshall10450 basically Spark.jdbc.overwrite drops that table (you can change setting to truncate though) and then creates a new table to populate it with new data. When this process is interrupted, I often remain with an empty table.
  • 1
    @Anakata can you change it to an commented out truncate?
Add Comment