java.lang.NoSuchMethodError: 'org.apache.spark.sql.catalyst.encoders.ExpressionEncoder org.apache.spark.sql.catalyst.encoders.RowEncoder.apply(org.apache.spark.sql.types.StructType)' #1209

CatalinMihaiIonescu · 2024-04-09T14:51:48Z

Hello,

I am trying to update my application to spark 3.5.1 (Scala version 2.12.18, OpenJDK 64-Bit Server VM, 17.0.10) but the scala 2.12 connector keeps throwing this error.

While trying to stream data I get the following:

DriverStacktrace:

java.lang.NoSuchMethodError: 'org.apache.spark.sql.catalyst.encoders.ExpressionEncoder org.apache.spark.sql.catalyst.encoders.RowEncoder.apply(org.apache.spark.sql.types.StructType)'
	at org.apache.spark.sql.PreScala213SparkSqlUtils.createExpressionEncoder(PreScala213SparkSqlUtils.java:53)
	at com.google.cloud.spark.bigquery.spark3.Spark3DataFrameToRDDConverter.convertToRDD(Spark3DataFrameToRDDConverter.java:40)
	at com.google.cloud.spark.bigquery.BigQueryStreamWriter$.writeBatch(BigQueryStreamWriter.scala:50)
	at com.google.cloud.spark.bigquery.BigQueryStreamingSink.addBatch(BigQueryStreamingSink.scala:53)
	at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$17(MicroBatchExecution.scala:732)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:125)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:201)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:108)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66)
	at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$16(MicroBatchExecution.scala:729)
	at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken(ProgressReporter.scala:427)
	at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken$(ProgressReporter.scala:425)
	at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:67)
	at org.apache.spark.sql.execution.streaming.MicroBatchExecution.runBatch(MicroBatchExecution.scala:729)
	at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$2(MicroBatchExecution.scala:286)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken(ProgressReporter.scala:427)
	at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken$(ProgressReporter.scala:425)
	at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:67)
	at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$1(MicroBatchExecution.scala:249)
	at org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:67)
	at org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:239)
	at org.apache.spark.sql.execution.streaming.StreamExecution.$anonfun$runStream$1(StreamExecution.scala:311)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
	at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:289)
	at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.$anonfun$run$1(StreamExecution.scala:211)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.JobArtifactSet$.withActiveJobArtifactState(JobArtifactSet.scala:94)
	at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:211)

I am using spark 3.5.1 and spark-bigquery-with-dependencies_2.12:0.37.0, everything works fine on spark 3.4.2.

The text was updated successfully, but these errors were encountered:

davidrabinowitz · 2024-04-09T21:31:30Z

Please try the spark-3.5-bigquery:0.37.0 connector.

CatalinMihaiIonescu · 2024-04-10T07:40:32Z

I have tried and it gives me the following error:

org.apache.spark.SparkUnsupportedOperationException: Data source bigquery does not support streamed writing.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.streamedOperatorUnsupportedByDataSourceError(QueryExecutionErrors.scala:696)
	at org.apache.spark.sql.execution.datasources.DataSource.createSink(DataSource.scala:326)
	at org.apache.spark.sql.streaming.DataStreamWriter.createV1Sink(DataStreamWriter.scala:442)
	at org.apache.spark.sql.streaming.DataStreamWriter.startInternal(DataStreamWriter.scala:404)
	at org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:251)

I am using a standalone Spark Cluster.

malhomaid · 2024-04-23T11:53:36Z

I have tried and it gives me the following error:

org.apache.spark.SparkUnsupportedOperationException: Data source bigquery does not support streamed writing.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.streamedOperatorUnsupportedByDataSourceError(QueryExecutionErrors.scala:696)
	at org.apache.spark.sql.execution.datasources.DataSource.createSink(DataSource.scala:326)
	at org.apache.spark.sql.streaming.DataStreamWriter.createV1Sink(DataStreamWriter.scala:442)
	at org.apache.spark.sql.streaming.DataStreamWriter.startInternal(DataStreamWriter.scala:404)
	at org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:251)

I am using a standalone Spark Cluster.

I'm using the same version and I have the same issue

malhomaid · 2024-05-06T14:34:21Z

I worked around it by using writeStream.foreachBatch

def write_to_bigquery(batch_df, batch_id):
    batch_df.write.format('bigquery') \
        .option("table", "") \
        .option("temporaryGcsBucket", "") \
        .option("checkpointLocation", "") \
        .option("writeMethod", "direct") \
        .save()

df.writeStream \
    .foreachBatch(write_to_bigquery) \
    .start() \
    .awaitTermination()

CatalinMihaiIonescu · 2024-09-11T13:01:59Z

I worked around it by using writeStream.foreachBatch

def write_to_bigquery(batch_df, batch_id):
    batch_df.write.format('bigquery') \
        .option("table", "") \
        .option("temporaryGcsBucket", "") \
        .option("checkpointLocation", "") \
        .option("writeMethod", "direct") \
        .save()

df.writeStream \
    .foreachBatch(write_to_bigquery) \
    .start() \
    .awaitTermination()

it seems to me like you are using Direct writing, we need to do batch streaming

CatalinMihaiIonescu · 2024-09-11T13:02:09Z

this still happens in spark 3.5.2 and with the 0.41.0 version of the connector

davidrabinowitz assigned isha97 Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

java.lang.NoSuchMethodError: 'org.apache.spark.sql.catalyst.encoders.ExpressionEncoder org.apache.spark.sql.catalyst.encoders.RowEncoder.apply(org.apache.spark.sql.types.StructType)' #1209

java.lang.NoSuchMethodError: 'org.apache.spark.sql.catalyst.encoders.ExpressionEncoder org.apache.spark.sql.catalyst.encoders.RowEncoder.apply(org.apache.spark.sql.types.StructType)' #1209

CatalinMihaiIonescu commented Apr 9, 2024 •

edited

Loading

davidrabinowitz commented Apr 9, 2024

CatalinMihaiIonescu commented Apr 10, 2024 •

edited

Loading

malhomaid commented Apr 23, 2024

malhomaid commented May 6, 2024

CatalinMihaiIonescu commented Sep 11, 2024

CatalinMihaiIonescu commented Sep 11, 2024 •

edited

Loading

java.lang.NoSuchMethodError: 'org.apache.spark.sql.catalyst.encoders.ExpressionEncoder org.apache.spark.sql.catalyst.encoders.RowEncoder.apply(org.apache.spark.sql.types.StructType)' #1209

java.lang.NoSuchMethodError: 'org.apache.spark.sql.catalyst.encoders.ExpressionEncoder org.apache.spark.sql.catalyst.encoders.RowEncoder.apply(org.apache.spark.sql.types.StructType)' #1209

Comments

CatalinMihaiIonescu commented Apr 9, 2024 • edited Loading

davidrabinowitz commented Apr 9, 2024

CatalinMihaiIonescu commented Apr 10, 2024 • edited Loading

malhomaid commented Apr 23, 2024

malhomaid commented May 6, 2024

CatalinMihaiIonescu commented Sep 11, 2024

CatalinMihaiIonescu commented Sep 11, 2024 • edited Loading

CatalinMihaiIonescu commented Apr 9, 2024 •

edited

Loading

CatalinMihaiIonescu commented Apr 10, 2024 •

edited

Loading

CatalinMihaiIonescu commented Sep 11, 2024 •

edited

Loading