allow custom type converters to be utilized for collection types #1204

aashishs101 · 2018-12-14T20:31:36Z

Currently, the connector limits users' ability to create custom type converters for collection types. Instead, it only allows the use of the default collection converters, which just apply in-scope converters to the elements of a collection, while ignoring any user-defined converters for the collection type. This prevents use cases where a converter is being used to convert a non-collection type into a collection, or where a collection is being converted into another collection (for instance we want to only store the keys of a map as a list). Currently, when such a custom converter is specified, it will fail with the following exception:
[info] - should write to a table with HLLs *** FAILED *** [info] org.apache.spark.SparkException: Job aborted due to stage failure: Task 11 in stage 0.0 failed 1 times, most recent failure: Lost task 11.0 in stage 0.0 (TID 11, localhost, executor driver): com.datastax.spark.connector.types.TypeConversionException: Cannot convert object OurType(14,16384,0.7212525005219688) of type class com.us.commons.internal.util.fn.OurType to Map[AnyRef,AnyRef]. [info] at com.datastax.spark.connector.types.TypeConverter$$anonfun$convert$1.apply(TypeConverter.scala:45) [info] at com.datastax.spark.connector.types.TypeConverter$CollectionConverter$$anonfun$convertPF$35.applyOrElse(TypeConverter.scala:694) [info] at com.datastax.spark.connector.types.TypeConverter$class.convert(TypeConverter.scala:43) [info] at com.datastax.spark.connector.types.TypeConverter$CollectionConverter.convert(TypeConverter.scala:682)
This is because of lines like this one that don't try to look for the correct type converter in scope. The proposed change keeps the current functionality of using the default collection converters, but only when a custom one isn't also in scope.

JIRA Ticket: https://datastax-oss.atlassian.net/browse/SPARKC-556

aashishs101 · 2018-12-14T20:37:09Z

test this please

aashishs101 · 2018-12-14T20:41:12Z

test this please

aashishs101 · 2018-12-18T14:46:01Z

@RussellSpitzer I'm not sure why the automated testing isn't working on this branch

aashishs101 · 2019-01-14T21:40:18Z

@RussellSpitzer what steps would I need to go through to get this merged?

aashishs101 · 2019-02-08T18:23:29Z

@RussellSpitzer: sorry to bother, but wondering if it is possible to get this PR in?

RussellSpitzer · 2019-04-17T18:00:49Z

@aashishs101 Sorry about that, we only check the JIRA so I miss if people post PR's without an accompanying ticket. Please file one and we'll get to review and testing ASAP.

RussellSpitzer · 2019-04-17T18:02:40Z

test this please

RussellSpitzer · 2019-04-17T18:02:53Z

Only authorized users can start the tests :)

ds-jenkins-builds · 2019-04-17T18:30:10Z

Build against Scala 2.11 finished with failure

RussellSpitzer · 2019-04-17T18:43:09Z

Error Message
java.io.IOException: Couldn't find table test_table_converter5 in test_cassandra_rdd_spec - Found similar tables in that keyspace: test_cassandra_rdd_spec.test_table_converter5
Stacktrace
sbt.ForkMain$ForkError: java.io.IOException: Couldn't find table test_table_converter5 in test_cassandra_rdd_spec - Found similar tables in that keyspace:
test_cassandra_rdd_spec.test_table_converter5
	at com.datastax.spark.connector.cql.Schema$.tableFromCassandra(Schema.scala:358)
	at com.datastax.spark.connector.writer.TableWriter$.apply(TableWriter.scala:383)
	at com.datastax.spark.connector.RDDFunctions.saveToCassandra(RDDFunctions.scala:35)
	at com.datastax.spark.connector.rdd.CassandraRDDSpec$$anonfun$141$$anonfun$apply$54.apply(CassandraRDDSpec.scala:1581)
	at com.datastax.spark.connector.rdd.CassandraRDDSpec$$anonfun$141$$anonfun$apply$54.apply(CassandraRDDSpec.scala:1572)
	at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:112)
	at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:111)
	at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:145)
	at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:111)
	at com.datastax.spark.connector.rdd.CassandraRDDSpec$$anonfun$141.apply(CassandraRDDSpec.scala:1572)
	at com.datastax.spark.connector.rdd.CassandraRDDSpec$$anonfun$141.apply(CassandraRDDSpec.scala:1572)

aashish added 2 commits December 14, 2018 14:29

allow custom type converters to be utilized for collection types

a377c68

Merge branch 'master' into fix_collection_type_converters

538492e

fix synchronization issue

a4bdb32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow custom type converters to be utilized for collection types #1204

allow custom type converters to be utilized for collection types #1204

aashishs101 commented Dec 14, 2018

aashishs101 commented Dec 14, 2018

aashishs101 commented Dec 14, 2018

aashishs101 commented Dec 18, 2018

aashishs101 commented Jan 14, 2019

aashishs101 commented Feb 8, 2019

RussellSpitzer commented Apr 17, 2019

RussellSpitzer commented Apr 17, 2019

RussellSpitzer commented Apr 17, 2019

ds-jenkins-builds commented Apr 17, 2019

RussellSpitzer commented Apr 17, 2019

allow custom type converters to be utilized for collection types #1204

Are you sure you want to change the base?

allow custom type converters to be utilized for collection types #1204

Conversation

aashishs101 commented Dec 14, 2018

aashishs101 commented Dec 14, 2018

aashishs101 commented Dec 14, 2018

aashishs101 commented Dec 18, 2018

aashishs101 commented Jan 14, 2019

aashishs101 commented Feb 8, 2019

RussellSpitzer commented Apr 17, 2019

RussellSpitzer commented Apr 17, 2019

RussellSpitzer commented Apr 17, 2019

ds-jenkins-builds commented Apr 17, 2019

RussellSpitzer commented Apr 17, 2019