CASSANALYTICS-120: Quote identifiers option must be set to true if ttl has mixed case column name#172
CASSANALYTICS-120: Quote identifiers option must be set to true if ttl has mixed case column name#172sarankk wants to merge 4 commits intoapache:trunkfrom
Conversation
…l has mixed case column name
67a8ff9 to
bf797b2
Compare
|
Thanks @frankgh for the review, good spot. Addressed your comment. Instead of automatically setting quote identifiers option to true, added a validation instead to throw if users use mixed case column names without quote identifiers. With current behavior, when perRow TTL option is set with mixed case TTL column name and without quote identifiers, the job succeeds but silently rewrites existing TTLs of the rows to null. Hence throwing to fix this. |
| WriterOptions.QUOTE_IDENTIFIERS.name(), WriterOptions.QUOTE_IDENTIFIERS.name()); | ||
| } | ||
|
|
||
| private boolean doesNotHaveUpperCaseChars(String name) |
There was a problem hiding this comment.
Not sure is there a existing Java utility function to o this check. Java experts may comment here.
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/BulkSparkConf.java
Outdated
Show resolved
Hide resolved
|
+1 Looks good |
frankgh
left a comment
There was a problem hiding this comment.
The approach now seems better than before. When we don't quote we validate that the TTL and Timestamp columns only have lowercase. However, this doesn't cover for the case of special characters. I suggest leveraging the bridge.maybeQuoteIdentifier(String) functionality to make a determination whether the input is valid or not
935df49 to
c0e56d1
Compare
If users set TTL column name or timestamp column name to be mixed case, they have to set quote identifiers option to true for Cassandra to resolve the column name correctly. Sometimes users miss setting quote identifiers option and perRowTTL setting incorrectly overrides their existing TTL with null values when bulk write job runs. This patch fixes this, by checking for mixed case column names and setting quote identifiers option when required.