MLE-12345 Merge release/2.1.1 into main#642
Merged
Merged
Conversation
MLE-28021 Merge main to develop
Trying to minimize CVEs
Just pointing to releases page to avoid having to update the tag name in future releases.
MLE-12345 - Clarifying the use of connectionString with Basic authentication.
…om-develop PDP-1182: Remove per-repo pr-workflow.yaml
No downside to this, and it allows for multiple concurrent jobs using S3 in the same Spark context.
Possibly a temporary addition, going to experiment with this in another project.
Also added support for publishing to GH Packages
…e option is specified (#630) * don't abort on batch write failures if flag is not specified for copy command * address PR comments. Improve test by attempting copy to other db ensuring it is empty prior to copy. * tweak role name * bump copyright header * add test assertion * extract var --------- Co-authored-by: Jason Teh <jteh@cas.org>
Eliminating CVEs
Treating this as a bug fix now and including in 2.1.1.
As noted, this shouldn't be necessary, trying to make Black Duck happy / not dumb.
Trying to un-confuse Black Duck.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR merges release 2.1.1 changes into main, updating project versions and adding API/CLI support around S3 anonymous access, basic-auth connection strings, write-option passthroughs, dependency CVE overrides, tests, and documentation.
Changes:
- Updates Flux and connector versions to 2.1.1 / 3.1.1 and refreshes related docs/examples.
- Adds S3 anonymous access and bucket-scoped S3 Hadoop configuration behavior.
- Adds basic-auth connection-string API support, hidden writer passthrough options, dependency overrides, and expanded tests.
Reviewed changes
Copilot reviewed 29 out of 32 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
gradle.properties |
Bumps Flux and connector versions. |
build.gradle |
Updates dependency resolution overrides. |
flux-cli/build.gradle |
Adds CVE-related dependency constraints and GitHub Packages publishing config. |
.github/workflows/pr-workflow.yaml |
Removes PR validation workflow. |
flux-cli/src/main/java/com/marklogic/flux/impl/S3Params.java |
Adds anonymous S3 auth and bucket-scoped Hadoop config. |
flux-cli/src/main/java/com/marklogic/flux/impl/AbstractCommand.java |
Applies S3 config based on file paths and adds basic connection-string API method. |
flux-cli/src/main/java/com/marklogic/flux/impl/importdata/WriteDocumentParams.java |
Adds hidden arbitrary writer options. |
flux-cli/src/main/java/com/marklogic/flux/impl/importdata/ReadFilesParams.java |
Adds API support for anonymous S3 reads. |
flux-cli/src/main/java/com/marklogic/flux/impl/export/WriteFilesParams.java |
Adds API support for anonymous S3 writes. |
flux-cli/src/main/java/com/marklogic/flux/impl/export/ExportFilesCommand.java |
Adds anonymous S3 support to generic file export API options. |
flux-cli/src/main/java/com/marklogic/flux/impl/custom/CustomImportCommand.java |
Adds anonymous S3 support to custom import read options. |
flux-cli/src/main/java/com/marklogic/flux/impl/custom/AbstractCustomExportCommand.java |
Adds anonymous S3 support to custom export write options. |
flux-cli/src/main/java/com/marklogic/flux/impl/copy/CopyCommand.java |
Adds hidden output writer options and explicit abort-on-failure setting. |
flux-cli/src/main/java/com/marklogic/flux/api/Executor.java |
Adds connectionStringBasic to the public API. |
flux-cli/src/main/java/com/marklogic/flux/api/ReadFilesOptions.java |
Adds anonymous S3 API method for readers. |
flux-cli/src/main/java/com/marklogic/flux/api/WriteFilesOptions.java |
Adds anonymous S3 API method for writers. |
flux-cli/src/main/java/com/marklogic/flux/api/GenericFilesExporter.java |
Adds anonymous S3 method to generic export options. |
flux-cli/src/main/java/com/marklogic/flux/api/CustomImporter.java |
Adds anonymous S3 method to custom import options. |
flux-cli/src/main/java/com/marklogic/flux/api/CustomExportWriteOptions.java |
Adds anonymous S3 method to custom export options. |
flux-cli/src/test/java/com/marklogic/flux/impl/S3ParamsTest.java |
Adds S3 anonymous, bucket scoping, and bucket extraction tests. |
flux-cli/src/test/java/com/marklogic/flux/impl/importdata/ImportFilesOptionsTest.java |
Covers hidden write-property passthrough. |
flux-cli/src/test/java/com/marklogic/flux/impl/copy/CopyTest.java |
Adds copy write-failure behavior tests and helper reuse. |
flux-cli/src/test/java/com/marklogic/flux/impl/copy/CopyOptionsTest.java |
Covers output write-property passthrough. |
flux-cli/src/test/java/com/marklogic/flux/api/ConnectionTest.java |
Tests connectionStringBasic. |
examples/client-project/build.gradle |
Updates example dependency versions. |
docs/spark-integration.md |
Updates spark-submit jar references. |
docs/import/import-files/selecting-files.md |
Documents S3 anonymous import access. |
docs/export/specifying-path.md |
Documents S3 anonymous export access. |
docs/import/embedder/embedder.md |
Updates default embedding field documentation. |
docs/common-options.md |
Clarifies connection-string auth behavior. |
docs/api.md |
Updates API dependency version and connection API docs. |
CONTRIBUTING.md |
Updates shadow jar examples. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (project.hasProperty("ghToken")) { | ||
| maven { | ||
| name = "GitHubPackages" | ||
| url = uri("https://maven.pkg.github.com/rjrudin/flux") |
Comment on lines
+359
to
+365
| if (project.hasProperty("ghToken")) { | ||
| maven { | ||
| name = "GitHubPackages" | ||
| url = uri("https://maven.pkg.github.com/rjrudin/flux") | ||
| credentials { | ||
| username = ghActor | ||
| password = ghToken |
Comment on lines
+92
to
+99
| for (String path : paths) { | ||
| bucket = extractBucket(path); | ||
| if (bucket != null) { | ||
| break; | ||
| } | ||
| } | ||
| } | ||
| addToHadoopConfiguration(config, bucket); |
Comment on lines
+53
to
+56
| // Default implementation for backwards compatibility. | ||
| return connectionString(connectionString); | ||
| } | ||
|
|
| ./gradlew shadowJar | ||
|
|
||
| This will produce an assembly jar at `./flux-cli/build/libs/marklogic-flux-2.1-SNAPSHOT-all.jar`. | ||
| This will produce an assembly jar at `./flux-cli/build/libs/marklogic-flux-2.2-SNAPSHOT-all.jar`. |
| ``` | ||
| $SPARK_HOME/bin/spark-submit --class com.marklogic.flux.spark.Submit \ | ||
| --master spark://NYWHYC3G0W:7077 flux-cli/build/libs/marklogic-flux-2.1-SNAPSHOT-all.jar \ | ||
| --master spark://NYWHYC3G0W:7077 flux-cli/build/libs/marklogic-flux-2.2-SNAPSHOT-all.jar \ |
| --packages org.apache.hadoop:hadoop-aws:3.3.4,org.apache.hadoop:hadoop-client:3.3.4 \ | ||
| --master spark://NYWHYC3G0W:7077 \ | ||
| flux-cli/build/libs/marklogic-flux-2.1-SNAPSHOT-all.jar \ | ||
| flux-cli/build/libs/marklogic-flux-2.2-SNAPSHOT-all.jar \ |
Comment on lines
+111
to
+119
| // useProfile and anonymousAccess have no bucket-scoped equivalent — always set globally. | ||
| if (anonymousAccess) { | ||
| config.set(S3A_CREDENTIALS_PROVIDER, "org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider"); | ||
| } else if (useProfile) { | ||
| config.set(S3A_CREDENTIALS_PROVIDER, "software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider"); | ||
| } | ||
|
|
||
| String prefix = (bucket != null && !bucket.isEmpty()) ? "fs.s3a.bucket." + bucket + "." : "fs.s3a."; | ||
|
|
Comment on lines
+183
to
+186
| void addCredentialsWithBucketUsesBucketScopedKeys() { | ||
| params.setAddCredentials(true); | ||
| Configuration config = new Configuration(); | ||
| params.addToHadoopConfiguration(config, "my-bucket"); |
Comment on lines
+185
to
+199
| Configuration config = new Configuration(); | ||
| params.addToHadoopConfiguration(config, "my-bucket"); | ||
| assertNull(config.get("fs.s3a.access.key"), "Global access key should not be set when bucket is provided"); | ||
| assertNull(config.get("fs.s3a.secret.key"), "Global secret key should not be set when bucket is provided"); | ||
| assertNotNull(config.get("fs.s3a.bucket.my-bucket.access.key"), "Bucket-scoped access key should be set"); | ||
| assertNotNull(config.get("fs.s3a.bucket.my-bucket.secret.key"), "Bucket-scoped secret key should be set"); | ||
| } | ||
|
|
||
| @Test | ||
| void addCredentialsWithoutBucketUsesGlobalKeys() { | ||
| params.setAddCredentials(true); | ||
| Configuration config = new Configuration(); | ||
| params.addToHadoopConfiguration(config); | ||
| assertNotNull(config.get("fs.s3a.access.key"), "Global access key should be set when no bucket"); | ||
| assertNotNull(config.get("fs.s3a.secret.key"), "Global secret key should be set when no bucket"); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.