[SPARK-56960][SQL][TESTS] Support custom Ivy settings file in HiveVersionSuite via environment variable#56003
[SPARK-56960][SQL][TESTS] Support custom Ivy settings file in HiveVersionSuite via environment variable#56003sarutak wants to merge 2 commits into
HiveVersionSuite via environment variable#56003Conversation
| // Path to a custom Ivy settings file for testing in environments that require | ||
| // authenticated access to private repositories: | ||
| private val ivySettingsPath: Option[String] = | ||
| sys.env.get("SPARK_VERSIONS_SUITE_IVY_SETTINGS") |
There was a problem hiding this comment.
Is the following not working?
$ mvn test -Dspark.jars.ivySettings=xxx
$ sbt -Dspark.jars.ivySettings=xxx test
There was a problem hiding this comment.
System properties aren't propagated to forked test JVMs right?
| hadoopConf: Configuration, | ||
| extraConf: Map[String, String] = Map.empty): HiveClient = { | ||
| val sparkConf = new SparkConf() | ||
| ivySettingsPath.foreach(sparkConf.set(MavenUtils.JAR_IVY_SETTING_PATH_KEY, _)) |
There was a problem hiding this comment.
Maybe, set -> setIfMissing?
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Got it, @sarutak .
If we need this, please revise SPARK_VERSIONS_SUITE_IVY_SETTINGS to have _TEST_ substring. In general, we distinguish test environment variables from normal environment variables (e.g., SPARK_ANSI_SQL_MODE) like the following.
- SPARK_TEST_CACHE_DIR_SYSTEM_PROPERTY
- SPARK_TEST_HADOOP_CONF_DIR
- SPARK_TEST_SPARK_BLOOM_FILTER_SUITE_ENABLED
- SPARK_TEST_SQL_SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD
- SPARK_TEST_SQL_RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD
- SPARK_TEST_HIVE_SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD
- SPARK_TEST_HIVE_RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD
- PYSPARK_TEST_TIMEOUT
For the properties,
- spark.test.cache-dir
- spark.test.home
- spark.sql.test.master
- spark.sql.test.randomDataGenerator.maxStrLen
- spark.sql.test.randomDataGenerator.maxArraySize
- spark.sql.test.version.index
- spark.sql.test.expectedWarehouseDir
- spark.sql.test.expectedMetastoreURL
- spark.kubernetes.test.deployMode
- spark.kubernetes.test.kubeConfigContext
- spark.kubernetes.test.master
- spark.kubernetes.test.namespace
- spark.kubernetes.test.serviceAccountName
- spark.kubernetes.test.jvmImage
...
cc @peter-toth , too
|
@dongjoon-hyun |
What changes were proposed in this pull request?
This PR adds support for the
SPARK_VERSIONS_SUITE_IVY_SETTINGSenvironment variable inHiveClientBuilder. When set, its value is used as the path to a custom Ivy settings file (spark.jars.ivySettings) for Hive metastore jar resolution during tests.Why are the changes needed?
After SPARK-56867(#55881) added support for
spark.jars.ivySettingsinIsolatedClientLoader, the test infrastructure should also be able to leverage this configuration. In closed environments where only authenticated private repositories are accessible, tests that download Hive metastore jars (e.g.,HiveClientSuite,HivePartitionFilteringSuite) would fail without a way to specify credentials. This environment variable allows such environments to provide a custom Ivy settings file with the necessary authentication configuration.This follows the same pattern as the existing
SPARK_VERSIONS_SUITE_IVY_PATHenvironment variable, which allows specifying a custom Ivy cache path for the same set of tests.Does this PR introduce any user-facing change?
No.
How was this patch tested?
Confirmed it works with a private repository which requires authentication.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude (via Kiro CLI, auto model selection)