After major compaction, not all regions show high locality ratios (some are 1 but some are still 0).
User already set the following in the cluster:
1. spark.dynamicAllocation.minExecutors = [2 x Number of Region Servers]
2. spark.locality.wait = 60s
Having higher locality ratio will help query performance. The user typically want to performance major compactions on a regular cadence, analyze tables, then query.
[Template: please modify accordingly]
Testing considerations and plan:
1. User impact/behavior changes (e.g., error message change)
2. Variation consideration (e.g., Count/Sum/Max/Min/Avg, Parquet/ORC formats)
3. Platform consideration (Cloudera/HDP/MapR, kerberos/ldap, Spark versions)
4. Upgrade/backdown compatibility (e.g., data dictionary changes)
5. Client connectivity impact (JDBC/ODBC)
6. Configuration changes needed