StarRocks version 4.0
Downgrade Notes
-
After upgrading StarRocks to v4.0, DO NOT downgrade it directly to v3.5.0 & v3.5.1, otherwise it will cause metadata incompatibility and FE crash. You must downgrade the cluster to v3.5.2 or later to prevent these issues.
-
Before downgrading clusters from v4.0.2 to v4.0.1, v4.0.0, and v3.5.2~v3.5.10, execute the following statement:
SET GLOBAL enable_rewrite_simple_agg_to_meta_scan=false;After upgrading the cluster back to v4.0.2 and later, execute the following statement:
SET GLOBAL enable_rewrite_simple_agg_to_meta_scan=true;
4.0.10β
Release Date: May 9, 2026
Behavior Changesβ
- Cloud storage credentials are now redacted in error messages produced by
INSERT INTO FILES, preventing accidental exposure of secrets in error logs andSHOW LOADoutput. #71245 - StarRocks no longer permits queries against insert-only ACID Hive tables in Hive catalog. Previously such queries could silently return more rows than actually visible because INSERT OVERWRITE operations were not recognized. Affected tables now return an explicit error instead of incorrect results. #71460
Improvementsβ
- Added an Avro schema cache in Iceberg
PartitionDataconstruction to remove redundant JacksonObjectMapperallocations during partition load on tables with many partitions. #72215 - Optimized
CatalogRecycleBin.getAdjustedRecycleTimestampto avoid rebuilding the table-id map on every call, reducing recycle-bin cleanup and tablet scheduling overhead. #72128 OlapTableSink.createLocationnow batches tablet-location lookups in shared-data mode, removing per-tablet StarOS RPCs that previously stalled the planner critical section. #72041- Java UDAF instances are now loaded and initialized once per query and reused across pipeline driver instances, removing the linear driver-preparation overhead at high
pipeline_dop. #72038 - Added BE metrics
starrocks_be_staros_shard_info_fallback_totalandstarrocks_be_staros_shard_info_fallback_failed_totalto track when the StarOS worker falls back to fetching shard info fromstarmgrbecause the local cache missed. #71620 - File-bundle writes now prefer a tablet-local aggregator so the bundled tablet metadata path does not require cross-node shard-info lookups. #71613
- Audit log entries now include the queried tables and views referenced by each query. #71596
INSERT INTO FILESCSV export now supportscsv.encloseandcsv.escapeproperties for controlling field quoting and escaping. #71589- Added LDAP direct bind authentication via DN pattern, removing the requirement for an admin search account in single-tenant LDAP setups. #71559
- Added the
starrocks_fe_tablet_nummetric for shared-data clusters to match the shared-nothing metric set. #71444 star_mgr_meta_sync_interval_secis now runtime-mutable viaADMIN SET FRONTEND CONFIG; the new interval takes effect on the next sync cycle without an FE restart. #71675
Bug Fixesβ
The following issues have been fixed:
- A race in shared-data combined txn log mode where INSERT into per-partition coordinator dispatch could classify legitimate txn logs as orphan and drop them, leaving the transaction stuck in non-VISIBLE state. #72237
- An issue where
_incremental_open_node_channelchannels in shared-data combined txn log mode silently dropped txn logs because the legacy "sender_id == 0 collects all logs" rule did not apply to incremental channels. #71992 - An issue where
RuntimeProfile::to_thrift()could crash BE withstd::bad_optional_accesswhen another thread reset counter min/max values during profile serialization. #72904 - An inconsistency in flat JSON merge results when one side contributed empty values. #72973
- An issue where
CREATE TABLEfor an Iceberg table failed with "Multiple entries with same key: format-version" when the user explicitly specifiedformat-versioninPROPERTIES. #72828 - A
CompactionScheduler.startCompactionlock scope that held a DB-wide READ lock across single-table critical work, blocking concurrent DDL on other tables in the same database. Switched to IS on DB plus READ on the target table. #72178 - An issue where
StarMgrMetaSyncer.syncTableMetaInternalandsyncTableColocationInfoheld DB READ/WRITE locks across external StarOS RPCs, freezing CREATE/DROP/ALTER/RENAME on every table in the database for the duration of each RPC. #72108 - An issue where
StarMgrMetaSyncer.getAllPartitionShardGroupIdheld the DB READ lock for full iteration over all cloud-native tables and physical partitions, stalling FE threads waiting for the DB write lock on large catalogs. #71614 - A redundant DB READ lock in
getTableNamesViewWithLock. The underlyingnameToTableis aConcurrentHashMap, so the enclosing lock added contention without correctness benefit. #72042 - A DB WRITE lock in the read-only
/api/{db}/{table}/_countREST endpoint that was unnecessary for computingproximateRowCount(). #72053 - A batch publish deadlock caused by partition version gaps that operations like tablet split, schema change, and alter jobs reserved by advancing
nextVersionwithout a matching publish. #71483 - A deadlock in shared-nothing mode when warming up the LRU cache for rowset metadata while the cache was full. #71459
- A
PipelineTimerTaskthat could remain stuck inwaitUtilFinisheddue to incorrect ordering between consumer registration and finished signaling. #72058 - A condition race in
ConnectorSinkPassthroughExchanger::acceptthat crashed BE with SIGSEGV via out-of-bounds vector access on_writer_count. #71848 - A use-after-free in
LoadChannel::get_load_replica_statuscaused by destruction of a temporaryshared_ptr. #71843 - A use-after-free in the information schema sink due to a missing reference count increment in async RPC closure handling. #71513
- A BE crash in
reverse(DecimalV3)caused by improper handling of decimal value width. #71834 - A BE crash when
UNNESTproduced columns whose define-expression carried an ARRAY type, which was incompatible with global dictionary generation downstream. #72027 - An NPE in FE when creating an Iceberg external table with invalid transform argument order such as
bucket(4, region); FE now returns a normal analyzer error. #71917 - An issue where Iceberg manifest data file cache entries were missing column statistics when the first query against a table did not request stats (for example
SELECT *). #71913 - An issue where the Iceberg min/max optimization was silently skipped when the table was partitioned by
bucket(col, N)becausePruneHDFSScanColumnRuleinjected a placeholder materialized column. #71863 - An issue where
AggregateJoinPushDownRulefailed to rewrite materialized views over Iceberg base tables becauseTable.getId()was compared instead of identity, and connector-table ids can shift across plan rebuilds. #71856 - An issue where INSERT OVERWRITE into Hive dynamic partitions failed when the metastore listed a partition whose location no longer existed on the file system; the missing partition directory is now created before commit. #71810
- A Parquet scanner failure (
Illegal converting from arrow type(dictionary) ...) when Arrow returned dictionary-typed columns, including dictionaries nested inside arrays, structs, and maps. #71855 - An issue where stale scan ranges from earlier batches persisted across
ColocatedBackendSelector.Assignmentincremental batches, causing files to be re-deployed and re-scanned. #71789 - An issue where
PruneShuffleColumnRuledid not update the JoinoutputPropertyafter pruning Exchange shuffle columns, leading to incorrect downstream distribution. #72003 - Incorrect shuffle distribution caused by a missing project node when
PushDownJoinOnExpressionToChildProjectwas disabled during the first stage of multi-stage MV rewrite. #71075 - Duplicate
Applyattachments inReplaceSubqueryRewriteRulewhen predicate normalization made the same scalar-subquery placeholder appear multiple times. #71155 - A short-circuit issue in
EventSchedulerwhere a finished join probe could prevent the pipeline from transitioning to the finished state. #71740 - An issue where AWS assume-role configured via
aws.s3.iam_role_arnwas not applied to JNI scanners (RCFile / Avro / SequenceFile / Hudi), causing S3 403 errors. #71422 - An issue where Oracle JDBC predicate pushdown produced invalid SQL because date literals did not match the Oracle NLS format; literals are now emitted as
date '...'. #71412 - An issue in shared-data mode where a follower FE forwarded DDL to the leader and waited only for FE journal replay, missing the StarMgr journal and producing "no queryable replica" errors for queries that immediately followed table creation. #71263
- An issue where
get_tablet_statsfor Primary Key tablets repeatedly reloaded the entireTabletMetadatafor every segment viaget_del_vec_in_meta(). #71672 - An Arrow Flight issue where empty result sets returned column names of
rbecause the placeholder name was emitted instead of the actual schema. #71534 - An issue where
parallel_clone_task_per_pathupdates did not include the store-path count when resizing the CLONE thread pool. #71484 - An issue where the resource group user classifier rejected digit-leading usernames that
CREATE USERallowed. The classifier now uses the same validation rule asCREATE USER. #71470 - An issue where
HttpServerHandler.channelInactiveskippedunregisterConnectionwhenisRegistered()was false, leaking connection-map entries for early-failing requests. #72006 - An issue where Java UDF JNI calls (
NewObject,NewArray,NewStringUTF, etc.) did not check for exceptions or null returns, leading to silent failures or undefined behavior. #71734 - An issue where
be_tablets.DATA_SIZEreportedtotal_disk_size(including rowset-embedded indexes and the persistent PK index for lake PK tablets) instead of rowset column data bytes. #70735 - A noisy "Failed to batch drop tablets" warning printed by
StarMgrMetaSyncereven when there were no shards to delete. #72209 - CVE-2026-42198 (pgjdbc) and CVE-2026-5598 (BouncyCastle): bumped
org.postgresql:postgresqlto 42.7.11 and BouncyCastle to 1.84. #72797 - CVE in netty: upgraded netty to 4.1.133.Final. #72905
- Cleaned broker CVEs by upgrading netty / jetty / awssdk / jackson dependencies in the broker. #72184
- Upgraded jetty-http to 9.4.58.v20250814 to address known CVEs in the previous jetty-http version. #71762
- Temporarily masked CVE-2026-2332 to unblock the build, since jetty 9.x is EOL and no upstream fix is published. #71914
4.0.9β
Release Date: April 16, 2026
Behavior Changesβ
- When VARBINARY columns appear inside nested types (ARRAY, MAP, or STRUCT), StarRocks now correctly encodes the values in binary format in MySQL result sets. Previously, raw bytes were emitted directly, which could break text-protocol parsing for null bytes or non-printable characters. This change may affect downstream clients or tools that process VARBINARY data inside nested types. #71346
- Routine Load jobs now automatically pause when a non-retryable error is encountered, such as a row causing the Primary Key size limit to be exceeded. Previously, the job would retry indefinitely because such errors were not recognized as non-retryable by the FE transaction status handler. #71161
SHOW CREATE TABLEandDESCstatements now display the Primary Key columns for Paimon external tables. #70535- Cloud-native tablet metadata fetch operations (such as
get_tablet_statsandget_tablet_metadatas) now use a dedicated thread pool instead of the sharedUPDATE_TABLET_META_INFOpool. This prevents metadata fetch contention from impacting repair and other tasks. The new thread pool size is configurable via a new BE parameter. #70492
Improvementsβ
- Added session variables to control the encoding behavior of VARBINARY values in MySQL protocol responses, providing fine-grained control over binary result encoding in client connections. #71415
- Added a
snapshot_meta.jsonmarker file to cluster snapshots to support integrity validation before snapshot restoration. #71209 - Added warning logs for silently swallowed exceptions in
WarehouseManagerto improve observability of silent failures. #71215 - Added metrics for Iceberg metadata table queries to support performance monitoring and diagnosis. #70825
- The
regexp_replace()function now supports constant folding during FE query planning, reducing planning overhead for queries with constant string arguments. #70804 - Added categorized metrics for Iceberg time travel queries to improve monitoring and performance analysis. #70788
- Added log output when update compaction is suspended, improving visibility into compaction lifecycle. #70538
SHOW COLUMNSnow returns column comments for PostgreSQL external tables. #70520- Added support for dumping query execution plans when a query encounters an exception, improving diagnosability of runtime failures. #70387
- Tablet deletion during DDL operations is now batched, reducing write lock contention on tablet metadata. #70052
- Added a Force Drop recovery mechanism for synchronous materialized views that are stuck in an error state and cannot be dropped through normal means. #70029
Bug Fixesβ
The following issues have been fixed:
- An issue where the profile
START_TIMEandEND_TIMEwere not displayed in the session timezone. #71429 - A shared-object mutation bug in
PushDownAggregateRewriterwhen processing CASE-WHEN/IF expressions, which could cause incorrect query results. #71309 - A use-after-free bug in
ThreadPool::do_submittriggered when thread creation fails. #71276 - An issue where
information_schema.tablesdid not properly escape special characters in equality predicates, causing incorrect results. #71273 - An issue where the materialized view scheduler continued to run after the materialized view became inactive. #71265
- Fixed a task signature collision in
UpdateTabletSchemaTaskacross concurrent ALTER jobs that could cause schema update tasks to be skipped. #71242 - An issue where row count estimation produced NaN values for histograms that contained only MCV (Most Common Values) entries. #71241
- A missing dependency on the AWS S3 Transfer Manager in the AWS SDK integration. #71230
- An issue where
TaskManagerscheduler callbacks did not verify whether the current node is the leader, potentially causing duplicate task execution on follower nodes. #71156 - A thread-local context pollution issue where
ConnectContextinformation was not cleared after a leader-forwarded request completed. #71141 - An issue where the partition predicate was missing in short-circuit point lookups, causing incorrect query results. #71124
- A NullPointerException when analyzing generated columns during Stream Load or Broker Load if a column referenced by the generated column expression was absent from the load schema. #71116
- A use-after-free bug in the error handling path of parallel segment and rowset loading. #71083
- An issue where delvec orphan entries were left behind when a write operation preceded compaction in the same publish batch. #71049
- An issue where queries appeared in the
current_queriesresult via HTTP loopback when checking query progress internally. #71032 - CVE-2026-33870 and CVE-2026-33871. #71017
- A read lock leak in
SharedDataStorageVolumeMgr. #70987 - An issue where the input and result columns of the
locate()function shared the same NullColumn reference inside BinaryColumns, causing incorrect results. #70957 - An issue where safe tablet deletion checks were incorrectly applied during ALTER operations in share-nothing mode. #70934
- A race condition in
_all_global_rf_ready_or_timeoutthat could prevent global runtime filters from being applied correctly. #70920 - An int32 overflow in the
ACCUMULATEDmetric macro that caused metric values to silently overflow. #70889 - Incorrect aggregation results in dictionary-encoded merge GROUP BY queries. #70866
- CVE-2025-54920. #70862
- A potential data loss issue in aggregation spill caused by incorrect hash table state handling during
set_finishing. #70851 - An issue where the
content-lengthheader was not reset whenproxy_pass_request_bodyis disabled. #70821 - An issue where the spill directory for load operations was cleaned up in the object destructor rather than during
DeltaWriter::close(), potentially causing premature deletion of spill data. #70778 - An issue where
INSERT INTO ... BY NAMEfromFILES()did not correctly push down the schema for partial column sets. #70774 - An issue where connector scan nodes did not reset the scan range source on query retry, causing incorrect results upon retry. #70762
- A potential rowset metadata loss for Primary Key model tablets caused by a GC race during disk re-migration of the form AβBβA. #70727
- An issue where a query-scoped warehouse hint leaked the
ComputeResourceobject inConnectContext, potentially affecting subsequent queries on the same connection. #70706 - An issue where redundant conjuncts in
MySqlScanNodeandJDBCScanNodecaused BE errors related toVectorizedInPredicatetype mismatches. #70694 - A missing
libssl-devdependency in the Ubuntu runtime environment. #70688 - An issue where Iceberg manifest cache completeness was not validated on read, leading to incorrect scan results when the cache was partially populated. #70675
- A duplicate closure reference in
_tablet_multi_get_rpcthat could cause use-after-free. #70657 - Partial manifest cache writes in the Iceberg
ManifestReaderthat could result in incomplete cache entries and incorrect scan behavior. #70652 - A crash in
array_map()when processing arrays that contain null literal elements. #70629 - A stack overflow in the
to_base64()function when processing large inputs. #70623 - An issue where
INSERT INTO ... BY NAMEfromFILES()used positional column mapping instead of name-based mapping, causing data to be written to incorrect columns. #70622 - An issue where
NOT NULLconstraints were incorrectly pushed down into the schema inferred fromFILES(), causing load failures for nullable columns. #70621 - An issue where precise external materialized view refresh did not fall back correctly for Iceberg-like connectors. #70589
- A
num_short_key_columnsmismatch when constructing a partial tablet schema, which could cause data read errors. #70586 - A BE crash that occurred when the child iterator was exhausted in
MaskMergeIterator. #70539 - An issue where materialized view refresh jobs repeatedly refreshed partitions whose corresponding Iceberg snapshots had expired. #70523
- An issue where starlet configuration parameters could not be set. #70482
- An issue where the lock-free materialized view rewrite path incorrectly fell back to live metadata, causing inconsistent rewrite behavior. #70475
- An issue in
JoinHashTable::merge_htwhere dummy rows were not skipped for expression-based join key columns, causing incorrect join results. #70465 - An incorrect equality comparison in
InformationFunctionthat could produce wrong results in certain queries. #70464 - A column type mismatch in the
__iceberg_transform_bucketinternal function. #70443 - An issue where Iceberg materialized view refresh failed when Iceberg snapshot timestamps were non-monotonic. #70382
- An issue where user authentication credentials were exposed in audit logs and SQL redaction output. #70360
- A CN crash that occurred when scanning an empty tablet with physical split enabled. #70281
- An issue where the VARCHAR column length was not preserved after a redundant CAST was eliminated during query optimization. #70269
- An issue where brpc connection retry logic did not correctly handle a wrapped
NoSuchElementException, causing connection failures after the retry attempt. #70203 - An issue where null fractions for outer join columns were not preserved during statistics estimation, leading to suboptimal query plans. #70144
- A memory tracker leak in connector sink operations running on poller threads. #70121
4.0.8β
Release Date: March 25, 2026
Behavior Changesβ
- Improved
sql_modehandling: whenDIVISION_BY_ZEROorFAIL_PARSE_DATEmode is set, division by zero and date parse failures instr_to_date/str2datenow return an error instead of being silently ignored. #70004 - When
sql_modeis set toFORBID_INVALID_DATE, invalid dates inINSERT VALUESclauses are now correctly rejected instead of being bypassed. #69803 - Expression partition generated columns are now hidden from
DESCandSHOW CREATE TABLEoutput. #69793 - Client ID is no longer included in audit logs. #69383
Improvementsβ
- Added a configuration item
local_exchange_buffer_mem_limit_per_driverto limit the local exchange buffer size todop * local_exchange_buffer_mem_limit_per_driver. #70393 - Cached file existence check results across versions in
check_missing_filesto reduce redundant storage I/O. #70364 - Allowed disabling split and reverse scan ranges for descending TopN runtime filters when
desc_hint_split_rangeis set to β€ 0. #70307 - Added
EXPLAINandEXPLAIN ANALYZEsupport forINSERTstatements in the Trino dialect. #70174 - Optimized Iceberg read performance when position deletes are present. #69717
- Optimized materialized view best-selector strategy based on distributed keys to improve materialized view selection accuracy. #69679
Bug Fixesβ
The following issues have been fixed:
- JDBC MySQL pushdown failing for unsupported cast operations. #70415
- Type mismatch issues in materialized view refresh. Added
mv_refresh_force_partition_typeconfiguration to force partition type in materialized view refresh. #70381 dataVersionnot set correctly when restoring from backup. #70373- Duplicated partition names in materialized view refresh tasks. #70354
- Incorrect SLF4J parameterized logging using string concatenation instead of placeholder arguments. #70330
- Comment not set when creating Hive tables. #70318
FileSystemExpirationCheckerblocking on slow HDFS close operations. #70311- Distribute column validation not applied across different partitions in
OlapTableSink. #70310 - Constant folding producing INF instead of an error when double addition overflows. #70309
- Typo in Iceberg table creation: field
commonwas used instead ofcomment. #70267 - Root user not bypassing all Ranger permission checks in some scenarios. #70254
query_poolmemory tracker going negative during data ingestion. #70228AuditEventProcessorthread exiting due toOutOfMemoryException. #70206SplitTopNRulenot applying partition pruning correctly. #70154- Out-of-bounds access in
cal_new_base_versionduring schema change publish. #70132 - Materialied view rewrite ignoring dropped partitions from the base table. #70130
- Unexpected partition predicate pruning due to type mismatch in boundary comparisons. #70097
str_to_datelosing microsecond precision in BE runtime. #70068- Join spill process crashing in
set_callback_function. #70030 - Broker Load failing GCS authentication after
gcs-connectorupgrade to version 3.0.13. #70012 - DCHECK failure in
DeltaWriter::close()when called from a bthread context. #69960 - Use-after-free race condition in
AsyncDeltaWriterclose/finish lifecycle. #69940 - Race condition causing write transaction edit log entry to be missed. #69899
- Known CVE vulnerabilities. #69863
- Follower FE not waiting for journal replay in
changeCatalogDb. #69834 - Incorrect
LIKEpattern matching with backslash escape sequences. #69775 - Expression analysis failure after renaming a partition column. #69771
- Use-after-free crash in
AsyncDeltaWriter::close. #69770 - Crash in local partition TopN execution. #69752
- Incorrect behavior in
PartitionColumnMinMaxRewriteRulecaused byPartition.hasStorageData. #69751 - Duplicated CSV compression suffix in file sink output filenames. #69749
lake_capture_tablet_and_rowsetsnot gated behind an experimental configuration flag. #69748- Incorrect partition min pruning with shadow partitions. #69641
- Java UDTF/UDAF crashing when method parameters use generic types. #69197
- Per-query metadata not released after query planning, causing FE OOM during concurrent query execution. #68444
- Query-scope warehouse hint leaking
ComputeResourceinConnectContext. #70706 - Lock-free materialized view rewrite incorrectly falling back to live metadata. #70475
- Duplicate closure reference in
_tablet_multi_get_rpc. #70657 - Infinite recursion in
ReplaceColumnRefRewriter. #66974 NOT NULLconstraint incorrectly pushed down toFILES()table function schema. #70621num_short_key_columnsmismatch in partial tablet schema. #70586COLUMN_UPSERT_MODEchecksum error in shared-data clusters. #65320- Column type mismatch for
__iceberg_transform_bucket. #70443 - Starlet configuration items not taking effect. #70482
- DCG data not read correctly when switching from column mode to row mode in partial update. #61529
4.0.7β
Release Date: March 12, 2026
Behavior Changeβ
- Disallowed creating materialized views based on Iceberg views. #69471
- Fixed inconsistencies in multi-statement Stream Load transaction behavior. #68542
Improvementsβ
- Added fine-grained trace counters for
LakePersistentIndexin the Publish phase. #69640 - Triggered early flush in
LakePersistentIndexwhen rebuild row count exceeds the threshold. #69698 - Added
dump_lake_persistent_index_sstoperation tometa_tool. #69682 - Improved
REPAIR TABLEfunctionality andSHOW TABLETstatus display. #69656 - Supports
ADMIN SHOW TABLET STATUSfor cloud-native tables. #69616 - Upgraded
hadoop-clientfrom 3.4.2 to 3.4.3. #69503 - Prevented crashes when deserialization mismatches occur. #69481
- Pushed down predicates to FE when querying
information_schema.loads. #69472 - Optimized the SQL displayed for materialized view refresh TaskRuns. #69437
- Bypassed caching in
CachingIcebergCatalogwhen vended credentials are enabled. #69434 - Uses
tryLockwith timeout incanTxnFinishedto reduce lock contention. #69427 - Added a global readonly variable
@@run_mode. #69247 - Uses Estimator to estimate cache entry weight in
DeltaLakeMetastore. #69244 - Added resource share type support for AWS Glue
GetDatabasesAPI. #69056 - Extracted range predicates from scalar-subqueries containing
convert_tz. #69055 - Added ByteBuffer Estimator. #69042
- Supports fast cancel for Lake DeltaWriter in shared-data clusters. #68877
- Supports an interface to add physical partitions for random distribution tables. #68503
- Gated SQL transactions behind the session variable
enable_sql_transaction(default: true). #63535 - Added partition scan number limit when querying external tables. #68480
Bug Fixesβ
The following issues have been fixed:
- Incorrect value for the metric
g_publish_version_failed_taskswhen the resource is busy. #69526 - NPE in
IcebergCatalog.getPartitionLastUpdatedTimewhen the snapshot has expired. #68925 - DCHECK failure in
DeltaWriter::close()when called from bthread context. #70057 - Several use-after-free issues. #69968
- Use-after-free race in AsyncDeltaWriter close/finish lifecycle. #69961
- Corrupted cache for PK SST tables is not clraered. #69693
AsyncFlushOutputStreamuse-after-free issue. #69688- Retention clock reset issue and incomplete scan in
disableRecoverPartitionWithSameName. #69677 - NPE in
StreamLoadMultiStmtTask.cancelAfterRestartafter deserialization. #69662 - Unnecessary RPCs and metadata queries caused by the incorrect logic of
SchemaBeTabletsScanner. #69645 - Graceful exit caused different transactions to publish the same version. #69639
TabletUpdates::get_column_valuescrashes with SIGSEGV when the Primary Key Index contains stale entries that point to rowsets that have been compacted. #69617KILL ANALYZEfails to stopANALYZE TABLEtasks. #69592- Unexpected behavior because not all exceptions of RowGroupWriter are caught. #69568
- TaskRun warehouse display issues after changing the warehouse for the materialized view. #69567
- Sort key does not include newly added key columns after schema change on aggregate and unique tables. #69529
- Issues caused by
isInternalCancelErrorusingequals. #69523 - TaskManager scheduling bugs after
ALTER MATERIALIZED VIEW. #69504 - Pipeline will be blocked or crash because not all exceptions of
ParquetFileWriter::closeare caught. #69492 - Materialized view force refresh bugs for partitioned tables. #69488
- Incorrect status was returned when certain writers failed to flush data. #69473
- Rowset files were deleted when Primary Key tablets were moved to trash. #69438
- INSERT failure when the range of an automatic partition is enclosed by an existing merged partition. #69429
- Materialized view tablet meta inconsistency between FE leader and follower. #69428
- Concurrency bugs related to function fields. #69315
- Premature deletion of data because rollup handler's active transaction ID is not considered in
computeMinActiveTxnId. #69285 - Lock leak in
addPartitionscaused by name-based table lookup after concurrent SWAP. #69284 DROP FUNCTION IF EXISTSignored theifExistsflag. #69216- Inconsistent behavior between StarRocks and MySQL-compatible syntax when
CAST(... AS SIGNED)in TypeParser. #69181 - Issue with case-insensitive partition lookup in query table copy. #69173
- Missing aggregate function when MIN/MAX stats rewrite failed. #69149
- CVE-2025-67721. #69138
- All-null value handling bug in synchronous materialized views. #69136
- Projection loss in materialized view rewrite due to shared mutable state. #69063
- Incorrect estimation of the Iceberg cache Weigher. #69058
FULL OUTER JOIN USINGissue with constant subqueries. #69028- Issue with
DISTINCT ORDER BYalias resolution for duplicated constants. #69014 - Issue when materialized view visits external catalog on reload. #68926
- NPE in Iceberg
getPartitions. #68907 - The container were not properly included in the Azure ABFS/WASB FileSystem cache key. #68901
- Erroneous query results after modifying
CHARcolumn length in shared-data clusters. #68808 - Case-insensitive issue with username in LDAP authentication. #67966
- Partitions could not be created after adding
storage_cooldown_ttlto a table. #60290
4.0.6β
Release Date: February 14, 2026
Improvementsβ
- Support Partition Transforms with parentheses when creating Iceberg tables (for example,
PARTITION BY (bucket(k1, 3))). #68945 - Removed the restriction that partition columns in Iceberg tables must be at the end of the column list; they can now be defined at any position. #68340
- Introduced host-level sorting for Iceberg table sink, controlled by the system variable
connector_sink_sort_scope(Default: FILE), to organize data layout for better read performance. #68121 - Improved error messages for Iceberg partition transform functions (for example,
bucket,truncate) when the argument count is incorrect. #68349 - Refactored table property handling to improve support for different file formats (ORC/Parquet) and compression codecs in Iceberg tables. #68588
- Added table-level query timeout configuration
table_query_timeoutfor fine-grained control (Priority: Session > Table > Cluster). #67547 - Supports the
ADMIN SHOW AUTOMATED CLUSTER SNAPSHOTstatement to view automated snapshot status and schedule. #68455 - Supports displaying the original user-defined SQL with comments in
SHOW CREATE VIEW. #68040 - Exposed Merge Commit-enabled Stream Load tasks in
information_schema.loadsfor better observability. #67879 - Introduced FE memory estimation utility API
/api/memory_usage. #68287 - Reduced unnecessary logging in
CatalogRecycleBinduring partition recycling. #68533 - Triggered refresh of related asynchronous materialized views when the base table undergoes Swap/Drop/Replace Partition operations. #68430
- Supports
VARBINARYtype forcount distinct-like aggregate functions. #68442 - Enhanced expression statistics to propagate histogram MCV for semantics-safe expressions (for example,
cast(k as bigint) + 10) to improve skew detection. #68292
Bug Fixesβ
The following issues have been fixed:
- Potential crashes in Skew Join V2 runtime filters. #67611
- Join predicate type mismatch (for example, INT = VARCHAR) caused by low-cardinality rewriting. #68568
- Issues in query queue allocation time and pending timeout logic. #65802
unique_idconflict for Flat JSON extended columns after schema changes. #68279- Concurrent partition access issues in
OlapTableSink.complete(). #68853 - Incorrect metadata tracking when restoring manually downloaded cluster snapshots. #68368
- Double slashes in backup paths when the repository location ends with
/. #68764 - OBS AK/SK credentials in the
SHOW CREATE CATALOGoutput were not masked. #65462
4.0.5β
Release Date: February 3, 2026
Improvementsβ
- Bumped Paimon version to 1.3.1. #67098
- Restored missing optimizations in DP statistics estimation to reduce redundant calculations. #67852
- Improved pruning in DP Join reorder to skip expensive candidate plans earlier. #67828
- Optimized JoinReorderDP partition enumeration to reduce object allocation and added an atom count cap (β€ 62). #67643
- Optimized DP join reorder pruning and added checks to BitSet to reduce stream operation overhead. #67644
- Skipped predicate column statistics collection during DP statistics estimation to reduce CPU overhead. #67663
- Optimized correlated Join row count estimation to avoid repeatedly building
Statisticsobjects. #67773 - Reduced memory allocations in
Statistics.getUsedColumns. #67786 - Avoided redundant
Statisticsmap copies when only row counts are updated. #67777 - Skipped aggregate pushdown logic when no aggregation exists in the query to reduce overhead. #67603
- Improved COUNT DISTINCT over windows, added support for fused multi-distinct aggregations, and optimized CTE generation. #67453
- Supports
map_aggfunction in the Trino dialect. #66673 - Supports batching retrieval of LakeTablet location information during physical planning to reduce RPC calls in shared-data clusters. #67325
- Added a thread pool for Publish Version transactions to shared-nothing clusters to improve concurrency. #67797
- Optimized LocalMetastore locking granularity by replacing database-level locks with table-level locks. #67658
- Refactored MergeCommitTask lifecycle management and added support for task cancellation. #67425
- Supports intervals for automated cluster snapshots. #67525
- Automatically cleaned up unused
mem_poolentries in MemTrackerManager. #67347 - Ignored
information_schemaqueries during warehouse idle checks. #67958 - Supports dynamically enabling global shuffle for Iceberg table sinks based on data distribution. #67442
- Added Profile metrics for connector sink modules. #67761
- Improved the collection and display of load spill metrics in Profiles, distinguishing between local and remote I/O. #67527
- Changed Async-Profiler log level to Error to avoid repeating warning logs. #67297
- Notified Starlet during BE shutdown to report SHUTDOWN status to StarMgr. #67461
Bug Fixesβ
The following issues have been fixed:
- Lacking support for legal simple paths containing hyphens (
-). #67988 - Runtime error when aggregate pushdown occurred on grouping keys involving JSON types. #68142
- Issue where JSON path rewrite rules incorrectly pruned partition columns referenced in partition predicates. #67986
- Type mismatch issue when rewriting simple aggregation using statistics. #67829
- Potential heap-buffer-overflow in partition Joins. #67435
- Duplicate
slot_idsintroduced when pushing down heavy expressions. #67477 - Division-by-zero error in ExecutionDAG fragment connection for lacking precondition checks. #67918
- Potential issues caused by fragment parallel prepare for single BE. #67798
- Operator terminates incorrectly for lacking
set_finishedmethod for RawValuesSourceOperator. #67609 - BE crash caused by unsupported DECIMAL256 type (precision > 38) in column aggregators. #68134
- Shared-data clusters lack support for Fast Schema Evolution v2 over DELETE operations by carrying
schema_keyin requests. #67456 - Shared-data clusters lack support for Fast Schema Evolution v2 over synchronous materialized views and traditional schema changes. #67443
- Vacuum might accidentally delete files when file bundling is disabled during FE downgrade. #67849
- Incorrect graceful exit handling in MySQLReadListener. #67917
4.0.4β
Release Date: January 16, 2026
Improvementsβ
- Supports Parallel Prepare for Operators and Drivers, and single-node batch fragment deployment to improve query scheduling performance. #63956
- Optimized
deltaRowscalculation with lazy evaluation for large partition tables. #66381 - Optimized Flat JSON processing with sequential iteration and improved path derivation. #66941 #66850
- Supports releasing Spill Operator memory earlier to reduce memory usage in group execution. #66669
- Optimized the logic to reduce string comparison overhead. #66570
- Improved skew detection in
GroupByCountDistinctDataSkewEliminateRuleandSkewJoinOptimizeRuleto support histogram and NULL-based strategies. #66640 #67100 - Enhanced Column ownership management in Chunk using Move semantics to reduce Copy-On-Write overhead. #66805
- For shared-data clusters, added FE
TableSchemaServiceand updatedMetaScanNodeto support Fast Schema Evolution v2 schema retrieval. #66142 #66970 - Supports multi-warehouse Backend resource statistics and parallelism (DOP) calculation for better resource isolation. #66632
- Supports configuring Iceberg split size via StarRocks session variable
connector_huge_file_size. #67044 - Supports label-formatted statistics in
QueryDumpDeserializer. #66656 - Added an FE configuration
lake_enable_fullvacuum(Default:false) to allow disabling Full Vacuum in shared-data clusters. #63859 - Upgraded lz4 dependency to v1.10.0. #67045
- Added fallback logic for sample-type cardinality estimation when row count is 0. #65599
- Validated Strict Weak Ordering property for lambda comparator in
array_sort. #66951 - Optimized error messages when fetching external table metadata (Delta/Hive/Hudi/Iceberg) fails, showing root causes. #66916
- Supports dumping pipeline status on query timeout and cancelling with
TIMEOUTstate in FE. #66540 - Displays matched rule index in SQL blacklist error messages. #66618
- Added labels to column statistics in
EXPLAINoutput. #65899 - Filtered out "cancel fragment" logs for normal query completions (for example, LIMIT reached). #66506
- Reduced Backend heartbeat failure logs when the warehouse is suspended. #66733
- Supports
IF EXISTSin theALTER STORAGE VOLUMEsyntax. #66691
Bug Fixβ
The following issues have been fixed:
- Incorrect
DISTINCTandGROUP BYresults under Low Cardinality optimization due to missingwithLocalShuffle. #66768 - Rewrite error for JSON v2 functions with Lambda expressions. #66550
- Incorrect application of Partition Join in Null-aware Left Anti Join within correlated subqueries. #67038
- Incorrect row count calculation in the Meta Scan rewrite rule. #66852
- Nullable property mismatched in Union Node when rewriting Meta Scan by statistics. #67051
- BE crash caused by optimization logic for Ranking window functions when
PARTITION BYandORDER BYare missing. #67094 - Potential wrong results in Group Execution Join with window functions. #66441
- Incorrect results from
PartitionColumnMinMaxRewriteRuleunder specific filter conditions. #66356 - Incorrect Nullable property deduction in Union operations after aggregation. #65429
- Crash in
percentile_approx_weightedwhen handling compression parameters. #64838 - Crash when spilling with large string encoding. #61495
- Crash triggered by multiple calls to
set_collectorwhen pushing down local TopN. #66199 - Dependency deduction error in LowCardinality rewrite logic. #66795
- Rowset ID leak when rowset commit fails. #66301
- Metacache lock contention. #66637
- Ingestion failure when column-mode partial update is used with conditional update. #66139
- Concurrent import failure caused by Tablet deletion during the ALTER operation. #65396
- Tablet metadata load error due to RocksDB iteration timeout. #65146
- Compression settings were not applied during table creation and Schema Change in shared-data clusters. #65673
- Delete Vector CRC32 compatibility issue during upgrade. #65442
- Status check logic error in file cleanup after clone task failure. #65709
- Abnormal statistics collection logic after
INSERT OVERWRITE. #65327 #65298 #65225 - Foreign Key constraints were lost after FE restart. #66474
- Metadata retrieval error after Warehouse deletion. #66436
- Inaccurate Audit Log scan statistics under high selectivity filters. #66280
- Incorrect query error rate metrics calculation logic. #65891
- Potential MySQL connection leaks when tasks exit. #66829
- BE status was not updated immediately on the SIGSEGV crash. #66212
- NPE during LDAP user login. #65843
- Inaccurate error log when switching users in HTTP SQL requests. #65371
- HTTP context leaks during TCP connection reuse. #65203
- Missing QueryDetail in Profile logs for queries forwarded from Follower. #64395
- Missing Prepare/Execute details in Audit logs. #65448
- Crash caused by HyperLogLog memory allocation failures. #66747
- Issue with the
trimfunction memory reservation. #66477 #66428 - CVE-2025-66566 and CVE-2025-12183. #66453 #66362 #67053
- Race condition in Exec Group driver submission. #66099
- Use-after-free risk in Pipeline countdown. #65940
MemoryScratchSinkOperatorhangs when the queue closes. #66041- Filesystem cache key collision issue. #65823
- Wrong subtask count in
SHOW PROC '/compactions'. #67209 - A unified JSON format is not returned in the Query Profile API. #67077
- Improper
getTableexception handling that affects the materialized view check. #67224 - Inconsistent output of the
Extracolumn from theDESCstatement for native and cloud-native tables. #67238 - Race condition in single-node deployments. #67215
- Log leakage from third-party libraries. #67129
- Incorrect REST Catalog authentication logic that causes authentication failures. #66861
4.0.3β
Release Date: December 25, 2025
Improvementsβ
- Supports
ORDER BYclauses for STRUCT data types #66035 - Supports creating Iceberg views with properties and displaying properties in the output of
SHOW CREATE VIEW. #65938 - Supports altering Iceberg table partition specs using
ALTER TABLE ADD/DROP PARTITION COLUMN. #65922 - Supports
COUNT/SUM/AVG(DISTINCT)aggregation over framed windows (for example,ORDER BY/PARTITION BY) with optimization options. #65815 - Optimized CSV parsing performance by using
memchrfor single-character delimiters. #63715 - Added an optimizer rule to push down Partial TopN to the Pre-Aggregation phase to reduce network overhead. #61497
- Enhanced Data Cache monitoring
- Optimized Sort and Aggregation operators to support rapid memory release in OOM scenarios. #66157
- Added
TableSchemaServicein FE for shared-data clusters to allow CNs to fetch specific schemas on demand. #66142 - Optimized Fast Schema Evolution to retain history schemas until all dependent ingestion jobs are finished. #65799
- Enhanced
filterPartitionsByTTLto properly handle NULL partition values to prevent all partitions from being filtered. #65923 - Optimized
FusedMultiDistinctStateto clear the associated MemPool upon reset. #66073 - Made
ICEBERG_CATALOG_SECURITYproperty check case-insensitive in Iceberg REST Catalog. #66028 - Added HTTP endpoint
GET /service_idto retrieve StarOS Service ID in shared-data clusters. #65816 - Replaced deprecated
metadata.broker.listwithbootstrap.serversin Kafka consumer configurations. #65437 - Added FE configuration
lake_enable_fullvacuum(Default: false) to allow disabling the Full Vacuum Daemon. #66685 - Updated lz4 library to v1.10.0. #67080
Bug Fixesβ
The following issues have been fixed:
latest_cached_tablet_metadatacould cause versions to be incorrectly skipped during batch Publish. #66558- Potential issues caused by
ClusterSnapshotrelative checks inCatalogRecycleBinwhen running in shared-nothing clusters. #66501 - BE crash when writing complex data types (ARRAY/MAP/STRUCT) to Iceberg tables during Spill operations. #66209
- Potential hang in Connector Chunk Sink when the writer's initialization or initial write fails. #65951
- Connector Chunk Sink bug where
PartitionChunkWriterinitialization failure caused a null pointer dereference during close. #66097 - Setting a non-existent system variable would silently succeed instead of reporting an error. #66022
- Bundle metadata parsing failure when Data Cache is corrupted. #66021
- MetaScan returned NULL instead of 0 for count columns when the result is empty. #66010
SHOW VERBOSE RESOURCE GROUP ALLdisplays NULL instead ofdefault_mem_poolfor resource groups created in earlier versions. #65982- A
RuntimeExceptionduring query execution after disabling theflat_jsontable configuration. #65921 - Type mismatch issue in shared-data clusters caused by rewriting
min/maxstatistics to MetaScan after Schema Change. #65911 - BE crash caused by ranking window optimization when
PARTITION BYandORDER BYare missing. #67093 - Incorrect
can_use_bfcheck when merging runtime filters, which could lead to wrong results or crashes. #67062 - Pushing down runtime bitset filters into nested OR predicates causes incorrect results. #67061
- Potential data race and data loss issues caused by write or flush operations after the DeltaWriter has finished. #66966
- Execution error caused by mismatched nullable properties when rewriting simple aggregation to MetaScan. #67068
- Incorrect row count calculation in the MetaScan rewrite rule. #66967
- Versions might be incorrectly skipped during batch Publish due to inconsistent cached tablet metadata. #66575
- Improper error handling for memory allocation failures in HyperLogLog operations. #66827
4.0.2β
Release Date: December 4, 2025
New Featuresβ
- Introduced a new resource group attribute,
mem_pool, allowing multiple resource groups to share the same memory pool and enforce a joint memory limit for the pool. This feature is backward compatible.default_mem_poolis used ifmem_poolis not specified. #64112
Improvementsβ
- Reduced remote storage access during Vacuum after File Bundling is enabled. #65793
- The File Bundling feature caches the latest tablet metadata. #65640
- Improved safety and stability for long-string scenarios. #65433 #65148
- Optimized the
SplitTopNAggregateRulelogic to avoid performance regression. #65478 - Applied the Iceberg/DeltaLake table statistics collection strategy to other external data sources to avoid collecting statistics when the table is a single table. #65430
- Added Page Cache metrics to the Data Cache HTTP API
api/datacache/app_stat. #65341 - Supports ORC file splitting to enable parallel scanning of a single large ORC file. #65188
- Added selectivity estimation for IF predicates in the optimizer. #64962
- Supports constant evaluation of
hour,minute, andsecondforDATEandDATETIMEtypes in the FE. #64953 - Enabled rewrite of simple aggregation to MetaScan by default. #64698
- Improved multiple-replica assignment handling in shared-data clusters for enhanced reliability. #64245
- Exposes cache hit ratio in audit logs and metrics. #63964
- Estimates per-bucket distinct counts for histograms using HyperLogLog or sampling to provide more accurate NDV for predicates and joins. #58516
- Supports FULL OUTER JOIN USING with SQL-standard semantics. #65122
- Prints memory information when Optimizer times out for diagnostics. #65206
Bug Fixesβ
The following issues have been fixed:
- DECIMAL56
mod-related issue. #65795 - Issue related to Iceberg scan range handling. #65658
- MetaScan rewrite issues on temporary partitions and random buckets. #65617
JsonPathRewriteRuleuses the wrong table after transparent materialized view rewrite. #65597- Materialized view refresh failures when
partition_retention_conditionreferenced generated columns. #65575 - Iceberg min/max value typing issue. #65551
- Issue with queries against
information_schema.tablesandviewsacross different databases whenenable_evaluate_schema_scan_ruleis set totrue. #65533 - Integer overflow in JSON array comparison. #64981
- MySQL Reader does not support SSL. #65291
- ARM build issue caused by SVE build incompatibility. #65268
- Queries based on bucket-aware execution may get stuck for bucketed Iceberg tables. #65261
- Robust error propagation and memory safety issues for the lack of memory limit checks in OLAP table scan. #65131
Behavior Changesβ
- When a materialized view is inactivated, the system recursively inactivates its dependent materialized views. #65317
- Uses the original materialized view query SQL (including comments/formatting) when generating SHOW CREATE output. #64318
4.0.1β
Release Date: November 17, 2025
Improvementsβ
- Optimized TaskRun session variable handling to process known variables only. #64150
- Supports collecting statistics of Iceberg and Delta Lake tables from metadata by default. #64140
- Supports collecting statistics of Iceberg tables with bucket and truncate partition transform. #64122
- Supports inspecting FE
/procprofile for debugging. #63954 - Enhanced OAuth2 and JWT authentication support for Iceberg REST catalogs. #63882
- Improved bundle tablet metadata validation and recovery handling. #63949
- Improved scan-range memory estimation logic. #64158
Bug Fixesβ
The following issues have been fixed:
- Transaction logs were deleted when publishing bundle tablets. #64030
- The join algorithm cannot guarantee the sort property because, after joining, the sort property is not reset. #64086
- Issues related to transparent materialized view rewrite. #63962
Behavior Changesβ
- Added the property
enable_iceberg_table_cacheto Iceberg Catalogs to optionally disable Iceberg table cache and allow it always to read the latest data. #64082 - Ensured
INSERT ... SELECTreads the freshest metadata by refreshing external tables before planning. #64026 - Increased lock table slots to 256 and added
ridto slow-lock logs. #63945 - Temporarily disabled
shared_scandue to incompatibility with event-based scheduling. #63543 - Changed the default Hive Catalog cache TTL to 24 hours and removed unused parameters. #63459
- Automatically determine the Partial Update mode based on the session variable and the number of inserted columns. #62091
4.0.0β
Release date: October 17, 2025
Data Lake Analyticsβ
- Unified Page Cache and Data Cache for BE metadata, and adopted an adaptive strategy for scaling. #61640
- Optimized metadata file parsing for Iceberg statistics to avoid repetitive parsing. #59955
- Optimized COUNT/MIN/MAX queries against Iceberg metadata by efficiently skipping over data file scans, significantly improving aggregation query performance on large partitioned tables and reducing resource consumption. #60385
- Supports compaction for Iceberg tables via procedure
rewrite_data_files. - Supports Iceberg tables with hidden partitions, including creating, writing, and reading the tables. #58914
- Supports setting sort keys when creating Iceberg tables.
- Optimizes sink performance for Iceberg tables.
- Iceberg Sink supports spilling large operators, global shuffle, and local sorting to optimize memory usage and address small file issues. #61963
- Iceberg Sink optimizes local sorting based on Spill Partition Writer to improve write efficiency. #62096
- Iceberg Sink supports global shuffle for partitions to further reduce small files. #62123
- Enhanced bucket-aware execution for Iceberg tables to improve concurrency and distribution capabilities of bucketed tables. #61756
- Supports the TIME data type in the Paimon catalog. #58292
- Upgraded Iceberg version to 1.10.0. #63667
Security and Authenticationβ
- In scenarios where JWT authentication and the Iceberg REST Catalog are used, StarRocks supports the passthrough of user login information to Iceberg via the REST Session Catalog for subsequent data access authentication. #59611 #58850
- Supports vended credentials for the Iceberg catalog.
- Supports granting StarRocks internal roles to external groups obtained via Group Provider. #63385 #63258
- Added REFRESH privilege to external tables to control the permission to refresh them. #63385
Storage Optimization and Cluster Managementβ
- Introduced β―the File Bundling optimization for the cloud-native table in shared-data clusters to automatically bundle the data files generated by loading, Compaction, or Publish operations, thereby reducing the API cost caused by high-frequency access to the external storage system. File Bundling is enabled by default for tables created in v4.0 or later. #58316
- Supports Multi-Table Write-Write Transaction to allow users to control the atomic submission of INSERT, UPDATE, and DELETE operations. The transaction supports Stream Load and INSERT INTO interfaces, effectively guaranteeing cross-table consistency in ETL and real-time write scenarios. #61362
- Supports Kafka 4.0 for Routine Load.
- Supports full-text inverted indexes on Primary Key tables in shared-nothing clusters.
- Supports modifying aggregate keys of Aggregate tables. #62253
- Supports enabling case-insensitive processing on names of catalogs, databases, tables, views, and materialized views. #61136
- Supports blacklisting Compute Nodes in shared-data clusters. #60830
- Supports global connection ID. #57256
- Added the
recyclebin_catalogsmetadata view to Information Schema to display recoverable deleted metadata. #51007
Query and Performance Improvementβ
- Supports DECIMAL256 data type, expanding the upper limit of precision from 38 to 76 bits. Its 256-bit storage provides better adaptability to high-precision financial and scientific computing scenarios, effectively mitigating DECIMAL128's precision overflow problem in very large aggregations and high-order operations. #59645
- Improved the performance for basic operators.#61691 #61632 #62585 #61405 #61429
- Optimized the performance of the JOIN and AGG operators. #61691
- [Preview] Introduced SQL Plan Manager to allow users to bind a query plan to a query, thereby preventing the query plan from changing due to system state changes (mainly data updates and statistics updates), thus stabilizing query performance. #56310
- Introduced Partition-wise Spillable Aggregate/Distinct operators to replace the original Spill implementation based on sorted aggregation, significantly improving aggregation performance and reducing read/write overhead in complex and high-cardinality GROUP BY scenarios. #60216
- Flat JSON V2:
- Supports configuring Flat JSON on the table level. #57379
- Enhance JSON columnar storage by retaining the V1 mechanism while adding page- and segment-level indexes (ZoneMaps, Bloom filters), predicate pushdown with late materialization, dictionary encoding, and integration of a low-cardinality global dictionary to significantly boost execution efficiency. #60953
- Supports an adaptive ZoneMap index creation strategy for the STRING data type. #61960
- Enhanced query observability:
- Optimized EXPLAIN ANALYZE output to display the execution metrics by group and by operator for better readability. #63326
QueryDetailActionV2andQueryProfileActionV2now support JSON format, enhancing cross-FE query capabilities. #63235- Supports retrieving Query Profile information across all FEs. #61345
- SHOW PROCESSLIST statements display Catalog, Query ID, and other information. #62552
- Enhanced query queue and process monitoring, supporting display of Running/Pending statuses.#62261
- Materialized view rewrites consider the distribution and sort keys of the original table, improving the selection of optimal materialized views. #62830
Functions and SQL Syntaxβ
- Added the following functions:
- Provides the following syntactic extensions:
Behavior Changesβ
- Adjust the logic of the materialized view parameter
auto_partition_refresh_numberto limit the number of partitions to refresh regardless of auto refresh or manual refresh. #62301 - Flat JSON is enabled by default. #62097
- The default value of the system variable
enable_materialized_view_agg_pushdown_rewriteis set totrue, indicating that aggregation pushdown for materialized view query rewrite is enabled by default. #60976 - Changed the type of some columns in
information_schema.materialized_viewsto better align with the corresponding data. #60054 - The
split_partfunction returns NULL when the delimiter is not matched. #56967 - Use STRING to replace fixed-length CHAR in CTAS/CREATE MATERIALIZED VIEW to avoid deducing the wrong column length, which may cause materialized view refresh failures. #63114 #62476
- Data Cache-related configurations are simplified. #61640
datacache_mem_sizeanddatacache_disk_sizeare now effective.storage_page_cache_limit,block_cache_mem_size,block_cache_disk_sizeare deprecated.
- Added new catalog properties (
remote_file_cache_memory_ratiofor Hive, andiceberg_data_file_cache_memory_usage_ratioandiceberg_delete_file_cache_memory_usage_ratiofor Iceberg) to limit the memory resources used for Hive and Iceberg metadata cache, and set the default values to0.1(10%). Adjust the metadata cache TTL to 24 hours. #63459 #63373 #61966 #62288 - SHOW DATA DISTRIBUTION now will not merge the statistics of all materialized indexes with the same bucket sequence number. It only shows data distribution at the materialized index level. #59656
- The default bucket size for automatic bucket tables is changed from 4GB to 1GB to improve performance and resource utilization. #63168
- The system determines the Partial Update mode based on the corresponding session variable and the number of columns in the INSERT statement. #62091
- Optimized the
fe_tablet_schedulesview in the Information Schema. #62073 #59813- Renamed the
TABLET_STATUScolumn toSCHEDULE_REASON, theCLONE_SRCcolumn toSRC_BE_ID, and theCLONE_DESTcolumn toDEST_BE_ID. - The data types of the
CREATE_TIME,SCHEDULE_TIMEandFINISH_TIMEcolumns have been changed fromDOUBLEtoDATETIME.
- Renamed the
- The
is_leaderlabel has been added to some FE metrics. #63004 - Shared-data clusters using Microsoft Azure Blob Storage and Data Lake Storage Gen 2 as object storage will experience Data Cache failure after being upgraded to v4.0. The system will automatically reload the cache.