Commit Graph

7241 Commits

Author SHA1 Message Date
srlch 441f126793
[Enhancement] VacuumFull Implementation (#61602)
Signed-off-by: srlch <linzichao@starrocks.com>
Co-authored-by: Connor Brennan <cbrennan@pinterest.com>
2025-08-18 10:37:20 +08:00
Murphy 3e64a479b4
[Enhancement] optimize GlobalDictCodeColumnIterator::decode_string_dict_codes (#62002) 2025-08-18 10:35:53 +08:00
yan zhang cb98f70a4e
[BugFix] fix parquet array write when split null string (#61999)
Signed-off-by: yan zhang <dirtysalt1987@gmail.com>
2025-08-18 10:21:58 +08:00
Murphy 413d6b9651
[Enhancement] support encode_sort_key function (#61781)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-08-15 15:26:22 +08:00
stdpain 9000728aa5
[BugFix] Correct add query context to context conditions (#61929)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-08-14 19:14:42 +08:00
zhangqiang bb99b62b44
[BugFix] Disable sync_publish for shadow tablet (#61887)
Signed-off-by: sevev <qiangzh95@gmail.com>
2025-08-14 17:21:34 +08:00
Gavin 55cb223971
[Refactor] Introduce a load chunk spiller and refactor the load spill memtable sink based on it. (#61866)
Why I'm doing:
StarRocks supports spilling some intermediate data to disk or object storage when writing to native table. This can avoid wring too many small files under memory pressure.

The same issue is also exist when writing external table. However, now the spill procedure heavily coupled with native table and cannot be reused by external table directly.

So, it is necessary to introduce a separate module to implement the spill function, which can easily be used by native and external table.

What I'm doing:
Introduce a load chunk spiller to handle the load and merge functions.
Refactor the spill memtable sink of native table based on the load chunk spiller.

Signed-off-by: GavinMar <yangguansuo@starrocks.com>
2025-08-14 15:14:37 +08:00
Yixin Luo 39cf319bf9
[BugFix] avoid hold tablet shard lock to get compaction score (#61899)
Signed-off-by: luohaha <18810541851@163.com>
2025-08-14 10:36:17 +08:00
meegoo 84243343b8
[Feature] Support multi statement transaction (part1) - stream load (#61362)
Signed-off-by: meegoo <meegoo.sr@gmail.com>
2025-08-14 10:17:58 +08:00
zihe.liu 6b0fd1ee94
[BugFix] Fix NPE for JoinHashTable::mem_usage (#61872)
Signed-off-by: zihe.liu <ziheliu1024@gmail.com>
2025-08-14 10:11:15 +08:00
stdpain 0e12bcc9cf
[BugFix] Fix QueryContext cancel may cause use-after-free (#61897)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-08-14 09:37:11 +08:00
stdpain 5f6cdde3a0
[Enhancement] support group by compressed key (#61632)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-08-14 09:32:20 +08:00
xiangguangyxg 4ef246685f
[Enhancement] Separate path id from physical partition id (#61854)
Signed-off-by: xiangguangyxg <xiangguangyxg@gmail.com>
2025-08-13 17:01:36 +08:00
zhanghe 7b95d648bd
[BugFix] Fix the problem with the number of rebuild file counted. (#61859)
Signed-off-by: edwinhzhang <edwinhzhang@tencent.com>
2025-08-13 16:46:22 +08:00
yan zhang 46c2c0f5af
[BugFix] fix min/max optimization on iceberg on partition columns (#61858)
Signed-off-by: yan zhang <dirtysalt1987@gmail.com>
2025-08-13 14:11:16 +08:00
Murphy 561f99eeac
[Enhancement] Implement function json_contains (#61403)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-08-13 10:42:27 +08:00
zombee0 2a3e4bc8a7
[BugFix]sqlserver doesn't support timeout greater than 65535 (#61719)
Signed-off-by: zombee0 <ewang2027@gmail.com>
2025-08-12 15:46:46 +08:00
zombee0 b9dbceaa2b
[BugFix] set bucket_aware for shuffler (#61801)
Signed-off-by: zombee0 <ewang2027@gmail.com>
2025-08-12 14:13:15 +08:00
wyb 2a611f4a27
[Enhancement] Add tablet info collection time in tablet report (#61643)
Signed-off-by: wyb <wybb86@gmail.com>
2025-08-12 11:53:27 +08:00
srlch 16af4ce6ab
[BugFix] Fix auto increment value lost when partial update in COLUMN_UPSERT_MODE for share nothing (#61341)
Signed-off-by: srlch <linzichao@starrocks.com>
2025-08-12 11:13:00 +08:00
Murphy cc7c240861
[Enhancement] collect global dict for flatjson (#61680)
Signed-off-by: Murphy <mofei@starrocks.com>
2025-08-12 11:00:56 +08:00
trueeyu 1db0060022
[BugFix] Fix the bug in JDBC's processing of the TIME type (#61783)
Signed-off-by: trueeyu <lxhhust350@qq.com>
2025-08-12 10:18:52 +08:00
starrocks-xupeng 26ade294ed
[Enhancement] support write starlet file with tag (#61605)
Signed-off-by: starrocks-xupeng <xupeng@starrocks.com>
Signed-off-by: 絵空事スピリット <wanglichen@starrocks.com>
Co-authored-by: 絵空事スピリット <wanglichen@starrocks.com>
2025-08-11 14:32:43 +00:00
Hongkun Xu f8e7371c48
[Feature] Support MATCH_ANY operator (#60986)
Signed-off-by: Hongkun Xu <xuhongkun666@163.com>
2025-08-11 22:24:13 +08:00
starrocks-xupeng ac8a74a78e
[BugFix] support configure starcache inline cache count limit (#61724)
Signed-off-by: starrocks-xupeng <xupeng@starrocks.com>
2025-08-11 13:49:05 +08:00
Murphy bf792a6455
[BugFix] fix builtin_function fuzzy test (#61530)
What I'm doing:
bits_function: the implementation is wrong
change the static DCHECK to dynamic argument validation for some functions
fix some type mapping error in the logical_type.cpp

Signed-off-by: Murphy <mofei@starrocks.com>
2025-08-11 13:39:06 +08:00
eyes_on_me 8b3e28ad43
[BugFix] fix mem alloc issue of AggHashSetOfSerializedKey (#61558)
Signed-off-by: silverbullet233 <3675229+silverbullet233@users.noreply.github.com>
2025-08-11 10:54:42 +08:00
Murphy 51426082f2
[BugFix] fix the heap-use-after-free issue of json_remove (#61714) 2025-08-10 15:00:43 +08:00
zhanghe c2e3ae1d5c
[BugFix]Fix java.time.LocalDate type check. (#61684)
Signed-off-by: edwinhzhang <edwinhzhang@tencent.com>
2025-08-08 15:17:54 +08:00
wyb d6088ff298
[Enhancement] Bump librdkafka to 2.11.0 for kafka 4.0 (#61698)
Signed-off-by: wyb <wybb86@gmail.com>
2025-08-08 06:31:54 +00:00
Murphy e270cf409b
[Enhancement] FlatJSON V2 for lake table (#61663)
Signed-off-by: Murphy <mofei@starrocks.com>
2025-08-08 10:21:05 +08:00
PengFei Li e8982e7797
[Enhancement] Add prepared_timeout configuration for transaction stream load (#61539)
## Why I'm doing:

Currently, users can only configure the timeout for prepared transactions through the global FE configuration `prepared_transaction_default_timeout_second`. This approach lacks flexibility as it requires all transactions to use the same timeout value. Users need the ability to specify different timeout values for different transactions based on their specific requirements, especially in production environments where precise control over transaction lifecycle is crucial.

## What I'm doing:

This PR adds support for the `prepared_timeout` configuration in transaction stream load, allowing users to specify a timeout period for transactions from PREPARED to COMMITTED state. The implementation includes:

**Backend Changes:**
- Added `HTTP_PREPARED_TIMEOUT` constant in `be/src/http/http_common.h`
- Extended `StreamLoadContext` with `prepared_timeout_second` field
- Modified `TransactionMgr` to parse `prepared_timeout` HTTP header
- Updated `StreamLoadExecutor::prepare_txn` to pass timeout to FE
- Enhanced `TransactionState` with `preparedTimeoutMs` field and timeout detection logic
- Updated Thrift interface `TLoadTxnCommitRequest` with `prepared_timeout_second` field

**Frontend Changes:**
- Modified `TransactionLoadAction` to parse `prepared_timeout` parameter
- Updated `TransactionState` with `setPreparedTimeAndTimeout` method
- Enhanced `DatabaseTransactionMgr` and `GlobalTransactionMgr` to handle prepared timeout
- Updated transaction timeout detection logic in `TransactionState::isTimeout`

**Usage Example:**
```bash
# Begin transaction
curl --location-trusted -u root: -H "label:test_txn" -H "timeout:300" -H "db:test_db" -H "table:test_table" \
    -XPOST http://fe_host:8030/api/transaction/begin

# Load data
curl --location-trusted -u root: -H "label:test_txn" -H "db:test_db" -H "table:test_table" \
    -d '1' -XPUT http://fe_host:8030/api/transaction/load

# Prepare transaction with custom timeout (60 seconds)
curl --location-trusted -u root: -H "label:test_txn" -H "db:test_db" \
    -H "prepared_timeout:60" -XPOST http://fe_host:8030/api/transaction/prepare

# Commit transaction
curl --location-trusted -u root: -H "label:test_txn" -H "db:test_db" \
    -XPOST http://fe_host:8030/api/transaction/commit

# View transaction details including PreparedTime and PreparedTimeoutMs
SHOW TRANSACTION WHERE id = <transaction_id>;
+---------------+--------+---------------+-------------------+-------------------+---------------------+---------------------+---------------------+---------------------+---------------------+--------+--------------------+------------+-----------+-------------------+--------+
| TransactionId | Label  | Coordinator   | TransactionStatus | LoadJobSourceType | PrepareTime         | PreparedTime        | CommitTime          | PublishTime         | FinishTime          | Reason | ErrorReplicasCount | ListenerId | TimeoutMs | PreparedTimeoutMs | ErrMsg |
+---------------+--------+---------------+-------------------+-------------------+---------------------+---------------------+---------------------+---------------------+---------------------+--------+--------------------+------------+-----------+-------------------+--------+
| 1633          | test_txn | BE: 127.0.0.1 | VISIBLE           | BACKEND_STREAMING | 2025-08-03 11:02:54 | 2025-08-03 11:03:10 | 2025-08-03 11:03:14 | 2025-08-03 11:03:14 | 2025-08-03 11:03:14 |        | 0                  | [12237]    | 300000    | 60000             |        |
+---------------+--------+---------------+-------------------+-------------------+---------------------+---------------------+---------------------+---------------------+---------------------+--------+--------------------+------------+-----------+-------------------+--------+
```

**Documentation:**
- Updated `Stream_Load_transaction_interface.md` with `prepared_timeout` usage instructions
- Modified `SHOW_TRANSACTION.md` to document new `PreparedTime` and `PreparedTimeoutMs` fields
- Added version information indicating support from 4.0.0 onwards

The feature provides backward compatibility by using the FE configuration `prepared_transaction_default_timeout_second` as the default value when `prepared_timeout` is not specified.

Signed-off-by: PengFei Li <lpengfei2016@gmail.com>
Signed-off-by: 絵空事スピリット <wanglichen@starrocks.com>
Co-authored-by: 絵空事スピリット <wanglichen@starrocks.com>
2025-08-08 09:55:08 +08:00
wyb 0bca11047d
[Enhancement] Reorder tablet_balanced column of partitions_meta system table for better compatibility (#61665)
Signed-off-by: wyb <wybb86@gmail.com>
2025-08-07 18:52:11 +08:00
yan zhang d70d3294f6
[UT] fix query id not found when cancelled (#61667)
Signed-off-by: yan zhang <dirtysalt1987@gmail.com>
2025-08-07 06:54:55 +00:00
Murphy 0a9618db4c
[Feature] implement json_remove (#61394)
Signed-off-by: Murphy <mofei@starrocks.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-08-07 14:50:13 +08:00
stdpain e338f7ac00
[BugFix] Fix min/max by crash when process literal inputs (#61651)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-08-07 14:43:05 +08:00
before-Sunrise 5d59303970
[BugFix] Revert 'Avoid brpc communication when using local pass through' (#61631)
Signed-off-by: before-Sunrise <unclejyj@gmail.com>
2025-08-07 14:26:33 +08:00
Murphy 7dc4d83de8
[Enhancement] FlatJSON-V2: part fe (#61598) 2025-08-07 14:22:43 +08:00
liubotao 44c52750a2
[Enhancement] Shared Data Mode Support Flat Json parameters table-level configs (#61160)
Signed-off-by: liubotao <316945435@qq.com>
2025-08-07 14:08:47 +08:00
zombee0 27b0ef7a22
[Enhancement]local-exchange support bucket-aware execution (#61592)
Signed-off-by: zombee0 <ewang2027@gmail.com>
2025-08-07 11:20:24 +08:00
wyb 3caf4c6f5d
[Enhancement] Show tablet distribution balance statistic (#61549)
Signed-off-by: wyb <wybb86@gmail.com>
2025-08-07 10:57:50 +08:00
srlch 3e1ff80062
[Enhancement] Support System table for dynamic tablet jobs (#61152)
Signed-off-by: srlch <linzichao@starrocks.com>
2025-08-07 09:54:32 +08:00
Xie Bofan 7557647309
[Enhancement] Add header setting support to http_client (#61621)
Signed-off-by: xiebofan <1814739992@qq.com>
2025-08-06 07:25:12 +00:00
Yixin Luo 35355c82cb
[Enhancement] Sort while constructing the mapping from source file rowids to update file rowids when partial update (#61488)
Signed-off-by: luohaha <18810541851@163.com>
2025-08-06 12:48:59 +08:00
stdpain da27736352
[BugFix] Fix maxmin_by window function primitive type output cause unmatched length chunk (#61580)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-08-06 09:49:00 +08:00
yan zhang d8ff13b53b
[BugFix] fix query hang because of incorrect scan range delivery (#61562)
Signed-off-by: yan zhang <dirtysalt1987@gmail.com>
2025-08-05 10:35:14 +00:00
stdpain a1921ff837
[BugFix] notify should be call after setting streaming_all_states (#61591)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-08-05 18:23:41 +08:00
zombee0 f9d77d4014
[Enhancement]exchange node support bucket aware execution (#61554)
Signed-off-by: zombee0 <ewang2027@gmail.com>
2025-08-05 15:39:57 +08:00
Mathieu Baurin a5dbbc17cd
[Feature] Add format_bytes function for human-readable byte formatting (#61535)
Signed-off-by: Mathieu Baurin <1mathieu.baurin@gmail.com>
Signed-off-by: stdpain <34912776+stdpain@users.noreply.github.com>
Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
2025-08-05 12:07:38 +08:00
Murphy e88244ec37
[Enhancement] FlatJSON-V2 part 1: BE code (#61447)
Signed-off-by: Murphy <mofei@starrocks.com>
2025-08-04 19:21:22 +08:00
qingzhongli 7df950e7d8
[UT] Fix sse_memcmp UT compilation error on aarch64 (#61569)
Fix sse_memcmp UT compilation error on aarch64.

## Why I'm doing:
```
[ 96%] Building CXX object test/CMakeFiles/starrocks_test_objs.dir/util/monotime_test.cpp.o
[ 96%] Building CXX object test/CMakeFiles/starrocks_test_objs.dir/util/mysql_row_buffer_test.cpp.o
/root/starrocks/be/test/util/memcmp_test.cpp: In member function 'virtual void starrocks::sse_memcmp_Test_Test::TestBody()':
/root/starrocks/be/test/util/memcmp_test.cpp:38:20: error: 'sse_memcmp2' was not declared in this scope
   38 |         int res2 = sse_memcmp2(c1, c2, 3);
      |                    ^~~~~~~~~~~
/root/starrocks/be/test/util/memcmp_test.cpp:46:20: error: 'sse_memcmp2' was not declared in this scope
   46 |         int res2 = sse_memcmp2(c1, c2, 3);
      |                    ^~~~~~~~~~~
/root/starrocks/be/test/util/memcmp_test.cpp:54:20: error: 'sse_memcmp2' was not declared in this scope
   54 |         int res2 = sse_memcmp2(c1, c2, 3);
      |                    ^~~~~~~~~~~
/root/starrocks/be/test/util/memcmp_test.cpp:62:20: error: 'sse_memcmp2' was not declared in this scope
   62 |         int res2 = sse_memcmp2(c1, c2, 3);
      |                    ^~~~~~~~~~~
/root/starrocks/be/test/util/memcmp_test.cpp:71:20: error: 'sse_memcmp2' was not declared in this scope
   71 |         int res2 = sse_memcmp2(c1, c2, 3);
      |                    ^~~~~~~~~~~
/root/starrocks/be/test/util/memcmp_test.cpp:80:20: error: 'sse_memcmp2' was not declared in this scope
   80 |         int res2 = sse_memcmp2(c1, c2, 3);
      |                    ^~~~~~~~~~~
/root/starrocks/be/test/util/memcmp_test.cpp:89:20: error: 'sse_memcmp2' was not declared in this scope
   89 |         int res2 = sse_memcmp2(c1, c2, 3);
      |                    ^~~~~~~~~~~
/root/starrocks/be/test/util/memcmp_test.cpp:98:20: error: 'sse_memcmp2' was not declared in this scope
   98 |         int res2 = sse_memcmp2(c1, c2, 3);
      |                    ^~~~~~~~~~~
make[2]: *** [test/CMakeFiles/starrocks_test_objs.dir/util/memcmp_test.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [test/CMakeFiles/starrocks_test_objs.dir/all] Error 2
make: *** [all] Error 2
```

Signed-off-by: qingzhongli <qingzhongli2018@gmail.com>
2025-08-04 15:45:17 +08:00
Yixin Luo 7dac2090e1
[Enhancement] reuse I/O when reading bundled tablet meta (#61413)
Signed-off-by: luohaha <18810541851@163.com>
2025-08-04 10:47:34 +08:00
Yixin Luo 75854adf72
[Enhancement] optimize tablet meta copy when enable file bundling (#61410)
Signed-off-by: luohaha <18810541851@163.com>
2025-08-04 10:47:21 +08:00
Kevin Cai a3a0a01140
[BugFix] fix file-prefix-map, remove the build_XXX part (#61540)
* -ffile-prefix-map=/build/starrocks/be/build_RELEASE=. -ffile-prefix-map=/build/starrocks/be=be
* before this fix: source file lists 
```
  be/build_RELEASE/be/src/agent/agent_common.h
  be/build_RELEASE/be/src/agent/agent_server.cpp
  ...
  be/build_RELEASE/be/src/util/value_generator.h
  be/build_RELEASE/be/src/util/xxhash.h
``` 
  after this fix: 
```
 ./be/src/agent/agent_common.h
 ./be/src/agent/agent_server.cpp
 ...
 ./be/src/util/value_generator.h
 ./be/src/util/xxhash.h
```

Signed-off-by: Kevin Cai <kevin.cai@celerdata.com>
2025-08-04 09:11:24 +08:00
zhangqiang d89a2f64f4
[Refactor] Change the data type of data_size column in the partitions_meta table to bigint. (#61251)
Signed-off-by: sevev <qiangzh95@gmail.com>
2025-08-02 11:32:38 +08:00
Kevin Cai 45c2310372
[BugFix] don't try to build MFV versions for the instructions turned off (#61532)
the cpu instruction is off either because of not wanted the target instruction set or the build machine doesn't have the instruction supported. Be respectful to the instruction switch.

Signed-off-by: Kevin Cai <kevin.cai@celerdata.com>
2025-08-02 11:02:34 +08:00
stdpain 94726f0973
[BugFix] Fix UAF in local-partition preagg (#61524)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-08-01 09:37:18 +00:00
stdpain 14fca55647
[BugFix] Fix local-passthrough cancel dead lock (#61487)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-31 20:02:39 +08:00
Kevin Cai ef04362a2f
[BugFix] properly handle orc reader decompress error (#61464)
handle zstd decompress failure, throw runtime_error exception
fix orc_scanner tpch_10k.orc.zstd, it's corrupted. Replace it with correct test file and update the related test cases.

Signed-off-by: Kevin Cai <kevin.cai@celerdata.com>
2025-07-31 16:35:18 +08:00
SevenJ a362c009bc
[UT] Fix iceberg trans ut (#61459)
Signed-off-by: SevenJ <wenjun7j@gmail.com>
2025-07-31 14:38:18 +08:00
stdpain 65fd661601
[Enhancement] add vectorized implemention for assign_data_with_nulls (#61454)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-31 14:25:52 +08:00
zombee0 ee8bea1c33
[Enhancement]murmur3 hash to do bucket aware execution for iceberg (#61366)
Signed-off-by: zombee0 <ewang2027@gmail.com>
2025-07-31 10:05:17 +08:00
stdpain fce0346e97
[BugFix] Fix local-passthrough cause rpc to get stuck (#61427)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-30 16:21:23 +08:00
Fatih Çatalkaya 75b996b714
[BugFix] Set Content-Type to application/json when responding to stream load http requests (#61144)
Why I'm doing:
When sending a request to the /api/transaction/{begin,load,commit,...} endpoints, the content type is wrongly set to text/html instead of application/json.

What I'm doing:
Fixes #61130

Signed-off-by: Fatih Çatalkaya <fatih.catalkaya@yahoo.de>
Signed-off-by: Kevin Cai <kevin.cai@celerdata.com>
Co-authored-by: Kevin Cai <kevin.cai@celerdata.com>
2025-07-30 08:47:21 +08:00
Yixin Luo 768e03ec5e
[Enhancement] add idle time config for publish version thread pool (#61239)
Signed-off-by: luohaha <18810541851@163.com>
Signed-off-by: Yixin Luo <luoyixin6688@gmail.com>
Co-authored-by: 絵空事スピリット <wanglichen@starrocks.com>
2025-07-29 16:40:52 +00:00
Murphy af49488e6f
[UT] Fuzz test built-in functions with type coverage (#61303)
Signed-off-by: Murphy <mofei@starrocks.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-07-29 11:20:47 +00:00
Kevin Cai 942a77c5bb
[UT] fix incorrect use of evhttp in stream load unit test (#61390)
should not create a separate evhttp_request in test body
shall leverage the input_buffer created in the evhttp_request initialized by evhttp_request_new()

Signed-off-by: Kevin Cai <kevin.cai@celerdata.com>
2025-07-29 18:43:01 +08:00
stdpain 04001e8618
[Enhancement] some minor optmize for read parquet files (#60551)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-29 18:12:53 +08:00
zhangqiang 2defcf0579
[BugFix] Fix nullptr exception during clone. (#61359)
Why I'm doing:
When clone and drop table run concurrency, the new_tablet during clone maybe dropped and throw null exception.


Signed-off-by: sevev <qiangzh95@gmail.com>
2025-07-29 16:26:44 +08:00
Yixin Luo 0f1deef421
[BugFix] fix missing partition id in combine txnlog (#61207)
Signed-off-by: luohaha <18810541851@163.com>
2025-07-29 16:25:12 +08:00
Murphy c9ea6464fe
[BugFix] compile failure in clang (#61351)
Signed-off-by: Murphy <mofei@starrocks.com>
2025-07-29 13:07:50 +08:00
stdpain 70a7f618d5
[Refactor] Refactor scalar function registration to speed up compilation (#61358)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-29 09:45:15 +08:00
yan zhang d46937ed5c
[UT] Fix compilation and be-ut (#61347)
Signed-off-by: yan zhang <dirtysalt1987@gmail.com>
2025-07-29 09:21:40 +08:00
Hechem Selmi b7c2561dc0
[Enhancement] Avoid brpc communication when using local pass through (#60538)
Signed-off-by: m-selmi <m.selmi@celonis.com>
Signed-off-by: stdpain <drfeng08@gmail.com>
Co-authored-by: stdpain <drfeng08@gmail.com>
2025-07-28 11:08:19 +00:00
shuming.li ac5fc3f681
[UT] [BugFix] Fix unstable JITCacheTest tests (#61331) 2025-07-28 18:47:39 +08:00
starrocks-xupeng b0f5cbbbb1
[Enhancement] add segment write time in lake compaction (#60891)
Signed-off-by: starrocks-xupeng <xupeng@starrocks.com>
2025-07-28 17:35:11 +08:00
yan zhang b84d2051e4
[UT] disable parquet asan long running ut (#61334)
Signed-off-by: yan zhang <dirtysalt1987@gmail.com>
2025-07-28 15:58:36 +08:00
stdpain 9521bd8266
[Enhancement] Introduce RETURN_IF_DCHECK_XX_FAILED (#61315)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-28 14:08:11 +08:00
Jun-Seok Heo 774b9d0de3
[BugFix] fix the pruned column size to be same with the unpruned one (#61271) 2025-07-28 12:06:20 +08:00
stdpain fc856ca330
[BugFix] Fix array_map crash when capture const array columns (#61309)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-26 15:54:39 +08:00
duanyyyyyyy 561b82cd93
[BugFix] Fix a bug that agg_state_if will not handle the streaming aggregation cases (#61084)
Signed-off-by: ‘duanyyyyyyy’ <yan.duan9759@gmail.com>
2025-07-26 12:50:47 +08:00
Kevin Cai b5cc684042
[UT] fix StarOSWorker AwsSDK cleanup issue (#61265)
Signed-off-by: Kevin Cai <kevin.cai@celerdata.com>
2025-07-25 13:49:44 +08:00
srlch cbb77d9883
[BugFix] Fix set null value for auto_increment column will reject the valid data if they are in the same chunk (#61255)
Signed-off-by: srlch <linzichao@starrocks.com>
2025-07-25 12:48:28 +08:00
Evgeniy Zuikin 81ff271a80
[BugFix] Fix array column cloning durign array comparison (#61036)
Signed-off-by: SHaaD94 <eugenzuy@gmail.com>
Signed-off-by: stdpain <drfeng08@gmail.com>
Signed-off-by: stdpain <34912776+stdpain@users.noreply.github.com>
Co-authored-by: stdpain <drfeng08@gmail.com>
Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
2025-07-25 11:06:42 +08:00
Murphy 2b69350d1b
[BugFix] fix hour_from_unixtime (#61206)
Signed-off-by: Murphy <mofei@starrocks.com>
2025-07-25 10:10:14 +08:00
Gavin 46601e16e4
[Enhancement] Disable the inline mode when writing data to datacache because it may cause a performance degradation. (#60530)
Signed-off-by: GavinMar <yangguansuo@starrocks.com>
2025-07-24 17:27:34 +08:00
eyes_on_me 4167aaf940
[BugFix] fix TableMetricsMgrTest (#61218)
Signed-off-by: silverbullet233 <3675229+silverbullet233@users.noreply.github.com>
2025-07-24 13:56:44 +08:00
zihe.liu 3107899823
[BugFix] Fix resource group cpu usage (#61177)
Signed-off-by: zihe.liu <ziheliu1024@gmail.com>
2025-07-23 19:37:10 +08:00
eyes_on_me d71cc3d2c7
[BugFix] reduce lock contention of TableMetricsManager (#58911)
Signed-off-by: silverbullet233 <3675229+silverbullet233@users.noreply.github.com>
2025-07-23 19:15:07 +08:00
eyes_on_me 6abb89573c
[BugFix] make scan behavior consistent on shared-data and shared-nothing (#61100)
Signed-off-by: silverbullet233 <3675229+silverbullet233@users.noreply.github.com>
2025-07-23 10:24:59 +08:00
satanson e91696fa1b
[BugFix] excluding some files involving JIT when STARROCKS_JIT_ENABLE=OFF (#61138)
Signed-off-by: satanson <ranpanf@gmail.com>
2025-07-22 16:13:37 +08:00
zihe.liu 2144db870c
[Enhancement] Use RangeDirectMapping to optimize hash join (#61124)
Signed-off-by: zihe.liu <ziheliu1024@gmail.com>
2025-07-22 15:55:54 +08:00
alexzorin 2dbfc1d516
[BugFix] set hit_count in vector index metrics (#61102)
Signed-off-by: Alex Zorin <alex@zorin.au>
2025-07-22 14:41:56 +08:00
starrocks-xupeng f3144b9a2e
[BugFix] fix cache might not be used when upgraded from 3.3 (#60973)
Signed-off-by: starrocks-xupeng <xupeng@starrocks.com>
2025-07-22 14:32:30 +08:00
stdpain 78558bcc07
[BugFix] Fix dictionary inconsistency in shared-data mode (#61006)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-22 14:21:27 +08:00
stdpain ebd73ed42c
[Enhancement] avoid reuse ByteBuffer when merge data in JAVA UDAF (#61054)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-21 15:39:22 +08:00
zihe.liu c2d4734377
[Refactor] Split join_hash_map into files (#61010)
Signed-off-by: zihe.liu <ziheliu1024@gmail.com>
2025-07-21 14:04:11 +08:00
srlch 4ac5ae833f
[Enhancement] Filter out keys using SstablePredicate for sstable after compaction (#60743)
Signed-off-by: srlch <linzichao@starrocks.com>
2025-07-21 09:33:53 +08:00
satanson f877782f08
[BugFix] Executable segments generated by JIT are not released when it is evicted from JIT cache (#61027)
Signed-off-by: satanson <ranpanf@gmail.com>
2025-07-18 16:43:09 +08:00
satanson b26637e0f5
[BugFix] disable jit in BE (#61060)
Signed-off-by: satanson <ranpanf@gmail.com>
2025-07-18 11:40:06 +08:00
yan zhang c72152f5a5
[Enhancement] support uuid type in postgres (#61021)
Signed-off-by: yan zhang <dirtysalt1987@gmail.com>
2025-07-18 10:11:25 +08:00
yan zhang f5f8e9bc2c
[Enhancement] support map type in UDAF (#60840)
Signed-off-by: yan zhang <dirtysalt1987@gmail.com>
2025-07-18 09:49:51 +08:00
stdpain 44e64daea0
[BugFix] Make Python UDF error reporting clearer (#61015)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-17 18:02:25 +08:00
zihe.liu 68e827d71a
[Refactor] Split JoinFunc to KeyConstructor and HashMapMethod (#60932)
Signed-off-by: zihe.liu <ziheliu1024@gmail.com>
2025-07-17 11:10:36 +08:00
zhangqiang c655814072
[BugFix] erase partition from partiton_map when partiton_ids is empty (#60842)
Signed-off-by: sevev <qiangzh95@gmail.com>
2025-07-17 10:31:44 +08:00
shuming.li 26f053fb8f
[UT] Fix broken stringCastBitmapFailed0 test (#60971)
Signed-off-by: shuming.li <ming.moriarty@gmail.com>
2025-07-16 14:57:56 +08:00
Seaven df099e035a
[Enhancement] compute unused column by be (#60462)
Signed-off-by: Seaven <seaven_7@qq.com>
2025-07-16 14:29:28 +08:00
Murphy a63d9820e5
[BugFix] fix the counter unit of OutputChunkBytes (#60940)
Signed-off-by: Murphy <mofei@starrocks.com>
2025-07-16 10:28:37 +08:00
stdpain d6a0b7413d
[BugFix] Fix extract wrong result from relative URL (#60926)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-16 09:36:48 +08:00
Murphy 2d2219ee42
[UT] mark as slow ut: testJsonColumnCompression (#60942)
Signed-off-by: Murphy <mofei@starrocks.com>
2025-07-15 18:55:19 +08:00
gengjun-git 5e5a9c972f
[BugFix] Change KEYWORD to WORD to comply with MySQL's standard definition (#60863)
Change KEYWORD to WORD to comply with MySQL's standard definition. https://dev.mysql.com/doc/refman/8.0/en/information-schema-keywords-table.html

Signed-off-by: gengjun-git <gengjun@starrocks.com>
2025-07-15 09:55:10 +08:00
wyb 61f12e7675
[Enhancement] Support parquet version in files unload (#60843)
Signed-off-by: wyb <wybb86@gmail.com>
2025-07-15 09:34:58 +08:00
Yixin Luo 951816f2d5
[Enhancement] add some metrics for aggregate publish version & compaction (#60747)
Signed-off-by: luohaha <18810541851@163.com>
Signed-off-by: Yixin Luo <luoyixin6688@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-14 19:32:12 +08:00
meegoo 9475e2abce
[BugFix] Fix large number of base compactions block other compaction tasks (#60711)
Signed-off-by: meegoo <meegoo.sr@gmail.com>
2025-07-14 18:46:54 +08:00
Kevin Cai c186dc288c
[BugFix] efficiently handle error string truncating (#60878)
* truncate the string before convert to a std::string

## Why I'm doing:

```
W0713 17:45:07.945505 41413 mem_hook.cpp:90] large memory alloc, query_id:78472d5d-414e-ebb9-3edf-18733d316fb4 instance: 00000000-0000-0000-0000-000000000000 acquire:4294955590 bytes, stack:
    @          0x2f9d0f2  malloc
    @          0x915b765  operator new()
    @          0x373a583  std::__cxx11::basic_string<>::_M_construct<>()
    @          0x373b014  starrocks::stream_load::OlapTableSink::_print_varchar_error_msg()
    @          0x373fa75  starrocks::stream_load::OlapTableSink::_validate_data()
    @          0x374782a  starrocks::stream_load::OlapTableSink::send_chunk()
    @          0x2f5a50b  starrocks::PlanFragmentExecutor::_open_internal_vectorized()
    @          0x2f5cd81  starrocks::PlanFragmentExecutor::open()
    @          0x2ed4ceb  starrocks::FragmentExecState::execute()
    @          0x2edb5a8  starrocks::FragmentMgr::exec_actual()
    @          0x30542cc  starrocks::ThreadPool::dispatch_thread()
    @          0x304d0ea  starrocks::Thread::supervise_thread()
    @     0x2b3316651ea5  start_thread
    @     0x2b331728c96d  __clone
    @              (nil)  (unknown)
```

Signed-off-by: Kevin Cai <kevin.cai@celerdata.com>
2025-07-14 18:31:28 +08:00
stdpain 8a035148f0
[BugFix] Fix inconsistent implementations in url_extract_parameter (#60873)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-14 18:24:30 +08:00
stdpain a24412587e
[BugFix] Fix BE crash when loading a OOM partition (#60778)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-14 10:09:25 +08:00
Alexey eeb2900f64
[BugFix] incorrect expression in zonemap filters (#60845)
regression introduced in #53967

available in 3.5, backported to 3.4
3.3 doesn't have bug 

`!(expr1 && expr2)` have to be transformed into `!expr1 || !expr2`
not like it happens in pr `!expr1 && !expr2`

otherwise we will use incorrect expression a top of scalar column

## Why I'm doing:

in #53967 we introduce zonemap filtering for struct column
but it contain bug 

failed query
```
select x
from y
where x.field1[1].field2
```

```cpp
    // check subfield expr has only one child, and it's a SlotRef
    if (subfield_expr->children().size() != 1 && !subfield_expr->get_child(0)->is_slotref()) {
        return Status::InternalError("Invalid pattern for predicate");
    }
```

`subfield_expr->children().size() != 1`  - `false`
`!subfield_expr->get_child(0)->is_slotref()` - `true`, because it will be array access

whole expression also `false` and execution continued
now we expect, because we want 

```
// Rewrite ColumnExprPredicate which contains subfield expr and put subfield path into subfield_output
// For example, WHERE col.a.b.c > 5, a.b.c is subfields, we will rewrite it to c > 5
```
but still have `[1].field2`

## What I'm doing:

fix expression logic

Signed-off-by: Aliaksei Dziomin <diominay@gmail.com>
2025-07-12 13:01:43 +08:00
Yakir Gibraltar 0fc909c8de
[Enhancement] Implement missing HDFS filesystem operations for spilling support (#59759)
Fixes #59757

Why I'm doing:
When using HDFS as a remote storage volume for spilling, StarRocks fails with NOT_IMPLEMENTED_ERROR because several critical filesystem operations were not implemented in the HDFS filesystem wrapper (HdfsFileSystem class). These operations are essential for the spilling workflow:

Creating directories for spill data organization
Checking if paths are directories vs files
Deleting files and directories during cleanup
Managing directory hierarchy for spill containers
Without these operations, queries that need to spill to HDFS storage volumes cannot function, severely limiting StarRocks' ability to handle large datasets when using HDFS as external storage.

What I'm doing:
This PR implements the missing HDFS filesystem operations required for spilling functionality:

Implemented Operations:
delete_file() - Delete files from HDFS using hdfsDelete
create_dir() - Create directories using hdfsCreateDirectory
create_dir_if_missing() - Create directories if they don't exist (with existence check)
create_dir_recursive() - Create directories recursively (leverages HDFS native recursive creation)
delete_dir() - Delete empty directories using hdfsDelete
delete_dir_recursive() - Delete directories and all contents recursively
is_directory() - Check if a path is a directory using hdfsGetPathInfo
Additional Improvements:
Added private helper method _is_directory() for internal directory type checking
Fixed bug in hdfs_write_buffer_size assignment for upload options (was using __isset instead of actual value)
Added comprehensive test coverage including realistic spilling workflow simulation
Implementation Details:
All operations properly handle HDFS connections through existing HdfsFsCache infrastructure
Robust error handling with meaningful error messages using get_hdfs_err_msg()
Path existence validation before operations to provide clear error messages
Directory vs file type validation to prevent incorrect operations
Follows existing code patterns and error handling conventions in the codebase
Fixes #59757

Signed-off-by: Yakir Gibraltar <yakir.g@taboola.com>
Signed-off-by: Yakir Gibraltar <yakirgb@gmail.com>
Signed-off-by: Kevin Cai <kevin.cai@celerdata.com>
Co-authored-by: Yakir Gibraltar <yakir.g@taboola.com>
Co-authored-by: Kevin Cai <caixh.kevin@gmail.com>
Co-authored-by: Kevin Cai <kevin.cai@celerdata.com>
2025-07-12 09:58:26 +08:00
liubotao d34b14224e
[Feature] add new function to_datetime and to_datetime_ntz (#60637)
Signed-off-by: liubotao <316945435@qq.com>
2025-07-11 22:07:01 +08:00
zhangqiang ebeb2ac7fd
[Enhancement] Support different compaction strategy for different table in shared-data mode (#60366)
Signed-off-by: sevev <qiangzh95@gmail.com>
2025-07-11 11:18:14 +08:00
Xu Bai 6bf1a340dc
[Feature] Implement parquet variant decoding (#60189) 2025-07-10 20:16:19 +08:00
Murphy d9b23cffb4
[Enhancement] function hour_from_unixtime (#60331)
Signed-off-by: Murphy <mofei@starrocks.com>
2025-07-10 16:16:51 +08:00
stephen b8d2a70930
[Enhancement] support collection array column ndv (#60623)
Signed-off-by: stephen <stephen5217@163.com>
2025-07-10 14:44:31 +08:00
Yixin Luo 1863e49d09
[BugFix] Add compact to LakeService_RecoverableStub and optimize the error messages returned by aggregation compaction (#60715)
Signed-off-by: luohaha <18810541851@163.com>
2025-07-10 11:16:25 +08:00
Gavin a7c8f86e47
[BugFix] Release the cache engine instances before the datacache is freed to clean some related resources in advance (#60745)
Signed-off-by: GavinMar <yangguansuo@starrocks.com>
2025-07-10 09:56:52 +08:00
shuming.li a45f155e86
[BugFix] Remove unused output_scale variable in expr (#60731)
Signed-off-by: shuming.li <ming.moriarty@gmail.com>
2025-07-09 11:57:54 +00:00
ruyliu b0d96c5c52
[BugFix] Cherry-Pick ORC-1525 bugfix from apache-orc (#60722) 2025-07-09 11:05:27 +00:00
srlch a04ba3afca
[Feature] Introduce predicate for sstable (#60645)
Signed-off-by: srlch <linzichao@starrocks.com>
2025-07-09 16:40:43 +08:00
zhangqiang 457f0a9e2e
[Enhancement] Add lake service stub cache (#60517)
Why I'm doing:
After support file_bundling, we will create brpc channels between CN nodes during each publish operation which may affect publish performance.

What I'm doing:
Add lake service stub cache to avoid creating brpc channels on each publish


Signed-off-by: sevev <qiangzh95@gmail.com>
2025-07-09 08:50:08 +08:00
Yixin Luo 1b855d34a0
[Enhancement] Add garbage file checking to facilitate future comprehensive vacuum testing (#60639)
Signed-off-by: luohaha <18810541851@163.com>
2025-07-08 16:29:48 +08:00
starrocks-xupeng 0249856554
[BugFix] fix compaction new segments does not clean by abort txn (#60673)
when partial compaction is not used, still need to set correct new segment info, so that abort txn can clean new segments

Signed-off-by: starrocks-xupeng <xupeng@starrocks.com>
2025-07-08 11:37:19 +08:00
stdpain a3def612b0
[BugFix] Fix memory/row count inaccuracies that can cause aggregate stucked (#60612)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-07 17:52:05 +08:00
TsukiokaKogane c6843040d8
[BugFix] fix short circuit query core with out of order value column sql (#60466)
Signed-off-by: TsukiokaKogane <cby141994@gmail.com>
2025-07-07 08:51:01 +00:00
stdpain 8b3965a337
[Enhancement] Make UDF URLs not have to use a specific ending (#60622)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-07 14:48:01 +08:00
Yixin Luo 0b93fb7c2a
[BugFix] fix cloud native pk index memory statistic leak (#60566)
Signed-off-by: luohaha <18810541851@163.com>
2025-07-07 14:37:21 +08:00
before-Sunrise d8981a85be
[BugFix]fix ngram_search use after free (#60608)
Signed-off-by: before-Sunrise <unclejyj@gmail.com>
2025-07-07 13:54:43 +08:00
Mesut Döner c1e47b2d3d
[BugFix] split_part function should not return null when delimiter is not matched (#56967)
Signed-off-by: stdpain <34912776+stdpain@users.noreply.github.com>
Signed-off-by: stdpain <drfeng08@gmail.com>
Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
Co-authored-by: stdpain <drfeng08@gmail.com>
2025-07-04 16:40:12 +08:00
Yixin Luo 3edbb4955c
[Tool] print bundle tablet meta proto as string (#60600)
Signed-off-by: luohaha <18810541851@163.com>
2025-07-04 06:56:02 +00:00
Murphy 74d21b3600
[Enhancement] Add expression filter counter to OLAP_SCAN for non-pushdown predicates (#60552) 2025-07-04 13:23:29 +08:00
Yixin Luo 74458bf52a
[Enhancement] datafile_gc support bundle tablet meta (#60507)
Signed-off-by: luohaha <18810541851@163.com>
2025-07-04 10:38:47 +08:00
Murphy 71793ad7b0
[Enhancement] improve crc64 for hashset performance (#60074)
Signed-off-by: Murphy <mofei@starrocks.com>
2025-07-04 10:24:21 +08:00
ThunderScar 58d34eb991
[Refactor] Remove class StringValue (#25002)
Signed-off-by: linyan <1870750355@qq.com>
Signed-off-by: stdpain <drfeng08@gmail.com>
Co-authored-by: linyan <1870750355@qq.com>
2025-07-04 09:33:43 +08:00
Wu Xueyang 7e662d4249
[BugFix] GC inverted index path after segment data deleted (#60390)
If rowset data is deleted by garbage collection, the inverted index will not be removed because path scanning ignores all of the directories under the tablet schema hash path.

What I'm doing:
Path scanning will scan inverted index paths.

Signed-off-by: wuxueyang.wxy <wuxueyang.wxy@alibaba-inc.com>
2025-07-03 18:50:11 +08:00
yan zhang 7ff005aaac
[BugFix] fix min/max opt when all null values (#60545) 2025-07-03 17:55:47 +08:00
Murphy 46826ad18a
[Enhancement] test json compression (#60380)
Signed-off-by: Murphy <mofei@starrocks.com>
2025-07-03 17:42:48 +08:00
stdpain be1fc50ecd
[BugFix] Fix Exception when Java UDF output empty map (#60539)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-03 16:48:56 +08:00
srlch b34357a2cc
[BugFix] Let submitted tasks without execution can be awared in starrocks::LakeServiceImpl to set a correct response and status (#59814)
Signed-off-by: srlch <linzichao@starrocks.com>
2025-07-03 15:50:13 +08:00
Murphy 461eaf4862
[Enhancement] extend the date cache to 2050 (#60533) 2025-07-03 12:30:10 +08:00
shuming.li 073e49d13f
[Feature] (Part1) Enhance observability for Warehouse CNGroup (#60343)
Signed-off-by: shuming.li <ming.moriarty@gmail.com>
2025-07-02 21:05:09 +08:00
yan zhang d3e9134902
[Refactor] sort out count/min/max opt prerequisite (#60515)
Signed-off-by: yan zhang <dirtysalt1987@gmail.com>
2025-07-02 10:49:49 +00:00
srlch e66031fc62
[Feature] Make record predicate available in rowset/segment read path (#60423)
Signed-off-by: srlch <linzichao@starrocks.com>
2025-07-02 17:32:52 +08:00
SevenJ 771e8fd8d7
[BugFix] fix thread unsafe gmtime to gmtime_r (#60483)
Signed-off-by: SevenJ <wenjun7j@gmail.com>
2025-07-02 08:28:29 +00:00
JinYang 76f64e5ff6
[Enhancement] accelerate the crc32c calculation speed (#43433)
Signed-off-by: GoHalo <gohalo@163.com>
Signed-off-by: stdpain <34912776+stdpain@users.noreply.github.com>
Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
2025-07-02 13:59:48 +08:00
stdpain 20d587f0a0
[BugFix] Fix arrow flight crash when fetch from a not exist query_id (#60497)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-02 13:43:24 +08:00
stdpain de1f77e27c
[Enhancement] Fixing reports on clang-tidy (#60480)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-07-02 09:40:36 +08:00
裸奔丶小馒头 31b8ff4251
[Feature] Support deleting all UDF jar caches at be startup (#41598)
Signed-off-by: changxin <streakxin@foxmail.com>
Signed-off-by: stdpain <34912776+stdpain@users.noreply.github.com>
Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
2025-07-01 18:09:04 +08:00
yan zhang 952db2da5f
[Enhancement] use lower_bound/upper_bound to optimize min/max (#60385)
Signed-off-by: yan zhang <dirtysalt1987@gmail.com>
2025-07-01 17:42:42 +08:00
satanson 0f3b2661b4
[Enhancement] Spill PartitionWise aggregation (#60216)
Signed-off-by: satanson <ranpanf@gmail.com>
2025-07-01 11:28:11 +08:00
Hongkun Xu 12f66bf17e
[Refactor] Fix spelling errors in variable names (#60433)
Signed-off-by: Hongkun Xu <xuhongkun666@163.com>
2025-06-30 19:52:09 +08:00
yan zhang 244ed71f5b
[Refactor] refactor bit packing code (#60434)
Signed-off-by: yan zhang <dirtysalt1987@gmail.com>
2025-06-30 19:47:11 +08:00
shuming.li 74d99cff15
[BugFix] Ensure information_schema.task_runs more compatible with null values (#60426)
Signed-off-by: shuming.li <ming.moriarty@gmail.com>
2025-06-30 16:15:04 +08:00
srlch 4be45d5c45
[Feature] Introduce Column Hash Predicate as a Record Predicate for data filtering (#59993)
Signed-off-by: srlch <linzichao@starrocks.com>
2025-06-30 10:12:44 +08:00
trueeyu 0ae63e4c18
[Refactor] Unify the local cache engine (#60110)
Signed-off-by: trueeyu <lxhhust350@qq.com>
2025-06-30 10:06:45 +08:00
cutiechi 3a708b5c62
[BugFix] Prevent approx_cosine_similarity from Returning NaN When Input Vector Norm is Zero (#60297)
Signed-off-by: cutiechi <superchijinpeng@gmail.com>
2025-06-29 14:52:27 +08:00
Yixin Luo 37c7c51ee8
[Refactor] enable skip pk preload by default (#60368)
Signed-off-by: luohaha <18810541851@163.com>
2025-06-27 12:18:42 +00:00
shuming.li a4ebb6b582
[BugFix] Fix information_schema.materialized_views table compatible bugs (#60374)
Signed-off-by: shuming.li <ming.moriarty@gmail.com>
2025-06-27 17:00:09 +08:00
stephen f0bbb1bb53
[Feature] Add comprehensive decimal256 support (#60207)
Signed-off-by: stephen <stephen5217@163.com>
2025-06-27 11:20:25 +08:00
yan zhang 8d2648039c
[Refactor] refactor bitpacking code (#60320)
Why I'm doing:
refactor bitpacking code for further improvement.

What I'm doing:
This PR does:

merge bit_packing.h and bit_packing.inline.h => bit_packing_default.h. This implementation is to use template and unroll to do acceleration. Meanwhile, use namespace util::bitpacking_default instead of class BitPacking
rename bit_packing_simd.h to bit_packing_avx2.h, because it just uses avx2 instructions.
move arrow bit packing code to bit_packing_arrow.h
rename bit_packing_adaptor.h to bit_packing.h. And this is the entry file.
So right now we have following files, and entry file is bit_packing.h

-rw-rw-r-- 1 zhangyan zhangyan  4861 Jun 26 14:09 bit_packing_arrow.h
-rw-rw-r-- 1 zhangyan zhangyan 19580 Jun 26 14:05 bit_packing_avx2.h
-rw-rw-r-- 1 zhangyan zhangyan 11541 Jun 26 14:03 bit_packing_default.h
-rw-rw-r-- 1 zhangyan zhangyan  1708 Jun 26 14:10 bit_packing.h

Signed-off-by: yan zhang <dirtysalt1987@gmail.com>
2025-06-27 10:20:26 +08:00
Mesut Döner fada758ad4
[Feature] Add strpos function (#57287)
Why I'm doing:
trying to implement functions in Good First Issue list

What I'm doing:
Trino reference:
image

Fixes #52604

Signed-off-by: Mesut-Doner <mesutdonerng@gmail.com>
2025-06-27 10:01:10 +08:00
Kevin Cai fa25472968
[BugFix] fix incorrect message shown from spill dir configuration (#60339)
report correct configuration name when parse_conf_store_paths other than storage_root_path

Signed-off-by: Kevin Cai <kevin.cai@celerdata.com>
2025-06-26 19:56:31 +08:00
xiangguangyxg f278793601
[Feature] Support marking data files as shared in tablet metadata and skipping to delete shared data files in vacuum, leaving them for full gc to clean up (#60140)
Signed-off-by: xiangguangyxg <xiangguangyxg@gmail.com>
2025-06-26 19:20:40 +08:00
Yixin Luo b395fdfd9d
[BugFix] Revert use file info instead of file path in bundle data file reader (#60220) (#60338)
Signed-off-by: luohaha <18810541851@163.com>
2025-06-26 11:07:35 +00:00
Seaven 88ac67846d
[BugFix] Fix prune unused predicate column bug (#60208)
Signed-off-by: Seaven <seaven_7@qq.com>
2025-06-26 17:16:46 +08:00
srlch 1cc2b266fc
[Enhancement] Introduce GC control by cluster snapshot info (#58909)
Signed-off-by: srlch <linzichao@starrocks.com>
2025-06-26 13:51:10 +08:00
shuming.li f85e05bcb8
[BugFix] Fix information_schema.task_runs schema scan bugs (#60296)
Signed-off-by: shuming.li <ming.moriarty@gmail.com>
2025-06-25 20:54:45 +08:00
Yixin Luo 74d6789f62
[Enhancement] support bundle file deletion for deleteTablets (#59966)
Signed-off-by: luohaha <18810541851@163.com>
2025-06-25 19:12:08 +08:00
Rohit Satardekar f3da9857c2
[Feature] support https for brpc connections (#53695)
Signed-off-by: Rohit Satardekar <rohitrs1983@gmail.com>
2025-06-25 18:27:25 +08:00
Yixin Luo 0fa0491f95
[Enhancement] remove useless dir create operations when load spill (#60282)
Signed-off-by: luohaha <18810541851@163.com>
2025-06-25 10:11:23 +00:00
Mesut-Doner da090d8c7c
[Feature] Add boolor function (#57414)
Signed-off-by: Mesut-Doner <mesutdonerng@gmail.com>
Signed-off-by: stdpain <drfeng08@gmail.com>
Co-authored-by: stdpain <drfeng08@gmail.com>
2025-06-25 18:03:20 +08:00
zhangqiang 9c9f55fe34
[BugFix] Fix some bugs in scenarios where file_bundling and alter operations intersect (#60091)
Signed-off-by: sevev <qiangzh95@gmail.com>
2025-06-25 14:06:36 +08:00
trueeyu 2ae06f73fb
[Refactor] Remove core arena mem allocator (#60221)
Branch-3.3 (pr: #51263) has already set the default value of config::chunk_reserved_bytes_limit to 0, and there is no performance issue, so we finally removed the core memory allocator in the main branch.

What I'm doing:
Remove core arena mem allocator

Signed-off-by: trueeyu <lxhhust350@qq.com>
2025-06-25 13:21:52 +08:00
Wu Xueyang 43f432565f
[BugFix] Return error asap while executing ShortCircuitHybridScanNode. (#53060)
Signed-off-by: 枢木 <wuxueyang.wxy@alibaba-inc.com>
2025-06-25 11:07:44 +08:00
SevenJ 32ccd0e9b7
[BugFix] fix partition key shuffle error (#60072)
Signed-off-by: SevenJ <wenjun7j@gmail.com>
2025-06-25 09:44:01 +08:00
shuming.li 4dc3df2106
[Enhancement] Add more informations for information_schema.task_runs and information_schema.materialized_views (#60054)
Signed-off-by: shuming.li <ming.moriarty@gmail.com>
2025-06-24 19:17:13 +08:00
Yixin Luo b81d60db99
[Enhancement] use file info instead of file path in bundle data file reader (#60220)
Why I'm doing:
When reading bundled data files, we should pass the file info instead of the file path, as the info may contain file size information. In some filesystem implementations, this avoids additional file size fetch requests.

What I'm doing:
This pull request modifies the FileSystem::new_random_access_file_w method in be/src/fs/fs.cpp to improve how RandomAccessFile objects are created by passing the entire FileInfo object instead of just its path attribute.

Changes to FileSystem::new_random_access_file_w:
Updated calls to new_random_access_file to use the full FileInfo object instead of only file_info.path. This ensures that all relevant file metadata is available during the creation of RandomAccessFile instances.

Signed-off-by: luohaha <18810541851@163.com>
2025-06-24 19:11:32 +08:00
Siqi Ling 4b74e7d831
[Enhancement] Add transmitted bytes to FE Auditlog (#58346)
Signed-off-by: Siqi Ling <s.ling@celonis.com>
2025-06-24 14:48:06 +08:00
trueeyu e04a2c14df
[BugFix] Fix the bug in the PageHandle move assignment operator (#60206)
Signed-off-by: trueeyu <lxhhust350@qq.com>
2025-06-24 14:07:38 +08:00
shuming.li 310c23aa58
[Enhancement] Support last_day constant evaluation in FE and partition pruning (#59504)
Signed-off-by: shuming.li <ming.moriarty@gmail.com>
2025-06-24 13:39:06 +08:00
stdpain adedb6d506
[BugFix] Fix unsupported nestloop null-aware left anti join (#60119)
Signed-off-by: stdpain <drfeng08@gmail.com>
2025-06-24 09:35:52 +08:00
Yixin Luo 529dbb0b09
[BugFix] fix initial tablet meta read when only provide full path (#60132)
Signed-off-by: luohaha <18810541851@163.com>
2025-06-23 13:51:40 +08:00
Yixin Luo 882e816598
[Tool] add meta tool to print bundle tablet meta (#60093)
Signed-off-by: luohaha <18810541851@163.com>
2025-06-20 17:35:28 +08:00
Seaven ec24a2943c
[UT] fix case when expr ut error (#60107)
Signed-off-by: Seaven <seaven_7@qq.com>
2025-06-20 14:28:57 +08:00
zhangqiang 3a72bfe807
[Refactor] Refactor some BE log (#56928)
Signed-off-by: sevev <qiangzh95@gmail.com>
2025-06-20 13:49:35 +08:00
yan zhang 5943373fa9
[Enhancement] optimize count(1) for iceberg table (#60022)
Signed-off-by: yan zhang <dirtysalt1987@gmail.com>
2025-06-20 09:55:50 +08:00
trueeyu 54d4c7e365
[Refactor] Remove ObjectCache interface (#59942)
We have abstracted a local cache engine interface, so the object cache interface is no longer needed.

Signed-off-by: trueeyu <lxhhust350@qq.com>
2025-06-19 20:53:13 +08:00
Yixin Luo 3273d9ac10
[Refactor] rename aggregate/shared prefix to bundle (#60057)
Signed-off-by: luohaha <18810541851@163.com>
2025-06-19 10:30:36 +00:00
Rohit Satardekar d985de00f3
[BugFix] BE crash when invoke list_rowsets() in non shared mode (#57462)
Fixes #57461

mysql> select * from TABLE(list_rowsets(24015, 10));
ERROR 1064 (HY000): Only works for tablets in the cloud-native table: BE:11001

Signed-off-by: Rohit Satardekar <rohitrs1983@gmail.com>
2025-06-19 16:09:45 +08:00
wyb 96340a0cef
[BugFix] Fix BE crash caused by filesystem is_symlink exception (#60028)
Signed-off-by: wyb <wybb86@gmail.com>
2025-06-19 15:22:00 +08:00
stephen 52db81f79f
[Enhancement] Optimize int256 division implement (#59892)
Signed-off-by: stephen <stephen5217@163.com>
2025-06-19 15:14:03 +08:00