all upper cases to CMAKE ON/OFF switch
update thirdparty tarball
less verbosity of wget downloading archives
less verbosity on aws cpp sdk unzip
less verbosity on extracting archives
Signed-off-by: Kevin Cai <kevin.cai@celerdata.com>
we are using and older version of dump_syms that doesn’t understand the .relr.dyn section
Fixes#60004
update breakpad to version breakpad-2024.02.16
$ bash thirdparty/minidump/gen_minidump_symbols.sh
starrocks_be'size (2153mb) reduced to (556mb)
symbol file is at /home/ubuntu/starrocks/output/be/symbols/starrocks_be/AC7FAB7F8B5BF82100000000000000000
Signed-off-by: Rohit Satardekar <rohitrs1983@gmail.com>
## Why I'm doing:
arm is slower than x86 in some cases
## What I'm doing:
1. vectorize rf's insert_hash using Neon intrinsics
2. streamvbyte's cmakelist is wrong, which cause performance downgrade in arm because vectorization cannot work properly
3. arm's int128_mul_overflow is super slow becase of divide operation, __builtin_mul_overflow(int128_t a, int128_t b, int128_t* c) is fast enough when compile with gcc. But gcc's __builtin_mul_overflow is at least 5 times faster then clang in arm, we already reported it to the community: https://github.com/llvm/llvm-project/issues/123262. So we still use gcc as default compiler and use __builtin_mul_overflow to replace original int128_mul_overflow implementation
4. arm's cast int128 to double is super slow in arm with gcc because the bad implementation of __floattidf, clang runtime-rt's implementation is 20 times faster then gcc, so I used clang compiler-rt's implementation to replace gcc's version
after this pr, arm is faster then gcc in the most of cases.
```
| Query | arm-opt | x86 |
|---------|--------|--------|
| QUERY01 | 36 | 61 |
| QUERY02 | 39 | 62 |
| QUERY14 | 1510 | 1514 |
| QUERY15 | 1407 | 1496 |
| QUERY17 | 21 | 88 |
| QUERY20 | 151 | 279 |
| QUERY21 | 1526 | 1529 |
| QUERY24 | 1399 | 1504 |
| QUERY26 | 32 | 122 |
| QUERY27 | 1493 | 1519 |
| QUERY90 | 3399 | 4030 |
| QUERY97 | 3859 | 4776 |
| QUERY98 | 2763 | 3208 |
| QUERY99 | 868 | 1259 |
```
Signed-off-by: before-Sunrise <unclejyj@gmail.com>
fix build failure for clang-18+gcc13
fix thrift definition, define the message before refering it
patching rocksdb to fix the forward declaration of incomplete type
Signed-off-by: Kevin Xiaohua Cai <caixiaohua@starrocks.com>
Why I'm doing:
Some third-party packages are not compiled in pic mode result in it can not be used in shared library.
What I'm doing:
compile these package using pic mode
Signed-off-by: dujunling <dujunling@bytedance.com>
Fixes#32772
If the parquet file is generated by hive, the key of map may be optional, which is not allowed by arrow.
This PR adds the patch to arrow to remove the limit.
Fixes#23606, where memory leaks are introduced by
`_chunk_writer->release()`
The modification on Arrow is to make sure
`RowGroupSerializer::column_writers_` is cleaned up even if exception
throws.
You may refer https://github.com/apache/arrow/pull/35520 for more
details.
Signed-off-by: Letian Jiang <letian.jiang@outlook.com>
performance case:
```
SELECT max(REGEXP_REPLACE(Referer, '^https?://(?:www\.)?([^/]+)/.*$', '\1')) AS k, COUNT(*) AS c FROM hits where Referer<>'';
```
baseline: 3.55
after-upgrade-re2: 2.75
update-re2 and optimize allocate: 2.12