starrocks/tools/stream_load
Rohit Satardekar 9cf9facc18
[Doc] stream load tool readme broken link (#52069)
Signed-off-by: Rohit Satardekar <rohitrs1983@gmail.com>
2024-10-18 09:58:00 +08:00
..
README.md [Doc] stream load tool readme broken link (#52069) 2024-10-18 09:58:00 +08:00
StreamLoadImportUtils.java [License] Change Elastic License to Apache License 2.0 (#14748) 2022-12-06 17:11:17 -08:00
stream-load-import.sh [Refactor] change the permission mode of tool script file (#17053) 2023-01-31 11:44:01 +08:00

README.md

stream load import tools ( multi-threading parallel ) , expect memory use in 500M no matter file size

Use

First , you need to install jdk , and run the script

./stream-load-import.sh --url=http://{fe_ip}:{fe_http_port}/api/{database_name}/{table_name}/_stream_load \
--source_file=/file/path/name.csv \
--H=column_separator:, \
--u=sr_username:sr_password

Params

Necessary:

  • --source_file: the absolute path of import file
  • --url: the fe url , it should contain protocol and so on, not be redirect url
  • --u: the starrocks database user and password ,not server

Optional:

  • --enable_debug: default false ,--enable_debug=true
  • --max_threads: parallel thread number , default min(server_core_number,32) ,--max_threads=16
  • --timeout: http protocol connect timeout and process timeout , default 60*1000ms, for example 5s --timeout=5000
  • --queue_size: memory queue limit size , default 256 , not reset unless you have a lot of memory and you need reset -Xmx
  • --H: http request header , for example --H=column_separator:,,column_separator as key,,as value

Other:

Stream Load

Warn

When appear error a certain thread, other normal thread transaction will not rollback

Currently, you need ensure your file not contain error data,and can clear table

We are realizing union transaction , it can resolve this problem