[BugFix] Fix the thread safety issue caused by concurrent initialization of column evaluator by multiple iceberg partition writers. (#63598)

Why I'm doing:
Now we initialize the column evaluator in the partition writer, which cause the same evaluator instance may be modified by different threads.

What I'm doing:
Remove the unnecessary evaluator initialization in the iceberg partition writer because it has been initialized in partition chunk writer factory.

Also, we check the file writer and create it if needed before writing file data, because it may has been reset by the previous commit operation.

Fixes #issue

Signed-off-by: GavinMar <yangguansuo@starrocks.com>
This commit is contained in:
Gavin 2025-09-26 20:36:35 -07:00 committed by GitHub
parent ee93eaef39
commit a9ed6d35e4
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 2 additions and 5 deletions

View File

@ -73,10 +73,10 @@ Status BufferPartitionChunkWriter::init() {
}
Status BufferPartitionChunkWriter::write(Chunk* chunk) {
RETURN_IF_ERROR(create_file_writer_if_needed());
if (_file_writer->get_written_bytes() >= _max_file_size) {
if (_file_writer && _file_writer->get_written_bytes() >= _max_file_size) {
commit_file();
}
RETURN_IF_ERROR(create_file_writer_if_needed());
return _file_writer->write(chunk);
}
@ -120,9 +120,6 @@ Status SpillPartitionChunkWriter::init() {
RETURN_IF_ERROR(_load_spill_block_mgr->init());
_load_chunk_spiller = std::make_unique<LoadChunkSpiller>(_load_spill_block_mgr.get(),
_fragment_context->runtime_state()->runtime_profile());
if (_column_evaluators) {
RETURN_IF_ERROR(ColumnEvaluator::init(*_column_evaluators));
}
return Status::OK();
}