158 lines
5.9 KiB
Plaintext
158 lines
5.9 KiB
Plaintext
# StarRocks Project Cursor Rules
|
|
|
|
## Project Overview
|
|
StarRocks is an open-source, high-performance analytical database system designed for real-time analytics. This is a large-scale C++/Java project with a complex build system.
|
|
|
|
## ⚠️ IMPORTANT BUILD SYSTEM WARNING
|
|
**DO NOT attempt to build or run unit tests (UT) for this project unless explicitly requested by the user.**
|
|
The build system is extremely resource-intensive and time-consuming. Building the full project can take hours and requires significant system resources.
|
|
|
|
## Code Organization
|
|
|
|
### Backend (be/)
|
|
**Language**: C++
|
|
**Purpose**: Core analytical engine and storage layer
|
|
- `be/src/exec/` - Query execution engine components
|
|
- `be/src/storage/` - Storage engine and data persistence
|
|
- `be/src/exprs/` - Expression evaluation and JIT compilation
|
|
- `be/src/formats/` - Data format parsers and serializers
|
|
- `be/src/runtime/` - Runtime components (batch write, stream load, memory management, etc.)
|
|
- `be/src/connector/` - External data source connectors
|
|
- `be/src/service/` - Core backend services
|
|
- `be/src/common/` - Common utilities and shared code
|
|
|
|
**📋 See `be/.cursorrules` for detailed backend component breakdown**
|
|
|
|
### Frontend (fe/)
|
|
**Language**: Java
|
|
**Purpose**: SQL parsing, query planning, and metadata management
|
|
- `fe/fe-core/` - Core frontend services (SQL parser, planner, catalog)
|
|
- `fe/fe-testing/` - Common test utilities
|
|
- `fe/fe-utils/` - Common utilities and helpers
|
|
- `fe/spark-dpp/` - Spark data preprocessing integration
|
|
- `fe/hive-udf/` - Hive UDF compatibility layer
|
|
|
|
**📋 See `fe/.cursorrules` for detailed frontend component breakdown**
|
|
|
|
### Java Extensions (java-extensions/)
|
|
**Language**: Java
|
|
**Purpose**: External connectors and extensions
|
|
- `java-extensions/hive-reader/` - Hive data reader
|
|
- `java-extensions/iceberg-metadata-reader/` - Apache Iceberg metadata reader
|
|
- `java-extensions/hudi-reader/` - Apache Hudi integration
|
|
- `java-extensions/paimon-reader/` - Apache Paimon reader
|
|
- `java-extensions/jdbc-bridge/` - JDBC connectivity bridge
|
|
- `java-extensions/hadoop-ext/` - Hadoop ecosystem integration
|
|
- `java-extensions/udf-extensions/` - UDF extension framework
|
|
- `java-extensions/common-runtime/` - Common runtime for Java extensions
|
|
|
|
**📋 See `java-extensions/.cursorrules` for detailed extensions breakdown**
|
|
|
|
### Generated Sources (gensrc/)
|
|
**Purpose**: Auto-generated code from IDL definitions
|
|
- `gensrc/proto/` - Protocol buffer definitions
|
|
- `gensrc/thrift/` - Thrift interface definitions
|
|
- `gensrc/script/` - Code generation scripts
|
|
|
|
### Testing (test/)
|
|
**Language**: Python
|
|
**Purpose**: Integration and SQL testing framework
|
|
- `test/sql/` - SQL test cases organized by functionality
|
|
- `test/common/` - Common test utilities
|
|
- `test/lib/` - Test libraries and helpers
|
|
|
|
### Tools and Utilities
|
|
- `tools/` - Diagnostic tools, benchmarks, and utilities
|
|
- `bin/` - Binary executables and scripts
|
|
- `conf/` - Configuration files and templates
|
|
- `build-support/` - Build system support files
|
|
- `docker/` - Docker build configurations
|
|
- `docs/` - Project documentation
|
|
|
|
### Third-party Dependencies
|
|
- `thirdparty/` - External dependencies and patches
|
|
- `licenses/` - License files for dependencies
|
|
|
|
### Other Important Directories
|
|
- `fs_brokers/` - File system broker implementations
|
|
- `webroot/` - Web UI static files
|
|
- `format-sdk/` - Format SDK for data interchange
|
|
|
|
## Development Guidelines
|
|
|
|
1. **No Building**: Avoid running build commands (`build.sh`, `make`, etc.) unless specifically requested
|
|
2. **No Unit Tests**: Do not execute unit test scripts (`run-be-ut.sh`, `run-fe-ut.sh`, etc.)
|
|
3. **Focus on Code Analysis**: Prioritize code reading, analysis, and small targeted changes
|
|
4. **Language Awareness**:
|
|
- Backend (be/) is C++ - focus on performance and memory management
|
|
- Frontend (fe/) is Java - focus on SQL parsing and query planning
|
|
- Tests are Python - focus on SQL correctness and integration testing
|
|
|
|
## Pull Request Guidelines
|
|
|
|
### PR Title Format
|
|
PR titles must include a prefix to categorize the change:
|
|
|
|
- **[BugFix]** - Bug fixes and error corrections
|
|
- **[Enhancement]** - Improvements to existing functionality
|
|
- **[Feature]** - New features and capabilities
|
|
- **[Refactor]** - Code refactoring without functional changes
|
|
- **[Test]** - Test-related changes
|
|
- **[Doc]** - Documentation updates
|
|
- **[Build]** - Build system and CI/CD changes
|
|
- **[Performance]** - Performance optimizations
|
|
|
|
**Examples:**
|
|
- `[BugFix] Fix memory leak in column batch processing`
|
|
- `[Feature] Add support for Apache Paimon connector`
|
|
- `[Enhancement] Improve query optimizer for materialized views`
|
|
|
|
### Commit Message Template
|
|
Follow this structured format for all commit messages:
|
|
|
|
```
|
|
[Category] Brief description (50 chars or less)
|
|
|
|
Detailed explanation of what this commit does and why.
|
|
Wrap lines at 72 characters.
|
|
|
|
- Key change 1
|
|
- Key change 2
|
|
- Key change 3
|
|
|
|
Fixes: #issue_number (if applicable)
|
|
Closes: #issue_number (if applicable)
|
|
```
|
|
|
|
**Categories:** BugFix, Enhancement, Feature, Refactor, Test, Doc, Build, Performance
|
|
|
|
**Example:**
|
|
```
|
|
[Feature] Add Apache Iceberg table format support
|
|
|
|
Implement Iceberg connector to enable querying Iceberg tables
|
|
directly from StarRocks. This includes metadata reading,
|
|
partition pruning, and schema evolution support.
|
|
|
|
- Add IcebergConnector and IcebergMetadata classes
|
|
- Implement partition and file pruning optimizations
|
|
- Support for Iceberg v1 and v2 table formats
|
|
- Add comprehensive unit tests
|
|
|
|
Closes: #12345
|
|
```
|
|
|
|
## Common File Extensions
|
|
- `.cpp`, `.h`, `.cc` - C++ source and headers (backend)
|
|
- `.java` - Java source files (frontend and extensions)
|
|
- `.proto` - Protocol buffer definitions
|
|
- `.thrift` - Thrift interface definitions
|
|
- `.sql` - SQL test cases and queries
|
|
- `.py` - Python test scripts
|
|
|
|
## Build System Files to Avoid
|
|
- `build.sh` - Main build script (very resource intensive)
|
|
- `build-in-docker.sh` - Docker-based build
|
|
- `run-*-ut.sh` - Unit test runners
|
|
- `Makefile*` - Make build files
|
|
- `pom.xml` - Maven build files (for Java components) |