first commit

This commit is contained in:
j 2024-10-23 18:00:06 +08:00
commit fe17965342
77 changed files with 3207 additions and 0 deletions

161
.gitignore vendored Normal file
View File

@ -0,0 +1,161 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
/.idea

261
README.md Normal file
View File

@ -0,0 +1,261 @@
## Gudu SQLFlow Lite version for python
[Gudu SQLFlow](https://sqlflow.gudusoft.com) is a tool used to analyze SQL statements and stored procedures
of various databases to obtain complex [data lineage](https://en.wikipedia.org/wiki/Data_lineage) relationships and visualize them.
[Gudu SQLFlow Lite version for python](https://github.com/sqlparser/python_data_lineage) allows Python developers to quickly integrate data lineage analysis and
visualization capabilities into their own Python applications. It can also be used in daily work by data scientists to quickly discover
data lineage from complex SQL scripts that usually used in ETL jobs do the data transform in a huge data platform.
Gudu SQLFlow Lite version for python is free for non-commercial use and can handle any complex SQL statements
with a length of up to 10k, including support for stored procedures. It supports SQL dialect from more than
20 major database vendors such as Oracle, DB2, Snowflake, Redshift, Postgres and so on.
Gudu SQLFlow Lite version for python includes [a Java library](https://www.gudusoft.com/sqlflow-java-library-2/) for analyzing complex SQL statements and
stored procedures to retrieve data lineage relationships, [a Python file](https://github.com/sqlparser/python_data_lineage/blob/main/dlineage.py) that utilizes jpype to call the APIs
in the Java library, and [a JavaScript library](https://docs.gudusoft.com/4.-sqlflow-widget/get-started) for visualizing data lineage relationships.
Gudu SQLFlow Lite version for python can also automatically extract table and column constraints,
as well as relationships between tables and fields, from [DDL scripts exported from the database](https://docs.gudusoft.com/6.-sqlflow-ingester/introduction)
and generate an ER Diagram.
### Automatically visualize data lineage
By executing this command:
```
python dlineage.py /t oracle /f test.sql /graph
```
We can automatically obtain the data lineage relationships contained in the following Oracle SQL statement.
```sql
CREATE VIEW vsal
AS
SELECT a.deptno "Department",
a.num_emp / b.total_count "Employees",
a.sal_sum / b.total_sal "Salary"
FROM (SELECT deptno,
Count() num_emp,
SUM(sal) sal_sum
FROM scott.emp
WHERE city = 'NYC'
GROUP BY deptno) a,
(SELECT Count() total_count,
SUM(sal) total_sal
FROM scott.emp
WHERE city = 'NYC') b
;
INSERT ALL
WHEN ottl < 100000 THEN
INTO small_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 100000 and ottl < 200000 THEN
INTO medium_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 200000 THEN
into large_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 290000 THEN
INTO special_orders
SELECT o.order_id oid, o.customer_id cid, o.order_total ottl,
o.sales_rep_id sid, c.credit_limit cl, c.cust_email cem
FROM orders o, customers c
WHERE o.customer_id = c.customer_id;
```
And visualize it as:
![Oracle data lineage sample](samples/images/oracle_data_lineage.png)
### Oracle PL/SQL Data Lineage
```
python dlineage.py /t oracle /f samlples/oracle_plsql.sql /graph
```
![Oracle PL/SQL data lineage sample](samples/images/oracle_plsql_data_lineage.png)
The [source code of this sample Oracle PL/SQL](samples/oracle_plsql.sql).
### Able to analyze dynamic SQL to get data lineage (Postgres stored procedure)
```sql
CREATE OR REPLACE FUNCTION t.mergemodel(_modelid integer)
RETURNS void
LANGUAGE plpgsql
AS $function$
BEGIN
EXECUTE format ('INSERT INTO InSelections
SELECT * FROM AddInSelections_%s', modelid);
END;
$function$
```
![Postgres stored procedure data lineage sample](samples/images/postgresql_plsql_data_lineage.png)
### Nested CTE with star columns (Snowflake SQL sample)
```
python dlineage.py /t snowflake /f samlples/snowflake_nested_cte.sql /graph
```
![Snowflake nested CTE data lineage sample](samples/images/snowflake_nested_cte_data_lineage.png)
The [snowflake SQL source code of this sample](samples/snowflake_nested_cte.sql).
### Analyze DDL and automatically draw an ER Diagram.
By executing this command:
```
python dlineage.py /t sqlserver /f samples/sqlserver_er.sql /graph /er
```
We can automatically obtain the ER Diagram of the following SQL Server database:
![SQL Sever ER Diagram sample](samples/images/sqlserver_er_diagram.png)
The [DDL script of the above ER diagram is here](samples/sqlserver_er.sql).
## Try your own SQL scripts
You may try more SQL scripts in your own computer without any internet connection by cloning [this python data lineage repo](https://github.com/sqlparser/python_data_lineage)
```shell
git clone https://github.com/sqlparser/python_data_lineage.git
```
- No database connection is needed.
- No internet connection is needed.
You only need a JDK and a python interpreter to run the Gudu SQLFlow lite version for python.
### Step 1: Prerequisites
* Install python3
* Install Java jdk1.8 (openJdk-8 is recommended)
Command used to check java version:
`java -version`
If the Java is not installed, exexute this command:
`sudo apt install openjdk-8-jdk`
If this error occurs:
`Unable to locate package openjdk-8-jdk`
Please execute the following commands:
```
sudo add-apt-repository ppa:openjdk-r/ppa
apt-get update
sudo apt install openjdk-8-jdk
```
### Step 2: Open the web service
Switch to the widget directory of this project and execute the following command to start the web service:
`python -m http.server 8000`
Open the following URL in a web browser to verify if the startup was successfulhttp://localhost:8000/
Note: If you want to modify the port 8000, you need to modify the widget_server_url in dlineage.py accordingly.
### step 3 Execute the python script
Open a new command window, switch to the root directory of this project, where the dlineage.py file is located, and execute the following command:
`python dlineage.py /t oracle /f test.sql /graph`
This command will perform data lineage analysis on test.sql and open a web browser page to display the results of the analysis in a graphical result.
Explanations of the command-line parameters supported by dlineage.py:
/t: Required, specify the database type
The valid value: access,bigquery,couchbase,dax,db2,greenplum, gaussdb, hana,hive,impala,informix,mdx,mssql,
sqlserver,mysql,netezza,odbc,openedge,oracle,postgresql,postgres,redshift,snowflake,
sybase,teradata,soql,vertica
the default value is oracle
/f: optional, The SQL file that needs to be processed, if this option is not specified, /d must be speicified.
/d: optional, All SQL files under this directory will be processed.
/j: optional, The analyzed result will include the join relationship.
/s: optional, Ignore the intermediate results of the output data lineage.
/topselectlist: optional, output the column in select list. this option valid only /s is specified.
/withTemporaryTable: optional, only valid use with /s option, including the data lineage of temporary table used in the SQL.
/i: optional, this option work almost the same as /s option, but will keep the data lineage generated by function call.
/if: optional, keep all the intermediate result in the output data lineage, but remove the result derived from function call.
/ic: optional, ignore the coordinate in the output.
/lof: optional, if a column in the SQL is not qualifiey with a table name, and multiple tables are used in the from clause, then, the column will be linked to the first table in from clause.
/traceView: optional, only list source table and view, ignore all intermediate result.
/json: optional, ouput in json format.
/tableLineage [/csv /delimiter]: optional, only output data lineage at table level.
/csv: optional, output the data lineage in CSV format.
/delimiter: optional, specify the separate character used in CSV output.
/env: optional, specify a metadata.json to provide the metadata that can be used during SQL analysis.
/transform: optional, includind the code that do the transform.
/coor: optional, whether including the coordinate in the output.
/defaultDatabase: optional, specify a default database.
/defaultSchema: optional, specify a default schema.
/showImplicitSchema: optional, Display the schema information inferred from the SQL statement.
/showConstant: optional, whether show constant.
/treatArgumentsInCountFunctionAsDirectDataflow: optional,treate column used in count function as a direct dataflow.
/filterRelationTypes: optional, supported types: fddfdrjoincallerseperated by comma if multiple values are specified.
/graph: optional, automatically open web browser to show the data lineage diagram.
/er: optional, automatically open web browser to show the ER diagram.
### Export metadata from various databases.
You can export metadata from the database using [SQLFlow ingester](https://github.com/sqlparser/sqlflow_public/releases)
and hand it over to Gudu SQLFlow for data lineage analysis.。
[Document of the SQLFlow ingester](https://docs.gudusoft.com/6.-sqlflow-ingester/introduction)
## Trouble shooting
### 1. SystemError: java.lang.ClassNotFoundException: org.jpype.classloader.DynamicClassLoader
```
Traceback (most recent call last):
File "/home/grq/python_data_lineage/dlineage.py", line 231, in <module>
call_dataFlowAnalyzer(args)
File "/home/grq/python_data_lineage/dlineage.py", line 20, in call_dataFlowAnalyzer
jpype.startJVM(jvm, "-ea", jar)
File "/usr/lib/python3/dist-packages/jpype/_core.py", line 224, in startJVM
_jpype.startup(jvmpath, tuple(args),
SystemError: java.lang.ClassNotFoundException: org.jpype.classloader.DynamicClassLoader
```
This problem is related to python3 jpype on ubuntu system. It seems that org.jpype.jar file is missing under /usr/lib/python3/dist-packages/
just copy org.jpype.jar to /usr/lib/python3/dist-packages/
```
cp /usr/share/java/org.jpype.jar /usr/lib/python3/dist-packages/org.jpype.jar
```
## Contact
For further information, please contact support@gudusoft.com

247
README_cn.md Normal file
View File

@ -0,0 +1,247 @@
## Gudu SQLFlow Lite version for python
[Gudu SQLFlow](https://sqlflow.gudusoft.com) 是一款用来分析各种数据库的 SQL 语句和存储过程来获取复杂的数据血缘关系并进行可视化的工具。
Gudu SQLFlow Lite version for python 可以让 python 开发者把数据血缘分析和可视化能力快速集成到他们自己的 python 应用中。
Gudu SQLFlow Lite version for python 对非商业用途来说是免费的,它可以处理 10k 长度以下的任意复杂的 SQL 语句,包含对存储过程的支持。
Gudu SQLFlow Lite version for python 包含一个 Java 类库,通过分析复杂的 SQL 语句和存储过程来获取数据血缘关系,一个 python 文件,
通过 jpype 来调用 Java 类库中的 API 一个 Javascript 库,用来可视化数据血缘关系。
Gudu SQLFlow Lite version for python 还可以自动从数据库中导出的 DDL 脚本中获取表和表,字段和字段间的约束关系,画出 ER Diagram.
### 自动可视化数据血缘关系
通过执行这条命令,
```
python dlineage.py /t oracle /f test.sql /graph
```
我们可以自动获得下面这个 Oracle SQL 语句包含的数据血缘关系
```sql
CREATE VIEW vsal
AS
SELECT a.deptno "Department",
a.num_emp / b.total_count "Employees",
a.sal_sum / b.total_sal "Salary"
FROM (SELECT deptno,
Count() num_emp,
SUM(sal) sal_sum
FROM scott.emp
WHERE city = 'NYC'
GROUP BY deptno) a,
(SELECT Count() total_count,
SUM(sal) total_sal
FROM scott.emp
WHERE city = 'NYC') b
;
INSERT ALL
WHEN ottl < 100000 THEN
INTO small_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 100000 and ottl < 200000 THEN
INTO medium_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 200000 THEN
into large_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 290000 THEN
INTO special_orders
SELECT o.order_id oid, o.customer_id cid, o.order_total ottl,
o.sales_rep_id sid, c.credit_limit cl, c.cust_email cem
FROM orders o, customers c
WHERE o.customer_id = c.customer_id;
```
并可视化为:
![Oracle data lineage sample](samples/images/oracle_data_lineage.png)
### Oracle PL/SQL Data Lineage
```
python dlineage.py /t oracle /f samlples/oracle_plsql.sql /graph
```
![Oracle PL/SQL data lineage sample](samples/images/oracle_plsql_data_lineage.png)
The [source code of this sample Oracle PL/SQL](samples/oracle_plsql.sql).
### Able to analyze dynamic SQL to get data lineage (Postgres stored procedure)
```sql
CREATE OR REPLACE FUNCTION t.mergemodel(_modelid integer)
RETURNS void
LANGUAGE plpgsql
AS $function$
BEGIN
EXECUTE format ('INSERT INTO InSelections
SELECT * FROM AddInSelections_%s', modelid);
END;
$function$
```
![Postgres stored procedure data lineage sample](samples/images/postgresql_plsql_data_lineage.png)
### Nested CTE with star columns (Snowflake SQL sample)
```
python dlineage.py /t snowflake /f samlples/snowflake_nested_cte.sql /graph
```
![Snowflake nested CTE data lineage sample](samples/images/snowflake_nested_cte_data_lineage.png)
The [snowflake SQL source code of this sample](samples/snowflake_nested_cte.sql).
### 分析 DDL, 自动画出 ER Diagram
通过执行这条命令,
```
python dlineage.py /t sqlserver /f samples/sqlserver_er.sql /graph /er
```
我们可以自动获得下面这个 SQL Server 数据库的 ER Diagram.
![SQL Sever ER Diagram sample](samples/images/sqlserver_er_diagram.png)
The [DDL script of the above ER diagram is here](samples/sqlserver_er.sql).
## Try your own SQL scripts
You may try more SQL scripts in your own computer without any internet connection by cloning [this python data lineage repo](https://github.com/sqlparser/python_data_lineage)
```shell
git clone https://github.com/sqlparser/python_data_lineage.git
```
- No database connection is needed.
- No internet connection is needed.
You only need a JDK and a python interpreter to run the Gudu SQLFlow lite version for python.
### step 1 环境准备
* 安装python3
安装完python3后还需要安装python依赖组件jpype。
* 安装 java jdk 要求jdk1.8及以上版本
以ubuntu操作系统下安装为例
检查jdk版本`java -version`。
如果未安装或版本小于1.8则需要安装jdk1.8
`sudo apt install openjdk-8-jdk`
如果报错:
`Unable to locate package openjdk-8-jdk`
则执行以下命令安装:
```
sudo add-apt-repository ppa:openjdk-r/ppa
apt-get update
sudo apt install openjdk-8-jdk
```
### step 2 打开web服务
切换到本项目widget目录执行以下命令启动web服务
`python -m http.server 8000`
浏览器内打开以下网址验证是否启动成功http://localhost:8000/
注意如果要修改8000端口需要同时在dlineage.py里修改widget_server_url
### step 3 执行python脚本
切换到本项目根目录即dlineage.py所在目录执行以下命令
`python dlineage.py /f test.sql /graph`
此命令会将test.sql进行血缘分析并打开一个浏览器页面图形化方式展示血缘分析结果。
dlineage.py 支持的命令参数说明:
/f: 可选, sql文件.
/d: 可选, 包含sql文件的文件夹路径.
/j: 可选, 返回包含join关系的结果.
/s: 可选, 简单输出,忽略中间结果.
/topselectlist: 可选, 简单输出,包含最顶端的输出结果.
/withTemporaryTable: 可选, 简单输出,包含临时表.
/i: 可选, 与/s选项相同但将保留SQL函数生成的结果集此参数将与/s/topselectlist+keep SQL函数生成结果集具有相同的效果。
/showResultSetTypes: 可选, 带有指定结果集类型的简单输出,用逗号分隔, 结果集类型有: array, struct, result_of, cte, insert_select, update_select, merge_update, merge_insert, output, update_set pivot_table, unpivot_table, alias, rs, function, case_when
/if: 可选, 保留所有中间结果集,但删除 SQL 函数生成的结果集。
/ic: 可选, 忽略输出中的坐标.
/lof: 必选, 将孤立列链接到第一个表.
/traceView: 可选,只输出源表和视图的名称,忽略所有中间数据.
/text: 可选, 如果只使用/s 选项,则在文本模式下输出列依赖项.
/json: 可选, 打印json格式输出.
/tableLineage [/csv /delimiter]: 可选, 输出表级血缘关系.
/csv: 可选, 输出csv格式的列一级的血缘关系.
/delimiter: 可选, 输出csv格式的分隔符.
/t: 必选, 指定数据库类型.
支持 access,bigquery,couchbase,dax,db2,greenplum, gaussdb, hana,hive,impala,informix,mdx,mssql,
sqlserver,mysql,netezza,odbc,openedge,oracle,postgresql,postgres,redshift,snowflake,
sybase,teradata,soql,vertica the default value is oracle
/env: 可选, 指定一个 metadata.json 来获取数据库元数据信息.
/transform: 可选, 输出关系转换码.
/coor: 可选, 输出关系转换坐标,但不输出代码.
/defaultDatabase: 可选, 指定默认database.
/defaultSchema: 可选, 指定默认schema.
/showImplicitSchema: 可选, 显示间接schema.
/showConstant: 可选, 显示常量.
/treatArgumentsInCountFunctionAsDirectDataflow: 可选,将 count 函数中的参数视为直接数据流.
/filterRelationTypes: 可选, 过滤关系类型,支持 fddfdrjoincaller如果有多个关系类型用英文半角逗号分隔.
/graph: 可选, 打开一个浏览器页面,图形化方式展示血缘分析结果
/er: 可选, 打开一个浏览器页面图形化方式展示ER图
### 从各种数据库中导出元数据
[SQLFlow ingester](https://github.com/sqlparser/sqlflow_public/releases) 可以中数据库中导出元数据,交给 Gudu SQLFlow 进行数据血缘分析。
SQLFlow ingester 的[使用文档](https://docs.gudusoft.com/6.-sqlflow-ingester/introduction)
### Trobule shooting
#### 1.脚本执行报错SystemError: java.lang.ClassNotFoundException: org.jpype.classloader.DynamicClassLoader
```
Traceback (most recent call last):
File "/home/grq/python_data_lineage/dlineage.py", line 231, in <module>
call_dataFlowAnalyzer(args)
File "/home/grq/python_data_lineage/dlineage.py", line 20, in call_dataFlowAnalyzer
jpype.startJVM(jvm, "-ea", jar)
File "/usr/lib/python3/dist-packages/jpype/_core.py", line 224, in startJVM
_jpype.startup(jvmpath, tuple(args),
SystemError: java.lang.ClassNotFoundException: org.jpype.classloader.DynamicClassLoader
```
这个问题在ubuntu系统预装的python3 jpype环境中常见原因是在/usr/lib/python3/dist-packages/目录下缺少org.jpype.jar。
需要将org.jpype.jar 复制到/usr/lib/python3/dist-packages/目录下。
```
cp /usr/share/java/org.jpype.jar /usr/lib/python3/dist-packages/org.jpype.jar
```

283
dlineage.py Normal file
View File

@ -0,0 +1,283 @@
# python3
import os
import webbrowser
import jpype
import sys
def get_file_character_count(file_path):
character_count = 0
with open(file_path, "r") as file:
try:
content = file.read()
character_count = len(content)
except:
print(file_path + " is not a text file.")
return character_count
def get_all_files(folder_path):
all_files = []
for root, dirs, files in os.walk(folder_path):
for file in files:
file_path = os.path.join(root, file)
all_files.append(file_path)
return all_files
def get_text_files_character_count(folder_path):
text_files = get_all_files(folder_path)
total_character_count = 0
for file_path in text_files:
character_count = get_file_character_count(file_path)
total_character_count += character_count
return total_character_count
def indexOf(args, arg):
try:
return args.index(arg)
except:
return -1
def save_to_file(file_name, contents):
fh = open(file_name, 'w')
fh.write(contents)
fh.close()
def call_dataFlowAnalyzer(args):
# Start the Java Virtual Machine (JVM)
widget_server_url = "http://localhost:8000"
jvm = jpype.getDefaultJVMPath()
jar = "-Djava.class.path=jar/gudusoft.gsqlparser-2.8.5.8.jar"
jpype.startJVM(jvm, "-ea", jar)
try:
TGSqlParser = jpype.JClass("gudusoft.gsqlparser.TGSqlParser")
DataFlowAnalyzer = jpype.JClass("gudusoft.gsqlparser.dlineage.DataFlowAnalyzer")
ProcessUtility = jpype.JClass("gudusoft.gsqlparser.dlineage.util.ProcessUtility")
JSON = jpype.JClass("gudusoft.gsqlparser.util.json.JSON")
XML2Model = jpype.JClass("gudusoft.gsqlparser.dlineage.util.XML2Model")
RemoveDataflowFunction = jpype.JClass("gudusoft.gsqlparser.dlineage.util.RemoveDataflowFunction")
File = jpype.JClass("java.io.File")
EDbVendor = jpype.JClass("gudusoft.gsqlparser.EDbVendor")
vendor = EDbVendor.dbvoracle
index = indexOf(args, "/t")
if index != -1 and len(args) > index + 1:
vendor = TGSqlParser.getDBVendorByName(args[index + 1])
if indexOf(args, "/version") != -1:
print("Version: " + DataFlowAnalyzer.getVersion())
print("Release Date: " + DataFlowAnalyzer.getReleaseDate())
return
if indexOf(args, "/f") != -1 and len(args) > indexOf(args, "/f") + 1:
sqlFiles = File(args[indexOf(args, "/f") + 1])
if not sqlFiles.exists() or not sqlFiles.isFile():
print(args[indexOf(args, "/f") + 1] + " is not a valid file.")
return
character_count = get_file_character_count(args[indexOf(args, "/f") + 1])
if character_count > 10000:
print("SQLFlow lite version only supports processing SQL statements with a maximum of 10,"
"000 characters. If you need to process SQL statements without length restrictions, "
"please contact support@gudusoft.com for more information.")
return
elif indexOf(args, "/d") != -1 and len(args) > indexOf(args, "/d") + 1:
sqlFiles = File(args[indexOf(args, "/d") + 1])
if not sqlFiles.exists() or not sqlFiles.isDirectory():
print(args[indexOf(args, "/d") + 1] + " is not a valid directory.")
return
character_count = get_text_files_character_count(args[indexOf(args, "/d") + 1])
if character_count > 10000:
print("SQLFlow lite version only supports processing SQL statements with a maximum of 10,"
"000 characters. If you need to process SQL statements without length restrictions, "
"please contact support@gudusoft.com for more information.")
return
else:
print("Please specify a sql file path or directory path to analyze dlineage.")
return
simple = indexOf(args, "/s") != -1
ignoreTemporaryTable = indexOf(args, "/withTemporaryTable") == -1
ignoreResultSets = indexOf(args, "/i") != -1
showJoin = indexOf(args, "/j") != -1
transform = indexOf(args, "/transform") != -1
transformCoordinate = transform and (indexOf(args, "/coor") != -1)
textFormat = False
linkOrphanColumnToFirstTable = indexOf(args, "/lof") != -1
ignoreCoordinate = indexOf(args, "/ic") != -1
showImplicitSchema = indexOf(args, "/showImplicitSchema") != -1
if simple:
textFormat = indexOf(args, "/text") != -1
traceView = indexOf(args, "/traceView") != -1
if traceView:
simple = True
jsonFormat = indexOf(args, "/json") != -1
stat = indexOf(args, "/stat") != -1
ignoreFunction = indexOf(args, "/if") != -1
topselectlist = indexOf(args, "/topselectlist") != -1
if indexOf(args, "/s") != -1 and indexOf(args, "/topselectlist") != -1:
simple = True
topselectlist = True
tableLineage = indexOf(args, "/tableLineage") != -1
csv = indexOf(args, "/csv") != -1
delimiter = args.get(indexOf(args, "/delimiter") + 1) if indexOf(args, "/delimiter") != -1 and len(
args) > indexOf(args, "/delimiter") + 1 else ","
if tableLineage:
simple = False
ignoreResultSets = False
sqlenv = None
if indexOf(args, "/env") != -1 and len(args) > indexOf(args, "/env") + 1:
metadataFile = File(args[indexOf(args, "/env") + 1])
if metadataFile.exists():
TJSONSQLEnvParser = jpype.JClass("gudusoft.gsqlparser.sqlenv.parser.TJSONSQLEnvParser")
jsonSQLEnvParser = TJSONSQLEnvParser(None, None, None)
SQLUtil = jpype.JClass("gudusoft.gsqlparser.util.SQLUtil")
envs = jsonSQLEnvParser.parseSQLEnv(vendor, SQLUtil.getFileContent(metadataFile))
if envs != None and envs.length > 0:
sqlenv = envs[0]
dlineage = DataFlowAnalyzer(sqlFiles, vendor, simple)
if sqlenv != None:
dlineage.setSqlEnv(sqlenv)
dlineage.setTransform(transform)
dlineage.setTransformCoordinate(transformCoordinate)
dlineage.setShowJoin(showJoin)
dlineage.setIgnoreRecordSet(ignoreResultSets)
if ignoreResultSets and not ignoreFunction:
dlineage.setSimpleShowFunction(True)
dlineage.setLinkOrphanColumnToFirstTable(linkOrphanColumnToFirstTable)
dlineage.setIgnoreCoordinate(ignoreCoordinate)
dlineage.setSimpleShowTopSelectResultSet(topselectlist)
dlineage.setShowImplicitSchema(showImplicitSchema)
dlineage.setIgnoreTemporaryTable(ignoreTemporaryTable)
if simple:
dlineage.setShowCallRelation(True)
dlineage.setShowConstantTable(indexOf(args, "/showConstant") != -1)
dlineage.setShowCountTableColumn(indexOf(args, "/treatArgumentsInCountFunctionAsDirectDataflow") != -1)
if indexOf(args, "/defaultDatabase") != -1:
dlineage.getOption().setDefaultDatabase(args[indexOf(args, "/defaultDatabase") + 1])
if indexOf(args, "/defaultSchema") != -1:
dlineage.getOption().setDefaultSchema(args[indexOf(args, "/defaultSchema") + 1])
if indexOf(args, "/showResultSetTypes") != -1:
resultSetTypes = args[indexOf(args, "/showResultSetTypes") + 1]
if resultSetTypes is not None:
dlineage.getOption().showResultSetTypes(resultSetTypes.split(","))
if indexOf(args, "/filterRelationTypes") != -1:
dlineage.getOption().filterRelationTypes(args[indexOf(args, "/filterRelationTypes") + 1])
if simple and not jsonFormat:
dlineage.setTextFormat(textFormat)
if indexOf(args, "/er") != -1:
dlineage.getOption().setShowERDiagram(True)
dlineage.generateDataFlow()
dataflow = dlineage.getDataFlow()
DataFlowGraphGenerator = jpype.JClass("gudusoft.gsqlparser.dlineage.graph.DataFlowGraphGenerator")
generator = DataFlowGraphGenerator()
result = generator.genERGraph(vendor, dataflow)
save_to_file("widget/json/erGraph.json", str(result))
webbrowser.open_new(widget_server_url + "/er.html")
return
elif tableLineage:
dlineage.generateDataFlow()
originDataflow = dlineage.getDataFlow()
if csv:
dataflow = ProcessUtility.generateTableLevelLineage(dlineage, originDataflow)
result = ProcessUtility.generateTableLevelLineageCsv(dlineage, originDataflow, delimiter)
else:
dataflow = ProcessUtility.generateTableLevelLineage(dlineage, originDataflow)
if jsonFormat:
model = DataFlowAnalyzer.getSqlflowJSONModel(dataflow, vendor)
result = JSON.toJSONString(model)
else:
result = XML2Model.saveXML(dataflow)
else:
result = dlineage.generateDataFlow()
dataflow = dlineage.getDataFlow()
if csv:
dataflow = dlineage.getDataFlow()
result = ProcessUtility.generateColumnLevelLineageCsv(dlineage, dataflow, delimiter)
elif jsonFormat:
dataflow = dlineage.getDataFlow()
if ignoreFunction:
dataflow = RemoveDataflowFunction().removeFunction(dataflow, vendor)
model = DataFlowAnalyzer.getSqlflowJSONModel(dataflow, vendor)
result = JSON.toJSONString(model)
elif traceView:
dataflow = dlineage.getDataFlow()
result = dlineage.traceView()
elif ignoreFunction and result.trim().startsWith("<?xml"):
dataflow = dlineage.getDataFlow()
dataflow = RemoveDataflowFunction().removeFunction(dataflow, vendor)
result = XML2Model.saveXML(dataflow)
if result != None:
print(result)
if dataflow != None and indexOf(args, "/graph") != -1:
DataFlowGraphGenerator = jpype.JClass("gudusoft.gsqlparser.dlineage.graph.DataFlowGraphGenerator")
generator = DataFlowGraphGenerator()
result = generator.genDlineageGraph(vendor, False, dataflow)
save_to_file("widget/json/lineageGraph.json", str(result))
webbrowser.open_new(widget_server_url)
errors = dlineage.getErrorMessages()
if not errors.isEmpty():
print("Error log:\n")
for err in errors:
print(err.getErrorMessage())
finally:
# Shutdown the JVM when done
jpype.shutdownJVM()
if __name__ == "__main__":
args = sys.argv
if len(args) < 2:
print("Usage: java DataFlowAnalyzer [/f <path_to_sql_file>] [/d <path_to_directory_includes_sql_files>] ["
"/stat] [/s [/topselectlist] [/text] [/withTemporaryTable]] [/i] [/showResultSetTypes "
"<resultset_types>] [/ic] [/lof] [/j] [/json] [/traceView] [/t <database type>] [/o <output file path>] "
"[/version] [/env <path_to_metadata.json>] [/tableLineage [/csv [/delimeter <delimeter>]]] [/transform "
"[/coor]] [/showConstant] [/treatArgumentsInCountFunctionAsDirectDataflow] [/filterRelationTypes "
"<relationTypes>]")
print("/f: Optional, the full path to SQL file.")
print("/d: Optional, the full path to the directory includes the SQL files.")
print("/j: Optional, return the result including the join relation.")
print("/s: Optional, simple output, ignore the intermediate results.")
print("/topselectlist: Optional, simple output with top select results.")
print("/withTemporaryTable: Optional, simple output with the temporary tables.")
print("/i: Optional, the same as /s option, but will keep the resultset generated by the SQL function, "
"this parameter will have the same effect as /s /topselectlist + keep resultset generated by the sql "
"function.")
print("/showResultSetTypes: Optional, simple output with specify resultset types, separate with commas, "
"resultset types contains array, struct, result_of, cte, insert_select, update_select, merge_update, "
"merge_insert, output, update_set,\r\n"
+ " pivot_table, unpivot_table, alias, rs, function, case_when")
print("/if: Optional, keep all the intermediate resultset, but remove the resultset generated by the SQL "
"function")
print("/ic: Optional, ignore the coordinates in the output.")
print("/lof: Option, link orphan column to the first table.")
print("/traceView: Optional, only output the name of source tables and views, ignore all intermedidate data.")
print("/text: Optional, this option is valid only /s is used, output the column dependency in text mode.")
print("/json: Optional, print the json format output.")
print("/tableLineage [/csv /delimiter]: Optional, output tabel level lineage.")
print("/csv: Optional, output column level lineage in csv format.")
print("/delimiter: Optional, the delimiter of output column level lineage in csv format.")
print("/t: Option, set the database type. "
+ "Support access,bigquery,couchbase,dax,db2,greenplum,hana,hive,impala,informix,mdx,mssql,\n"
+ "sqlserver,mysql,netezza,odbc,openedge,oracle,postgresql,postgres,redshift,snowflake,\n"
+ "sybase,teradata,soql,vertica\n, " + "the default value is oracle")
print("/env: Optional, specify a metadata.json to get the database metadata information.")
print("/transform: Optional, output the relation transform code.")
print("/coor: Optional, output the relation transform coordinate, but not the code.")
print("/defaultDatabase: Optional, specify the default schema.")
print("/defaultSchema: Optional, specify the default schema.")
print("/showImplicitSchema: Optional, show implicit schema.")
print("/showConstant: Optional, show constant table.")
print("/treatArgumentsInCountFunctionAsDirectDataflow: Optional, treat arguments in count function as direct "
"dataflow.")
print("/filterRelationTypes: Optional, support fdd, fdr, join, call, er, multiple relatoin types separated by "
"commas")
print("/graph: Optional, Open a browser page to graphically display the results")
print("/er: Optional, Open a browser page and display the ER diagram graphically")
sys.exit(0)
call_dataFlowAnalyzer(args)

1
github/HEAD Normal file
View File

@ -0,0 +1 @@
ref: refs/heads/main

13
github/config Normal file
View File

@ -0,0 +1,13 @@
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
ignorecase = true
precomposeunicode = true
[remote "origin"]
url = https://github.com/sqlparser/python_data_lineage.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "main"]
remote = origin
merge = refs/heads/main

1
github/description Normal file
View File

@ -0,0 +1 @@
Unnamed repository; edit this file 'description' to name the repository.

View File

@ -0,0 +1,15 @@
#!/bin/sh
#
# An example hook script to check the commit log message taken by
# applypatch from an e-mail message.
#
# The hook should exit with non-zero status after issuing an
# appropriate message if it wants to stop the commit. The hook is
# allowed to edit the commit message file.
#
# To enable this hook, rename this file to "applypatch-msg".
. git-sh-setup
commitmsg="$(git rev-parse --git-path hooks/commit-msg)"
test -x "$commitmsg" && exec "$commitmsg" ${1+"$@"}
:

24
github/hooks/commit-msg.sample Executable file
View File

@ -0,0 +1,24 @@
#!/bin/sh
#
# An example hook script to check the commit log message.
# Called by "git commit" with one argument, the name of the file
# that has the commit message. The hook should exit with non-zero
# status after issuing an appropriate message if it wants to stop the
# commit. The hook is allowed to edit the commit message file.
#
# To enable this hook, rename this file to "commit-msg".
# Uncomment the below to add a Signed-off-by line to the message.
# Doing this in a hook is a bad idea in general, but the prepare-commit-msg
# hook is more suited to it.
#
# SOB=$(git var GIT_AUTHOR_IDENT | sed -n 's/^\(.*>\).*$/Signed-off-by: \1/p')
# grep -qs "^$SOB" "$1" || echo "$SOB" >> "$1"
# This example catches duplicate Signed-off-by lines.
test "" = "$(grep '^Signed-off-by: ' "$1" |
sort | uniq -c | sed -e '/^[ ]*1[ ]/d')" || {
echo >&2 Duplicate Signed-off-by lines.
exit 1
}

View File

@ -0,0 +1,174 @@
#!/usr/bin/perl
use strict;
use warnings;
use IPC::Open2;
# An example hook script to integrate Watchman
# (https://facebook.github.io/watchman/) with git to speed up detecting
# new and modified files.
#
# The hook is passed a version (currently 2) and last update token
# formatted as a string and outputs to stdout a new update token and
# all files that have been modified since the update token. Paths must
# be relative to the root of the working tree and separated by a single NUL.
#
# To enable this hook, rename this file to "query-watchman" and set
# 'git config core.fsmonitor .git/hooks/query-watchman'
#
my ($version, $last_update_token) = @ARGV;
# Uncomment for debugging
# print STDERR "$0 $version $last_update_token\n";
# Check the hook interface version
if ($version ne 2) {
die "Unsupported query-fsmonitor hook version '$version'.\n" .
"Falling back to scanning...\n";
}
my $git_work_tree = get_working_dir();
my $retry = 1;
my $json_pkg;
eval {
require JSON::XS;
$json_pkg = "JSON::XS";
1;
} or do {
require JSON::PP;
$json_pkg = "JSON::PP";
};
launch_watchman();
sub launch_watchman {
my $o = watchman_query();
if (is_work_tree_watched($o)) {
output_result($o->{clock}, @{$o->{files}});
}
}
sub output_result {
my ($clockid, @files) = @_;
# Uncomment for debugging watchman output
# open (my $fh, ">", ".git/watchman-output.out");
# binmode $fh, ":utf8";
# print $fh "$clockid\n@files\n";
# close $fh;
binmode STDOUT, ":utf8";
print $clockid;
print "\0";
local $, = "\0";
print @files;
}
sub watchman_clock {
my $response = qx/watchman clock "$git_work_tree"/;
die "Failed to get clock id on '$git_work_tree'.\n" .
"Falling back to scanning...\n" if $? != 0;
return $json_pkg->new->utf8->decode($response);
}
sub watchman_query {
my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j --no-pretty')
or die "open2() failed: $!\n" .
"Falling back to scanning...\n";
# In the query expression below we're asking for names of files that
# changed since $last_update_token but not from the .git folder.
#
# To accomplish this, we're using the "since" generator to use the
# recency index to select candidate nodes and "fields" to limit the
# output to file names only. Then we're using the "expression" term to
# further constrain the results.
my $last_update_line = "";
if (substr($last_update_token, 0, 1) eq "c") {
$last_update_token = "\"$last_update_token\"";
$last_update_line = qq[\n"since": $last_update_token,];
}
my $query = <<" END";
["query", "$git_work_tree", {$last_update_line
"fields": ["name"],
"expression": ["not", ["dirname", ".git"]]
}]
END
# Uncomment for debugging the watchman query
# open (my $fh, ">", ".git/watchman-query.json");
# print $fh $query;
# close $fh;
print CHLD_IN $query;
close CHLD_IN;
my $response = do {local $/; <CHLD_OUT>};
# Uncomment for debugging the watch response
# open ($fh, ">", ".git/watchman-response.json");
# print $fh $response;
# close $fh;
die "Watchman: command returned no output.\n" .
"Falling back to scanning...\n" if $response eq "";
die "Watchman: command returned invalid output: $response\n" .
"Falling back to scanning...\n" unless $response =~ /^\{/;
return $json_pkg->new->utf8->decode($response);
}
sub is_work_tree_watched {
my ($output) = @_;
my $error = $output->{error};
if ($retry > 0 and $error and $error =~ m/unable to resolve root .* directory (.*) is not watched/) {
$retry--;
my $response = qx/watchman watch "$git_work_tree"/;
die "Failed to make watchman watch '$git_work_tree'.\n" .
"Falling back to scanning...\n" if $? != 0;
$output = $json_pkg->new->utf8->decode($response);
$error = $output->{error};
die "Watchman: $error.\n" .
"Falling back to scanning...\n" if $error;
# Uncomment for debugging watchman output
# open (my $fh, ">", ".git/watchman-output.out");
# close $fh;
# Watchman will always return all files on the first query so
# return the fast "everything is dirty" flag to git and do the
# Watchman query just to get it over with now so we won't pay
# the cost in git to look up each individual file.
my $o = watchman_clock();
$error = $output->{error};
die "Watchman: $error.\n" .
"Falling back to scanning...\n" if $error;
output_result($o->{clock}, ("/"));
$last_update_token = $o->{clock};
eval { launch_watchman() };
return 0;
}
die "Watchman: $error.\n" .
"Falling back to scanning...\n" if $error;
return 1;
}
sub get_working_dir {
my $working_dir;
if ($^O =~ 'msys' || $^O =~ 'cygwin') {
$working_dir = Win32::GetCwd();
$working_dir =~ tr/\\/\//;
} else {
require Cwd;
$working_dir = Cwd::cwd();
}
return $working_dir;
}

View File

@ -0,0 +1,8 @@
#!/bin/sh
#
# An example hook script to prepare a packed repository for use over
# dumb transports.
#
# To enable this hook, rename this file to "post-update".
exec git update-server-info

View File

@ -0,0 +1,14 @@
#!/bin/sh
#
# An example hook script to verify what is about to be committed
# by applypatch from an e-mail message.
#
# The hook should exit with non-zero status after issuing an
# appropriate message if it wants to stop the commit.
#
# To enable this hook, rename this file to "pre-applypatch".
. git-sh-setup
precommit="$(git rev-parse --git-path hooks/pre-commit)"
test -x "$precommit" && exec "$precommit" ${1+"$@"}
:

49
github/hooks/pre-commit.sample Executable file
View File

@ -0,0 +1,49 @@
#!/bin/sh
#
# An example hook script to verify what is about to be committed.
# Called by "git commit" with no arguments. The hook should
# exit with non-zero status after issuing an appropriate message if
# it wants to stop the commit.
#
# To enable this hook, rename this file to "pre-commit".
if git rev-parse --verify HEAD >/dev/null 2>&1
then
against=HEAD
else
# Initial commit: diff against an empty tree object
against=$(git hash-object -t tree /dev/null)
fi
# If you want to allow non-ASCII filenames set this variable to true.
allownonascii=$(git config --type=bool hooks.allownonascii)
# Redirect output to stderr.
exec 1>&2
# Cross platform projects tend to avoid non-ASCII filenames; prevent
# them from being added to the repository. We exploit the fact that the
# printable range starts at the space character and ends with tilde.
if [ "$allownonascii" != "true" ] &&
# Note that the use of brackets around a tr range is ok here, (it's
# even required, for portability to Solaris 10's /usr/bin/tr), since
# the square bracket bytes happen to fall in the designated range.
test $(git diff --cached --name-only --diff-filter=A -z $against |
LC_ALL=C tr -d '[ -~]\0' | wc -c) != 0
then
cat <<\EOF
Error: Attempt to add a non-ASCII file name.
This can cause problems if you want to work with people on other platforms.
To be portable it is advisable to rename the file.
If you know what you are doing you can disable this check using:
git config hooks.allownonascii true
EOF
exit 1
fi
# If there are whitespace errors, print the offending file names and fail.
exec git diff-index --check --cached $against --

View File

@ -0,0 +1,13 @@
#!/bin/sh
#
# An example hook script to verify what is about to be committed.
# Called by "git merge" with no arguments. The hook should
# exit with non-zero status after issuing an appropriate message to
# stderr if it wants to stop the merge commit.
#
# To enable this hook, rename this file to "pre-merge-commit".
. git-sh-setup
test -x "$GIT_DIR/hooks/pre-commit" &&
exec "$GIT_DIR/hooks/pre-commit"
:

53
github/hooks/pre-push.sample Executable file
View File

@ -0,0 +1,53 @@
#!/bin/sh
# An example hook script to verify what is about to be pushed. Called by "git
# push" after it has checked the remote status, but before anything has been
# pushed. If this script exits with a non-zero status nothing will be pushed.
#
# This hook is called with the following parameters:
#
# $1 -- Name of the remote to which the push is being done
# $2 -- URL to which the push is being done
#
# If pushing without using a named remote those arguments will be equal.
#
# Information about the commits which are being pushed is supplied as lines to
# the standard input in the form:
#
# <local ref> <local oid> <remote ref> <remote oid>
#
# This sample shows how to prevent push of commits where the log message starts
# with "WIP" (work in progress).
remote="$1"
url="$2"
zero=$(git hash-object --stdin </dev/null | tr '[0-9a-f]' '0')
while read local_ref local_oid remote_ref remote_oid
do
if test "$local_oid" = "$zero"
then
# Handle delete
:
else
if test "$remote_oid" = "$zero"
then
# New branch, examine all commits
range="$local_oid"
else
# Update to existing branch, examine new commits
range="$remote_oid..$local_oid"
fi
# Check for WIP commit
commit=$(git rev-list -n 1 --grep '^WIP' "$range")
if test -n "$commit"
then
echo >&2 "Found WIP commit in $local_ref, not pushing"
exit 1
fi
fi
done
exit 0

169
github/hooks/pre-rebase.sample Executable file
View File

@ -0,0 +1,169 @@
#!/bin/sh
#
# Copyright (c) 2006, 2008 Junio C Hamano
#
# The "pre-rebase" hook is run just before "git rebase" starts doing
# its job, and can prevent the command from running by exiting with
# non-zero status.
#
# The hook is called with the following parameters:
#
# $1 -- the upstream the series was forked from.
# $2 -- the branch being rebased (or empty when rebasing the current branch).
#
# This sample shows how to prevent topic branches that are already
# merged to 'next' branch from getting rebased, because allowing it
# would result in rebasing already published history.
publish=next
basebranch="$1"
if test "$#" = 2
then
topic="refs/heads/$2"
else
topic=`git symbolic-ref HEAD` ||
exit 0 ;# we do not interrupt rebasing detached HEAD
fi
case "$topic" in
refs/heads/??/*)
;;
*)
exit 0 ;# we do not interrupt others.
;;
esac
# Now we are dealing with a topic branch being rebased
# on top of master. Is it OK to rebase it?
# Does the topic really exist?
git show-ref -q "$topic" || {
echo >&2 "No such branch $topic"
exit 1
}
# Is topic fully merged to master?
not_in_master=`git rev-list --pretty=oneline ^master "$topic"`
if test -z "$not_in_master"
then
echo >&2 "$topic is fully merged to master; better remove it."
exit 1 ;# we could allow it, but there is no point.
fi
# Is topic ever merged to next? If so you should not be rebasing it.
only_next_1=`git rev-list ^master "^$topic" ${publish} | sort`
only_next_2=`git rev-list ^master ${publish} | sort`
if test "$only_next_1" = "$only_next_2"
then
not_in_topic=`git rev-list "^$topic" master`
if test -z "$not_in_topic"
then
echo >&2 "$topic is already up to date with master"
exit 1 ;# we could allow it, but there is no point.
else
exit 0
fi
else
not_in_next=`git rev-list --pretty=oneline ^${publish} "$topic"`
/usr/bin/perl -e '
my $topic = $ARGV[0];
my $msg = "* $topic has commits already merged to public branch:\n";
my (%not_in_next) = map {
/^([0-9a-f]+) /;
($1 => 1);
} split(/\n/, $ARGV[1]);
for my $elem (map {
/^([0-9a-f]+) (.*)$/;
[$1 => $2];
} split(/\n/, $ARGV[2])) {
if (!exists $not_in_next{$elem->[0]}) {
if ($msg) {
print STDERR $msg;
undef $msg;
}
print STDERR " $elem->[1]\n";
}
}
' "$topic" "$not_in_next" "$not_in_master"
exit 1
fi
<<\DOC_END
This sample hook safeguards topic branches that have been
published from being rewound.
The workflow assumed here is:
* Once a topic branch forks from "master", "master" is never
merged into it again (either directly or indirectly).
* Once a topic branch is fully cooked and merged into "master",
it is deleted. If you need to build on top of it to correct
earlier mistakes, a new topic branch is created by forking at
the tip of the "master". This is not strictly necessary, but
it makes it easier to keep your history simple.
* Whenever you need to test or publish your changes to topic
branches, merge them into "next" branch.
The script, being an example, hardcodes the publish branch name
to be "next", but it is trivial to make it configurable via
$GIT_DIR/config mechanism.
With this workflow, you would want to know:
(1) ... if a topic branch has ever been merged to "next". Young
topic branches can have stupid mistakes you would rather
clean up before publishing, and things that have not been
merged into other branches can be easily rebased without
affecting other people. But once it is published, you would
not want to rewind it.
(2) ... if a topic branch has been fully merged to "master".
Then you can delete it. More importantly, you should not
build on top of it -- other people may already want to
change things related to the topic as patches against your
"master", so if you need further changes, it is better to
fork the topic (perhaps with the same name) afresh from the
tip of "master".
Let's look at this example:
o---o---o---o---o---o---o---o---o---o "next"
/ / / /
/ a---a---b A / /
/ / / /
/ / c---c---c---c B /
/ / / \ /
/ / / b---b C \ /
/ / / / \ /
---o---o---o---o---o---o---o---o---o---o---o "master"
A, B and C are topic branches.
* A has one fix since it was merged up to "next".
* B has finished. It has been fully merged up to "master" and "next",
and is ready to be deleted.
* C has not merged to "next" at all.
We would want to allow C to be rebased, refuse A, and encourage
B to be deleted.
To compute (1):
git rev-list ^master ^topic next
git rev-list ^master next
if these match, topic has not merged in next at all.
To compute (2):
git rev-list master..topic
if this is empty, it is fully merged to "master".
DOC_END

24
github/hooks/pre-receive.sample Executable file
View File

@ -0,0 +1,24 @@
#!/bin/sh
#
# An example hook script to make use of push options.
# The example simply echoes all push options that start with 'echoback='
# and rejects all pushes when the "reject" push option is used.
#
# To enable this hook, rename this file to "pre-receive".
if test -n "$GIT_PUSH_OPTION_COUNT"
then
i=0
while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"
do
eval "value=\$GIT_PUSH_OPTION_$i"
case "$value" in
echoback=*)
echo "echo from the pre-receive-hook: ${value#*=}" >&2
;;
reject)
exit 1
esac
i=$((i + 1))
done
fi

View File

@ -0,0 +1,42 @@
#!/bin/sh
#
# An example hook script to prepare the commit log message.
# Called by "git commit" with the name of the file that has the
# commit message, followed by the description of the commit
# message's source. The hook's purpose is to edit the commit
# message file. If the hook fails with a non-zero status,
# the commit is aborted.
#
# To enable this hook, rename this file to "prepare-commit-msg".
# This hook includes three examples. The first one removes the
# "# Please enter the commit message..." help message.
#
# The second includes the output of "git diff --name-status -r"
# into the message, just before the "git status" output. It is
# commented because it doesn't cope with --amend or with squashed
# commits.
#
# The third example adds a Signed-off-by line to the message, that can
# still be edited. This is rarely a good idea.
COMMIT_MSG_FILE=$1
COMMIT_SOURCE=$2
SHA1=$3
/usr/bin/perl -i.bak -ne 'print unless(m/^. Please enter the commit message/..m/^#$/)' "$COMMIT_MSG_FILE"
# case "$COMMIT_SOURCE,$SHA1" in
# ,|template,)
# /usr/bin/perl -i.bak -pe '
# print "\n" . `git diff --cached --name-status -r`
# if /^#/ && $first++ == 0' "$COMMIT_MSG_FILE" ;;
# *) ;;
# esac
# SOB=$(git var GIT_COMMITTER_IDENT | sed -n 's/^\(.*>\).*$/Signed-off-by: \1/p')
# git interpret-trailers --in-place --trailer "$SOB" "$COMMIT_MSG_FILE"
# if test -z "$COMMIT_SOURCE"
# then
# /usr/bin/perl -i.bak -pe 'print "\n" if !$first_line++' "$COMMIT_MSG_FILE"
# fi

View File

@ -0,0 +1,78 @@
#!/bin/sh
# An example hook script to update a checked-out tree on a git push.
#
# This hook is invoked by git-receive-pack(1) when it reacts to git
# push and updates reference(s) in its repository, and when the push
# tries to update the branch that is currently checked out and the
# receive.denyCurrentBranch configuration variable is set to
# updateInstead.
#
# By default, such a push is refused if the working tree and the index
# of the remote repository has any difference from the currently
# checked out commit; when both the working tree and the index match
# the current commit, they are updated to match the newly pushed tip
# of the branch. This hook is to be used to override the default
# behaviour; however the code below reimplements the default behaviour
# as a starting point for convenient modification.
#
# The hook receives the commit with which the tip of the current
# branch is going to be updated:
commit=$1
# It can exit with a non-zero status to refuse the push (when it does
# so, it must not modify the index or the working tree).
die () {
echo >&2 "$*"
exit 1
}
# Or it can make any necessary changes to the working tree and to the
# index to bring them to the desired state when the tip of the current
# branch is updated to the new commit, and exit with a zero status.
#
# For example, the hook can simply run git read-tree -u -m HEAD "$1"
# in order to emulate git fetch that is run in the reverse direction
# with git push, as the two-tree form of git read-tree -u -m is
# essentially the same as git switch or git checkout that switches
# branches while keeping the local changes in the working tree that do
# not interfere with the difference between the branches.
# The below is a more-or-less exact translation to shell of the C code
# for the default behaviour for git's push-to-checkout hook defined in
# the push_to_deploy() function in builtin/receive-pack.c.
#
# Note that the hook will be executed from the repository directory,
# not from the working tree, so if you want to perform operations on
# the working tree, you will have to adapt your code accordingly, e.g.
# by adding "cd .." or using relative paths.
if ! git update-index -q --ignore-submodules --refresh
then
die "Up-to-date check failed"
fi
if ! git diff-files --quiet --ignore-submodules --
then
die "Working directory has unstaged changes"
fi
# This is a rough translation of:
#
# head_has_history() ? "HEAD" : EMPTY_TREE_SHA1_HEX
if git cat-file -e HEAD 2>/dev/null
then
head=HEAD
else
head=$(git hash-object -t tree --stdin </dev/null)
fi
if ! git diff-index --quiet --cached --ignore-submodules $head --
then
die "Working directory has staged changes"
fi
if ! git read-tree -u -m "$commit"
then
die "Could not update working tree to new HEAD"
fi

128
github/hooks/update.sample Executable file
View File

@ -0,0 +1,128 @@
#!/bin/sh
#
# An example hook script to block unannotated tags from entering.
# Called by "git receive-pack" with arguments: refname sha1-old sha1-new
#
# To enable this hook, rename this file to "update".
#
# Config
# ------
# hooks.allowunannotated
# This boolean sets whether unannotated tags will be allowed into the
# repository. By default they won't be.
# hooks.allowdeletetag
# This boolean sets whether deleting tags will be allowed in the
# repository. By default they won't be.
# hooks.allowmodifytag
# This boolean sets whether a tag may be modified after creation. By default
# it won't be.
# hooks.allowdeletebranch
# This boolean sets whether deleting branches will be allowed in the
# repository. By default they won't be.
# hooks.denycreatebranch
# This boolean sets whether remotely creating branches will be denied
# in the repository. By default this is allowed.
#
# --- Command line
refname="$1"
oldrev="$2"
newrev="$3"
# --- Safety check
if [ -z "$GIT_DIR" ]; then
echo "Don't run this script from the command line." >&2
echo " (if you want, you could supply GIT_DIR then run" >&2
echo " $0 <ref> <oldrev> <newrev>)" >&2
exit 1
fi
if [ -z "$refname" -o -z "$oldrev" -o -z "$newrev" ]; then
echo "usage: $0 <ref> <oldrev> <newrev>" >&2
exit 1
fi
# --- Config
allowunannotated=$(git config --type=bool hooks.allowunannotated)
allowdeletebranch=$(git config --type=bool hooks.allowdeletebranch)
denycreatebranch=$(git config --type=bool hooks.denycreatebranch)
allowdeletetag=$(git config --type=bool hooks.allowdeletetag)
allowmodifytag=$(git config --type=bool hooks.allowmodifytag)
# check for no description
projectdesc=$(sed -e '1q' "$GIT_DIR/description")
case "$projectdesc" in
"Unnamed repository"* | "")
echo "*** Project description file hasn't been set" >&2
exit 1
;;
esac
# --- Check types
# if $newrev is 0000...0000, it's a commit to delete a ref.
zero=$(git hash-object --stdin </dev/null | tr '[0-9a-f]' '0')
if [ "$newrev" = "$zero" ]; then
newrev_type=delete
else
newrev_type=$(git cat-file -t $newrev)
fi
case "$refname","$newrev_type" in
refs/tags/*,commit)
# un-annotated tag
short_refname=${refname##refs/tags/}
if [ "$allowunannotated" != "true" ]; then
echo "*** The un-annotated tag, $short_refname, is not allowed in this repository" >&2
echo "*** Use 'git tag [ -a | -s ]' for tags you want to propagate." >&2
exit 1
fi
;;
refs/tags/*,delete)
# delete tag
if [ "$allowdeletetag" != "true" ]; then
echo "*** Deleting a tag is not allowed in this repository" >&2
exit 1
fi
;;
refs/tags/*,tag)
# annotated tag
if [ "$allowmodifytag" != "true" ] && git rev-parse $refname > /dev/null 2>&1
then
echo "*** Tag '$refname' already exists." >&2
echo "*** Modifying a tag is not allowed in this repository." >&2
exit 1
fi
;;
refs/heads/*,commit)
# branch
if [ "$oldrev" = "$zero" -a "$denycreatebranch" = "true" ]; then
echo "*** Creating a branch is not allowed in this repository" >&2
exit 1
fi
;;
refs/heads/*,delete)
# delete branch
if [ "$allowdeletebranch" != "true" ]; then
echo "*** Deleting a branch is not allowed in this repository" >&2
exit 1
fi
;;
refs/remotes/*,commit)
# tracking branch
;;
refs/remotes/*,delete)
# delete tracking branch
if [ "$allowdeletebranch" != "true" ]; then
echo "*** Deleting a tracking branch is not allowed in this repository" >&2
exit 1
fi
;;
*)
# Anything else (is there anything else?)
echo "*** Update hook: unknown type of update to ref $refname of type $newrev_type" >&2
exit 1
;;
esac
# --- Finished
exit 0

BIN
github/index Normal file

Binary file not shown.

6
github/info/exclude Normal file
View File

@ -0,0 +1,6 @@
# git ls-files --others --exclude-from=.git/info/exclude
# Lines that start with '#' are comments.
# For a project mostly in C, the following would be a good set of
# exclude patterns (uncomment them if you want to use them):
# *.[oa]
# *~

1
github/logs/HEAD Normal file
View File

@ -0,0 +1 @@
0000000000000000000000000000000000000000 b242b3132e5152a21ad6bd6681953648ed013924 j <j@j.com> 1729677558 +0800 clone: from https://github.com/sqlparser/python_data_lineage.git

View File

@ -0,0 +1 @@
0000000000000000000000000000000000000000 b242b3132e5152a21ad6bd6681953648ed013924 j <j@j.com> 1729677558 +0800 clone: from https://github.com/sqlparser/python_data_lineage.git

View File

@ -0,0 +1 @@
0000000000000000000000000000000000000000 b242b3132e5152a21ad6bd6681953648ed013924 j <j@j.com> 1729677558 +0800 clone: from https://github.com/sqlparser/python_data_lineage.git

2
github/packed-refs Normal file
View File

@ -0,0 +1,2 @@
# pack-refs with: peeled fully-peeled sorted
b242b3132e5152a21ad6bd6681953648ed013924 refs/remotes/origin/main

1
github/refs/heads/main Normal file
View File

@ -0,0 +1 @@
b242b3132e5152a21ad6bd6681953648ed013924

View File

@ -0,0 +1 @@
ref: refs/remotes/origin/main

Binary file not shown.

2
samples/aws_athena.sql Normal file
View File

@ -0,0 +1,2 @@
INSERT INTO cities_usa (city,state)
SELECT city,state FROM cities_world WHERE country='usa';

18
samples/azure.sql Normal file
View File

@ -0,0 +1,18 @@
-- azure sql sample SQL
CREATE VIEW [SalesLT].[vProductAndDescription]
WITH SCHEMABINDING
AS
-- View (indexed or standard) to display products and product descriptions by language.
SELECT
p.[ProductID]
,p.[Name]
,pm.[Name] AS [ProductModel]
,pmx.[Culture]
,pd.[Description]
FROM [SalesLT].[Product] p
INNER JOIN [SalesLT].[ProductModel] pm
ON p.[ProductModelID] = pm.[ProductModelID]
INNER JOIN [SalesLT].[ProductModelProductDescription] pmx
ON pm.[ProductModelID] = pmx.[ProductModelID]
INNER JOIN [SalesLT].[ProductDescription] pd
ON pmx.[ProductDescriptionID] = pd.[ProductDescriptionID];

11
samples/bigquery.sql Normal file
View File

@ -0,0 +1,11 @@
-- bigquery sample SQL
MERGE dataset.DetailedInventory T
USING dataset.Inventory S
ON T.product = S.product
WHEN NOT MATCHED AND s.quantity < 20 THEN
INSERT(product, quantity, supply_constrained, comments)
VALUES(product, quantity, true, ARRAY<STRUCT<created DATE, comment STRING>>[(DATE('2016-01-01'), 'comment1')])
WHEN NOT MATCHED THEN
INSERT(product, quantity, supply_constrained)
VALUES(product, quantity, false)
;

9
samples/couchbase.sql Normal file
View File

@ -0,0 +1,9 @@
MERGE INTO all_empts a USING emps_deptb b ON KEY b.empId
WHEN MATCHED THEN
UPDATE SET a.depts = a.depts + 1,
a.title = b.title || ", " || b.title
WHEN NOT MATCHED THEN
INSERT { "name": b.name, "title": b.title, "depts": b.depts, "empId": b.empId, "dob": b.dob }
;

7
samples/databricks.sql Normal file
View File

@ -0,0 +1,7 @@
-- databricks
CREATE OR REPLACE VIEW experienced_employee
(id COMMENT 'Unique identification number', Name)
COMMENT 'View for experienced employees'
AS SELECT id, name
FROM all_employee
WHERE working_years > 5;

12
samples/db2.sql Normal file
View File

@ -0,0 +1,12 @@
CREATE VIEW DSN8B10.FIRSTQTR (SNO, CHARGES, DATE) AS
SELECT SNO, CHARGES, DATE
FROM MONTH1
WHERE DATE BETWEEN '01/01/2000' and '01/31/2000'
UNION All
SELECT SNO, CHARGES, DATE
FROM MONTH2
WHERE DATE BETWEEN '02/01/2000' and '02/29/2000'
UNION All
SELECT SNO, CHARGES, DATE
FROM MONTH3
WHERE DATE BETWEEN '03/01/2000' and '03/31/2000';

14
samples/greenplum.sql Normal file
View File

@ -0,0 +1,14 @@
create view v1 as
WITH RECURSIVE search_graph(id, link, data, depth, path, cycle) AS (
SELECT g.id, g.link, g.data, 1,
ARRAY[g.id],
false
FROM graph g
UNION ALL
SELECT g.id, g.link, g.data, sg.depth + 1,
path || g.id,
g.id = ANY(path)
FROM graph g, search_graph sg
WHERE g.id = sg.link AND NOT cycle
)
SELECT * FROM search_graph;

16
samples/guassdb.sql Normal file
View File

@ -0,0 +1,16 @@
CREATE OR REPLACE PROCEDURE dept_proc()
AS
declare
DEPT_NAME VARCHAR(100);
DEPT_LOC INTEGER;
CURSOR C1 IS
SELECT name, place_id FROM hr.section WHERE place_id <= 50;
BEGIN
OPEN C1;
LOOP
FETCH C1 INTO DEPT_NAME, DEPT_LOC;
EXIT WHEN C1%NOTFOUND;
DBMS_OUTPUT.PUT_LINE(DEPT_NAME||'---'||DEPT_LOC);
END LOOP;
CLOSE C1;
END;

20
samples/hive.sql Normal file
View File

@ -0,0 +1,20 @@
CREATE TABLE merge_data.transactions(
ID int,
TranValue string,
last_update_user string)
PARTITIONED BY (tran_date string)
CLUSTERED BY (ID) into 5 buckets
STORED AS ORC TBLPROPERTIES ('transactional'='true');
CREATE TABLE merge_data.merge_source(
ID int,
TranValue string,
tran_date string)
STORED AS ORC;
MERGE INTO merge_data.transactions AS T
USING merge_data.merge_source AS S
ON T.ID = S.ID and T.tran_date = S.tran_date
WHEN MATCHED AND (T.TranValue != S.TranValue AND S.TranValue IS NOT NULL) THEN UPDATE SET TranValue = S.TranValue, last_update_user = 'merge_update'
WHEN MATCHED AND S.TranValue IS NULL THEN DELETE
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.TranValue, 'merge_insert', S.tran_date)

Binary file not shown.

After

Width:  |  Height:  |  Size: 95 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 138 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 77 KiB

6
samples/impala.sql Normal file
View File

@ -0,0 +1,6 @@
-- Impala sample sql
insert into t2 select c1, c2 from t1;
CREATE VIEW v5 AS SELECT c1, CAST(c3 AS STRING) c3, CONCAT(c4,c5) c5, TRIM(c6) c6, "Constant" c8 FROM t1;
CREATE VIEW v7 (c1 COMMENT 'Comment for c1', c2) COMMENT 'Comment for v7' AS SELECT t1.c1, t1.c2 FROM t1;

6
samples/informix.sql Normal file
View File

@ -0,0 +1,6 @@
UPDATE nmosdb@wnmserver1:test
SET name=(SELECT name FROM test
WHERE test.id = nmosdb@wnmserver1:test.id)
WHERE EXISTS(
SELECT 1 FROM test WHERE test.id = nmosdb@wnmserver1:test.id
);

12
samples/mysql.sql Normal file
View File

@ -0,0 +1,12 @@
create view v1 as
WITH RECURSIVE employee_paths (id, name, path) AS
(
SELECT id, name, CAST(id AS CHAR(200))
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.id, e.name, CONCAT(ep.path, ',', e.id)
FROM employee_paths AS ep JOIN employees AS e
ON ep.id = e.manager_id
)
SELECT * FROM employee_paths ORDER BY path;

63
samples/mysql_er.sql Normal file
View File

@ -0,0 +1,63 @@
CREATE TABLE `permissions`(
`id` BIGINT NOT NULL,
`name` VARCHAR(255) NOT NULL,
`guard_name` VARCHAR(255) NOT NULL,
`created_at` DATETIME NOT NULL,
`updated_at` DATETIME NOT NULL
);
ALTER TABLE
`permissions` ADD PRIMARY KEY `permissions_id_primary`(`id`);
CREATE TABLE `roles`(
`id` BIGINT NOT NULL,
`name` VARCHAR(255) NOT NULL,
`guard_name` VARCHAR(255) NOT NULL,
`created_at` DATETIME NOT NULL,
`updated_at` DATETIME NOT NULL
);
ALTER TABLE
`roles` ADD PRIMARY KEY `roles_id_primary`(`id`);
CREATE TABLE `model_has_permissions`(
`permission_id` BIGINT NOT NULL,
`model_type` VARCHAR(255) NOT NULL,
`model_id` INT NOT NULL
);
ALTER TABLE
`model_has_permissions` ADD PRIMARY KEY `model_has_permissions_permission_id_model_id_model_type_primary`(
`permission_id`,
`model_id`,
`model_type`
);
ALTER TABLE
`model_has_permissions` ADD INDEX `model_has_permissions_model_id_model_type_index`(`model_id`, `model_type`);
ALTER TABLE
`model_has_permissions` ADD PRIMARY KEY `model_has_permissions_permission_id_primary`(`permission_id`);
ALTER TABLE
`model_has_permissions` ADD CONSTRAINT `model_has_permissions_permission_id_foreign` FOREIGN KEY(`permission_id`) REFERENCES `permissions`(`id`);
CREATE TABLE `model_has_roles`(
`role_id` BIGINT NOT NULL,
`model_type` VARCHAR(255) NOT NULL,
`model_id` INT NOT NULL
);
ALTER TABLE
`model_has_roles` ADD PRIMARY KEY `model_has_roles_role_id_model_id_model_type_primary`(`role_id`, `model_id`, `model_type`);
ALTER TABLE
`model_has_roles` ADD INDEX `model_has_roles_model_id_model_type_index`(`model_id`, `model_type`);
ALTER TABLE
`model_has_roles` ADD PRIMARY KEY `model_has_roles_role_id_primary`(`role_id`);
ALTER TABLE
`model_has_roles` ADD CONSTRAINT `model_has_roles_role_id_foreign` FOREIGN KEY(`role_id`) REFERENCES `roles`(`id`);
CREATE TABLE `role_has_permissions`(
`permission_id` BIGINT NOT NULL,
`role_id` BIGINT NOT NULL
);
ALTER TABLE
`role_has_permissions` ADD PRIMARY KEY `role_has_permissions_permission_id_role_id_primary`(`permission_id`, `role_id`);
ALTER TABLE
`role_has_permissions` ADD PRIMARY KEY `role_has_permissions_permission_id_primary`(`permission_id`);
ALTER TABLE
`role_has_permissions` ADD CONSTRAINT `role_has_permissions_role_id_foreign` FOREIGN KEY(`role_id`) REFERENCES `roles`(`id`);
ALTER TABLE
`role_has_permissions` ADD CONSTRAINT `role_has_permissions_permission_id_foreign` FOREIGN KEY(`permission_id`) REFERENCES `permissions`(`id`);

18
samples/netezza.sql Normal file
View File

@ -0,0 +1,18 @@
create table t as
WITH manager (mgr_id, mgr_name, mgr_dept) AS
(SELECT id, name, grp
FROM emp_copy
WHERE mgr = id AND grp != 'gone'),
employee (emp_id, emp_name, emp_mgr) AS
(SELECT id, name, mgr_id
FROM emp_copy JOIN manager ON grp = mgr_dept),
mgr_cnt (mgr_id, mgr_reports) AS
(SELECT mgr, COUNT (*)
FROM emp_copy
WHERE mgr != id
GROUP BY mgr)
SELECT *
FROM employee JOIN manager ON emp_mgr = mgr_id
JOIN mgr_cnt ON emp_mgr = mgr_id
WHERE emp_id != mgr_id
ORDER BY mgr_dept;

26
samples/openedge.sql Normal file
View File

@ -0,0 +1,26 @@
-- openedge sample sql
CREATE VIEW ne_customers AS
SELECT name, address, city, state
FROM customer
WHERE state IN ( 'NH', 'MA', 'ME', 'RI', 'CT', 'VT' )
WITH CHECK OPTION ;
INSERT INTO neworders (order_no, product, qty)
SELECT order_no, product, qty
FROM orders
WHERE order_date = SYSDATE ;
UPDATE OrderLine
SET (ItemNum, Price) =
(SELECT ItemNum, Price * 3
FROM Item
WHERE ItemName = 'gloves')
WHERE OrderNum = 21 ;
Update Orderline
SET (Itemnum) =
(Select Itemnum
FROM Item
WHERE Itemname = 'Tennis balls')
WHERE Ordernum = 20;

33
samples/oracle.sql Normal file
View File

@ -0,0 +1,33 @@
CREATE VIEW vsal
AS
SELECT a.deptno "Department",
a.num_emp / b.total_count "Employees",
a.sal_sum / b.total_sal "Salary"
FROM (SELECT deptno,
Count() num_emp,
SUM(sal) sal_sum
FROM scott.emp
WHERE city = 'NYC'
GROUP BY deptno) a,
(SELECT Count() total_count,
SUM(sal) total_sal
FROM scott.emp
WHERE city = 'NYC') b
;
INSERT ALL
WHEN ottl < 100000 THEN
INTO small_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 100000 and ottl < 200000 THEN
INTO medium_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 200000 THEN
into large_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 290000 THEN
INTO special_orders
SELECT o.order_id oid, o.customer_id cid, o.order_total ottl,
o.sales_rep_id sid, c.credit_limit cl, c.cust_email cem
FROM orders o, customers c
WHERE o.customer_id = c.customer_id;

14
samples/oracle_insert.sql Normal file
View File

@ -0,0 +1,14 @@
INSERT INTO deptsal
(dept_no,
dept_name,
salary)
SELECT d.deptno,
d.dname,
SUM(e.sal + Nvl(e.comm, 0)) AS sal
FROM dept d
left join (SELECT *
FROM emp
WHERE hiredate > DATE '1980-01-01') e
ON e.deptno = d.deptno
GROUP BY d.deptno,
d.dname;

40
samples/oracle_plsql.sql Normal file
View File

@ -0,0 +1,40 @@
DECLARE
z_empid employees.employee_id%TYPE;
z_depid employees.department_id%TYPE;
z_firstname employees.first_name%TYPE;
z_lastname employees.last_name%TYPE;
CURSOR cur_stclerk IS
SELECT employee_id,
department_id,
first_name,
last_name
FROM employees
WHERE job_id = 'ST_CLERK';
BEGIN
OPEN cur_stclerk;
LOOP
FETCH cur_stclerk INTO z_empid,z_depid,z_firstname,
z_lastname;
EXIT WHEN cur_stclerk%NOTFOUND;
INSERT INTO emp_temp
(employee_id,
department_id,
job_id)
VALUES (z_empid,
z_depid,
'ST_CLERK');
INSERT INTO emp_detls_temp
(employee_id,
empname)
VALUES (z_empid,
z_firstname
|| ' '
|| z_lastname);
END LOOP;
CLOSE cur_stclerk;
COMMIT;
END;

11
samples/postgresql.sql Normal file
View File

@ -0,0 +1,11 @@
create view v3 as
WITH RECURSIVE employee_recursive(distance, employee_name, manager_name) AS (
SELECT 1, employee_name, manager_name
FROM employee
WHERE manager_name = 'Mary'
UNION ALL
SELECT er.distance + 1, e.employee_name, e.manager_name
FROM employee_recursive er, employee e
WHERE er.employee_name = e.manager_name
)
SELECT distance, employee_name FROM employee_recursive;

125
samples/postgresql_er.sql Normal file
View File

@ -0,0 +1,125 @@
CREATE TABLE "ticketit_statuses"(
"id" INTEGER NOT NULL,
"name" VARCHAR(255) NOT NULL,
"color" BIGINT NOT NULL
);
ALTER TABLE
"ticketit_statuses" ADD PRIMARY KEY("id");
CREATE TABLE "ticketit_priorities"(
"id" INTEGER NOT NULL,
"name" VARCHAR(255) NOT NULL,
"color" BIGINT NOT NULL
);
ALTER TABLE
"ticketit_priorities" ADD PRIMARY KEY("id");
CREATE TABLE "ticketit_categories"(
"id" INTEGER NOT NULL,
"name" VARCHAR(255) NOT NULL,
"color" BIGINT NOT NULL
);
ALTER TABLE
"ticketit_categories" ADD PRIMARY KEY("id");
CREATE TABLE "ticketit_categories_users"(
"category_id" INTEGER NOT NULL,
"user_id" INTEGER NOT NULL
);
ALTER TABLE
"ticketit_categories_users" ADD PRIMARY KEY("category_id");
CREATE TABLE "ticketit"(
"id" INTEGER NOT NULL,
"subject" VARCHAR(255) NOT NULL,
"content" TEXT NOT NULL,
"html" TEXT NULL,
"status_id" INTEGER NOT NULL,
"priority_id" INTEGER NOT NULL,
"user_id" INTEGER NOT NULL,
"agent_id" INTEGER NOT NULL,
"category_id" INTEGER NOT NULL,
"created_at" TIMESTAMP(0) WITHOUT TIME ZONE NOT NULL,
"updated_at" BIGINT NOT NULL,
"completed_at" TIMESTAMP(0) WITHOUT TIME ZONE NULL
);
ALTER TABLE
"ticketit" ADD PRIMARY KEY("id");
CREATE INDEX "ticketit_subject_index" ON
"ticketit"("subject");
CREATE INDEX "ticketit_status_id_index" ON
"ticketit"("status_id");
CREATE INDEX "ticketit_priority_id_index" ON
"ticketit"("priority_id");
CREATE INDEX "ticketit_user_id_index" ON
"ticketit"("user_id");
CREATE INDEX "ticketit_agent_id_index" ON
"ticketit"("agent_id");
CREATE INDEX "ticketit_category_id_index" ON
"ticketit"("category_id");
CREATE INDEX "ticketit_completed_at_index" ON
"ticketit"("completed_at");
CREATE TABLE "ticketit_comments"(
"id" INTEGER NOT NULL,
"content" TEXT NOT NULL,
"user_id" INTEGER NOT NULL,
"ticket_id" INTEGER NOT NULL,
"created_at" BIGINT NOT NULL,
"updated_at" TIMESTAMP(0) WITHOUT TIME ZONE NOT NULL,
"content" TEXT NOT NULL,
"html" TEXT NULL
);
ALTER TABLE
"ticketit_comments" ADD PRIMARY KEY("id");
CREATE INDEX "ticketit_comments_user_id_index" ON
"ticketit_comments"("user_id");
CREATE INDEX "ticketit_comments_ticket_id_index" ON
"ticketit_comments"("ticket_id");
CREATE TABLE "ticketit_audits"(
"id" INTEGER NOT NULL,
"operation" TEXT NOT NULL,
"user_id" INTEGER NOT NULL,
"ticket_id" INTEGER NOT NULL,
"created_at" TIMESTAMP(0) WITHOUT TIME ZONE NOT NULL,
"updated_at" TIMESTAMP(0) WITHOUT TIME ZONE NOT NULL
);
ALTER TABLE
"ticketit_audits" ADD PRIMARY KEY("id");
CREATE TABLE "users"(
"id" INTEGER NOT NULL,
"ticketit_admin" BOOLEAN NOT NULL,
"ticketit_agent" BOOLEAN NOT NULL
);
ALTER TABLE
"users" ADD PRIMARY KEY("id");
CREATE TABLE "ticketit_settings"(
"id" INTEGER NOT NULL,
"lang" VARCHAR(255) NULL,
"slug" VARCHAR(255) NOT NULL,
"value" TEXT NOT NULL,
"default" TEXT NOT NULL,
"created_at" TIMESTAMP(0) WITHOUT TIME ZONE NOT NULL,
"updated_at" TIMESTAMP(0) WITHOUT TIME ZONE NOT NULL
);
ALTER TABLE
"ticketit_settings" ADD PRIMARY KEY("id");
ALTER TABLE
"ticketit_settings" ADD CONSTRAINT "ticketit_settings_lang_unique" UNIQUE("lang");
ALTER TABLE
"ticketit_settings" ADD CONSTRAINT "ticketit_settings_slug_unique" UNIQUE("slug");
ALTER TABLE
"ticketit" ADD CONSTRAINT "ticketit_priority_id_foreign" FOREIGN KEY("priority_id") REFERENCES "ticketit_priorities"("id");
ALTER TABLE
"ticketit" ADD CONSTRAINT "ticketit_status_id_foreign" FOREIGN KEY("status_id") REFERENCES "ticketit_statuses"("id");
ALTER TABLE
"ticketit" ADD CONSTRAINT "ticketit_agent_id_foreign" FOREIGN KEY("agent_id") REFERENCES "users"("id");
ALTER TABLE
"ticketit" ADD CONSTRAINT "ticketit_category_id_foreign" FOREIGN KEY("category_id") REFERENCES "ticketit_categories"("id");
ALTER TABLE
"ticketit_comments" ADD CONSTRAINT "ticketit_comments_ticket_id_foreign" FOREIGN KEY("ticket_id") REFERENCES "ticketit"("id");
ALTER TABLE
"ticketit_audits" ADD CONSTRAINT "ticketit_audits_ticket_id_foreign" FOREIGN KEY("ticket_id") REFERENCES "ticketit"("id");
ALTER TABLE
"ticketit" ADD CONSTRAINT "ticketit_user_id_foreign" FOREIGN KEY("user_id") REFERENCES "users"("id");
ALTER TABLE
"ticketit_categories_users" ADD CONSTRAINT "ticketit_categories_users_user_id_foreign" FOREIGN KEY("user_id") REFERENCES "users"("id");
ALTER TABLE
"ticketit_comments" ADD CONSTRAINT "ticketit_comments_user_id_foreign" FOREIGN KEY("user_id") REFERENCES "users"("id");
ALTER TABLE
"ticketit_audits" ADD CONSTRAINT "ticketit_audits_user_id_foreign" FOREIGN KEY("user_id") REFERENCES "users"("id");

View File

@ -0,0 +1,466 @@
--
-- PostgreSQL database dump
--
SET statement_timeout = 0;
SET lock_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET default_tablespace = '';
SET default_with_oids = false;
---
--- drop tables
---
DROP TABLE IF EXISTS customer_customer_demo;
DROP TABLE IF EXISTS customer_demographics;
DROP TABLE IF EXISTS employee_territories;
DROP TABLE IF EXISTS order_details;
DROP TABLE IF EXISTS orders;
DROP TABLE IF EXISTS customers;
DROP TABLE IF EXISTS products;
DROP TABLE IF EXISTS shippers;
DROP TABLE IF EXISTS suppliers;
DROP TABLE IF EXISTS territories;
DROP TABLE IF EXISTS us_states;
DROP TABLE IF EXISTS categories;
DROP TABLE IF EXISTS region;
DROP TABLE IF EXISTS employees;
--
-- Name: categories; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
CREATE TABLE categories (
category_id smallint NOT NULL,
category_name character varying(15) NOT NULL,
description text,
picture bytea
);
--
-- Name: customer_customer_demo; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
CREATE TABLE customer_customer_demo (
customer_id bpchar NOT NULL,
customer_type_id bpchar NOT NULL
);
--
-- Name: customer_demographics; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
CREATE TABLE customer_demographics (
customer_type_id bpchar NOT NULL,
customer_desc text
);
--
-- Name: customers; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
CREATE TABLE customers (
customer_id bpchar NOT NULL,
company_name character varying(40) NOT NULL,
contact_name character varying(30),
contact_title character varying(30),
address character varying(60),
city character varying(15),
region character varying(15),
postal_code character varying(10),
country character varying(15),
phone character varying(24),
fax character varying(24)
);
--
-- Name: employees; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
CREATE TABLE employees (
employee_id smallint NOT NULL,
last_name character varying(20) NOT NULL,
first_name character varying(10) NOT NULL,
title character varying(30),
title_of_courtesy character varying(25),
birth_date date,
hire_date date,
address character varying(60),
city character varying(15),
region character varying(15),
postal_code character varying(10),
country character varying(15),
home_phone character varying(24),
extension character varying(4),
photo bytea,
notes text,
reports_to smallint,
photo_path character varying(255)
);
--
-- Name: employee_territories; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
CREATE TABLE employee_territories (
employee_id smallint NOT NULL,
territory_id character varying(20) NOT NULL
);
--
-- Name: order_details; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
CREATE TABLE order_details (
order_id smallint NOT NULL,
product_id smallint NOT NULL,
unit_price real NOT NULL,
quantity smallint NOT NULL,
discount real NOT NULL
);
--
-- Name: orders; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
CREATE TABLE orders (
order_id smallint NOT NULL,
customer_id bpchar,
employee_id smallint,
order_date date,
required_date date,
shipped_date date,
ship_via smallint,
freight real,
ship_name character varying(40),
ship_address character varying(60),
ship_city character varying(15),
ship_region character varying(15),
ship_postal_code character varying(10),
ship_country character varying(15)
);
--
-- Name: products; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
CREATE TABLE products (
product_id smallint NOT NULL,
product_name character varying(40) NOT NULL,
supplier_id smallint,
category_id smallint,
quantity_per_unit character varying(20),
unit_price real,
units_in_stock smallint,
units_on_order smallint,
reorder_level smallint,
discontinued integer NOT NULL
);
--
-- Name: region; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
CREATE TABLE region (
region_id smallint NOT NULL,
region_description bpchar NOT NULL
);
--
-- Name: shippers; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
CREATE TABLE shippers (
shipper_id smallint NOT NULL,
company_name character varying(40) NOT NULL,
phone character varying(24)
);
--
-- Name: suppliers; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
CREATE TABLE suppliers (
supplier_id smallint NOT NULL,
company_name character varying(40) NOT NULL,
contact_name character varying(30),
contact_title character varying(30),
address character varying(60),
city character varying(15),
region character varying(15),
postal_code character varying(10),
country character varying(15),
phone character varying(24),
fax character varying(24),
homepage text
);
--
-- Name: territories; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
CREATE TABLE territories (
territory_id character varying(20) NOT NULL,
territory_description bpchar NOT NULL,
region_id smallint NOT NULL
);
--
-- Name: us_states; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
CREATE TABLE us_states (
state_id smallint NOT NULL,
state_name character varying(100),
state_abbr character varying(2),
state_region character varying(50)
);
--
-- Name: pk_categories; Type: CONSTRAINT; Schema: public; Owner: -; Tablespace:
--
ALTER TABLE ONLY categories
ADD CONSTRAINT pk_categories PRIMARY KEY (category_id);
--
-- Name: pk_customer_customer_demo; Type: CONSTRAINT; Schema: public; Owner: -; Tablespace:
--
ALTER TABLE ONLY customer_customer_demo
ADD CONSTRAINT pk_customer_customer_demo PRIMARY KEY (customer_id, customer_type_id);
--
-- Name: pk_customer_demographics; Type: CONSTRAINT; Schema: public; Owner: -; Tablespace:
--
ALTER TABLE ONLY customer_demographics
ADD CONSTRAINT pk_customer_demographics PRIMARY KEY (customer_type_id);
--
-- Name: pk_customers; Type: CONSTRAINT; Schema: public; Owner: -; Tablespace:
--
ALTER TABLE ONLY customers
ADD CONSTRAINT pk_customers PRIMARY KEY (customer_id);
--
-- Name: pk_employees; Type: CONSTRAINT; Schema: public; Owner: -; Tablespace:
--
ALTER TABLE ONLY employees
ADD CONSTRAINT pk_employees PRIMARY KEY (employee_id);
--
-- Name: pk_employee_territories; Type: CONSTRAINT; Schema: public; Owner: -; Tablespace:
--
ALTER TABLE ONLY employee_territories
ADD CONSTRAINT pk_employee_territories PRIMARY KEY (employee_id, territory_id);
--
-- Name: pk_order_details; Type: CONSTRAINT; Schema: public; Owner: -; Tablespace:
--
ALTER TABLE ONLY order_details
ADD CONSTRAINT pk_order_details PRIMARY KEY (order_id, product_id);
--
-- Name: pk_orders; Type: CONSTRAINT; Schema: public; Owner: -; Tablespace:
--
ALTER TABLE ONLY orders
ADD CONSTRAINT pk_orders PRIMARY KEY (order_id);
--
-- Name: pk_products; Type: CONSTRAINT; Schema: public; Owner: -; Tablespace:
--
ALTER TABLE ONLY products
ADD CONSTRAINT pk_products PRIMARY KEY (product_id);
--
-- Name: pk_region; Type: CONSTRAINT; Schema: public; Owner: -; Tablespace:
--
ALTER TABLE ONLY region
ADD CONSTRAINT pk_region PRIMARY KEY (region_id);
--
-- Name: pk_shippers; Type: CONSTRAINT; Schema: public; Owner: -; Tablespace:
--
ALTER TABLE ONLY shippers
ADD CONSTRAINT pk_shippers PRIMARY KEY (shipper_id);
--
-- Name: pk_suppliers; Type: CONSTRAINT; Schema: public; Owner: -; Tablespace:
--
ALTER TABLE ONLY suppliers
ADD CONSTRAINT pk_suppliers PRIMARY KEY (supplier_id);
--
-- Name: pk_territories; Type: CONSTRAINT; Schema: public; Owner: -; Tablespace:
--
ALTER TABLE ONLY territories
ADD CONSTRAINT pk_territories PRIMARY KEY (territory_id);
--
-- Name: pk_usstates; Type: CONSTRAINT; Schema: public; Owner: -; Tablespace:
--
ALTER TABLE ONLY us_states
ADD CONSTRAINT pk_usstates PRIMARY KEY (state_id);
--
-- Name: fk_orders_customers; Type: Constraint; Schema: -; Owner: -
--
ALTER TABLE ONLY orders
ADD CONSTRAINT fk_orders_customers FOREIGN KEY (customer_id) REFERENCES customers;
--
-- Name: fk_orders_employees; Type: Constraint; Schema: -; Owner: -
--
ALTER TABLE ONLY orders
ADD CONSTRAINT fk_orders_employees FOREIGN KEY (employee_id) REFERENCES employees;
--
-- Name: fk_orders_shippers; Type: Constraint; Schema: -; Owner: -
--
ALTER TABLE ONLY orders
ADD CONSTRAINT fk_orders_shippers FOREIGN KEY (ship_via) REFERENCES shippers;
--
-- Name: fk_order_details_products; Type: Constraint; Schema: -; Owner: -
--
ALTER TABLE ONLY order_details
ADD CONSTRAINT fk_order_details_products FOREIGN KEY (product_id) REFERENCES products;
--
-- Name: fk_order_details_orders; Type: Constraint; Schema: -; Owner: -
--
ALTER TABLE ONLY order_details
ADD CONSTRAINT fk_order_details_orders FOREIGN KEY (order_id) REFERENCES orders;
--
-- Name: fk_products_categories; Type: Constraint; Schema: -; Owner: -
--
ALTER TABLE ONLY products
ADD CONSTRAINT fk_products_categories FOREIGN KEY (category_id) REFERENCES categories;
--
-- Name: fk_products_suppliers; Type: Constraint; Schema: -; Owner: -
--
ALTER TABLE ONLY products
ADD CONSTRAINT fk_products_suppliers FOREIGN KEY (supplier_id) REFERENCES suppliers;
--
-- Name: fk_territories_region; Type: Constraint; Schema: -; Owner: -
--
ALTER TABLE ONLY territories
ADD CONSTRAINT fk_territories_region FOREIGN KEY (region_id) REFERENCES region;
--
-- Name: fk_employee_territories_territories; Type: Constraint; Schema: -; Owner: -
--
ALTER TABLE ONLY employee_territories
ADD CONSTRAINT fk_employee_territories_territories FOREIGN KEY (territory_id) REFERENCES territories;
--
-- Name: fk_employee_territories_employees; Type: Constraint; Schema: -; Owner: -
--
ALTER TABLE ONLY employee_territories
ADD CONSTRAINT fk_employee_territories_employees FOREIGN KEY (employee_id) REFERENCES employees;
--
-- Name: fk_customer_customer_demo_customer_demographics; Type: Constraint; Schema: -; Owner: -
--
ALTER TABLE ONLY customer_customer_demo
ADD CONSTRAINT fk_customer_customer_demo_customer_demographics FOREIGN KEY (customer_type_id) REFERENCES customer_demographics;
--
-- Name: fk_customer_customer_demo_customers; Type: Constraint; Schema: -; Owner: -
--
ALTER TABLE ONLY customer_customer_demo
ADD CONSTRAINT fk_customer_customer_demo_customers FOREIGN KEY (customer_id) REFERENCES customers;
--
-- Name: fk_employees_employees; Type: Constraint; Schema: -; Owner: -
--
ALTER TABLE ONLY employees
ADD CONSTRAINT fk_employees_employees FOREIGN KEY (reports_to) REFERENCES employees;
--
-- PostgreSQL database dump complete
--

View File

@ -0,0 +1,10 @@
CREATE OR REPLACE FUNCTION t.mergemodel(_modelid integer)
RETURNS void
LANGUAGE plpgsql
AS $function$
BEGIN
EXECUTE format ('INSERT INTO InSelections
SELECT * FROM AddInSelections_%s', modelid);
END;
$function$

4
samples/presto.sql Normal file
View File

@ -0,0 +1,4 @@
CREATE TABLE orders_column_aliased (order_date, total_price)
AS
SELECT orderdate, totalprice
FROM orders;

31
samples/redshift.sql Normal file
View File

@ -0,0 +1,31 @@
-- redshift sample sql
Create table sales(
dateid int,
venuestate char(80),
venuecity char(40),
venuename char(100),
catname char(50),
Qtr int,
qtysold int,
pricepaid int,
Year date
);
insert into t2
SELECT qtr,
Sum(pricepaid) AS qtrsales,
(SELECT Sum(pricepaid)
FROM sales
JOIN date
ON sales.dateid = date.dateid
WHERE qtr = '1'
AND year = 2008) AS q1sales
FROM sales
JOIN date
ON sales.dateid = date.dateid
WHERE qtr IN( '2', '3' )
AND year = 2008
GROUP BY qtr
ORDER BY qtr;

5
samples/sap_hana.sql Normal file
View File

@ -0,0 +1,5 @@
create table "my_schema".t1(a int, b int);
MERGE INTO "my_schema".t1 USING "my_schema".t2 ON "my_schema".t1.a = "my_schema".t2.a
WHEN MATCHED THEN UPDATE SET "my_schema".t1.b = "my_schema".t2.b
WHEN NOT MATCHED THEN INSERT VALUES("my_schema".t2.a, "my_schema".t2.b);

6
samples/snowflake.sql Normal file
View File

@ -0,0 +1,6 @@
insert overwrite all
into t1
into t1 (c1, c2, c3) values (n2, n1, default)
into t2 (c1, c2, c3)
into t2 values (n3, n2, n1)
select n1, n2, n3 from src;

View File

@ -0,0 +1,65 @@
create or replace view CH_LATEST_JIRA_ISSUE(
JIRA_ISSUE_ID,
KEY,
PARENT_ID,
RESOLUTION_ID,
LAST_VIEWED,
_ORIGINAL_ESTIMATE,
ASSIGNEE_ID,
ISSUE_TYPE_ID,
ENVIRONMENT,
DUE_DATE,
REMAINING_ESTIMATE,
STATUS_ID,
_REMAINING_ESTIMATE,
CREATOR_ID,
TIME_SPENT,
_TIME_SPENT,
WORK_RATIO,
REPORTER_ID,
PROJECT_ID,
RESOLVED,
UPDATED_AT,
ORIGINAL_ESTIMATE,
ISSUE_DESCRIPTION,
ISSUE_SUMMARY,
STATUS_CATEGORY_CHANGED,
PRIORITY_ID,
ISSUE_CREATED_AT,
IS_DELETED,
SYNCED_AT,
FIRST_IN_AMT,
FIRST_OUT_AMT
) as (
WITH tran_in_base1 AS (
SELECT COMPANY_ID, min(CREATED_AT) CREATED_AT_MIN FROM tide.pres_core.cleared_transactions WHERE AMOUNT>0 GROUP BY COMPANY_ID
),
tran_out_base1 AS (
SELECT COMPANY_ID, min(CREATED_AT) CREATED_AT_MIN FROM tide.pres_core.cleared_transactions WHERE AMOUNT<0 GROUP BY COMPANY_ID
),
tran_in_base2 AS (
SELECT a.COMPANY_ID,MAX(a.AMOUNT) AS FIRST_IN_AMT FROM tide.pres_core.cleared_transactions a
INNER JOIN tran_in_base1 b
on a.COMPANY_ID=b.COMPANY_ID and a.CREATED_AT=b.CREATED_AT_MIN GROUP BY a.COMPANY_ID
),
tran_out_base2 AS (
SELECT a.COMPANY_ID,MAX(a.AMOUNT) AS FIRST_OUT_AMT FROM tide.pres_core.cleared_transactions a
INNER JOIN tran_out_base1 b
on a.COMPANY_ID=b.COMPANY_ID and a.CREATED_AT=b.CREATED_AT_MIN GROUP BY a.COMPANY_ID
),
jira_issues_tab AS (
SELECT *
FROM tide.intg_jira.latest_jira_issues
)
SELECT a.*, b.FIRST_IN_AMT, c.FIRST_OUT_AMT
FROM jira_issues_tab a
LEFT JOIN tran_in_base2 b ON cast(REGEXP_SUBSTR(a.issue_summary,'[0-9]+') AS bigint)=b.COMPANY_ID
LEFT JOIN tran_out_base2 c ON cast(REGEXP_SUBSTR(a.issue_summary,'[0-9]+') AS bigint)=c.COMPANY_ID
);

9
samples/sparksql.sql Normal file
View File

@ -0,0 +1,9 @@
-- sparksql sample sql
CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING);
SELECT * FROM person
PIVOT (
SUM(age) AS a, AVG(class) AS c
FOR name IN ('John' AS john, 'Mike' AS mike)
);

37
samples/sqlserver.sql Normal file
View File

@ -0,0 +1,37 @@
-- sql server sample sql
CREATE TABLE dbo.EmployeeSales
( DataSource varchar(20) NOT NULL,
BusinessEntityID varchar(11) NOT NULL,
LastName varchar(40) NOT NULL,
SalesDollars money NOT NULL
);
GO
CREATE PROCEDURE dbo.uspGetEmployeeSales
AS
SET NOCOUNT ON;
SELECT 'PROCEDURE', sp.BusinessEntityID, c.LastName,
sp.SalesYTD
FROM Sales.SalesPerson AS sp
INNER JOIN Person.Person AS c
ON sp.BusinessEntityID = c.BusinessEntityID
WHERE sp.BusinessEntityID LIKE '2%'
ORDER BY sp.BusinessEntityID, c.LastName;
GO
--INSERT...SELECT example
INSERT INTO dbo.EmployeeSales
SELECT 'SELECT', sp.BusinessEntityID, c.LastName, sp.SalesYTD
FROM Sales.SalesPerson AS sp
INNER JOIN Person.Person AS c
ON sp.BusinessEntityID = c.BusinessEntityID
WHERE sp.BusinessEntityID LIKE '2%'
ORDER BY sp.BusinessEntityID, c.LastName;
GO
CREATE VIEW hiredate_view
AS
SELECT p.FirstName, p.LastName, e.BusinessEntityID, e.HireDate
FROM HumanResources.Employee e
JOIN Person.Person AS p ON e.BusinessEntityID = p.BusinessEntityID ;
GO

83
samples/sqlserver_er.sql Normal file
View File

@ -0,0 +1,83 @@
CREATE TABLE "users"(
"id" INT NOT NULL,
"name" NVARCHAR(255) NOT NULL,
"email" NVARCHAR(255) NOT NULL,
"email_verified_at" DATETIME NULL,
"password" NVARCHAR(255) NOT NULL,
"remember_token" NVARCHAR(255) NULL,
"created_at" DATETIME NOT NULL,
"updated_at" DATETIME NOT NULL,
"phone_number" NVARCHAR(255) NOT NULL,
"description" NVARCHAR(255) NOT NULL,
"profile_image" NVARCHAR(255) NOT NULL
);
ALTER TABLE
"users" ADD CONSTRAINT "users_id_primary" PRIMARY KEY("id");
CREATE UNIQUE INDEX "users_email_unique" ON
"users"("email");
CREATE TABLE "rooms"(
"id" INT NOT NULL,
"home_type" NVARCHAR(255) NOT NULL,
"room_type" NVARCHAR(255) NOT NULL,
"total_occupancy" INT NOT NULL,
"total_bedrooms" INT NOT NULL,
"total_bathrooms" INT NOT NULL,
"summary" NVARCHAR(255) NOT NULL,
"address" NVARCHAR(255) NOT NULL,
"has_tv" BIT NOT NULL,
"has_kitchen" BIT NOT NULL,
"has_air_con" BIT NOT NULL,
"has_heating" BIT NOT NULL,
"has_internet" BIT NOT NULL,
"price" INT NOT NULL,
"published_at" DATETIME NOT NULL,
"owner_id" INT NOT NULL,
"created_at" DATETIME NOT NULL,
"updated_at" DATETIME NOT NULL,
"latitude" FLOAT NOT NULL,
"longitude" FLOAT NOT NULL
);
ALTER TABLE
"rooms" ADD CONSTRAINT "rooms_id_primary" PRIMARY KEY("id");
CREATE TABLE "reservations"(
"id" INT NOT NULL,
"user_id" INT NOT NULL,
"room_id" INT NOT NULL,
"start_date" DATETIME NOT NULL,
"end_date" DATETIME NOT NULL,
"price" INT NOT NULL,
"total" INT NOT NULL,
"created_at" DATETIME NOT NULL,
"updated_at" DATETIME NOT NULL
);
ALTER TABLE
"reservations" ADD CONSTRAINT "reservations_id_primary" PRIMARY KEY("id");
CREATE TABLE "media"(
"id" INT NOT NULL,
"model_id" INT NOT NULL,
"model_type" NVARCHAR(255) NOT NULL,
"file_name" NVARCHAR(255) NOT NULL,
"mime_type" NVARCHAR(255) NULL
);
ALTER TABLE
"media" ADD CONSTRAINT "media_id_primary" PRIMARY KEY("id");
CREATE TABLE "reviews"(
"id" INT NOT NULL,
"reservation_id" INT NOT NULL,
"rating" INT NOT NULL,
"comment" NVARCHAR(255) NOT NULL
);
ALTER TABLE
"reviews" ADD CONSTRAINT "reviews_id_primary" PRIMARY KEY("id");
ALTER TABLE
"rooms" ADD CONSTRAINT "rooms_published_at_foreign" FOREIGN KEY("published_at") REFERENCES "users"("id");
ALTER TABLE
"reservations" ADD CONSTRAINT "reservations_user_id_foreign" FOREIGN KEY("user_id") REFERENCES "users"("id");
ALTER TABLE
"reservations" ADD CONSTRAINT "reservations_room_id_foreign" FOREIGN KEY("room_id") REFERENCES "rooms"("id");
ALTER TABLE
"reviews" ADD CONSTRAINT "reviews_reservation_id_foreign" FOREIGN KEY("reservation_id") REFERENCES "reservations"("id");
ALTER TABLE
"media" ADD CONSTRAINT "media_model_id_foreign" FOREIGN KEY("model_id") REFERENCES "reviews"("id");
ALTER TABLE
"media" ADD CONSTRAINT "media_model_id_foreign" FOREIGN KEY("model_id") REFERENCES "rooms"("id");

5
samples/sybase.sql Normal file
View File

@ -0,0 +1,5 @@
create view psych_titles as
select *
from (select * from titles
where type = "psychology") dt_psych
;

11
samples/teradata.sql Normal file
View File

@ -0,0 +1,11 @@
USING (empno INTEGER,
salary INTEGER)
MERGE INTO employee AS t
USING (SELECT :empno, :salary, name
FROM names
WHERE empno=:empno) AS s(empno, salary, name)
ON t.empno=s.empno
WHEN MATCHED THEN UPDATE
SET salary=s.salary, name = s.name
WHEN NOT MATCHED THEN INSERT (empno, name, salary)
VALUES (s.empno, s.name, s.salary);

1
samples/vertica.sql Normal file
View File

@ -0,0 +1 @@
INSERT INTO t1 (col1, col2) (SELECT 'abc', mycolumn FROM mytable);

53
test.sql Normal file
View File

@ -0,0 +1,53 @@
CREATE VIEW vsal
AS
SELECT a.deptno "Department",
a.num_emp / b.total_count "Employees",
a.sal_sum / b.total_sal "Salary"
FROM (SELECT deptno,
Count() num_emp,
SUM(sal) sal_sum
FROM scott.emp
WHERE city = 'NYC'
GROUP BY deptno) a,
(SELECT Count() total_count,
SUM(sal) total_sal
FROM scott.emp
WHERE city = 'NYC') b
;
INSERT ALL
WHEN ottl < 100000 THEN
INTO small_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 100000 and ottl < 200000 THEN
INTO medium_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 200000 THEN
into large_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 290000 THEN
INTO special_orders
SELECT o.order_id oid, o.customer_id cid, o.order_total ottl,
o.sales_rep_id sid, c.credit_limit cl, c.cust_email cem
FROM orders o, customers c
WHERE o.customer_id = c.customer_id;
create table scott.dept(
deptno number(2,0),
dname varchar2(14),
loc varchar2(13),
constraint pk_dept primary key (deptno)
);
create table scott.emp(
empno number(4,0),
ename varchar2(10),
job varchar2(9),
mgr number(4,0),
hiredate date,
sal number(7,2),
comm number(7,2),
deptno number(2,0),
constraint pk_emp primary key (empno),
constraint fk_deptno foreign key (deptno) references dept (deptno)
);

45
widget/er.html Normal file
View File

@ -0,0 +1,45 @@
<!DOCTYPE html>
<html lang="en-us">
<head>
<meta charset="UTF-8" />
<title>Data lineage view</title>
<script src="sqlflow.widget.3.5.19.js?t=1704526657668"></script>
<script>
document.addEventListener('DOMContentLoaded', async () => {
const sqlflow = await SQLFlow.init({
container: document.getElementById('sqlflow'),
width: '100%',
height: '100%',
apiPrefix: '',
component: {
sqlEditor: false,
graphLocate: true,
minimap: true,
},
});
const json = await fetch('json/erGraph.json').then(res => res.json());
sqlflow.visualizeERJSON(json, { layout: true });
});
</script>
<style>
html, body {
width: 100%;
height: 100%;
}
div {
height: 100%;
width: 100%;
}
* {
margin: 0;
padding: 0;
}
</style>
</head>
<body>
<div class="block">
<div id="sqlflow"></div>
</div>
</body>
</html>

45
widget/index.html Normal file
View File

@ -0,0 +1,45 @@
<!DOCTYPE html>
<html lang="en-us">
<head>
<meta charset="UTF-8" />
<title>Data lineage view</title>
<script src="sqlflow.widget.3.5.19.js?t=1704526657668"></script>
<script>
document.addEventListener('DOMContentLoaded', async () => {
const sqlflow = await SQLFlow.init({
container: document.getElementById('sqlflow'),
width: '100%',
height: '100%',
apiPrefix: '',
component: {
sqlEditor: false,
graphLocate: true,
minimap: true,
},
});
const json = await fetch('json/lineageGraph.json').then(res => res.json());
sqlflow.visualizeJSON(json, { layout: true });
});
</script>
<style>
html, body {
width: 100%;
height: 100%;
}
div {
height: 100%;
width: 100%;
}
* {
margin: 0;
padding: 0;
}
</style>
</head>
<body>
<div class="block">
<div id="sqlflow"></div>
</div>
</body>
</html>

1
widget/jquery.min.js vendored Normal file

File diff suppressed because one or more lines are too long

1
widget/json/erGraph.json Normal file

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long