This commit is contained in:
j 2024-10-23 17:45:25 +08:00
commit a0bcfd3549
101 changed files with 2180 additions and 0 deletions

26
.gitignore vendored Normal file
View File

@ -0,0 +1,26 @@
# Compiled class file
*.class
# Log file
*.log
# BlueJ files
*.ctxt
# Mobile Tools for Java (J2ME)
.mtj.tmp/
# Package Files #
# *.jar
*.war
*.nar
*.ear
*.zip
*.tar.gz
*.rar
# virtual machine crash logs, see http://www.java.com/en/download/help/error_hotspot.xml
hs_err_pid*
replay_pid*
/target
/.idea

54
README.md Normal file
View File

@ -0,0 +1,54 @@
## Gudu SQLFlow Lite version for Java
[Gudu SQLFlow](https://sqlflow.gudusoft.com) is a tool used to analyze SQL statements and stored procedures
of various databases to obtain complex [data lineage](https://en.wikipedia.org/wiki/Data_lineage) relationships and visualize them.
[Gudu SQLFlow Lite version for Java](https://github.com/sqlparser/java_data_lineage) allows Java developers to quickly integrate data lineage analysis and
visualization capabilities into their own Java applications. It can also be used in daily work by data scientists to quickly discover
data lineage from complex SQL scripts that usually used in ETL jobs do the data transform in a huge data platform.
Gudu SQLFlow Lite version for Java is free for non-commercial use and can handle any complex SQL statements
with a length of up to 10k, including support for stored procedures. It supports SQL dialect from more than
20 major database vendors such as Oracle, DB2, Snowflake, Redshift, Postgres and so on.
Gudu SQLFlow Lite version for Java includes [a Java library](https://www.gudusoft.com/sqlflow-java-library-2/) for analyzing complex SQL statements and
stored procedures to retrieve data lineage relationships, and [a JavaScript library](https://docs.gudusoft.com/4.-sqlflow-widget/get-started) for visualizing data lineage relationships.
Gudu SQLFlow Lite version for Java can also automatically extract table and column constraints,
as well as relationships between tables and fields, from [DDL scripts exported from the database](https://docs.gudusoft.com/6.-sqlflow-ingester/introduction)
and generate an ER Diagram.
### Build and run the program
This is a spring boot web service that integrates browser-side applications with thymeleaf. So you no longer need to install additional web containers like Nginx.
#### Prerequisites
* Install maven
* Install Java jdk1.8
#### Build
Compile with the following Maven command, The compiled jar package is under the target folder.
```
mvn package
```
Of course, you can skip this step because there is already a compiled executable in the Bin Folder.
#### Run the program
```
java -jar bin/java_data_lineage-1.0.0.jar
```
When the startup is complete, open the program in the browser at the following URL:
http://localhost:9600
The default port is 9600, if you need to change the port, for example to 8000, you can start by the following command:
```
java -jar bin/java_data_lineage-1.0.0.jar --server.port=8000
```
![png](doc/images/home.png)
### Export metadata from various databases.
You can export metadata from the database using [SQLFlow ingester](https://github.com/sqlparser/sqlflow_public/releases)
and hand it over to Gudu SQLFlow for data lineage analysis.。
[Document of the SQLFlow ingester](https://docs.gudusoft.com/6.-sqlflow-ingester/introduction)
## Contact
For further information, please contact support@gudusoft.com

80
README_cn.md Normal file
View File

@ -0,0 +1,80 @@
## Gudu SQLFlow Lite version for java
[Gudu SQLFlow](https://sqlflow.gudusoft.com) 是一款用来分析各种数据库的SQL语句和存储过程来获取复杂的数据血缘关系并进行可视化的工具。
[Gudu SQLFlow Lite version for Java](https://github.com/sqlparser/java_data_lineage)适用于Java允许Java开发人员快速将数据血缘分析和可视化功能集成到自己的Java应用程序中。数据分析人员也可以在日常工作中使用它以快速发现复杂SQL脚本中的数据血缘关系这些脚本通常用于在大型数据平台上进行数据转换的ETL作业。
[Gudu SQLFlow Lite version for Java](https://github.com/sqlparser/java_data_lineage)适用于非商业用途并且可以处理长度最多为10k的任何复杂SQL语句包括对存储过程的支持。它支持来自20多个主要数据库供应商如Oracle、DB2、Snowflake、Redshift、Postgres等的SQL。
[Gudu SQLFlow Lite version for Java](https://github.com/sqlparser/java_data_lineage)适用于Java 包括一个用于分析复杂SQL语句和存储过程以检索数据血缘关系的Java库以及一个用于可视化数据血缘关系的JavaScript库。
[Gudu SQLFlow Lite version for Java](https://github.com/sqlparser/java_data_lineage)适用于Java 还可以自动[从数据库导出的DDL脚本](https://docs.gudusoft.com/6.-sqlflow-ingester/introduction)中提取表和列约束以及表和字段之间的关系并生成ER图。
### 构建和运行程序
这是一个将浏览器端应用程序用Thymeleaf集成的Spring Boot Web服务。因此您无需安装额外的Web容器如Nginx。
#### 先决条件
* 安装Maven
* 安装Java jdk1.8
### 构建
使用以下Maven命令进行编译编译后的JAR包位于target文件夹下。
```
mvn package
```
当然您也可以跳过此步骤因为Bin文件夹中已经有一个编译好的可执行文件。
### 运行程序
```
java -jar bin/java_data_lineage-1.0.0.jar
```
启动完成后在浏览器中打开以下网址以访问程序http://localhost:9600
默认端口号是9600如果您需要更改端口号例如改为8000可以使用以下命令启动
```
java -jar bin/java_data_lineage-1.0.0.jar --server.port=8000
```
界面如下:
![png](doc/images/home.png)
### 界面参数说明
#### <a id="note1"></a> 1. dbvendor 指定数据库类型
>默认是 oracle支持 access,bigquery,couchbase,dax,db2,greenplum, gaussdb, hana,hive,impala,informix,mdx,mssql,
sqlserver,mysql,netezza,odbc,openedge,oracle,postgresql,postgres,redshift,snowflake,
sybase,teradata,soql,vertica
#### <a id="note2"></a> 2. Setting 常用参数设置
* [indirect 显示间接血缘关系](doc/cn/set_indirect.md)
* [show function 显示函数](doc/cn/set_function.md)
* [show constant 显示常量](doc/cn/set_constant.md)
* [ignoreRecordSet 忽略中间结果集](doc/cn/set_ignoreRecordSet.md)
* [table level 显示表级血缘关系](doc/cn/set_tablelevel.md)
* [show transform 显示关系转换](doc/cn/set_transform.md)
#### <a id="note3"></a> 3. Show ResultSet Types 指定结果集类型的简单输出]
可选结果集类型有:
* [result_of](doc/cn/rt_result_of.md)
* [cte](doc/cn/rt_cte.md)
* [insert_select](doc/cn/rt_insert_select.md)
* [update_select](doc/cn/rt_update_select.md)
* [merge_update](doc/cn/rt_merge_update.md)
* [merge_insert](doc/cn/rt_merge_insert.md)
* [update_set](doc/cn/rt_update_set.md)
* [pivot_table](doc/cn/rt_pivot_table.md)
* [unpivot_table](doc/cn/rt_unpivot_table.md)
* [rs](doc/cn/rt_rs.md)
* [function](doc/cn/rt_function.md)
* [case_when](doc/cn/rt_case_when.md)
### 从各种数据库导出元数据
您可以使用[SQLFlow ingester](https://github.com/sqlparser/sqlflow_public/releases) 从数据库导出元数据并将其交给Gudu SQLFlow进行数据血缘分析。
[SQLFlow ingester的文档](https://docs.gudusoft.com/6.-sqlflow-ingester/introduction)
### 联系方式
如需更多信息请联系support@gudusoft.com。

Binary file not shown.

17
doc/cn/rt_case_when.md Normal file
View File

@ -0,0 +1,17 @@
## Show ResultSet Typescase_when
oracle
```
SELECT
COUNT( CASE WHEN AGE = 18 THEN 'countof18' END) EIGHTENN,
COUNT( CASE WHEN AGE = 19 THEN 'countof19' END) NINETEEN
FROM
PeopleInfo
```
显示前:
![png](../images/rt_case_when_01.png)
显示后:
![png](../images/rt_case_when_02.png)

28
doc/cn/rt_cte.md Normal file
View File

@ -0,0 +1,28 @@
## Show ResultSet Typescte
oracle
```
with aa as
(
select country_id ,row_number() over(order by country_id ) as ida
from contry a
)
, bb as
(
select country_id ,row_number() over(order by country_id ) as idb
from contry
)
, cc as
(
select aa.ida, bb.idb from aa left join bb on aa.ida=bb.idb
)
select * from cc;
```
显示前:
![png](../images/rt_cte_01.png)
显示后:
![png](../images/rt_cte_02.png)

14
doc/cn/rt_function.md Normal file
View File

@ -0,0 +1,14 @@
## Show ResultSet Typesfunction
oracle
```
select round(salary) as sal from scott.emp
```
显示前:
![png](../images/rt_function_01.png)
显示后:
![png](../images/rt_function_02.png)

View File

@ -0,0 +1,30 @@
## Show ResultSet Typesinsert_select
oracle
```
INSERT ALL
WHEN ottl < 100000 THEN
INTO small_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 100000 and ottl < 200000 THEN
INTO medium_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 200000 THEN
into large_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 290000 THEN
INTO special_orders
SELECT o.order_id oid, o.customer_id cid, o.order_total ottl,
o.sales_rep_id sid, c.credit_limit cl, c.cust_email cem
FROM orders o, customers c
WHERE o.customer_id = c.customer_id;
```
显示前:
![png](../images/rt_insert_select_01.png)
显示后:
![png](../images/rt_insert_select_02.png)

22
doc/cn/rt_merge_insert.md Normal file
View File

@ -0,0 +1,22 @@
## Show ResultSet Typesmerge_insert
oracle
```
merge into t_B_info_bb b
using t_B_info_aa a
on (a.id = b.id and a.type = b.type)
when matched then
update set b.price = a.price
when not matched then
insert (id, type, price) values (a.id, a.type, a.price)
```
显示前:
![png](../images/rt_merge_insert_01.png)
显示后:
![png](../images/rt_merge_insert_02.png)

20
doc/cn/rt_merge_update.md Normal file
View File

@ -0,0 +1,20 @@
## Show ResultSet Typesmerge_update
oracle
```
merge into t_B_info_bb b
using t_B_info_aa a
on (a.id = b.id and a.type = b.type)
when matched then
update set b.price = a.price
when not matched then
insert (id, type, price) values (a.id, a.type, a.price)
```
显示前:
![png](../images/rt_merge_update_01.png)
显示后:
![png](../images/rt_merge_insert_02.png)

13
doc/cn/rt_pivot_table.md Normal file
View File

@ -0,0 +1,13 @@
## Show ResultSet Typespivot_table
oracle
```
select * from table2 pivot(max(value) as attr for(attr) in('age' as age,'sex' as sex));
```
显示前:
![png](../images/rt_pivot_table_01.png)
显示后:
![png](../images/rt_pivot_table_02.png)

29
doc/cn/rt_result_of.md Normal file
View File

@ -0,0 +1,29 @@
## Show ResultSet Typesresult_of
oracle
```
CREATE VIEW vsal
AS
SELECT a.deptno "Department",
a.num_emp / b.total_count "Employees",
a.sal_sum / b.total_sal "Salary"
FROM (SELECT deptno,
Count() num_emp,
SUM(sal) sal_sum
FROM scott.emp
WHERE city = 'NYC'
GROUP BY deptno) a,
(SELECT Count() total_count,
SUM(sal) total_sal
FROM scott.emp
WHERE city = 'NYC') b
;
```
显示前:
![png](../images/rt_result_of_01.png)
显示后:
![png](../images/rt_result_of_02.png)

13
doc/cn/rt_rs.md Normal file
View File

@ -0,0 +1,13 @@
## Show ResultSet Typesrs
oracle
```
CREATE VIEW viewA AS SELECT a, b FROM tableA;
```
显示前:
![png](../images/rt_rs_01.png)
显示后:
![png](../images/rt_rs_02.png)

View File

@ -0,0 +1,17 @@
## Show ResultSet Typesunpivot_table
oracle
```
select * from pivot_sales_data
unpivot(
amount for month in (jan, feb, mar, apr)
)
order by prd_type_id;
```
显示前:
![png](../images/rt_unpivot_table_01.png)
显示后:
![png](../images/rt_unpivot_table_02.png)

View File

@ -0,0 +1,14 @@
## Show ResultSet Typesupdate_select
oracle
```
UPDATE employees SET salary = (SELECT salary * 1.1 FROM employees WHERE department_id = 80) WHERE department_id = 80;
create view eview as select * from employees;
```
显示前:
![png](../images/rt_update_select_01.png)
显示后:
![png](../images/rt_update_select_02.png)

14
doc/cn/rt_update_set.md Normal file
View File

@ -0,0 +1,14 @@
## Show ResultSet Typesupdate_set
oracle
```
UPDATE employees SET salary =1 WHERE department_id = 80;
create view eview as select * from employees;
```
显示前:
![png](../images/rt_update_set_01.png)
显示后:
![png](../images/rt_update_set_02.png)

12
doc/cn/set_constant.md Normal file
View File

@ -0,0 +1,12 @@
## setting: show constant
oracle
```
SELECT deptno, '001' sortcode FROM scott.emp WHERE city = 'NYC'
```
以上sql中存在2个常量其显示常量数据血缘的数据流图如下图所示
![png](../images/setting_constant_01.png)
如果您关闭“show constant”数据流结果如下
![png](../images/setting_constant_02.png)

19
doc/cn/set_function.md Normal file
View File

@ -0,0 +1,19 @@
## setting:show function
在数据流分析过程中function起着关键作用它接受列作为参数并生成可能是标量值或集合值的结果。
oracle
```
select round(salary) as sal from scott.emp
```
在上述SQL中从列salary到round函数生成一个直接数据流
> scott.emp.salary -> direct -> round(salary) -> direct -> sal
数据流图示:
![setting_indirect_01.png](../images/setting_function_01.png)
如果您关闭“show function”数据流结果如下
![setting_indirect_01.png](../images/setting_function_02.png)

View File

@ -0,0 +1,29 @@
## setting:ignoreRecordSet
oracle
```
INSERT ALL
WHEN ottl < 100000 THEN
INTO small_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 100000 and ottl < 200000 THEN
INTO medium_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 200000 THEN
into large_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 290000 THEN
INTO special_orders
SELECT o.order_id oid, o.customer_id cid, o.order_total ottl,
o.sales_rep_id sid, c.credit_limit cl, c.cust_email cem
FROM orders o, customers c
WHERE o.customer_id = c.customer_id;
```
![png](../images/setting_ignoreRecordSet_01.png)
以上sql未忽略中间结果集情况下会有一个insert_select中间结果其数据流图如下所示
![png](../images/setting_ignoreRecordSet_02.png)
如果您打开“ignoreRecordSet”数据流结果如下
![png](../images/setting_ignoreRecordSet_03.png)

73
doc/cn/set_indirect.md Normal file
View File

@ -0,0 +1,73 @@
## setting: 间接数据流和伪列介绍
本文介绍一些生成间接数据流的SQL元素。间接数据流通常由where子句、group by子句、聚合函数等中使用的列生成。
为了在列之间创建间接数据流我们引入了一个伪列RelationRows。
RelationRows是关系的伪列用于表示关系中的行数。顾名思义RelationRows不是关系表/结果集等中的真正列。通常它用于表示列和关系之间的数据流。RelationRows伪列可用于源关系和目标关系。
### 1、RelationsRows在目标关系中
以下述SQL为例
oracle
```
SELECT a.empName "eName" FROM scott.emp a Where sal > 1000
```
select列表的总行数受where子句中sal列的值影响因此间接数据流是这样创建的
> scott.emp.sal -> indirect -> RS-1.RelationRows
数据流图示:
![setting_indirect_01.png](../images/setting_indirect_01.png)
### 2. RelationsRows在源关系中
这里是另一个示例SQL
oracle
```
SELECT count() totalNum, sum(sal) totalSal FROM scott.emp
```
count函数和sumsal函数的值受scott.emp源表中的行数影响。
> scott.emp.RelationRows -> indirect -> count()
> scott.emp.RelationRows -> indirect -> sum(sal)
数据流图示:
![setting_indirect_01.png](../images/setting_indirect_02.png)
### 3. 表级别的数据流关系中的RelationsRows
RelationRows还用于表示表级数据流。
oracle
```
alter table t2 rename to t3;
```
表级数据流不是建立在表上而是建立在伪列RelationRows上如下所示
> t2.RelationRows -> direct -> t3.RelationRows
![setting_indirect_01.png](../images/setting_indirect_03.png)
使用RelationRows伪列构建表到表的数据流有两个原因
如果用户需要表级溯源模型,这个用来表示表到列数据流的伪列稍后将用于生成表到表的数据流。
如果在列到列的数据流中使用同一表中的其他列,而该表本身也在表到表的数据流中,那么,该伪列将使单个表能够同时包含列到列的数据流和表到表的数据流。
以这个SQL为例
oracle
```
create view v1 as select f1 from t2;
alter table t2 rename to t3;
```
第一条create view语句将在表t2和视图v1之间生成一个列级数据流:
> t2.f1 -> direct -> RS-1.f1 -> direct -> v1.f1
>
而第二个alter table语句将在表t2和t3之间生成表级数据流。
> t2.RelationRows -> direct -> t3.RelationRows
![setting_indirect_01.png](../images/setting_indirect_02.png)
如您所见表t2涉及create view语句生成的列到列的数据流它还涉及alter Table语句生成的表到表的数据流,上图中的一个表t2显示它既包括列到列的数据流也包括表到表的数据流。

30
doc/cn/set_tablelevel.md Normal file
View File

@ -0,0 +1,30 @@
## setting: table level
oracle
```
INSERT ALL
WHEN ottl < 100000 THEN
INTO small_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 100000 and ottl < 200000 THEN
INTO medium_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 200000 THEN
into large_orders
VALUES(oid, ottl, sid, cid)
WHEN ottl > 290000 THEN
INTO special_orders
SELECT o.order_id oid, o.customer_id cid, o.order_total ottl,
o.sales_rep_id sid, c.credit_limit cl, c.cust_email cem
FROM orders o, customers c
WHERE o.customer_id = c.customer_id;
```
以上sql未打开“table level”的情况下可以看到显示列到列之间的血缘关系数据流图如下
![png](../images/setting_tablelevel_01.png)
如果您打开“table level”数据流结果如下
![png](../images/setting_tablelevel_03.png)
![png](../images/setting_tablelevel_02.png)

13
doc/cn/set_transform.md Normal file
View File

@ -0,0 +1,13 @@
## setting: show transform
show transform 参数用来显示 SQL 语句中进行数据转换的表达式,即目标字段的数据是哪些源数据字段通过哪个表达式转换来的。例如:
oracle
```
select SUM(e.sal + Nvl(e.comm, 0)) AS sal from table1;
```
我们可以知道 sal 字段的数据通过 SUM(e.sal + Nvl(e.comm, 0)) 表达式转换而来,源数据字段为 sal 和 comm。
通过打开 show transform 参数,我们可以方便的看到这个转换过程对应的表达式。
![png](../images/setting_transform_01.png)

BIN
doc/images/home.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 326 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

BIN
doc/images/rt_cte_01.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

BIN
doc/images/rt_cte_02.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 91 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 40 KiB

BIN
doc/images/rt_rs_01.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.7 KiB

BIN
doc/images/rt_rs_02.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 27 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 84 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

1
github/HEAD Normal file
View File

@ -0,0 +1 @@
ref: refs/heads/main

13
github/config Normal file
View File

@ -0,0 +1,13 @@
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
ignorecase = true
precomposeunicode = true
[remote "origin"]
url = https://github.com/sqlparser/java_data_lineage.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "main"]
remote = origin
merge = refs/heads/main

1
github/description Normal file
View File

@ -0,0 +1 @@
Unnamed repository; edit this file 'description' to name the repository.

View File

@ -0,0 +1,15 @@
#!/bin/sh
#
# An example hook script to check the commit log message taken by
# applypatch from an e-mail message.
#
# The hook should exit with non-zero status after issuing an
# appropriate message if it wants to stop the commit. The hook is
# allowed to edit the commit message file.
#
# To enable this hook, rename this file to "applypatch-msg".
. git-sh-setup
commitmsg="$(git rev-parse --git-path hooks/commit-msg)"
test -x "$commitmsg" && exec "$commitmsg" ${1+"$@"}
:

24
github/hooks/commit-msg.sample Executable file
View File

@ -0,0 +1,24 @@
#!/bin/sh
#
# An example hook script to check the commit log message.
# Called by "git commit" with one argument, the name of the file
# that has the commit message. The hook should exit with non-zero
# status after issuing an appropriate message if it wants to stop the
# commit. The hook is allowed to edit the commit message file.
#
# To enable this hook, rename this file to "commit-msg".
# Uncomment the below to add a Signed-off-by line to the message.
# Doing this in a hook is a bad idea in general, but the prepare-commit-msg
# hook is more suited to it.
#
# SOB=$(git var GIT_AUTHOR_IDENT | sed -n 's/^\(.*>\).*$/Signed-off-by: \1/p')
# grep -qs "^$SOB" "$1" || echo "$SOB" >> "$1"
# This example catches duplicate Signed-off-by lines.
test "" = "$(grep '^Signed-off-by: ' "$1" |
sort | uniq -c | sed -e '/^[ ]*1[ ]/d')" || {
echo >&2 Duplicate Signed-off-by lines.
exit 1
}

View File

@ -0,0 +1,174 @@
#!/usr/bin/perl
use strict;
use warnings;
use IPC::Open2;
# An example hook script to integrate Watchman
# (https://facebook.github.io/watchman/) with git to speed up detecting
# new and modified files.
#
# The hook is passed a version (currently 2) and last update token
# formatted as a string and outputs to stdout a new update token and
# all files that have been modified since the update token. Paths must
# be relative to the root of the working tree and separated by a single NUL.
#
# To enable this hook, rename this file to "query-watchman" and set
# 'git config core.fsmonitor .git/hooks/query-watchman'
#
my ($version, $last_update_token) = @ARGV;
# Uncomment for debugging
# print STDERR "$0 $version $last_update_token\n";
# Check the hook interface version
if ($version ne 2) {
die "Unsupported query-fsmonitor hook version '$version'.\n" .
"Falling back to scanning...\n";
}
my $git_work_tree = get_working_dir();
my $retry = 1;
my $json_pkg;
eval {
require JSON::XS;
$json_pkg = "JSON::XS";
1;
} or do {
require JSON::PP;
$json_pkg = "JSON::PP";
};
launch_watchman();
sub launch_watchman {
my $o = watchman_query();
if (is_work_tree_watched($o)) {
output_result($o->{clock}, @{$o->{files}});
}
}
sub output_result {
my ($clockid, @files) = @_;
# Uncomment for debugging watchman output
# open (my $fh, ">", ".git/watchman-output.out");
# binmode $fh, ":utf8";
# print $fh "$clockid\n@files\n";
# close $fh;
binmode STDOUT, ":utf8";
print $clockid;
print "\0";
local $, = "\0";
print @files;
}
sub watchman_clock {
my $response = qx/watchman clock "$git_work_tree"/;
die "Failed to get clock id on '$git_work_tree'.\n" .
"Falling back to scanning...\n" if $? != 0;
return $json_pkg->new->utf8->decode($response);
}
sub watchman_query {
my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j --no-pretty')
or die "open2() failed: $!\n" .
"Falling back to scanning...\n";
# In the query expression below we're asking for names of files that
# changed since $last_update_token but not from the .git folder.
#
# To accomplish this, we're using the "since" generator to use the
# recency index to select candidate nodes and "fields" to limit the
# output to file names only. Then we're using the "expression" term to
# further constrain the results.
my $last_update_line = "";
if (substr($last_update_token, 0, 1) eq "c") {
$last_update_token = "\"$last_update_token\"";
$last_update_line = qq[\n"since": $last_update_token,];
}
my $query = <<" END";
["query", "$git_work_tree", {$last_update_line
"fields": ["name"],
"expression": ["not", ["dirname", ".git"]]
}]
END
# Uncomment for debugging the watchman query
# open (my $fh, ">", ".git/watchman-query.json");
# print $fh $query;
# close $fh;
print CHLD_IN $query;
close CHLD_IN;
my $response = do {local $/; <CHLD_OUT>};
# Uncomment for debugging the watch response
# open ($fh, ">", ".git/watchman-response.json");
# print $fh $response;
# close $fh;
die "Watchman: command returned no output.\n" .
"Falling back to scanning...\n" if $response eq "";
die "Watchman: command returned invalid output: $response\n" .
"Falling back to scanning...\n" unless $response =~ /^\{/;
return $json_pkg->new->utf8->decode($response);
}
sub is_work_tree_watched {
my ($output) = @_;
my $error = $output->{error};
if ($retry > 0 and $error and $error =~ m/unable to resolve root .* directory (.*) is not watched/) {
$retry--;
my $response = qx/watchman watch "$git_work_tree"/;
die "Failed to make watchman watch '$git_work_tree'.\n" .
"Falling back to scanning...\n" if $? != 0;
$output = $json_pkg->new->utf8->decode($response);
$error = $output->{error};
die "Watchman: $error.\n" .
"Falling back to scanning...\n" if $error;
# Uncomment for debugging watchman output
# open (my $fh, ">", ".git/watchman-output.out");
# close $fh;
# Watchman will always return all files on the first query so
# return the fast "everything is dirty" flag to git and do the
# Watchman query just to get it over with now so we won't pay
# the cost in git to look up each individual file.
my $o = watchman_clock();
$error = $output->{error};
die "Watchman: $error.\n" .
"Falling back to scanning...\n" if $error;
output_result($o->{clock}, ("/"));
$last_update_token = $o->{clock};
eval { launch_watchman() };
return 0;
}
die "Watchman: $error.\n" .
"Falling back to scanning...\n" if $error;
return 1;
}
sub get_working_dir {
my $working_dir;
if ($^O =~ 'msys' || $^O =~ 'cygwin') {
$working_dir = Win32::GetCwd();
$working_dir =~ tr/\\/\//;
} else {
require Cwd;
$working_dir = Cwd::cwd();
}
return $working_dir;
}

View File

@ -0,0 +1,8 @@
#!/bin/sh
#
# An example hook script to prepare a packed repository for use over
# dumb transports.
#
# To enable this hook, rename this file to "post-update".
exec git update-server-info

View File

@ -0,0 +1,14 @@
#!/bin/sh
#
# An example hook script to verify what is about to be committed
# by applypatch from an e-mail message.
#
# The hook should exit with non-zero status after issuing an
# appropriate message if it wants to stop the commit.
#
# To enable this hook, rename this file to "pre-applypatch".
. git-sh-setup
precommit="$(git rev-parse --git-path hooks/pre-commit)"
test -x "$precommit" && exec "$precommit" ${1+"$@"}
:

49
github/hooks/pre-commit.sample Executable file
View File

@ -0,0 +1,49 @@
#!/bin/sh
#
# An example hook script to verify what is about to be committed.
# Called by "git commit" with no arguments. The hook should
# exit with non-zero status after issuing an appropriate message if
# it wants to stop the commit.
#
# To enable this hook, rename this file to "pre-commit".
if git rev-parse --verify HEAD >/dev/null 2>&1
then
against=HEAD
else
# Initial commit: diff against an empty tree object
against=$(git hash-object -t tree /dev/null)
fi
# If you want to allow non-ASCII filenames set this variable to true.
allownonascii=$(git config --type=bool hooks.allownonascii)
# Redirect output to stderr.
exec 1>&2
# Cross platform projects tend to avoid non-ASCII filenames; prevent
# them from being added to the repository. We exploit the fact that the
# printable range starts at the space character and ends with tilde.
if [ "$allownonascii" != "true" ] &&
# Note that the use of brackets around a tr range is ok here, (it's
# even required, for portability to Solaris 10's /usr/bin/tr), since
# the square bracket bytes happen to fall in the designated range.
test $(git diff --cached --name-only --diff-filter=A -z $against |
LC_ALL=C tr -d '[ -~]\0' | wc -c) != 0
then
cat <<\EOF
Error: Attempt to add a non-ASCII file name.
This can cause problems if you want to work with people on other platforms.
To be portable it is advisable to rename the file.
If you know what you are doing you can disable this check using:
git config hooks.allownonascii true
EOF
exit 1
fi
# If there are whitespace errors, print the offending file names and fail.
exec git diff-index --check --cached $against --

View File

@ -0,0 +1,13 @@
#!/bin/sh
#
# An example hook script to verify what is about to be committed.
# Called by "git merge" with no arguments. The hook should
# exit with non-zero status after issuing an appropriate message to
# stderr if it wants to stop the merge commit.
#
# To enable this hook, rename this file to "pre-merge-commit".
. git-sh-setup
test -x "$GIT_DIR/hooks/pre-commit" &&
exec "$GIT_DIR/hooks/pre-commit"
:

53
github/hooks/pre-push.sample Executable file
View File

@ -0,0 +1,53 @@
#!/bin/sh
# An example hook script to verify what is about to be pushed. Called by "git
# push" after it has checked the remote status, but before anything has been
# pushed. If this script exits with a non-zero status nothing will be pushed.
#
# This hook is called with the following parameters:
#
# $1 -- Name of the remote to which the push is being done
# $2 -- URL to which the push is being done
#
# If pushing without using a named remote those arguments will be equal.
#
# Information about the commits which are being pushed is supplied as lines to
# the standard input in the form:
#
# <local ref> <local oid> <remote ref> <remote oid>
#
# This sample shows how to prevent push of commits where the log message starts
# with "WIP" (work in progress).
remote="$1"
url="$2"
zero=$(git hash-object --stdin </dev/null | tr '[0-9a-f]' '0')
while read local_ref local_oid remote_ref remote_oid
do
if test "$local_oid" = "$zero"
then
# Handle delete
:
else
if test "$remote_oid" = "$zero"
then
# New branch, examine all commits
range="$local_oid"
else
# Update to existing branch, examine new commits
range="$remote_oid..$local_oid"
fi
# Check for WIP commit
commit=$(git rev-list -n 1 --grep '^WIP' "$range")
if test -n "$commit"
then
echo >&2 "Found WIP commit in $local_ref, not pushing"
exit 1
fi
fi
done
exit 0

169
github/hooks/pre-rebase.sample Executable file
View File

@ -0,0 +1,169 @@
#!/bin/sh
#
# Copyright (c) 2006, 2008 Junio C Hamano
#
# The "pre-rebase" hook is run just before "git rebase" starts doing
# its job, and can prevent the command from running by exiting with
# non-zero status.
#
# The hook is called with the following parameters:
#
# $1 -- the upstream the series was forked from.
# $2 -- the branch being rebased (or empty when rebasing the current branch).
#
# This sample shows how to prevent topic branches that are already
# merged to 'next' branch from getting rebased, because allowing it
# would result in rebasing already published history.
publish=next
basebranch="$1"
if test "$#" = 2
then
topic="refs/heads/$2"
else
topic=`git symbolic-ref HEAD` ||
exit 0 ;# we do not interrupt rebasing detached HEAD
fi
case "$topic" in
refs/heads/??/*)
;;
*)
exit 0 ;# we do not interrupt others.
;;
esac
# Now we are dealing with a topic branch being rebased
# on top of master. Is it OK to rebase it?
# Does the topic really exist?
git show-ref -q "$topic" || {
echo >&2 "No such branch $topic"
exit 1
}
# Is topic fully merged to master?
not_in_master=`git rev-list --pretty=oneline ^master "$topic"`
if test -z "$not_in_master"
then
echo >&2 "$topic is fully merged to master; better remove it."
exit 1 ;# we could allow it, but there is no point.
fi
# Is topic ever merged to next? If so you should not be rebasing it.
only_next_1=`git rev-list ^master "^$topic" ${publish} | sort`
only_next_2=`git rev-list ^master ${publish} | sort`
if test "$only_next_1" = "$only_next_2"
then
not_in_topic=`git rev-list "^$topic" master`
if test -z "$not_in_topic"
then
echo >&2 "$topic is already up to date with master"
exit 1 ;# we could allow it, but there is no point.
else
exit 0
fi
else
not_in_next=`git rev-list --pretty=oneline ^${publish} "$topic"`
/usr/bin/perl -e '
my $topic = $ARGV[0];
my $msg = "* $topic has commits already merged to public branch:\n";
my (%not_in_next) = map {
/^([0-9a-f]+) /;
($1 => 1);
} split(/\n/, $ARGV[1]);
for my $elem (map {
/^([0-9a-f]+) (.*)$/;
[$1 => $2];
} split(/\n/, $ARGV[2])) {
if (!exists $not_in_next{$elem->[0]}) {
if ($msg) {
print STDERR $msg;
undef $msg;
}
print STDERR " $elem->[1]\n";
}
}
' "$topic" "$not_in_next" "$not_in_master"
exit 1
fi
<<\DOC_END
This sample hook safeguards topic branches that have been
published from being rewound.
The workflow assumed here is:
* Once a topic branch forks from "master", "master" is never
merged into it again (either directly or indirectly).
* Once a topic branch is fully cooked and merged into "master",
it is deleted. If you need to build on top of it to correct
earlier mistakes, a new topic branch is created by forking at
the tip of the "master". This is not strictly necessary, but
it makes it easier to keep your history simple.
* Whenever you need to test or publish your changes to topic
branches, merge them into "next" branch.
The script, being an example, hardcodes the publish branch name
to be "next", but it is trivial to make it configurable via
$GIT_DIR/config mechanism.
With this workflow, you would want to know:
(1) ... if a topic branch has ever been merged to "next". Young
topic branches can have stupid mistakes you would rather
clean up before publishing, and things that have not been
merged into other branches can be easily rebased without
affecting other people. But once it is published, you would
not want to rewind it.
(2) ... if a topic branch has been fully merged to "master".
Then you can delete it. More importantly, you should not
build on top of it -- other people may already want to
change things related to the topic as patches against your
"master", so if you need further changes, it is better to
fork the topic (perhaps with the same name) afresh from the
tip of "master".
Let's look at this example:
o---o---o---o---o---o---o---o---o---o "next"
/ / / /
/ a---a---b A / /
/ / / /
/ / c---c---c---c B /
/ / / \ /
/ / / b---b C \ /
/ / / / \ /
---o---o---o---o---o---o---o---o---o---o---o "master"
A, B and C are topic branches.
* A has one fix since it was merged up to "next".
* B has finished. It has been fully merged up to "master" and "next",
and is ready to be deleted.
* C has not merged to "next" at all.
We would want to allow C to be rebased, refuse A, and encourage
B to be deleted.
To compute (1):
git rev-list ^master ^topic next
git rev-list ^master next
if these match, topic has not merged in next at all.
To compute (2):
git rev-list master..topic
if this is empty, it is fully merged to "master".
DOC_END

24
github/hooks/pre-receive.sample Executable file
View File

@ -0,0 +1,24 @@
#!/bin/sh
#
# An example hook script to make use of push options.
# The example simply echoes all push options that start with 'echoback='
# and rejects all pushes when the "reject" push option is used.
#
# To enable this hook, rename this file to "pre-receive".
if test -n "$GIT_PUSH_OPTION_COUNT"
then
i=0
while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"
do
eval "value=\$GIT_PUSH_OPTION_$i"
case "$value" in
echoback=*)
echo "echo from the pre-receive-hook: ${value#*=}" >&2
;;
reject)
exit 1
esac
i=$((i + 1))
done
fi

View File

@ -0,0 +1,42 @@
#!/bin/sh
#
# An example hook script to prepare the commit log message.
# Called by "git commit" with the name of the file that has the
# commit message, followed by the description of the commit
# message's source. The hook's purpose is to edit the commit
# message file. If the hook fails with a non-zero status,
# the commit is aborted.
#
# To enable this hook, rename this file to "prepare-commit-msg".
# This hook includes three examples. The first one removes the
# "# Please enter the commit message..." help message.
#
# The second includes the output of "git diff --name-status -r"
# into the message, just before the "git status" output. It is
# commented because it doesn't cope with --amend or with squashed
# commits.
#
# The third example adds a Signed-off-by line to the message, that can
# still be edited. This is rarely a good idea.
COMMIT_MSG_FILE=$1
COMMIT_SOURCE=$2
SHA1=$3
/usr/bin/perl -i.bak -ne 'print unless(m/^. Please enter the commit message/..m/^#$/)' "$COMMIT_MSG_FILE"
# case "$COMMIT_SOURCE,$SHA1" in
# ,|template,)
# /usr/bin/perl -i.bak -pe '
# print "\n" . `git diff --cached --name-status -r`
# if /^#/ && $first++ == 0' "$COMMIT_MSG_FILE" ;;
# *) ;;
# esac
# SOB=$(git var GIT_COMMITTER_IDENT | sed -n 's/^\(.*>\).*$/Signed-off-by: \1/p')
# git interpret-trailers --in-place --trailer "$SOB" "$COMMIT_MSG_FILE"
# if test -z "$COMMIT_SOURCE"
# then
# /usr/bin/perl -i.bak -pe 'print "\n" if !$first_line++' "$COMMIT_MSG_FILE"
# fi

View File

@ -0,0 +1,78 @@
#!/bin/sh
# An example hook script to update a checked-out tree on a git push.
#
# This hook is invoked by git-receive-pack(1) when it reacts to git
# push and updates reference(s) in its repository, and when the push
# tries to update the branch that is currently checked out and the
# receive.denyCurrentBranch configuration variable is set to
# updateInstead.
#
# By default, such a push is refused if the working tree and the index
# of the remote repository has any difference from the currently
# checked out commit; when both the working tree and the index match
# the current commit, they are updated to match the newly pushed tip
# of the branch. This hook is to be used to override the default
# behaviour; however the code below reimplements the default behaviour
# as a starting point for convenient modification.
#
# The hook receives the commit with which the tip of the current
# branch is going to be updated:
commit=$1
# It can exit with a non-zero status to refuse the push (when it does
# so, it must not modify the index or the working tree).
die () {
echo >&2 "$*"
exit 1
}
# Or it can make any necessary changes to the working tree and to the
# index to bring them to the desired state when the tip of the current
# branch is updated to the new commit, and exit with a zero status.
#
# For example, the hook can simply run git read-tree -u -m HEAD "$1"
# in order to emulate git fetch that is run in the reverse direction
# with git push, as the two-tree form of git read-tree -u -m is
# essentially the same as git switch or git checkout that switches
# branches while keeping the local changes in the working tree that do
# not interfere with the difference between the branches.
# The below is a more-or-less exact translation to shell of the C code
# for the default behaviour for git's push-to-checkout hook defined in
# the push_to_deploy() function in builtin/receive-pack.c.
#
# Note that the hook will be executed from the repository directory,
# not from the working tree, so if you want to perform operations on
# the working tree, you will have to adapt your code accordingly, e.g.
# by adding "cd .." or using relative paths.
if ! git update-index -q --ignore-submodules --refresh
then
die "Up-to-date check failed"
fi
if ! git diff-files --quiet --ignore-submodules --
then
die "Working directory has unstaged changes"
fi
# This is a rough translation of:
#
# head_has_history() ? "HEAD" : EMPTY_TREE_SHA1_HEX
if git cat-file -e HEAD 2>/dev/null
then
head=HEAD
else
head=$(git hash-object -t tree --stdin </dev/null)
fi
if ! git diff-index --quiet --cached --ignore-submodules $head --
then
die "Working directory has staged changes"
fi
if ! git read-tree -u -m "$commit"
then
die "Could not update working tree to new HEAD"
fi

128
github/hooks/update.sample Executable file
View File

@ -0,0 +1,128 @@
#!/bin/sh
#
# An example hook script to block unannotated tags from entering.
# Called by "git receive-pack" with arguments: refname sha1-old sha1-new
#
# To enable this hook, rename this file to "update".
#
# Config
# ------
# hooks.allowunannotated
# This boolean sets whether unannotated tags will be allowed into the
# repository. By default they won't be.
# hooks.allowdeletetag
# This boolean sets whether deleting tags will be allowed in the
# repository. By default they won't be.
# hooks.allowmodifytag
# This boolean sets whether a tag may be modified after creation. By default
# it won't be.
# hooks.allowdeletebranch
# This boolean sets whether deleting branches will be allowed in the
# repository. By default they won't be.
# hooks.denycreatebranch
# This boolean sets whether remotely creating branches will be denied
# in the repository. By default this is allowed.
#
# --- Command line
refname="$1"
oldrev="$2"
newrev="$3"
# --- Safety check
if [ -z "$GIT_DIR" ]; then
echo "Don't run this script from the command line." >&2
echo " (if you want, you could supply GIT_DIR then run" >&2
echo " $0 <ref> <oldrev> <newrev>)" >&2
exit 1
fi
if [ -z "$refname" -o -z "$oldrev" -o -z "$newrev" ]; then
echo "usage: $0 <ref> <oldrev> <newrev>" >&2
exit 1
fi
# --- Config
allowunannotated=$(git config --type=bool hooks.allowunannotated)
allowdeletebranch=$(git config --type=bool hooks.allowdeletebranch)
denycreatebranch=$(git config --type=bool hooks.denycreatebranch)
allowdeletetag=$(git config --type=bool hooks.allowdeletetag)
allowmodifytag=$(git config --type=bool hooks.allowmodifytag)
# check for no description
projectdesc=$(sed -e '1q' "$GIT_DIR/description")
case "$projectdesc" in
"Unnamed repository"* | "")
echo "*** Project description file hasn't been set" >&2
exit 1
;;
esac
# --- Check types
# if $newrev is 0000...0000, it's a commit to delete a ref.
zero=$(git hash-object --stdin </dev/null | tr '[0-9a-f]' '0')
if [ "$newrev" = "$zero" ]; then
newrev_type=delete
else
newrev_type=$(git cat-file -t $newrev)
fi
case "$refname","$newrev_type" in
refs/tags/*,commit)
# un-annotated tag
short_refname=${refname##refs/tags/}
if [ "$allowunannotated" != "true" ]; then
echo "*** The un-annotated tag, $short_refname, is not allowed in this repository" >&2
echo "*** Use 'git tag [ -a | -s ]' for tags you want to propagate." >&2
exit 1
fi
;;
refs/tags/*,delete)
# delete tag
if [ "$allowdeletetag" != "true" ]; then
echo "*** Deleting a tag is not allowed in this repository" >&2
exit 1
fi
;;
refs/tags/*,tag)
# annotated tag
if [ "$allowmodifytag" != "true" ] && git rev-parse $refname > /dev/null 2>&1
then
echo "*** Tag '$refname' already exists." >&2
echo "*** Modifying a tag is not allowed in this repository." >&2
exit 1
fi
;;
refs/heads/*,commit)
# branch
if [ "$oldrev" = "$zero" -a "$denycreatebranch" = "true" ]; then
echo "*** Creating a branch is not allowed in this repository" >&2
exit 1
fi
;;
refs/heads/*,delete)
# delete branch
if [ "$allowdeletebranch" != "true" ]; then
echo "*** Deleting a branch is not allowed in this repository" >&2
exit 1
fi
;;
refs/remotes/*,commit)
# tracking branch
;;
refs/remotes/*,delete)
# delete tracking branch
if [ "$allowdeletebranch" != "true" ]; then
echo "*** Deleting a tracking branch is not allowed in this repository" >&2
exit 1
fi
;;
*)
# Anything else (is there anything else?)
echo "*** Update hook: unknown type of update to ref $refname of type $newrev_type" >&2
exit 1
;;
esac
# --- Finished
exit 0

BIN
github/index Normal file

Binary file not shown.

6
github/info/exclude Normal file
View File

@ -0,0 +1,6 @@
# git ls-files --others --exclude-from=.git/info/exclude
# Lines that start with '#' are comments.
# For a project mostly in C, the following would be a good set of
# exclude patterns (uncomment them if you want to use them):
# *.[oa]
# *~

1
github/logs/HEAD Normal file
View File

@ -0,0 +1 @@
0000000000000000000000000000000000000000 b20efb697007b60cc6faef13bc06e05a78854b59 j <j@j.com> 1729676691 +0800 clone: from https://github.com/sqlparser/java_data_lineage.git

View File

@ -0,0 +1 @@
0000000000000000000000000000000000000000 b20efb697007b60cc6faef13bc06e05a78854b59 j <j@j.com> 1729676691 +0800 clone: from https://github.com/sqlparser/java_data_lineage.git

View File

@ -0,0 +1 @@
0000000000000000000000000000000000000000 b20efb697007b60cc6faef13bc06e05a78854b59 j <j@j.com> 1729676691 +0800 clone: from https://github.com/sqlparser/java_data_lineage.git

2
github/packed-refs Normal file
View File

@ -0,0 +1,2 @@
# pack-refs with: peeled fully-peeled sorted
b20efb697007b60cc6faef13bc06e05a78854b59 refs/remotes/origin/main

1
github/refs/heads/main Normal file
View File

@ -0,0 +1 @@
b20efb697007b60cc6faef13bc06e05a78854b59

View File

@ -0,0 +1 @@
ref: refs/remotes/origin/main

Binary file not shown.

63
pom.xml Normal file
View File

@ -0,0 +1,63 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.7.16</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<groupId>com.gudusoft</groupId>
<artifactId>java_data_lineage</artifactId>
<version>1.0.0</version>
<name>java_data_lineage</name>
<description>Demo project for GSP</description>
<properties>
<java.version>1.8</java.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-thymeleaf</artifactId>
</dependency>
<dependency>
<groupId>com.sqlparser</groupId>
<artifactId>gudusoft.gsqlparser</artifactId>
<version>2.8.5.8</version>
<scope>system</scope>
<systemPath>${project.basedir}/libs/gudusoft.gsqlparser-2.8.5.8.jar</systemPath>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.8.1</version>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.70</version>
</dependency>
<dependency>
<groupId>jakarta.validation</groupId>
<artifactId>jakarta.validation-api</artifactId>
<version>2.0.2</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<configuration>
<includeSystemScope >true</includeSystemScope>
</configuration>
</plugin>
</plugins>
</build >
</project>

View File

@ -0,0 +1,13 @@
package com.gudusoft.datalineage.demo;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class DataLineageDemoApplication {
public static void main(String[] args) {
SpringApplication.run(DataLineageDemoApplication.class, args);
}
}

View File

@ -0,0 +1,79 @@
package com.gudusoft.datalineage.demo.controller;
import com.gudusoft.datalineage.demo.dto.DataflowRequest;
import com.gudusoft.datalineage.demo.dto.Result;
import gudusoft.gsqlparser.EDbVendor;
import gudusoft.gsqlparser.dlineage.DataFlowAnalyzer;
import gudusoft.gsqlparser.dlineage.dataflow.model.Option;
import gudusoft.gsqlparser.dlineage.dataflow.model.RelationshipType;
import gudusoft.gsqlparser.dlineage.dataflow.model.xml.dataflow;
import gudusoft.gsqlparser.dlineage.graph.DataFlowGraphGenerator;
import gudusoft.gsqlparser.dlineage.util.DataflowUtility;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestMethod;
import org.springframework.web.bind.annotation.RestController;
import javax.validation.Valid;
import java.util.ArrayList;
import java.util.Objects;
@RestController
@RequestMapping("/sqlflow")
public class DataLineageDemoController {
@RequestMapping(value = "/datalineage", method = {RequestMethod.POST}, produces = "application/json;charset=utf-8")
public Result<String> dataflow(@Valid @RequestBody DataflowRequest req) throws Exception {
Option option = new Option();
option.setVendor(EDbVendor.valueOf(req.getDbVendor()));
option.setSimpleOutput(false);
option.setIgnoreRecordSet(false);
option.filterRelationTypes("fdd,fddi,frd,fdr".split(","));
option.setLinkOrphanColumnToFirstTable(true);
option.setOutput(false);
option.setSimpleShowFunction(req.isSimpleShowFunction());
option.setShowConstantTable(req.isShowConstantTable());
option.setTransform(req.isShowTransform());
option.setTransformCoordinate(req.isShowTransform());
option.setShowCountTableColumn(true);
if(Objects.nonNull(req.getShowResultSetTypes())){
option.showResultSetTypes(req.getShowResultSetTypes().split(","));
option.setIgnoreRecordSet(true);
option.setSimpleOutput(false);
}
DataFlowAnalyzer analyzer = new DataFlowAnalyzer(req.getSqlText(), option);
analyzer.generateDataFlow(true);
dataflow dataflow = analyzer.getDataFlow();
if(req.isIgnoreRecordSet() && Objects.isNull(req.getShowResultSetTypes())){
analyzer = new DataFlowAnalyzer("", option.getVendor(), false);
analyzer.setIgnoreRecordSet(req.isIgnoreRecordSet());
analyzer.getOption().setShowERDiagram(true);
ArrayList types = new ArrayList();
types.add(RelationshipType.fdd.name());
types.add(RelationshipType.fdr.name());
dataflow = analyzer.getSimpleDataflow(dataflow, false, types);
}
if(req.isTableLevel()){
dataflow = DataflowUtility.convertToTableLevelDataflow(dataflow);
}
if(dataflow.getRelationships().size() > 2000){
return Result.error(500, "More than 2,000 relationships, the front end can not be displayed!");
}
DataFlowGraphGenerator generator = new DataFlowGraphGenerator();
String result = generator.genDlineageGraph(option.getVendor(),req.isIndirect(), dataflow);
return Result.success(result);
}
@RequestMapping(value = "/erdiagram", method = {RequestMethod.POST}, produces = "application/json;charset=utf-8")
public Result<String> erflow(@Valid @RequestBody DataflowRequest req) throws Exception {
Option option = new Option();
option.setVendor(EDbVendor.valueOf(req.getDbVendor()));
option.setShowERDiagram(true);
DataFlowAnalyzer analyzer = new DataFlowAnalyzer(req.getSqlText(), option);
analyzer.generateDataFlow();
dataflow dataflow = analyzer.getDataFlow();
DataFlowGraphGenerator generator = new DataFlowGraphGenerator();
String result = generator.genERGraph(option.getVendor(), dataflow);
return Result.success(result);
}
}

View File

@ -0,0 +1,12 @@
package com.gudusoft.datalineage.demo.controller;
import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.RequestMapping;
@Controller
public class WebController {
@RequestMapping("/")
public String home(){
return "index";
}
}

View File

@ -0,0 +1,44 @@
package com.gudusoft.datalineage.demo.dto;
public class CodeMsg {
private int code;
private String msg;
public static CodeMsg SUCCESS = new CodeMsg(200, "success");
public static CodeMsg SERVER_ERROR = new CodeMsg(500, "程序异常");
private CodeMsg( ) {
}
private CodeMsg( int code,String msg ) {
this.code = code;
this.msg = msg;
}
public int getCode() {
return code;
}
public void setCode(int code) {
this.code = code;
}
public String getMsg() {
return msg;
}
public void setMsg(String msg) {
this.msg = msg;
}
public CodeMsg fillArgs(Object... args) {
int code = this.code;
String message = String.format(this.msg, args);
return new CodeMsg(code, message);
}
@Override
public String toString() {
return "CodeMsg [code=" + code + ", msg=" + msg + "]";
}
}

View File

@ -0,0 +1,90 @@
package com.gudusoft.datalineage.demo.dto;
import javax.validation.constraints.NotEmpty;
public class DataflowRequest {
@NotEmpty(message = "dbVendor not empt")
private String dbVendor;
@NotEmpty(message = "sqlText not empt")
private String sqlText;
private boolean indirect = true;
private boolean showConstantTable = true;
private boolean simpleShowFunction = true;
private String showResultSetTypes;
private boolean ignoreRecordSet = true;
private boolean showTransform = false;
private boolean tableLevel = false;
public String getDbVendor() {
return dbVendor;
}
public void setDbVendor(String dbVendor) {
this.dbVendor = dbVendor;
}
public String getSqlText() {
return sqlText;
}
public void setSqlText(String sqlText) {
this.sqlText = sqlText;
}
public boolean isIndirect() {
return indirect;
}
public void setIndirect(boolean indirect) {
this.indirect = indirect;
}
public boolean isShowConstantTable() {
return showConstantTable;
}
public void setShowConstantTable(boolean showConstantTable) {
this.showConstantTable = showConstantTable;
}
public boolean isSimpleShowFunction() {
return simpleShowFunction;
}
public void setSimpleShowFunction(boolean simpleShowFunction) {
this.simpleShowFunction = simpleShowFunction;
}
public String getShowResultSetTypes() {
return showResultSetTypes;
}
public void setShowResultSetTypes(String showResultSetTypes) {
this.showResultSetTypes = showResultSetTypes;
}
public boolean isIgnoreRecordSet() {
return ignoreRecordSet;
}
public void setIgnoreRecordSet(boolean ignoreRecordSet) {
this.ignoreRecordSet = ignoreRecordSet;
}
public boolean isTableLevel() {
return tableLevel;
}
public void setTableLevel(boolean tableLevel) {
this.tableLevel = tableLevel;
}
public boolean isShowTransform() {
return showTransform;
}
public void setShowTransform(boolean showTransform) {
this.showTransform = showTransform;
}
}

View File

@ -0,0 +1,55 @@
package com.gudusoft.datalineage.demo.dto;
public class Result<T> {
private int code;
private String msg;
private T data;
private long srvTime = System.currentTimeMillis();
public static <T> Result<T> success(T data){
return new Result<T>(data);
}
public static <T> Result<T> error(CodeMsg codeMsg){
return new Result<T>(codeMsg);
}
public static <T> Result<T> error(int code, String msg){
return new Result<T>(code, msg);
}
private Result(T data) {
this.code = CodeMsg.SUCCESS.getCode();
this.data = data;
}
private Result(int code, String msg) {
this.code = code;
this.msg = msg;
}
private Result(CodeMsg codeMsg) {
if(codeMsg != null) {
this.code = codeMsg.getCode();
this.msg = codeMsg.getMsg();
}
}
public int getCode() {
return code;
}
public void setCode(int code) {
this.code = code;
}
public String getMsg() {
return msg;
}
public void setMsg(String msg) {
this.msg = msg;
}
public T getData() {
return data;
}
public void setData(T data) {
this.data = data;
}
}

View File

@ -0,0 +1,8 @@
spring:
application:
name: datalineage
server:
port: 9600
servlet:
context-path: /

File diff suppressed because one or more lines are too long

View File

@ -0,0 +1,36 @@
const sampleSQL = {
dbvathena: "INSERT INTO cities_usa (city,state)\nSELECT city,state FROM cities_world WHERE country='usa';",
dbvazuresql: '-- azure sql sample SQL\nCREATE VIEW [SalesLT].[vProductAndDescription]\nWITH SCHEMABINDING\nAS\n-- View (indexed or standard) to display products and product descriptions by language.\nSELECT\n p.[ProductID]\n ,p.[Name]\n ,pm.[Name] AS [ProductModel]\n ,pmx.[Culture]\n ,pd.[Description]\nFROM [SalesLT].[Product] p\n INNER JOIN [SalesLT].[ProductModel] pm\n ON p.[ProductModelID] = pm.[ProductModelID]\n INNER JOIN [SalesLT].[ProductModelProductDescription] pmx\n ON pm.[ProductModelID] = pmx.[ProductModelID]\n INNER JOIN [SalesLT].[ProductDescription] pd\n ON pmx.[ProductDescriptionID] = pd.[ProductDescriptionID];',
dbvbigquery: "-- bigquery sample SQL\nMERGE dataset.DetailedInventory T\nUSING dataset.Inventory S\nON T.product = S.product\nWHEN NOT MATCHED AND s.quantity < 20 THEN\n INSERT(product, quantity, supply_constrained, comments)\n VALUES(product, quantity, true, ARRAY<STRUCT<created DATE, comment STRING>>[(DATE('2016-01-01'), 'comment1')])\nWHEN NOT MATCHED THEN\n INSERT(product, quantity, supply_constrained)\n VALUES(product, quantity, false)\n;",
dbvcouchbase:
'-- Couchbase\nSELECT t1.country, array_agg(t1.city), sum(t1.city_cnt) as apnum\nFROM (SELECT city, city_cnt, array_agg(airportname) as apnames, country\n FROM `travel-sample` WHERE type = "airport"\n GROUP BY city, country LETTING city_cnt = count(city) ) AS t1\nWHERE t1.city_cnt > 5\nGROUP BY t1.country;\n\nINSERT INTO `travel-sample` (key UUID(), value _country)\n SELECT _country FROM `travel-sample` _country\n WHERE type = "airport" AND airportname = "Heathrow";\n \nMERGE INTO all_empts a USING emps_deptb b ON KEY b.empId\nWHEN MATCHED THEN\n UPDATE SET a.depts = a.depts + 1,\n a.title = b.title || ", " || b.title\nWHEN NOT MATCHED THEN\n INSERT { "name": b.name, "title": b.title, "depts": b.depts, "empId": b.empId, "dob": b.dob }\n;\n\nUPDATE `travel-sample`\nSET foo = 9\nWHERE city = (SELECT raw city FROM `beer-sample` WHERE type = "brewery")\n;',
dbvdatabricks: `-- databricks\nCREATE OR REPLACE VIEW experienced_employee\n (id COMMENT 'Unique identification number', Name)\n COMMENT 'View for experienced employees'\n AS SELECT id, name\n FROM all_employee\n WHERE working_years > 5;`,
dbvdb2: `-- DB2\nSELECT PART, SUPPLIER, PRODNUM, PRODUCT\nFROM (SELECT PART, PROD# AS PRODNUM, SUPPLIER\nFROM PARTS\nWHERE PROD# < 200) AS PARTX\nLEFT OUTER JOIN PRODUCTS\nON PRODNUM = PROD#;\n\n\nSELECT D.DEPTNO, D.DEPTNAME,\nEMPINFO.AVGSAL, EMPINFO.EMPCOUNT\nFROM DEPT D,\nTABLE (SELECT AVG(E.SALARY) AS AVGSAL,\nCOUNT(*) AS EMPCOUNT\nFROM EMP E\nWHERE E.WORKDEPT = D.DEPTNO)\nAS EMPINFO;\n\nDELETE FROM EMP X\nWHERE ABSENT = (SELECT MAX(ABSENT) FROM EMP Y\nWHERE X.WORKDEPT = Y.WORKDEPT);\n\nINSERT INTO SMITH.TEMPEMPL\nSELECT *\nFROM DSN8B10.EMP;\n\nINSERT INTO B.EMP_PHOTO_RESUME\n-- OVERRIDING USER VALUE\nSELECT * FROM DSN8B10.EMP_PHOTO_RESUME;\n\nMERGE INTO RECORDS AR\nUSING (VALUES (:hv_activity, :hv_description)\n--FOR :hv_nrows ROWS\n)\nAS AC (ACTIVITY, DESCRIPTION)\nON (AR.ACTIVITY = AC.ACTIVITY)\nWHEN MATCHED THEN UPDATE SET DESCRIPTION = AC.DESCRIPTION\nWHEN NOT MATCHED THEN INSERT (ACTIVITY, DESCRIPTION)\nVALUES (AC.ACTIVITY, AC.DESCRIPTION)\n-- NOT ATOMIC CONTINUE ON SQLEXCEPTION\n;\n\nCREATE TABLE T1 (COL1 CHAR(7), COL2 INT);\nINSERT INTO T1 VALUES ('abc', 10);\nMERGE INTO T1 AS A\nUSING TABLE (VALUES ('rsk', 3 ) ) AS T (ID, AMOUNT)\nON A.COL1 = T.ID\nWHEN MATCHED\nTHEN UPDATE SET COL2 = CARDINALITY(CHARA)\nWHEN NOT MATCHED\nTHEN INSERT (COL1, COL2 ) VALUES (T.ID, CARDINALITY(INTA));\n\nUPDATE DSN8B10.EMP\nSET PROJSIZE = (SELECT COUNT(*)\nFROM DSN8B10.PROJ\nWHERE DEPTNO = 'E21')\nWHERE WORKDEPT = 'E21';\n\nCREATE VIEW DSN8B10.FIRSTQTR (SNO, CHARGES, DATE) AS\nSELECT SNO, CHARGES, DATE\nFROM MONTH1\nWHERE DATE BETWEEN '01/01/2000' and '01/31/2000'\nUNION All\nSELECT SNO, CHARGES, DATE\nFROM MONTH2\nWHERE DATE BETWEEN '02/01/2000' and '02/29/2000'\nUNION All\nSELECT SNO, CHARGES, DATE\nFROM MONTH3\nWHERE DATE BETWEEN '03/01/2000' and '03/31/2000';\n\n/* not supported yet\n\nEXEC SQL\nDECLARE C1 CURSOR FOR\nSELECT BONUS\nFROM DSN8710.EMP\nWHERE WORKDEPT = 'E12'\nFOR UPDATE OF BONUS;\n\nEXEC SQL\nUPDATE DSN8710.EMP\nSET BONUS = ( SELECT .10 * SALARY FROM DSN8710.EMP Y\nWHERE EMPNO = Y.EMPNO )\nWHERE CURRENT OF C1;\n*/`,
dbvgaussdb: `-- gaussdb\n CREATE TABLE hr.section\n(\n id INTEGER NOT NULL, \n code VARCHAR(16) NOT NULL, \n name VARCHAR(50) NOT NULL, \n place_id INTEGER NOT NULL,\n place_name VARCHAR(50) NOT NULL \n);\n\nCREATE UNIQUE INDEX hr_section_index1 ON hr.section(id);\n\nCREATE TABLE hr.deptinfo AS SELECT * FROM hr.deptbase WHERE id > 365;\n\nCREATE VIEW deptinView AS\n SELECT * FROM hr.deptinfo WHERE id > 100;\n \nCREATE OR REPLACE PROCEDURE dept_proc()\nAS \ndeclare\n DEPT_NAME VARCHAR(100);\n DEPT_LOC INTEGER;\n CURSOR C1 IS \n SELECT name, place_id FROM hr.section WHERE place_id <= 50;\nBEGIN\n OPEN C1;\n LOOP\n FETCH C1 INTO DEPT_NAME, DEPT_LOC;\n EXIT WHEN C1%NOTFOUND;\n DBMS_OUTPUT.PUT_LINE(DEPT_NAME||\'---\'||DEPT_LOC);\n END LOOP;\n CLOSE C1;\nEND;`,
dbvgreenplum: `-- greenplum sample SQL\n\nWITH regional_sales AS (\n SELECT region, SUM(amount) AS total_sales\n FROM orders\n GROUP BY region\n ), top_regions AS (\n SELECT region\n FROM regional_sales\n WHERE total_sales > (SELECT SUM(total_sales)/10 FROM regional_sales)\n )\nSELECT region,\n product,\n SUM(quantity) AS product_units,\n SUM(amount) AS product_sales\nFROM orders\nWHERE region IN (SELECT region FROM top_regions)\nGROUP BY region, product;\n\nWITH RECURSIVE search_graph(id, link, data, depth, path, cycle) AS (\n SELECT g.id, g.link, g.data, 1,\n ARRAY[g.id],\n false\n FROM graph g\n UNION ALL\n SELECT g.id, g.link, g.data, sg.depth + 1,\n path || g.id,\n g.id = ANY(path)\n FROM graph g, search_graph sg\n WHERE g.id = sg.link AND NOT cycle\n)\nSELECT * FROM search_graph;\n\nINSERT INTO films SELECT * FROM tmp_films WHERE date_prod < '2004-05-07';\n\nUPDATE employees SET sales_count = sales_count + 1 WHERE id =\n (SELECT id FROM accounts WHERE name = 'Acme Corporation');\n \n \nUPDATE accounts SET (contact_last_name, contact_first_name) =\n (SELECT last_name, first_name FROM salesmen\n WHERE salesmen.id = accounts.sales_id); `,
dbvhana: `-- sap hana sample SQLCODE\n\nCREATE TABLE t1(a INT, b INT, c INT); \nCREATE TABLE t2(a INT, b INT, c INT);\nINSERT INTO t1 VALUES(1,12,13); \nINSERT INTO t2 VALUES(2,120,130);\n \nINSERT INTO t1 WITH alias1 AS (SELECT * FROM t1) SELECT * FROM alias1;\nINSERT INTO t1 WITH w1 AS (SELECT * FROM t2) SELECT * FROM w1;\n\nSELECT TA.a1, TB.b1 FROM TA, LATERAL (SELECT b1, b2 FROM TB WHERE b3 = TA.a3) TB WHERE TA.a2 = TB.b2;\n\nUPDATE T1 SET C1 = ARRAY( SELECT C1 FROM T0 ) WHERE ID = 1;\n\nUPSERT T1 VALUES ( 1, ARRAY ( SELECT C1 FROM T0 ) ) WHERE ID = 1;\n\n\nMERGE INTO "my_schema".t1 USING "my_schema".t2 ON "my_schema".t1.a = "my_schema".t2.a\n WHEN MATCHED THEN UPDATE SET "my_schema".t1.b = "my_schema".t2.b\n WHEN NOT MATCHED THEN INSERT VALUES("my_schema".t2.a, "my_schema".t2.b);\n \n MERGE INTO T1 USING T2 ON T1.A = T2.A\n WHEN MATCHED AND T1.A > 1 THEN UPDATE SET B = T2.B\n WHEN NOT MATCHED AND T2.A > 3 THEN INSERT VALUES (T2.A, T2.B);\n \n \n /* not support yet\n CREATE VIEW Illness_K_Anon (ID, Gender, Location, Illness)\n AS SELECT ID, Gender, City AS Location, Illness\n FROM Illness\n WITH ANONYMIZATION ( ALGORITHM 'K-ANONYMITY'\n PARAMETERS '{"data_change_strategy": "qualified", "k": 2}'\n COLUMN ID PARAMETERS '{"is_sequence": true}'\n COLUMN Gender PARAMETERS '{"is_quasi_identifier":true, "hierarchy":{"embedded": [["F"], ["M"]]}}'\n COLUMN Location PARAMETERS '{"is_quasi_identifier":true, "hierarchy":{"embedded": [["Paris", "France"], ["Munich", "Germany"], ["Nice", "France"]]}}');\n*/`,
dbvhive:
"-- Hive sample sql\nSELECT pageid, adid\nFROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid;\n\nCREATE DATABASE merge_data;\n \nCREATE TABLE merge_data.transactions(\n\tID int,\n\tTranValue string,\n\tlast_update_user string)\nPARTITIONED BY (tran_date string)\nCLUSTERED BY (ID) into 5 buckets \nSTORED AS ORC TBLPROPERTIES ('transactional'='true');\n \nCREATE TABLE merge_data.merge_source(\n\tID int,\n\tTranValue string,\n\ttran_date string)\nSTORED AS ORC;\n\n MERGE INTO merge_data.transactions AS T \n USING merge_data.merge_source AS S\n ON T.ID = S.ID and T.tran_date = S.tran_date\n WHEN MATCHED AND (T.TranValue != S.TranValue AND S.TranValue IS NOT NULL) THEN UPDATE SET TranValue = S.TranValue, last_update_user = 'merge_update'\n WHEN MATCHED AND S.TranValue IS NULL THEN DELETE\n WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.TranValue, 'merge_insert', S.tran_date)",
dbvimpala: `-- Impala sample sql\ninsert into t2 select * from t1;\ninsert into t2 select c1, c2 from t1;\n\nCREATE VIEW v5 AS SELECT c1, CAST(c3 AS STRING) c3, CONCAT(c4,c5) c5, TRIM(c6) c6, "Constant" c8 FROM t1;\n\nCREATE VIEW v7 (c1 COMMENT 'Comment for c1', c2) COMMENT 'Comment for v7' AS SELECT t1.c1, t1.c2 FROM t1;`,
dbvinformix: `-- Informix sample sql\nCREATE VIEW myview (cola, colb) AS\nSELECT colx, coly from firsttab\nUNION\nSELECT colx, colz from secondtab;\n\nCREATE VIEW palo_alto AS\nSELECT * FROM customer WHERE city = 'Palo Alto'\nWITH CHECK OPTION\n;\n\nMERGE INTO t2 AS o USING t1 AS n ON o.f1 = n.f1\nWHEN NOT MATCHED THEN INSERT ( o.f1,o.f2)\nVALUES ( n.f1,n.f2)\n;\n\nINSERT INTO t2(f1, f2)\nSELECT t1.f1, t1.f2 FROM t1\nWHERE NOT EXISTS\n(SELECT f1, f2 FROM t2\nWHERE t2.f1 = t1.f1);\n\nMERGE INTO sale USING new_sale AS n\nON sale.cust_id = n.cust_id\nWHEN MATCHED THEN UPDATE\nSET sale.salecount = sale.salecount + n.salecount\nWHEN NOT MATCHED THEN INSERT (cust_id, salecount)\nVALUES (n.cust_id, n.salecount);\n\n\nMERGE INTO customer c\nUSING ext_customer e\nON c.customer_num=e.customer_num AND c.fname=e.fname AND c.lname=e.lname\nWHEN NOT MATCHED THEN\nINSERT\n(c.fname, c.lname, c.company, c.address1, c.address2,\nc.city, c.state, c.zipcode, c.phone)\nVALUES\n(e.fname, e.lname, e.company, e.address1, e.address2,\ne.city, e.state, e.zipcode, e.phone)\nWHEN MATCHED THEN UPDATE\nSET c.fname = e.fname,\nc.lname = e.lname,\nc.company = e.company,\nc.address1 = e.address1,\nc.address2 = e.address2,\nc.city = e.city,\nc.state = e.state,\nc.zipcode = e.zipcode,\nc.phone = e.phone ;\n\n\nUPDATE nmosdb@wnmserver1:test\nSET name=(SELECT name FROM test\nWHERE test.id = nmosdb@wnmserver1:test.id)\nWHERE EXISTS(\nSELECT 1 FROM test WHERE test.id = nmosdb@wnmserver1:test.id\n);\n\nUPDATE orders\nSET ship_charge =\n(SELECT SUM(total_price) * .07 FROM items\nWHERE orders.order_num = items.order_num)\nWHERE orders.order_num = 1001;`,
dbvmdx: 'mdx',
dbvmysql:
"-- mysql sample sql\n\nSELECT\n salesperson.name,\n -- find maximum sale size for this salesperson\n (SELECT MAX(amount) AS amount\n FROM all_sales\n WHERE all_sales.salesperson_id = salesperson.id)\n AS amount,\n -- find customer for this maximum size\n (SELECT customer_name\n FROM all_sales\n WHERE all_sales.salesperson_id = salesperson.id\n AND all_sales.amount =\n -- find maximum size, again\n (SELECT MAX(amount) AS amount\n FROM all_sales\n WHERE all_sales.salesperson_id = salesperson.id))\n AS customer_name\nFROM\n salesperson;\n \n \nSELECT\n salesperson.name,\n max_sale.amount,\n max_sale_customer.customer_name\nFROM\n salesperson,\n -- calculate maximum size, cache it in transient derived table max_sale\n (SELECT MAX(amount) AS amount\n FROM all_sales\n WHERE all_sales.salesperson_id = salesperson.id)\n AS max_sale,\n -- find customer, reusing cached maximum size\n (SELECT customer_name\n FROM all_sales\n WHERE all_sales.salesperson_id = salesperson.id\n AND all_sales.amount =\n -- the cached maximum size\n max_sale.amount)\n AS max_sale_customer;\n\nSELECT\n salesperson.name,\n max_sale.amount,\n max_sale.customer_name\nFROM\n salesperson,\n -- find maximum size and customer at same time\n LATERAL\n (SELECT amount, customer_name\n FROM all_sales\n WHERE all_sales.salesperson_id = salesperson.id\n ORDER BY amount DESC LIMIT 1)\n AS max_sale;\n\nWITH RECURSIVE employee_paths (id, name, path) AS\n(\n SELECT id, name, CAST(id AS CHAR(200))\n FROM employees\n WHERE manager_id IS NULL\n UNION ALL\n SELECT e.id, e.name, CONCAT(ep.path, ',', e.id)\n FROM employee_paths AS ep JOIN employees AS e\n ON ep.id = e.manager_id\n)\nSELECT * FROM employee_paths ORDER BY path;\n\nUPDATE table1 t1 \nJOIN table2 t2 ON t1.field1 = t2.field1 \nJOIN table3 t3 ON (t3.field1=t2.field2 AND t3.field3 IS NOT NULL) \nSET t1.field9=t3.field9\nWHERE t1.field5=1\nAND t1.field9 IS NULL ",
dbvnetezza: `-- netezza sample sql\n\nINSERT INTO films SELECT * FROM tmp;\n\nINSERT INTO emp_copy WITH employee AS (select * from\nemp) SELECT * FROM employee;\n\nUPDATE emp_copy SET grp = 'gone' WHERE id =\n(WITH employee AS (select * from emp) SELECT id FROM employee WHERE id\n= 1);\n\nDELETE FROM emp_copy WHERE id IN\n(WITH employee AS (SELECT * FROM emp_copy where grp = 'gone')\nSELECT id FROM employee);\n\n\nWITH manager (mgr_id, mgr_name, mgr_dept) AS\n (SELECT id, name, grp\n FROM emp_copy\n WHERE mgr = id AND grp != 'gone'),\nemployee (emp_id, emp_name, emp_mgr) AS\n (SELECT id, name, mgr_id\n FROM emp_copy JOIN manager ON grp = mgr_dept),\nmgr_cnt (mgr_id, mgr_reports) AS\n (SELECT mgr, COUNT (*)\n FROM emp_copy\n WHERE mgr != id\n GROUP BY mgr)\nSELECT *\nFROM employee JOIN manager ON emp_mgr = mgr_id \n JOIN mgr_cnt ON emp_mgr = mgr_id \nWHERE emp_id != mgr_id\nORDER BY mgr_dept;`,
dbvopenedge: `-- openedge sample sql\n\nCREATE VIEW ne_customers AS\nSELECT name, address, city, state\nFROM customer\nWHERE state IN ( 'NH', 'MA', 'ME', 'RI', 'CT', 'VT' )\nWITH CHECK OPTION ;\n\nINSERT INTO neworders (order_no, product, qty)\nSELECT order_no, product, qty\nFROM orders\nWHERE order_date = SYSDATE ;\n\nUPDATE OrderLine\nSET (ItemNum, Price) =\n(SELECT ItemNum, Price * 3\nFROM Item\nWHERE ItemName = 'gloves')\nWHERE OrderNum = 21 ;\n\nUpdate Orderline\nSET (Itemnum) =\n(Select Itemnum\nFROM Item\nWHERE Itemname = 'Tennis balls')\nWHERE Ordernum = 20;`,
dbvoracle: `CREATE VIEW vsal \nAS \n SELECT a.deptno "Department", \n a.num_emp / b.total_count "Employees", \n a.sal_sum / b.total_sal "Salary" \n FROM (SELECT deptno, \n Count() num_emp, \n SUM(sal) sal_sum \n FROM scott.emp \n WHERE city = 'NYC' \n GROUP BY deptno) a, \n (SELECT Count() total_count, \n SUM(sal) total_sal \n FROM scott.emp \n WHERE city = 'NYC') b \n;\n\nINSERT ALL\n WHEN ottl < 100000 THEN\n INTO small_orders\n VALUES(oid, ottl, sid, cid)\n WHEN ottl > 100000 and ottl < 200000 THEN\n INTO medium_orders\n VALUES(oid, ottl, sid, cid)\n WHEN ottl > 200000 THEN\n into large_orders\n VALUES(oid, ottl, sid, cid)\n WHEN ottl > 290000 THEN\n INTO special_orders\nSELECT o.order_id oid, o.customer_id cid, o.order_total ottl,\no.sales_rep_id sid, c.credit_limit cl, c.cust_email cem\nFROM orders o, customers c\nWHERE o.customer_id = c.customer_id;\n\ncreate table scott.dept( \n deptno number(2,0), \n dname varchar2(14), \n loc varchar2(13), \n constraint pk_dept primary key (deptno) \n);\n\ncreate table scott.emp( \n empno number(4,0), \n ename varchar2(10), \n job varchar2(9), \n mgr number(4,0), \n hiredate date, \n sal number(7,2), \n comm number(7,2), \n deptno number(2,0), \n constraint pk_emp primary key (empno),\n constraint fk_deptno foreign key (deptno) references dept (deptno) \n);`,
dbvpostgresql: `-- postgresql sample sql\n\ncreate view v2 as \nSELECT distributors.name\nFROM distributors\nWHERE distributors.name LIKE 'W%'\nUNION\nSELECT actors.name\nFROM actors\nWHERE actors.name LIKE 'W%';\n\n\nWITH t AS (\nSELECT random() as x FROM generate_series(1, 3)\n)\nSELECT * FROM t\nUNION ALL\nSELECT * FROM t\n;\n\ncreate view v3 as \nWITH RECURSIVE employee_recursive(distance, employee_name, manager_name) AS (\nSELECT 1, employee_name, manager_name\nFROM employee\nWHERE manager_name = 'Mary'\nUNION ALL\nSELECT er.distance + 1, e.employee_name, e.manager_name\nFROM employee_recursive er, employee e\nWHERE er.employee_name = e.manager_name\n)\nSELECT distance, employee_name FROM employee_recursive;\n\nWITH upd AS (\nUPDATE employees SET sales_count = sales_count + 1 WHERE id =\n(SELECT sales_person FROM accounts WHERE name = 'Acme Corporation')\nRETURNING *\n)\nINSERT INTO employees_log SELECT *, current_timestamp FROM upd;\n\n\n/* not implemented\nCREATE RECURSIVE VIEW nums_1_100 (n) AS\nVALUES (1)\nUNION ALL\nSELECT n+1 FROM nums_1_100 WHERE n < 100;\n*/`,
dbvpresto: 'CREATE TABLE orders_column_aliased (order_date, total_price)\nAS\nSELECT orderdate, totalprice\nFROM orders;',
dbvredshift:
"-- redshift sample sql\nCreate table sales(\n dateid int,\n venuestate char(80),\n venuecity char(40),\n venuename char(100),\ncatname char(50),\nQtr int,\nqtysold int,\n pricepaid int,\nYear date\n);\ncreate view sales_vw as\nselect * from public.sales\nunion all\nselect * from spectrum.sales\n;\n\ninsert into category_ident(catgroup,catname,catdesc)\nselect catgroup,catname,catdesc from category;\n\nUPDATE category \nSET catdesc = 'Broadway Musical' \nWHERE category.catid IN (SELECT category.catid \n FROM category \n JOIN event \n ON category.catid = event.catid \n JOIN venue \n ON venue.venueid = event.venueid \n JOIN sales \n ON sales.eventid = event.eventid \n WHERE venuecity = 'New York City' \n AND catname = 'Musicals'); \n\nupdate category set catid=100\nfrom event join category cat on event.catid=cat.catid\nwhere cat.catgroup='Concerts';\n\nSELECT qtr, \n Sum(pricepaid) AS qtrsales, \n (SELECT Sum(pricepaid) \n FROM sales \n JOIN date \n ON sales.dateid = date.dateid \n WHERE qtr = '1' \n AND year = 2008) AS q1sales \nFROM sales \n JOIN date \n ON sales.dateid = date.dateid \nWHERE qtr IN( '2', '3' ) \n AND year = 2008 \nGROUP BY qtr \nORDER BY qtr; \n\n\nWITH venue_sales \n AS (SELECT venuename, \n venuecity, \n Sum(pricepaid) AS venuename_sales \n FROM sales, \n venue, \n event \n WHERE venue.venueid = event.venueid \n AND event.eventid = sales.eventid \n GROUP BY venuename, \n venuecity), \n top_venues \n AS (SELECT venuename \n FROM venue_sales \n WHERE venuename_sales > 800000) \nSELECT venuename, \n venuecity, \n venuestate, \n Sum(qtysold) AS venue_qty, \n Sum(pricepaid) AS venue_sales \nFROM sales, \n venue, \n event \nWHERE venue.venueid = event.venueid \n AND event.eventid = sales.eventid \n AND venuename IN(SELECT venuename \n FROM top_venues) \nGROUP BY venuename, \n venuecity, \n venuestate \nORDER BY venuename; ",
dbvsnowflake: `-- snowflake sample sql\n\ncreate or replace secure view myview comment='Test secure view' as select * from mytable;\n\n\ninsert into employees(first_name, last_name, workphone, city,postal_code)\n select\n contractor_first,contractor_last,worknum,null,zip_code\n from contractors\n where contains(worknum,'650');\n \ninsert into employees (first_name,last_name,workphone,city,postal_code)\n with cte as\n (select contractor_first as first_name,contractor_last as last_name,worknum as workphone,city,zip_code as postal_code\n from contractors)\n select first_name,last_name,workphone,city,postal_code\n from cte;\n\ninsert into emp (id,first_name,last_name,city,postal_code,ph)\n select a.id,a.first_name,a.last_name,a.city,a.postal_code,b.ph\n from emp_addr a\n inner join emp_ph b on a.id = b.id;\n\ninsert overwrite all\n into t1\n into t1 (c1, c2, c3) values (n2, n1, default)\n into t2 (c1, c2, c3)\n into t2 values (n3, n2, n1)\nselect n1, n2, n3 from src;\n\ninsert all\n when n1 > 100 then\n into t1\n when n1 > 10 then\n into t1\n into t2\n else\n into t2\nselect n1 from src;\n\ninsert all\ninto t1 values (key, a)\nselect src1.key as key, src1.a as a\nfrom src1, src2 where src1.key = src2.key;\n\n\nmerge into target\n using src on target.k = src.k\n when matched and src.v = 11 then delete\n when matched then update set target.v = src.v;\n\nmerge into target using (select k, max(v) as v from src group by k) as b on target.k = b.k\n when matched then update set target.v = b.v\n when not matched then insert (k, v) values (b.k, b.v);\n\nmerge into members m\n using (\n select id, date\n from signup\n where datediff(day, current_date(), signup.date::date) < -30) s on m.id = s.id\n when matched then update set m.fee = 40;\n\nupdate t1\n set t1.number_column = t1.number_column + t2.number_column, t1.text_column = 'ASDF'\n from t2\n where t1.key_column = t2.t1_key and t1.number_column < 10;\n\nupdate target set v = b.v\n from (select k, min(v) v from src group by k) b\n where target.k = b.k; `,
dbvsparksql:
"-- sparksql sample sql\n\nCREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING);\n\nSELECT * FROM person\n PIVOT (\n SUM(age) AS a, AVG(class) AS c\n FOR name IN ('John' AS john, 'Mike' AS mike)\n );",
dbvmssql:
"-- sql server sample sql\nCREATE TABLE dbo.EmployeeSales \n( DataSource varchar(20) NOT NULL, \n BusinessEntityID varchar(11) NOT NULL, \n LastName varchar(40) NOT NULL, \n SalesDollars money NOT NULL \n); \nGO \nCREATE PROCEDURE dbo.uspGetEmployeeSales \nAS \n SET NOCOUNT ON; \n SELECT 'PROCEDURE', sp.BusinessEntityID, c.LastName, \n sp.SalesYTD \n FROM Sales.SalesPerson AS sp \n INNER JOIN Person.Person AS c \n ON sp.BusinessEntityID = c.BusinessEntityID \n WHERE sp.BusinessEntityID LIKE '2%' \n ORDER BY sp.BusinessEntityID, c.LastName; \nGO \n--INSERT...SELECT example \nINSERT INTO dbo.EmployeeSales \n SELECT 'SELECT', sp.BusinessEntityID, c.LastName, sp.SalesYTD \n FROM Sales.SalesPerson AS sp \n INNER JOIN Person.Person AS c \n ON sp.BusinessEntityID = c.BusinessEntityID \n WHERE sp.BusinessEntityID LIKE '2%' \n ORDER BY sp.BusinessEntityID, c.LastName; \nGO \n\n\nCREATE VIEW hiredate_view \nAS \nSELECT p.FirstName, p.LastName, e.BusinessEntityID, e.HireDate \nFROM HumanResources.Employee e \nJOIN Person.Person AS p ON e.BusinessEntityID = p.BusinessEntityID ; \nGO \n\nCREATE VIEW view1 \nAS \nSELECT fis.CustomerKey, fis.ProductKey, fis.OrderDateKey, \n fis.SalesTerritoryKey, dst.SalesTerritoryRegion \nFROM FactInternetSales AS fis \nLEFT OUTER JOIN DimSalesTerritory AS dst \nON (fis.SalesTerritoryKey=dst.SalesTerritoryKey); \n\nGO \nSELECT ROW_NUMBER() OVER(PARTITION BY s.PostalCode ORDER BY s.SalesYTD DESC) AS \"Row Number\", \n p.LastName, s.SalesYTD, a.PostalCode \nFROM Sales.SalesPerson AS s \n INNER JOIN Person.Person AS p \n ON s.BusinessEntityID = p.BusinessEntityID \n INNER JOIN Person.Address AS a \n ON a.AddressID = p.BusinessEntityID \nWHERE s.TerritoryID IS NOT NULL \n AND s.SalesYTD <> 0 \nORDER BY s.PostalCode;",
dbvsybase:
'-- sybase sample sql\n\ncreate view accounts (title, advance, amt_due)\nas select title, advance, price * total_sales\nfrom titles\nwhere price > $5\n\ncreate view cities\n(authorname, acity, publishername, pcity)\nas select authors.au_lname, authors.city, publishers.pub_name,\npublishers.city\nfrom authors, publishers\nwhere authors.city = publishers.city;\n\ncreate view cities2\nas select authorname = authors.au_lname,\nacity = authors.city, publishername = authors.pub_name, pcity =\npublishers.city\nfrom authors, publishers\nwhere authors.city = publishers.city\n;\n\ncreate view psych_titles as\nselect *\nfrom (select * from titles\nwhere type = "psychology") dt_psych\n;\n\ninsert newpublishers (pub_id, pub_name)\nselect pub_id, pub_name\nfrom publishers\nwhere pub_name="New Age Data"\n;\n\nmerge into GlobalSales\n(Item_number, Description, Quantity)as G\nusing DailySales as D\nON D.Item_number = G.Item_number\nwhen not matched\nthen\ninsert (Item_number, Description, Quantity )\nvalues (D.Item_number, D.Description, D.Quantity)\nwhen matched\nthen update set\nG.Quantity = G.Quantity + D.Quantity\n;\n\nupdate titles\nset total_sales = sales.total_sales + sales.qty\nfrom titles, salesdetail, sales\nwhere titles.title_id = salesdetail.title_id\nand salesdetail.stor_id = sales.stor_id\nand salesdetail.ord_num = sales.ord_num\nand sales.date in\n(select max (sales.date) from sales)\n;',
dbvteradata:
"-- teradata sample sql\n\nINSERT INTO promotion (deptnum, empname, yearsexp)\nSELECT deptno, name, yrsexp\nFROM employee\nWHERE yrsexp > 10 ;\n\n-- Teradata Database interprets t1 in the scalar subquery as a distinct instance of\n-- t1 rather than as the target table t1 of the insert operation\n/* not supported yet\nINSERT INTO t1 VALUES (1,2,3 (SELECT d2\nFROM t2\nWHERE a2=t1.a1));\n*/\n\n\nINSERT INTO t1\nSELECT a, PERIOD(b, c)\nFROM t2;\n\nUSING (empno INTEGER,\nname VARCHAR(50),\nsalary INTEGER)\nMERGE INTO employee AS t\nUSING VALUES (:empno, :name, :salary) AS s(empno, name, salary)\nON t.empno=s.empno\nWHEN MATCHED THEN UPDATE\nSET salary=s.salary\nWHEN NOT MATCHED THEN INSERT (empno, name, salary)\nVALUES (s.empno, s.name, s.salary);\n\nUSING (empno INTEGER,\nname VARCHAR(50),\nsalary INTEGER)\nUPDATE employee\nSET salary=:salary\nWHERE empno=:empno\nELSE INSERT INTO employee\n(empno, name, salary) VALUES ( :empno, :name, :salary);\n\nUSING (empno INTEGER,\nsalary INTEGER)\nMERGE INTO employee AS t\nUSING (SELECT :empno, :salary, name\nFROM names\nWHERE empno=:empno) AS s(empno, salary, name)\nON t.empno=s.empno\nWHEN MATCHED THEN UPDATE\nSET salary=s.salary, name = s.name\nWHEN NOT MATCHED THEN INSERT (empno, name, salary)\nVALUES (s.empno, s.name, s.salary);\n\nMERGE INTO emp\nUSING VALUES (100, 'cc', 200, 3333) AS emp1 (empnum, name,\ndeptno, sal)\nON emp1.empnum=emp.s_no\nWHEN MATCHED THEN\nUPDATE SET sal=DEFAULT\nWHEN NOT MATCHED THEN\nINSERT VALUES (emp1.empnum, emp1.name, emp1.deptno, emp1.sal);\nMERGE INTO emp\nUSING VALUES (100, 'cc', 200, 3333) AS emp1 (empnum, name,\ndeptno, sal)\nON emp1.empnum=emp.s_no\nWHEN MATCHED THEN\nUPDATE SET sal=DEFAULT(emp.sal)\nWHEN NOT MATCHED THEN\nINSERT VALUES (emp1.empnum, emp1.name, emp1.deptno, emp1.sal)\n;\n\nMERGE INTO t1\nUSING t2\nON t1.x1=t2.z2 AND t1.y1=t2.y2\nWHEN MATCHED THEN\nUPDATE SET z1=10\nWHEN NOT MATCHED THEN\nINSERT (z2,y2,x2);\n\nMERGE INTO t1\nUSING (SELECT *\nFROM t2\nWHERE y2=10) AS s\nON x1=10\nWHEN MATCHED THEN\nUPDATE SET z1=z2\nWHEN NOT MATCHED THEN\nINSERT (x2, y2, z2);\n\nUPDATE sales_sum_table AS ss\nSET total_sales = (SELECT sum(amount)\nFROM sales_table AS s\nWHERE s.day_of_sale BETWEEN ss.period_start\nAND ss.period_end);\n\n-- The following UPDATE request updates the NoPI table nopi012_t1 aliased as t1.\nUPDATE t1\nFROM nopi012_t1 AS t1, nopi012_t2 AS t2\nSET c3 = t1.c3 * 1.05\nWHERE t1.c2 = t2.c2;",
dbvvertica: `-- hp vertica sample sql\n\nCREATE VIEW myview AS\nSELECT SUM(annual_income), customer_state\nFROM public.customer_dimension\nWHERE customer_key IN\n(SELECT customer_key\nFROM store.store_sales_fact)\nGROUP BY customer_state\nORDER BY customer_state ASC;\n\nINSERT INTO t1 (col1, col2) (SELECT 'abc', mycolumn FROM mytable);\n\nMERGE INTO t USING s ON (t.c1 = s.c1)\nWHEN NOT MATCHED THEN INSERT (c1, c2) VALUES (s.c1, s.c2);\n\n-- First WITH clause,regional_sales\nWITH\nregional_sales AS (\nSELECT region, SUM(amount) AS total_sales\nFROM orders\nGROUP BY region),\n-- Second WITH clause top_regions\ntop_regions AS (\nSELECT region\nFROM regional_sales\nWHERE total_sales > (SELECT SUM (total_sales)/10 FROM regional_sales) )\n-- End defining WITH clause statement\n-- Begin main primary query\nSELECT region,\nproduct,\nSUM(quantity) AS product_units,\nSUM(amount) AS product_sales\nFROM orders\nWHERE region IN (SELECT region FROM top_regions)\n;`,
}

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

Some files were not shown because too many files have changed in this diff Show More