first commit
|
|
@ -0,0 +1,26 @@
|
|||
# Compiled class file
|
||||
*.class
|
||||
|
||||
# Log file
|
||||
*.log
|
||||
|
||||
# BlueJ files
|
||||
*.ctxt
|
||||
|
||||
# Mobile Tools for Java (J2ME)
|
||||
.mtj.tmp/
|
||||
|
||||
# Package Files #
|
||||
*.jar
|
||||
*.war
|
||||
*.nar
|
||||
*.ear
|
||||
*.zip
|
||||
*.tar.gz
|
||||
*.rar
|
||||
|
||||
.DS_Store
|
||||
|
||||
# virtual machine crash logs, see http://www.java.com/en/download/help/error_hotspot.xml
|
||||
hs_err_pid*
|
||||
|
||||
|
|
@ -0,0 +1,201 @@
|
|||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
APPENDIX: How to apply the Apache License to your work.
|
||||
|
||||
To apply the Apache License to your work, attach the following
|
||||
boilerplate notice, with the fields enclosed by brackets "[]"
|
||||
replaced with your own identifying information. (Don't include
|
||||
the brackets!) The text should be enclosed in the appropriate
|
||||
comment syntax for the file format. We also recommend that a
|
||||
file or class name and description of purpose be included on the
|
||||
same "printed page" as the copyright notice for easier
|
||||
identification within third-party archives.
|
||||
|
||||
Copyright [yyyy] [name of copyright owner]
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
|
|
@ -0,0 +1,52 @@
|
|||
## [SQLFlow](https://sqlflow.gudusoft.com) - A tool that tracks column-level data lineage
|
||||
|
||||
Track Column-Level Data Lineage for [more than 20 major databases](/databases/readme.md) including
|
||||
Snowflake, Hive, SparkSQL, Teradata, Oracle, SQL Server, AWS redshift, BigQuery, etc.
|
||||
|
||||
Build and visualize lineage from SQL script from query history, ETL script,
|
||||
Github/Bitbucket, Local filesystem and remote databases.
|
||||
|
||||
[Exploring lineage using an interactive diagram](https://sqlflow.gudusoft.com) or programmatically using [Restful APIs](/api) or [SDKs](https://www.gudusoft.com/sqlflow-java-library-2/).
|
||||
|
||||
Discover data lineage in this query:
|
||||
```sql
|
||||
insert into emp (id,first_name,last_name,city,postal_code,ph)
|
||||
select a.id,a.first_name,a.last_name,a.city,a.postal_code,b.ph
|
||||
from emp_addr a
|
||||
inner join emp_ph b on a.id = b.id;
|
||||
```
|
||||
|
||||
SQLFlow presents a nice clean graph to you that tells
|
||||
where the data came from, what transformations it underwent along the way,
|
||||
and what other data items are derived from this data value.
|
||||
|
||||
[](https://sqlflow.gudusoft.com)
|
||||
|
||||
### What SQLFlow can do for you
|
||||
- Scan your database and discover the data lineage instantly.
|
||||
- Automatically collect SQL script from github/bitbucket or local file system.
|
||||
- Provide a nice cleam diagram to the end-user to understand the data lineage quickly.
|
||||
- programmatically using [Restful APIs](/api) or [SDKs](https://www.gudusoft.com/sqlflow-java-library-2/) to get lineage in CSV, JSON, Graphml format.
|
||||
- Incorporate the lineage metadata decoded from the complex SQL script into your own metadata database for further processing.
|
||||
- Visualize the metadata already existing in your database to release the power of data.
|
||||
- Perform impact analysis and root-cause analysis by tracing lineage backwards or forwards with several mouse click.
|
||||
- Able to process SQL script from more than 20 major database vendors.
|
||||
|
||||
### How to use SQLFlow
|
||||
- Open [the official website](https://gudusoft.com/sqlflow/#/) of the SQLFlow and paste your SQL script or metadata to get a nice clean lineage diagram.
|
||||
- Call the [Restful API](/api) of the SQLFlow in your own code to get data lineage metadata decoded by the SQLFlow from the SQL script.
|
||||
- The [on-premise version](https://github.com/sqlparser/sqlflow_public/blob/master/install_sqlflow.md) of SQLflow enables you to use it on your own server to keep the data safer.
|
||||
|
||||
|
||||
### Restful APIs
|
||||
- [SQLFlow API document](https://github.com/sqlparser/sqlflow_public/blob/master/api/sqlflow_api.md)
|
||||
- [Client in C#](https://github.com/sqlparser/sqlflow_public/tree/master/api/client/csharp)
|
||||
|
||||
### SQLFlow architecture
|
||||
- [Architecture document](sqlflow_architecture.md)
|
||||
|
||||
### User manual and FAQ
|
||||
- [User guide](sqlflow_guide.md)
|
||||
- [SQLFlow FAQ](sqlflow_faq.md)
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,50 @@
|
|||
## 一、SQLFlow 是什么
|
||||
|
||||
数据库中视图(View)的数据来自表(Table)或其他视图,视图中字段(Column)的数据可能来自多个表中多个字段的聚集(aggregation)。
|
||||
表中的数据可能通过ETL从外部系统中导入。这种从数据的源头经过各个处理环节,到达数据终点的数据链路关系称为数据血缘关系([data lineage](https://en.wikipedia.org/wiki/Data_lineage))。
|
||||
|
||||
[SQLFlow](https://sqlflow.gudusoft.com/) 通过分析各种数据库对象的定义(DDL)、DML 语句、ETL/ELT中使用的存储过程(Proceudre,Function)、
|
||||
触发器(Trigger)和其他 SQL 脚本,给出完整的数据血缘关系。
|
||||
|
||||
|
||||
在大型数据仓库中,完整的数据血缘关系可以用来进行数据溯源、表和字段变更的影响分析、数据合规性的证明、数据质量的检查等。
|
||||
|
||||
举例来说,可能会问财务报表中的统计结果,它是有哪些子系统(采购、生产、销售等)提供的数据汇总而成的?
|
||||
当某个子系统(例如 销售子系统)的表和字段等数据结构发生变化时,可能会影响其它子系统吗?
|
||||
财务报表子系统中的表和字段是否也需要进行相应的改动?
|
||||
|
||||
SQLFlow 会帮助你回答这些问题,以可视化的图形方式把这些关系呈现在你面前,让你对组织的IT系统中的数据流动一目了然。
|
||||
|
||||

|
||||
|
||||
## 二、SQLFlow 是怎样工作的
|
||||
|
||||
1. 从数据库、版本控制系统、文件系统中获取 SQL 脚本。
|
||||
2. 解析 SQL 脚本,分析其中的各种数据库对象关系,建立数据血缘关系。
|
||||
3. 以各种形式呈现数据血缘关系,包括交互式 UI、CSV、JSON、GRAPHML 格式。
|
||||
|
||||
## 三、SQLFlow 的组成
|
||||
|
||||
1. Backend, 后台由一系列 Java 程序组成。负责 SQL 的解析、数据血缘分析、可视化元素的布局、身份认证等。
|
||||
2. Frontend,前端由一系列 javascript、html 代码组成。负责 SQL 的递交、数据血缘关系的可视化展示。
|
||||
3. [Grabit 工具](https://www.gudusoft.com/grabit/),一个 Java 程序。负责从数据库、版本控制系统、文件系统中收集 SQL 脚本,递交给后台进行数据血缘分析。
|
||||
4. [Restful API](https://github.com/sqlparser/sqlflow_public/tree/master/api),一套完整的 API。让用户可以通过 Java、C#、Python、PHP 等编程语言与后台进行交互,完成数据血缘分析。
|
||||
|
||||

|
||||
|
||||
## 四、SQLFlow的使用
|
||||
|
||||
1. 通过浏览器访问[SQLFlow的前端](https://sqlflow.gudusoft.com/)。
|
||||
2. 在浏览器中上传SQL文本或文件。
|
||||
3. 点击分析按钮后,查看数据血缘关系的可视化结果。
|
||||
4. 在浏览器中,以交互形式,查看特定表或视图的完整血缘关系图。
|
||||
5. 用 grabit 工具或 API,提交需要处理的 SQL 文件,然后在浏览器中查看结果,或在自己的代码中对返回的结果做进一步处理。
|
||||
|
||||
## 五、SQLFlow 的局限
|
||||
|
||||
SQLFlow 仅仅通过分析 SQL 脚本,包含存储过程(proceudre, function, trigger)来获取数据库中的数据血缘关系。
|
||||
但在 ETL 数据转换过程中,会用到很多其它技术和工具,由此产生的数据血缘关系目前 SQLFlow 无法探知。
|
||||
|
||||
## 六、进一步了解 SQLFlow
|
||||
1. 支持多达21个主流数据库
|
||||
2. [Architecture document](sqlflow_architecture.md)
|
||||
|
|
@ -0,0 +1,344 @@
|
|||
## Ignore Visual Studio temporary files, build results, and
|
||||
## files generated by popular Visual Studio add-ons.
|
||||
##
|
||||
## Get latest from https://github.com/github/gitignore/blob/master/VisualStudio.gitignore
|
||||
|
||||
# User-specific files
|
||||
*.rsuser
|
||||
*.suo
|
||||
*.user
|
||||
*.userosscache
|
||||
*.sln.docstates
|
||||
|
||||
# User-specific files (MonoDevelop/Xamarin Studio)
|
||||
*.userprefs
|
||||
|
||||
# Build results
|
||||
[Dd]ebug/
|
||||
[Dd]ebugPublic/
|
||||
[Rr]elease/
|
||||
[Rr]eleases/
|
||||
x64/
|
||||
x86/
|
||||
[Aa][Rr][Mm]/
|
||||
[Aa][Rr][Mm]64/
|
||||
bld/
|
||||
[Bb]in/
|
||||
[Oo]bj/
|
||||
[Ll]og/
|
||||
|
||||
# Visual Studio 2015/2017 cache/options directory
|
||||
.vs/
|
||||
# Uncomment if you have tasks that create the project's static files in wwwroot
|
||||
#wwwroot/
|
||||
|
||||
# Visual Studio 2017 auto generated files
|
||||
Generated\ Files/
|
||||
|
||||
# MSTest test Results
|
||||
[Tt]est[Rr]esult*/
|
||||
[Bb]uild[Ll]og.*
|
||||
|
||||
# NUNIT
|
||||
*.VisualState.xml
|
||||
TestResult.xml
|
||||
|
||||
# Build Results of an ATL Project
|
||||
[Dd]ebugPS/
|
||||
[Rr]eleasePS/
|
||||
dlldata.c
|
||||
|
||||
# Benchmark Results
|
||||
BenchmarkDotNet.Artifacts/
|
||||
|
||||
# .NET Core
|
||||
project.lock.json
|
||||
project.fragment.lock.json
|
||||
artifacts/
|
||||
|
||||
# StyleCop
|
||||
StyleCopReport.xml
|
||||
|
||||
# Files built by Visual Studio
|
||||
*_i.c
|
||||
*_p.c
|
||||
*_h.h
|
||||
*.ilk
|
||||
*.meta
|
||||
*.obj
|
||||
*.iobj
|
||||
*.pch
|
||||
*.pdb
|
||||
*.ipdb
|
||||
*.pgc
|
||||
*.pgd
|
||||
*.rsp
|
||||
*.sbr
|
||||
*.tlb
|
||||
*.tli
|
||||
*.tlh
|
||||
*.tmp
|
||||
*.tmp_proj
|
||||
*_wpftmp.csproj
|
||||
*.log
|
||||
*.vspscc
|
||||
*.vssscc
|
||||
.builds
|
||||
*.pidb
|
||||
*.svclog
|
||||
*.scc
|
||||
|
||||
# Chutzpah Test files
|
||||
_Chutzpah*
|
||||
|
||||
# Visual C++ cache files
|
||||
ipch/
|
||||
*.aps
|
||||
*.ncb
|
||||
*.opendb
|
||||
*.opensdf
|
||||
*.sdf
|
||||
*.cachefile
|
||||
*.VC.db
|
||||
*.VC.VC.opendb
|
||||
|
||||
# Visual Studio profiler
|
||||
*.psess
|
||||
*.vsp
|
||||
*.vspx
|
||||
*.sap
|
||||
|
||||
# Visual Studio Trace Files
|
||||
*.e2e
|
||||
|
||||
# TFS 2012 Local Workspace
|
||||
$tf/
|
||||
|
||||
# Guidance Automation Toolkit
|
||||
*.gpState
|
||||
|
||||
# ReSharper is a .NET coding add-in
|
||||
_ReSharper*/
|
||||
*.[Rr]e[Ss]harper
|
||||
*.DotSettings.user
|
||||
|
||||
# JustCode is a .NET coding add-in
|
||||
.JustCode
|
||||
|
||||
# TeamCity is a build add-in
|
||||
_TeamCity*
|
||||
|
||||
# DotCover is a Code Coverage Tool
|
||||
*.dotCover
|
||||
|
||||
# AxoCover is a Code Coverage Tool
|
||||
.axoCover/*
|
||||
!.axoCover/settings.json
|
||||
|
||||
# Visual Studio code coverage results
|
||||
*.coverage
|
||||
*.coveragexml
|
||||
|
||||
# NCrunch
|
||||
_NCrunch_*
|
||||
.*crunch*.local.xml
|
||||
nCrunchTemp_*
|
||||
|
||||
# MightyMoose
|
||||
*.mm.*
|
||||
AutoTest.Net/
|
||||
|
||||
# Web workbench (sass)
|
||||
.sass-cache/
|
||||
|
||||
# Installshield output folder
|
||||
[Ee]xpress/
|
||||
|
||||
# DocProject is a documentation generator add-in
|
||||
DocProject/buildhelp/
|
||||
DocProject/Help/*.HxT
|
||||
DocProject/Help/*.HxC
|
||||
DocProject/Help/*.hhc
|
||||
DocProject/Help/*.hhk
|
||||
DocProject/Help/*.hhp
|
||||
DocProject/Help/Html2
|
||||
DocProject/Help/html
|
||||
|
||||
# Click-Once directory
|
||||
publish/
|
||||
|
||||
# Publish Web Output
|
||||
*.[Pp]ublish.xml
|
||||
*.azurePubxml
|
||||
# Note: Comment the next line if you want to checkin your web deploy settings,
|
||||
# but database connection strings (with potential passwords) will be unencrypted
|
||||
*.pubxml
|
||||
*.publishproj
|
||||
!linux.pubxml
|
||||
!osx.pubxml
|
||||
!win.pubxml
|
||||
|
||||
# Microsoft Azure Web App publish settings. Comment the next line if you want to
|
||||
# checkin your Azure Web App publish settings, but sensitive information contained
|
||||
# in these scripts will be unencrypted
|
||||
PublishScripts/
|
||||
|
||||
# NuGet Packages
|
||||
*.nupkg
|
||||
# The packages folder can be ignored because of Package Restore
|
||||
**/[Pp]ackages/*
|
||||
# except build/, which is used as an MSBuild target.
|
||||
!**/[Pp]ackages/build/
|
||||
# Uncomment if necessary however generally it will be regenerated when needed
|
||||
#!**/[Pp]ackages/repositories.config
|
||||
# NuGet v3's project.json files produces more ignorable files
|
||||
*.nuget.props
|
||||
*.nuget.targets
|
||||
|
||||
# Microsoft Azure Build Output
|
||||
csx/
|
||||
*.build.csdef
|
||||
|
||||
# Microsoft Azure Emulator
|
||||
ecf/
|
||||
rcf/
|
||||
|
||||
# Windows Store app package directories and files
|
||||
AppPackages/
|
||||
BundleArtifacts/
|
||||
Package.StoreAssociation.xml
|
||||
_pkginfo.txt
|
||||
*.appx
|
||||
|
||||
# Visual Studio cache files
|
||||
# files ending in .cache can be ignored
|
||||
*.[Cc]ache
|
||||
# but keep track of directories ending in .cache
|
||||
!?*.[Cc]ache/
|
||||
|
||||
# Others
|
||||
ClientBin/
|
||||
~$*
|
||||
*~
|
||||
*.dbmdl
|
||||
*.dbproj.schemaview
|
||||
*.jfm
|
||||
*.pfx
|
||||
*.publishsettings
|
||||
orleans.codegen.cs
|
||||
|
||||
# Including strong name files can present a security risk
|
||||
# (https://github.com/github/gitignore/pull/2483#issue-259490424)
|
||||
#*.snk
|
||||
|
||||
# Since there are multiple workflows, uncomment next line to ignore bower_components
|
||||
# (https://github.com/github/gitignore/pull/1529#issuecomment-104372622)
|
||||
#bower_components/
|
||||
# ASP.NET Core default setup: bower directory is configured as wwwroot/lib/ and bower restore is true
|
||||
**/wwwroot/lib/
|
||||
|
||||
# RIA/Silverlight projects
|
||||
Generated_Code/
|
||||
|
||||
# Backup & report files from converting an old project file
|
||||
# to a newer Visual Studio version. Backup files are not needed,
|
||||
# because we have git ;-)
|
||||
_UpgradeReport_Files/
|
||||
Backup*/
|
||||
UpgradeLog*.XML
|
||||
UpgradeLog*.htm
|
||||
ServiceFabricBackup/
|
||||
*.rptproj.bak
|
||||
|
||||
# SQL Server files
|
||||
*.mdf
|
||||
*.ldf
|
||||
*.ndf
|
||||
|
||||
# Business Intelligence projects
|
||||
*.rdl.data
|
||||
*.bim.layout
|
||||
*.bim_*.settings
|
||||
*.rptproj.rsuser
|
||||
|
||||
# Microsoft Fakes
|
||||
FakesAssemblies/
|
||||
|
||||
# GhostDoc plugin setting file
|
||||
*.GhostDoc.xml
|
||||
|
||||
# Node.js Tools for Visual Studio
|
||||
.ntvs_analysis.dat
|
||||
node_modules/
|
||||
|
||||
# Visual Studio 6 build log
|
||||
*.plg
|
||||
|
||||
# Visual Studio 6 workspace options file
|
||||
*.opt
|
||||
|
||||
# Visual Studio 6 auto-generated workspace file (contains which files were open etc.)
|
||||
*.vbw
|
||||
|
||||
# Visual Studio LightSwitch build output
|
||||
**/*.HTMLClient/GeneratedArtifacts
|
||||
**/*.DesktopClient/GeneratedArtifacts
|
||||
**/*.DesktopClient/ModelManifest.xml
|
||||
**/*.Server/GeneratedArtifacts
|
||||
**/*.Server/ModelManifest.xml
|
||||
_Pvt_Extensions
|
||||
|
||||
# Paket dependency manager
|
||||
.paket/paket.exe
|
||||
paket-files/
|
||||
|
||||
# FAKE - F# Make
|
||||
.fake/
|
||||
|
||||
# JetBrains Rider
|
||||
.idea/
|
||||
*.sln.iml
|
||||
|
||||
# CodeRush personal settings
|
||||
.cr/personal
|
||||
|
||||
# Python Tools for Visual Studio (PTVS)
|
||||
__pycache__/
|
||||
*.pyc
|
||||
|
||||
# Cake - Uncomment if you are using it
|
||||
# tools/**
|
||||
# !tools/packages.config
|
||||
|
||||
# Tabs Studio
|
||||
*.tss
|
||||
|
||||
# Telerik's JustMock configuration file
|
||||
*.jmconfig
|
||||
|
||||
# BizTalk build output
|
||||
*.btp.cs
|
||||
*.btm.cs
|
||||
*.odx.cs
|
||||
*.xsd.cs
|
||||
|
||||
# OpenCover UI analysis results
|
||||
OpenCover/
|
||||
|
||||
# Azure Stream Analytics local run output
|
||||
ASALocalRun/
|
||||
|
||||
# MSBuild Binary and Structured Log
|
||||
*.binlog
|
||||
|
||||
# NVidia Nsight GPU debugger configuration file
|
||||
*.nvuser
|
||||
|
||||
# MFractors (Xamarin productivity tool) working folder
|
||||
.mfractor/
|
||||
|
||||
# Local History for Visual Studio
|
||||
.localhistory/
|
||||
|
||||
# BeatPulse healthcheck temp database
|
||||
healthchecksdb
|
||||
|
|
@ -0,0 +1,25 @@
|
|||
|
||||
Microsoft Visual Studio Solution File, Format Version 12.00
|
||||
# Visual Studio Version 16
|
||||
VisualStudioVersion = 16.0.29503.13
|
||||
MinimumVisualStudioVersion = 10.0.40219.1
|
||||
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "SQLFlowClient", "SQLFlowClient\SQLFlowClient.csproj", "{8F80B6E9-F33B-4936-8111-48A9BCA9AEDC}"
|
||||
EndProject
|
||||
Global
|
||||
GlobalSection(SolutionConfigurationPlatforms) = preSolution
|
||||
Debug|Any CPU = Debug|Any CPU
|
||||
Release|Any CPU = Release|Any CPU
|
||||
EndGlobalSection
|
||||
GlobalSection(ProjectConfigurationPlatforms) = postSolution
|
||||
{8F80B6E9-F33B-4936-8111-48A9BCA9AEDC}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
|
||||
{8F80B6E9-F33B-4936-8111-48A9BCA9AEDC}.Debug|Any CPU.Build.0 = Debug|Any CPU
|
||||
{8F80B6E9-F33B-4936-8111-48A9BCA9AEDC}.Release|Any CPU.ActiveCfg = Release|Any CPU
|
||||
{8F80B6E9-F33B-4936-8111-48A9BCA9AEDC}.Release|Any CPU.Build.0 = Release|Any CPU
|
||||
EndGlobalSection
|
||||
GlobalSection(SolutionProperties) = preSolution
|
||||
HideSolutionNode = FALSE
|
||||
EndGlobalSection
|
||||
GlobalSection(ExtensibilityGlobals) = postSolution
|
||||
SolutionGuid = {3C9B0DCD-4A60-4E0C-9A35-7211C074B0D1}
|
||||
EndGlobalSection
|
||||
EndGlobal
|
||||
|
|
@ -0,0 +1 @@
|
|||
dist
|
||||
|
|
@ -0,0 +1,14 @@
|
|||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.Text;
|
||||
|
||||
namespace SQLFlowClient
|
||||
{
|
||||
class Config
|
||||
{
|
||||
public string Host { get; set; }
|
||||
public string Token { get; set; }
|
||||
public string SecretKey { get; set; }
|
||||
public string UserId { get; set; }
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,31 @@
|
|||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.Text;
|
||||
using System.ComponentModel;
|
||||
|
||||
namespace SQLFlowClient
|
||||
{
|
||||
public enum DBVendor
|
||||
{
|
||||
bigquery,
|
||||
couchbase,
|
||||
db2,
|
||||
greenplum,
|
||||
hana ,
|
||||
hive,
|
||||
impala ,
|
||||
informix,
|
||||
mdx,
|
||||
mysql,
|
||||
netezza,
|
||||
openedge,
|
||||
oracle,
|
||||
postgresql,
|
||||
redshift,
|
||||
snowflake,
|
||||
mssql,
|
||||
sybase,
|
||||
teradata,
|
||||
vertica,
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,246 @@
|
|||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.Net.Http;
|
||||
using System.Text;
|
||||
using System.Threading.Tasks;
|
||||
using System.Linq;
|
||||
using System.Net.Http.Headers;
|
||||
using Newtonsoft.Json;
|
||||
using Newtonsoft.Json.Linq;
|
||||
using System.IO;
|
||||
using System.Diagnostics;
|
||||
|
||||
namespace SQLFlowClient
|
||||
{
|
||||
public static class HttpService
|
||||
{
|
||||
private static Config config;
|
||||
|
||||
public static async Task Request(Options options)
|
||||
{
|
||||
config = new Config
|
||||
{
|
||||
Host = "https://api.gudusoft.com",
|
||||
Token = "",
|
||||
UserId = "gudu|0123456789",
|
||||
};
|
||||
try
|
||||
{
|
||||
if (File.Exists("./config.json"))
|
||||
{
|
||||
var json = JObject.Parse(File.ReadAllText("./config.json"));
|
||||
if (!string.IsNullOrWhiteSpace(json["Host"]?.ToString()))
|
||||
{
|
||||
config.Host = json["Host"].ToString();
|
||||
}
|
||||
if (!string.IsNullOrWhiteSpace(json["Token"]?.ToString()))
|
||||
{
|
||||
config.Token = json["Token"].ToString();
|
||||
}
|
||||
if (!string.IsNullOrWhiteSpace(json["SecretKey"]?.ToString()))
|
||||
{
|
||||
config.SecretKey = json["SecretKey"].ToString();
|
||||
}
|
||||
if (!string.IsNullOrWhiteSpace(json["UserId"]?.ToString()))
|
||||
{
|
||||
config.UserId = json["UserId"].ToString();
|
||||
}
|
||||
}
|
||||
}
|
||||
catch (Exception e)
|
||||
{
|
||||
Console.WriteLine($"Invalid config.json :\n{e.Message}");
|
||||
return;
|
||||
}
|
||||
//if (!string.IsNullOrWhiteSpace(options.Token))
|
||||
//{
|
||||
// config.Token = options.Token;
|
||||
//}
|
||||
//if (!string.IsNullOrWhiteSpace(options.UserId))
|
||||
//{
|
||||
// config.UserId = options.UserId;
|
||||
//}
|
||||
//if (!string.IsNullOrWhiteSpace(options.SecretKey))
|
||||
//{
|
||||
// config.SecretKey = options.SecretKey;
|
||||
//}
|
||||
if (options.Version)
|
||||
{
|
||||
await Version();
|
||||
}
|
||||
else
|
||||
{
|
||||
await SQLFlow(options);
|
||||
}
|
||||
}
|
||||
|
||||
public static async Task SQLFlow(Options options)
|
||||
{
|
||||
StreamContent sqlfile;
|
||||
if (options.SQLFile == null)
|
||||
{
|
||||
Console.WriteLine($"Please specify an input file. (e.g. SQLFlowClient test.sql)");
|
||||
return;
|
||||
}
|
||||
try
|
||||
{
|
||||
string path = Path.GetFullPath(options.SQLFile);
|
||||
sqlfile = new StreamContent(File.Open(options.SQLFile, FileMode.Open));
|
||||
}
|
||||
catch (Exception e)
|
||||
{
|
||||
Console.WriteLine($"Open file failed.\n{e.Message}");
|
||||
return;
|
||||
}
|
||||
var types = options.ShowRelationType.Split(",")
|
||||
.Where(p => Enum.GetNames(typeof(RelationType)).FirstOrDefault(t => t.ToLower() == p.ToLower()) == null)
|
||||
.ToList();
|
||||
if (types.Count != 0)
|
||||
{
|
||||
Console.WriteLine($"Wrong relation type : { string.Join(",", types) }.\nIt should be one or more from the following list : fdd, fdr, frd, fddi, join");
|
||||
return;
|
||||
}
|
||||
string dbvendor = Enum.GetNames(typeof(DBVendor)).FirstOrDefault(p => p.ToLower() == options.DBVendor.ToLower());
|
||||
if (dbvendor == null)
|
||||
{
|
||||
Console.WriteLine($"Wrong database vendor : {options.DBVendor}.\nIt should be one of the following list : " +
|
||||
$"bigquery, couchbase, db2, greenplum, hana , hive, impala , informix, mdx, mysql, netezza, openedge," +
|
||||
$" oracle, postgresql, redshift, snowflake, mssql, sybase, teradata, vertica");
|
||||
return;
|
||||
}
|
||||
if (!string.IsNullOrWhiteSpace(config.SecretKey) && !string.IsNullOrWhiteSpace(config.UserId))
|
||||
{
|
||||
// request token
|
||||
string url2 = $"{config.Host}/gspLive_backend/user/generateToken";
|
||||
using var client2 = new HttpClient();
|
||||
using var response2 = await client2.PostAsync(url2, content: new FormUrlEncodedContent(new List<KeyValuePair<string, string>>
|
||||
{
|
||||
new KeyValuePair<string, string>("userId", config.UserId),
|
||||
new KeyValuePair<string, string>("secretKey", config.SecretKey)
|
||||
}));
|
||||
if (response2.IsSuccessStatusCode)
|
||||
{
|
||||
var text = await response2.Content.ReadAsStringAsync();
|
||||
var jobject = JObject.Parse(text);
|
||||
var json = jobject.ToString();
|
||||
var code = jobject.SelectToken("code");
|
||||
if (code?.ToString() == "200")
|
||||
{
|
||||
config.Token = jobject.SelectToken("token").ToString();
|
||||
}
|
||||
else
|
||||
{
|
||||
Console.WriteLine($"{url2} error, code={code?.ToString() }");
|
||||
return;
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
Console.WriteLine($"Wrong response code {(int)response2.StatusCode} {response2.StatusCode}.url={url2}");
|
||||
return;
|
||||
}
|
||||
|
||||
}
|
||||
var form = new MultipartFormDataContent{
|
||||
{ sqlfile , "sqlfile" , "sqlfile" },
|
||||
{ new StringContent("dbv"+dbvendor) , "dbvendor" },
|
||||
{ new StringContent(options.ShowRelationType) , "showRelationType" },
|
||||
{ new StringContent(options.SimpleOutput.ToString()) , "simpleOutput" },
|
||||
{ new StringContent(options.IgnoreRecordSet.ToString()) , "ignoreRecordSet" },
|
||||
{ new StringContent(options.ignoreFunction.ToString()) , "ignoreFunction" },
|
||||
{ new StringContent(config.UserId) , "userId" },
|
||||
{ new StringContent(config.Token) , "token" },
|
||||
};
|
||||
try
|
||||
{
|
||||
var stopWatch = Stopwatch.StartNew();
|
||||
string url = $"{config.Host}/gspLive_backend/sqlflow/generation/sqlflow/" + (options.IsGraph ? "graph" : "");
|
||||
using var client = new HttpClient();
|
||||
// client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Token", config.Token);
|
||||
using var response = await client.PostAsync(url, form);
|
||||
if (response.IsSuccessStatusCode)
|
||||
{
|
||||
stopWatch.Stop();
|
||||
var text = await response.Content.ReadAsStringAsync();
|
||||
var result = new SQLFlowResult(text);
|
||||
if (result.data && result.dbobjs || result.data && result.sqlflow && result.graph)
|
||||
{
|
||||
if (options.Output != "")
|
||||
{
|
||||
try
|
||||
{
|
||||
File.WriteAllText(Path.GetFullPath(options.Output), result.json);
|
||||
Console.WriteLine($"Output has been saved to {options.Output}.");
|
||||
}
|
||||
catch (Exception e)
|
||||
{
|
||||
Console.WriteLine($"Save File failed.{e.Message}");
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
Console.WriteLine(result.json ?? "");
|
||||
}
|
||||
}
|
||||
if (result.error)
|
||||
{
|
||||
Console.WriteLine($"Success with some errors.Executed in {stopWatch.Elapsed.TotalSeconds.ToString("0.00")} seconds by host {config.Host}.");
|
||||
}
|
||||
else
|
||||
{
|
||||
Console.WriteLine($"Success.Executed in {stopWatch.Elapsed.TotalSeconds.ToString("0.00")} seconds by host {config.Host}.");
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
Console.WriteLine($"Wrong response code {(int)response.StatusCode} {response.StatusCode}.");
|
||||
}
|
||||
}
|
||||
catch (Exception e)
|
||||
{
|
||||
Console.WriteLine($"An unknonwn exeception occurs :\n{e.Message}");
|
||||
}
|
||||
}
|
||||
|
||||
public static async Task Version()
|
||||
{
|
||||
try
|
||||
{
|
||||
string url = $"{config.Host}/gspLive_backend/version";
|
||||
using var client = new HttpClient();
|
||||
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Token", config.Token);
|
||||
var form = new MultipartFormDataContent{
|
||||
{ new StringContent(config.UserId) , "userId" },
|
||||
};
|
||||
using var response = await client.PostAsync(url, form);
|
||||
if (response.IsSuccessStatusCode)
|
||||
{
|
||||
var text = await response.Content.ReadAsStringAsync();
|
||||
var json = JObject.Parse(text);
|
||||
var gsp = new
|
||||
{
|
||||
ReleaseDate = json.SelectToken("version.gsp.['release.date']")?.ToString(),
|
||||
version = json.SelectToken("version.gsp.version")?.ToString(),
|
||||
};
|
||||
var backend = new
|
||||
{
|
||||
ReleaseDate = json.SelectToken("version.backend.['release.date']")?.ToString(),
|
||||
version = json.SelectToken("version.backend.version")?.ToString(),
|
||||
};
|
||||
Console.WriteLine(" version relase date");
|
||||
Console.WriteLine("SQLFlowClient 1.2.0 2020/12/13");
|
||||
Console.WriteLine($"gsp {gsp.version} {gsp.ReleaseDate}");
|
||||
Console.WriteLine($"backend {backend.version} {backend.ReleaseDate}");
|
||||
}
|
||||
else
|
||||
{
|
||||
Console.WriteLine($"Not connected.Wrong response code {(int)response.StatusCode} {response.StatusCode}.");
|
||||
}
|
||||
}
|
||||
catch (Exception e)
|
||||
{
|
||||
Console.WriteLine($"An unknonwn exeception occurs :\n{e.Message}");
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,73 @@
|
|||
using System;
|
||||
using CommandLine;
|
||||
using CommandLine.Text;
|
||||
|
||||
namespace SQLFlowClient
|
||||
{
|
||||
public class Options
|
||||
{
|
||||
[Value(0, MetaName = "sqlfile", Required = false, HelpText = "Input sqlfile to be processed.")]
|
||||
public string SQLFile { get; set; }
|
||||
|
||||
[Option('g', "graph", Required = false, Default = false, HelpText = "Get the graph from sql.")]
|
||||
public bool IsGraph { get; set; }
|
||||
|
||||
[Option('v', "dbvendor", Required = false, Default = "oracle", HelpText = "Set the database of the sqlfile.")]
|
||||
public string DBVendor { get; set; }
|
||||
|
||||
[Option('r', "showRelationType", Required = false, Default = "fdd", HelpText = "Set the relation type.")]
|
||||
public string ShowRelationType { get; set; }
|
||||
|
||||
[Option('s', "simpleOutput", Required = false, Default = false, HelpText = "Set whether to get simple output.")]
|
||||
public bool SimpleOutput { get; set; }
|
||||
|
||||
[Option("ignoreRecordSet", Required = false, Default = false, HelpText = "Set whether to ignore record set.")]
|
||||
public bool IgnoreRecordSet { get; set; }
|
||||
|
||||
[Option("ignoreFunction", Required = false, Default = false, HelpText = "Set whether to ignore function.")]
|
||||
public bool ignoreFunction { get; set; }
|
||||
|
||||
[Option('o', "output", Required = false, Default = "", HelpText = "Save output as a file.")]
|
||||
public string Output { get; set; }
|
||||
|
||||
//[Option('t', "token", Required = false, Default = "", HelpText = "If userId and secretKey is given, token will be ignore, otherwise it will use token.")]
|
||||
//public string Token { get; set; }
|
||||
|
||||
//[Option('u', "userId", Required = false, Default = "", HelpText = "")]
|
||||
//public string UserId { get; set; }
|
||||
|
||||
//[Option('k', "secretKey", Required = false, Default = "", HelpText = "")]
|
||||
//public string SecretKey { get; set; }
|
||||
|
||||
[Option("version", Required = false, Default = false, HelpText = "Show version.")]
|
||||
public bool Version { get; set; }
|
||||
}
|
||||
|
||||
class Program
|
||||
{
|
||||
static void Main(string[] args)
|
||||
{
|
||||
var parser = new Parser(with =>
|
||||
{
|
||||
with.AutoVersion = false;
|
||||
with.AutoHelp = true;
|
||||
});
|
||||
var parserResult = parser.ParseArguments<Options>(args); ;
|
||||
parserResult
|
||||
.WithParsed(options =>
|
||||
{
|
||||
HttpService.Request(options).Wait();
|
||||
})
|
||||
.WithNotParsed(errs =>
|
||||
{
|
||||
var helpText = HelpText.AutoBuild(parserResult, h =>
|
||||
{
|
||||
h.AutoHelp = true;
|
||||
h.AutoVersion = false;
|
||||
return h;
|
||||
}, e => e);
|
||||
Console.WriteLine(helpText);
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,17 @@
|
|||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<!--
|
||||
https://go.microsoft.com/fwlink/?LinkID=208121.
|
||||
-->
|
||||
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
|
||||
<PropertyGroup>
|
||||
<Configuration>Release</Configuration>
|
||||
<Platform>Any CPU</Platform>
|
||||
<PublishDir>bin\Release\netcoreapp3.0\publish\linux</PublishDir>
|
||||
<PublishProtocol>FileSystem</PublishProtocol>
|
||||
<TargetFramework>netcoreapp3.0</TargetFramework>
|
||||
<RuntimeIdentifier>linux-x64</RuntimeIdentifier>
|
||||
<SelfContained>true</SelfContained>
|
||||
<PublishSingleFile>True</PublishSingleFile>
|
||||
<PublishTrimmed>False</PublishTrimmed>
|
||||
</PropertyGroup>
|
||||
</Project>
|
||||
|
|
@ -0,0 +1,17 @@
|
|||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<!--
|
||||
https://go.microsoft.com/fwlink/?LinkID=208121.
|
||||
-->
|
||||
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
|
||||
<PropertyGroup>
|
||||
<Configuration>Release</Configuration>
|
||||
<Platform>Any CPU</Platform>
|
||||
<PublishDir>bin\Release\netcoreapp3.0\publish\osx\</PublishDir>
|
||||
<PublishProtocol>FileSystem</PublishProtocol>
|
||||
<TargetFramework>netcoreapp3.0</TargetFramework>
|
||||
<RuntimeIdentifier>osx-x64</RuntimeIdentifier>
|
||||
<SelfContained>true</SelfContained>
|
||||
<PublishSingleFile>True</PublishSingleFile>
|
||||
<PublishTrimmed>False</PublishTrimmed>
|
||||
</PropertyGroup>
|
||||
</Project>
|
||||
|
|
@ -0,0 +1,18 @@
|
|||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<!--
|
||||
https://go.microsoft.com/fwlink/?LinkID=208121.
|
||||
-->
|
||||
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
|
||||
<PropertyGroup>
|
||||
<Configuration>Release</Configuration>
|
||||
<Platform>Any CPU</Platform>
|
||||
<PublishDir>bin\Release\netcoreapp3.0\publish\win\</PublishDir>
|
||||
<PublishProtocol>FileSystem</PublishProtocol>
|
||||
<TargetFramework>netcoreapp3.0</TargetFramework>
|
||||
<RuntimeIdentifier>win-x64</RuntimeIdentifier>
|
||||
<SelfContained>true</SelfContained>
|
||||
<PublishSingleFile>True</PublishSingleFile>
|
||||
<PublishReadyToRun>False</PublishReadyToRun>
|
||||
<PublishTrimmed>False</PublishTrimmed>
|
||||
</PropertyGroup>
|
||||
</Project>
|
||||
|
|
@ -0,0 +1,8 @@
|
|||
{
|
||||
"profiles": {
|
||||
"SQLFlowClient": {
|
||||
"commandName": "Project",
|
||||
"commandLineArgs": "test.sql"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,15 @@
|
|||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.Text;
|
||||
|
||||
namespace SQLFlowClient
|
||||
{
|
||||
public enum RelationType
|
||||
{
|
||||
fdd,
|
||||
fdr,
|
||||
frd,
|
||||
fddi,
|
||||
join,
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,30 @@
|
|||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<OutputType>Exe</OutputType>
|
||||
<TargetFramework>netcoreapp3.0</TargetFramework>
|
||||
<Version>1.0.9</Version>
|
||||
<AssemblyVersion>1.0.9.0</AssemblyVersion>
|
||||
<FileVersion>1.0.9.0</FileVersion>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<Compile Remove="dist\**" />
|
||||
<EmbeddedResource Remove="dist\**" />
|
||||
<None Remove="dist\**" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="CommandLineParser" Version="2.6.0" />
|
||||
<PackageReference Include="Newtonsoft.Json" Version="12.0.3" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<None Update="config.json">
|
||||
<CopyToOutputDirectory>Always</CopyToOutputDirectory>
|
||||
</None>
|
||||
<None Update="test.sql">
|
||||
<CopyToOutputDirectory>Always</CopyToOutputDirectory>
|
||||
</None>
|
||||
</ItemGroup>
|
||||
</Project>
|
||||
|
|
@ -0,0 +1,86 @@
|
|||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.IO;
|
||||
using System.Text;
|
||||
using Newtonsoft.Json;
|
||||
using Newtonsoft.Json.Linq;
|
||||
|
||||
namespace SQLFlowClient
|
||||
{
|
||||
class SQLFlowResult
|
||||
{
|
||||
private readonly int maxLength = 24814437;// json will not be formatted if the string length exceeds this number
|
||||
public string json;
|
||||
public bool data;
|
||||
public bool error;
|
||||
public bool dbobjs;
|
||||
public bool sqlflow;
|
||||
public bool graph;
|
||||
|
||||
public SQLFlowResult(string text)
|
||||
{
|
||||
if (text.Length <= maxLength)
|
||||
{
|
||||
var jobject = JObject.Parse(text);
|
||||
json = jobject.ToString();
|
||||
data = jobject.SelectToken("data") != null;
|
||||
error = jobject.SelectToken("error") != null;
|
||||
dbobjs = jobject.SelectToken("data.dbobjs") != null;
|
||||
sqlflow = jobject.SelectToken("data.sqlflow") != null;
|
||||
graph = jobject.SelectToken("data.graph") != null;
|
||||
}
|
||||
else
|
||||
{
|
||||
json = text;
|
||||
data = false;
|
||||
error = false;
|
||||
dbobjs = false;
|
||||
sqlflow = false;
|
||||
graph = false;
|
||||
|
||||
using var reader = new JsonTextReader(new StringReader(text));
|
||||
while (reader.Read())
|
||||
{
|
||||
if (reader.Value != null)
|
||||
{
|
||||
//Console.WriteLine("Token: {0}, Value: {1} ,Depth:{2}", reader.TokenType, reader.Value, reader.Depth);
|
||||
if (reader.Depth > 3)
|
||||
{
|
||||
goto End;
|
||||
}
|
||||
if (reader.TokenType.ToString() == "PropertyName")
|
||||
{
|
||||
switch (reader.Value.ToString())
|
||||
{
|
||||
case "data":
|
||||
data = true;
|
||||
break;
|
||||
case "error":
|
||||
error = true;
|
||||
break;
|
||||
case "dbobjs":
|
||||
dbobjs = true;
|
||||
break;
|
||||
case "sqlflow":
|
||||
sqlflow = true;
|
||||
break;
|
||||
case "graph":
|
||||
graph = true;
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
//Console.WriteLine("Token: {0}", reader.TokenType);
|
||||
if (error || dbobjs || sqlflow || graph)
|
||||
{
|
||||
reader.Skip();
|
||||
}
|
||||
}
|
||||
}
|
||||
End: { }
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,5 @@
|
|||
{
|
||||
"Host": "https://api.gudusoft.com",
|
||||
"SecretKey": "d126d0fb1a5a13abb97b160d571f29a2bbaa13861219082da7e9c4d62553ed7c",
|
||||
"UserId": "auth0|600acd55e68a290069f8a8db"
|
||||
}
|
||||
|
|
@ -0,0 +1,5 @@
|
|||
dotnet publish -c Release /p:PublishProfile=Properties\PublishProfiles\linux.pubxml
|
||||
dotnet publish -c Release /p:PublishProfile=Properties\PublishProfiles\osx.pubxml
|
||||
dotnet publish -c Release /p:PublishProfile=Properties\PublishProfiles\win.pubxml
|
||||
if exist dist rd dist /S /Q
|
||||
xcopy /s .\bin\Release\netcoreapp3.0\publish .\dist\
|
||||
|
|
@ -0,0 +1,33 @@
|
|||
CREATE VIEW vsal
|
||||
AS
|
||||
SELECT a.deptno "Department",
|
||||
a.num_emp / b.total_count "Employees",
|
||||
a.sal_sum / b.total_sal "Salary"
|
||||
FROM (SELECT deptno,
|
||||
Count() num_emp,
|
||||
SUM(sal) sal_sum
|
||||
FROM scott.emp
|
||||
WHERE city = 'NYC'
|
||||
GROUP BY deptno) a,
|
||||
(SELECT Count() total_count,
|
||||
SUM(sal) total_sal
|
||||
FROM scott.emp
|
||||
WHERE city = 'NYC') b
|
||||
;
|
||||
|
||||
INSERT ALL
|
||||
WHEN ottl < 100000 THEN
|
||||
INTO small_orders
|
||||
VALUES(oid, ottl, sid, cid)
|
||||
WHEN ottl > 100000 and ottl < 200000 THEN
|
||||
INTO medium_orders
|
||||
VALUES(oid, ottl, sid, cid)
|
||||
WHEN ottl > 200000 THEN
|
||||
into large_orders
|
||||
VALUES(oid, ottl, sid, cid)
|
||||
WHEN ottl > 290000 THEN
|
||||
INTO special_orders
|
||||
SELECT o.order_id oid, o.customer_id cid, o.order_total ottl,
|
||||
o.sales_rep_id sid, c.credit_limit cl, c.cust_email cem
|
||||
FROM orders o, customers c
|
||||
WHERE o.customer_id = c.customer_id;
|
||||
|
|
@ -0,0 +1,115 @@
|
|||
# Get Started
|
||||
### [Download](https://sqlflow.gudusoft.com/download/) the executable program according to your operating system.
|
||||
|
||||
- [windows](https://sqlflow.gudusoft.com/download/win/SQLFlowClient.exe)
|
||||
- [mac](https://sqlflow.gudusoft.com/download/osx/SQLFlowClient)
|
||||
- [linux](https://sqlflow.gudusoft.com/download/linux/SQLFlowClient)
|
||||
|
||||
|
||||
### Configuration
|
||||
|
||||
#### SQLFlow Cloud server
|
||||
|
||||
Create a file named `config.json` in directory where the executable(.exe) exists, and then input your `SecretKey` and `UserId`, always set `host` to `https://api.gudusoft.com` ,for example:
|
||||
|
||||
```json
|
||||
{
|
||||
"Host": "https://api.gudusoft.com",
|
||||
"SecretKey": "XXX",
|
||||
"UserId": "XXX"
|
||||
}
|
||||
```
|
||||
If you want to connect to [the SQLFlow Cloud Server](https://sqlflow.gudusoft.com), you may [request a 30 days premium account](https://www.gudusoft.com/request-a-premium-account/) to
|
||||
[get the necessary userId and secret code](/sqlflow-userid-secret.md).
|
||||
|
||||
#### SQLFlow on-premise version
|
||||
|
||||
Create a file named `config.json` in directory where the executable(.exe) exists, and always set `userId` to `gudu|0123456789`, keep `userSecret` empty, and set `host`to your server ip, for example:
|
||||
|
||||
```json
|
||||
{
|
||||
"Host": "http://your server ip:8081",
|
||||
"SecretKey": "",
|
||||
"UserId": "gudu|0123456789"
|
||||
}
|
||||
```
|
||||
Please [check here](https://github.com/sqlparser/sqlflow_public/blob/master/install_sqlflow.md) to see how to install SQLFlow on-premise version on you own server.
|
||||
|
||||
### Set permissions
|
||||
|
||||
|
||||
For mac:
|
||||
```
|
||||
chmod +x SQLFlowClient
|
||||
```
|
||||
|
||||
For linux:
|
||||
```
|
||||
chmod +x SQLFlowClient
|
||||
```
|
||||
|
||||
### Create a simple sql file for testing
|
||||
For example, test.sql:
|
||||
```sql
|
||||
insert into t2 select * from t1;
|
||||
```
|
||||
|
||||
Run the program from command line:
|
||||
```
|
||||
./SQLFlowClient test.sql
|
||||
```
|
||||
```
|
||||
./SQLFlowClient test.sql -g
|
||||
```
|
||||
|
||||
# Usage
|
||||
|
||||
SQLFlowClient filepath -parameter value
|
||||
|
||||
### parameters
|
||||
|
||||
| parameter | short | value type | default | |
|
||||
| ------------------ | ----- | ------------------------------------------------------------ | ------- | --------------------------------- |
|
||||
| --graph | -g | boolean | false | Get the graph from sql. |
|
||||
| --dbvendor | -v | one of the following list :<br />bigquery, couchbase, db2, greenplum, <br />hana , hive, impala , informix, <br />mdx, mysql, netezza, openedge, <br />oracle, postgresql, redshift, snowflake, <br />mssql, sybase, teradata, vertica | oracle | Set the database of the sqlfile. |
|
||||
| --showRelationType | -r | one or more from the following list :<br /> fdd, fdr, frd, fddi, join | fdd | Set the relation type. |
|
||||
| --simpleOutput | -s | boolean | false | Set whether to get simple output. |
|
||||
| --ignoreRecordSet | | boolean | false | Set whether to ignore record set. |
|
||||
| --ignoreFunction | | boolean | false | Set whether to ignore function. |
|
||||
| --output | -o | string | "" | Save output as a file. |
|
||||
| --help | | | | Display this help screen. |
|
||||
| --version | | | | Display version information. |
|
||||
|
||||
### examples
|
||||
1. SQLFlowClient test.sql
|
||||
2. SQLFlowClient test.sql -g
|
||||
3. SQLFlowClient test.sql -g -v oracle
|
||||
4. SQLFlowClient test.sql -g -v oracle -r fdr
|
||||
5. SQLFlowClient test.sql -g -v oracle -r fdr,join
|
||||
6. SQLFlowClient test.sql -g -v oracle -r fdr,join -s
|
||||
7. SQLFlowClient test.sql -g -v oracle -r fdr,join -s --ignoreRecordSet
|
||||
8. SQLFlowClient test.sql -g -v oracle -r fdr,join -s --ignoreFunction -o result.txt
|
||||
|
||||
# Compile and build on windows
|
||||
|
||||
### Download and install the .NET Core SDK
|
||||
|
||||
```
|
||||
https://dotnet.microsoft.com/download
|
||||
```
|
||||
|
||||
### Download source code
|
||||
```
|
||||
git clone https://github.com/sqlparser/sqlflow_public.git
|
||||
```
|
||||
|
||||
### Build from command line
|
||||
|
||||
```
|
||||
dotnet publish -c Release /p:PublishProfile=Properties\PublishProfiles\linux.pubxml
|
||||
dotnet publish -c Release /p:PublishProfile=Properties\PublishProfiles\osx.pubxml
|
||||
dotnet publish -c Release /p:PublishProfile=Properties\PublishProfiles\win.pubxml
|
||||
```
|
||||
|
||||
### [Download executable programs](https://sqlflow.gudusoft.com/download//)
|
||||
|
||||
|
|
@ -0,0 +1,88 @@
|
|||
"""
|
||||
How to get user_id and secret_key: https://docs.gudusoft.com/3.-api-docs/prerequisites#generate-account-secret
|
||||
|
||||
once you have user_id and secret_key,
|
||||
|
||||
user_id: <YOUR USER ID HERE>
|
||||
secret_key: <YOUR SECRET KEY HERE>
|
||||
|
||||
you can get token by:
|
||||
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/user/generateToken" -H "Request-Origion:testClientDemo" -H "accept:application/json;charset=utf-8" -H "Content-Type:application/x-www-form-urlencoded;charset=UTF-8" -d "secretKey=YOUR SECRET KEY" -d "userId=YOUR USER ID HERE"
|
||||
|
||||
and then you can use the token to call the api:
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/sqlflow/generation/sqlflow?showRelationType=fdd" -H "Request-Origion:testClientDemo" -H "accept:application/json;charset=utf-8" -H "Content-Type:multipart/form-data" -F "sqlfile=" -F "dbvendor=dbvoracle" -F "ignoreRecordSet=false" -F "simpleOutput=false" -F "sqltext=CREATE VIEW vsal as select * from emp" -F "userId=YOUR USER ID HERE" -F "token=YOUR TOKEN HERE"
|
||||
|
||||
"""
|
||||
|
||||
# Python code to call the API based on the description:
|
||||
|
||||
import requests
|
||||
import urllib3
|
||||
|
||||
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
|
||||
|
||||
# Function to get the token
|
||||
def get_token(user_id, secret_key):
|
||||
url = "https://api.gudusoft.com/gspLive_backend/user/generateToken"
|
||||
headers = {
|
||||
"Request-Origion": "testClientDemo",
|
||||
"accept": "application/json;charset=utf-8",
|
||||
"Content-Type": "application/x-www-form-urlencoded;charset=UTF-8"
|
||||
}
|
||||
data = {
|
||||
"secretKey": secret_key,
|
||||
"userId": user_id
|
||||
}
|
||||
response = requests.post(url, headers=headers, data=data, verify=False, proxies=None)
|
||||
|
||||
# Check if the request was successful
|
||||
response.raise_for_status()
|
||||
|
||||
# Parse the JSON response
|
||||
json_response = response.json()
|
||||
|
||||
# Check if 'token' key exists directly in the response
|
||||
if 'token' in json_response:
|
||||
return json_response['token']
|
||||
else:
|
||||
raise ValueError("Token not found in the response.")
|
||||
|
||||
# Function to call the SQLFlow API
|
||||
def call_sqlflow_api(user_id, token, sql_text):
|
||||
url = "https://api.gudusoft.com/gspLive_backend/sqlflow/generation/sqlflow?showRelationType=fdd"
|
||||
headers = {
|
||||
"Request-Origion": "testClientDemo",
|
||||
"accept": "application/json;charset=utf-8"
|
||||
}
|
||||
data = {
|
||||
"sqlfile": "",
|
||||
"dbvendor": "dbvoracle",
|
||||
"ignoreRecordSet": "false",
|
||||
"simpleOutput": "false",
|
||||
"sqltext": sql_text,
|
||||
"userId": user_id,
|
||||
"token": token
|
||||
}
|
||||
response = requests.post(url, headers=headers, data=data)
|
||||
return response.json()
|
||||
|
||||
# Example usage
|
||||
# How to get user_id and secret_key: https://docs.gudusoft.com/3.-api-docs/prerequisites#generate-account-secret
|
||||
|
||||
user_id = "your user id"
|
||||
secret_key = "your secret key"
|
||||
sql_text = "CREATE VIEW vsal AS SELECT * FROM emp"
|
||||
|
||||
try:
|
||||
# Get the token
|
||||
token = get_token(user_id, secret_key)
|
||||
print("Token:", token)
|
||||
except requests.exceptions.RequestException as e:
|
||||
print("Error making request:", e)
|
||||
except ValueError as e:
|
||||
print("Error parsing response:", e)
|
||||
|
||||
# Call the SQLFlow API
|
||||
result = call_sqlflow_api(user_id, token, sql_text)
|
||||
print(result)
|
||||
|
|
@ -0,0 +1,90 @@
|
|||
package java;
|
||||
/**
|
||||
* 解析SQLFLow exportLineageAsJson接口返回的JSON格式的血缘关系中的关系链路
|
||||
*
|
||||
* 例如demo中的血缘数据,解析成以下链路:
|
||||
* 达成的目标是,List中两个元素:
|
||||
* SCOTT.DEPT -> SCOTT.EMP->VSAL
|
||||
* SCOTT.EMP->VSAL
|
||||
*/
|
||||
|
||||
public class DataLineageParser {
|
||||
static class Node {
|
||||
String value;
|
||||
String id;
|
||||
Node next;
|
||||
|
||||
public Node(String value, String id) {
|
||||
this.value = value;
|
||||
this.id = id;
|
||||
}
|
||||
|
||||
public String key() {
|
||||
Node node = this.next;
|
||||
StringBuilder key = new StringBuilder(id);
|
||||
while (node != null) {
|
||||
key.append(node.id);
|
||||
node = node.next;
|
||||
}
|
||||
return key.toString();
|
||||
}
|
||||
}
|
||||
|
||||
public static void main(String[] args) {
|
||||
String input = "{"jobId":"d9550e491c024d0cbe6e1034604aca17","code":200,"data":{"mode":"global","sqlflow":{"relationship":[{"sources":[{"parentName":"ORDERS","column":"TABLE","coordinates":[],"id":"10000106","parentId":"86"}],"id":"1000012311","type":"fdd","target":{"parentName":"SPECIAL_ORDERS","column":"TABLE","coordinates":[],"id":"10000102","parentId":"82"}},{"sources":[{"parentName":"CUSTOMERS","column":"TABLE","coordinates":[],"id":"10000103","parentId":"94"}],"id":"1000012312","type":"fdd","target":{"parentName":"SPECIAL_ORDERS","column":"TABLE","coordinates":[],"id":"10000102","parentId":"82"}}]}},"sessionId":"8bb7d3da4b687bb7badf01608a739fbebd61309cd5a643cecf079d122095738a_1685604216451"}";
|
||||
try {
|
||||
ObjectMapper objectMapper = new ObjectMapper();
|
||||
JsonNode jsonNode = objectMapper.readTree(input);
|
||||
JsonNode relationshipNode = jsonNode.path("data").path("sqlflow").path("relationships");
|
||||
List<Map<String, Object>> dataList = objectMapper.readValue(relationshipNode.toString(), new TypeReference<List<Map<String, Object>>>() {
|
||||
});
|
||||
|
||||
ArrayList<Node> value = new ArrayList<>();
|
||||
Map<String, Node> nodeMap = new HashMap<>();
|
||||
for (Map<String, Object> data : dataList) {
|
||||
List<Map<String, Object>> sources = (List<Map<String, Object>>) data.get("sources");
|
||||
Map<String, Object> targetNode = (Map<String, Object>) data.get("target");
|
||||
Node target = new Node((String) targetNode.get("parentName"), (String) targetNode.get("parentId"));
|
||||
if (!sources.isEmpty()) {
|
||||
for (Map<String, Object> source : sources) {
|
||||
String parentId = (String) source.get("parentId");
|
||||
String parentName = (String) source.get("parentName");
|
||||
Node sourceNode = new Node(parentName, parentId);
|
||||
sourceNode.next = target;
|
||||
value.add(sourceNode);
|
||||
nodeMap.put(parentId, sourceNode);
|
||||
}
|
||||
} else {
|
||||
value.add(target);
|
||||
nodeMap.put((String) targetNode.get("parentId"), target);
|
||||
}
|
||||
}
|
||||
|
||||
for (Node node : value) {
|
||||
Node next = node.next;
|
||||
if (next != null) {
|
||||
String id = next.id;
|
||||
next = nodeMap.get(id);
|
||||
if (next != null) {
|
||||
node.next = next;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
HashSet<String> key = new HashSet<>();
|
||||
Iterator<Node> iterator = value.iterator();
|
||||
while (iterator.hasNext()) {
|
||||
Node node = iterator.next();
|
||||
String k = node.key();
|
||||
if (key.contains(k)) {
|
||||
iterator.remove();
|
||||
}
|
||||
key.add(k);
|
||||
}
|
||||
|
||||
// value
|
||||
} catch (JsonProcessingException e) {
|
||||
e.printStackTrace();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,3 @@
|
|||
Manifest-Version: 1.0
|
||||
Class-Path: lib/fastjson-1.2.47.jar lib/httpclient-4.5.5.jar lib/httpcore-4.4.9.jar lib/httpmime-4.5.6.jar lib/slf4j-api-1.7.25.jar lib/slf4j-log4j12-1.7.25.jar lib/commons-codec-1.10.jar lib/commons-logging-1.2.jar
|
||||
Main-Class: com.gudusoft.grabit.Runner
|
||||
|
|
@ -0,0 +1,3 @@
|
|||
Manifest-Version: 1.0
|
||||
Class-Path: lib/fastjson-1.2.47.jar lib/httpclient-4.5.5.jar lib/httpcore-4.4.9.jar lib/httpmime-4.5.6.jar lib/slf4j-api-1.7.25.jar lib/slf4j-log4j12-1.7.25.jar lib/commons-codec-1.10.jar lib/commons-logging-1.2.jar
|
||||
Main-Class: com.gudusoft.grabit.Runner
|
||||
|
|
@ -0,0 +1,38 @@
|
|||
@ECHO OFF
|
||||
SETLOCAL enableDelayedExpansion
|
||||
|
||||
SET cur_dir=%CD%
|
||||
echo %cur_dir%
|
||||
|
||||
SET qddemo=%cur_dir%
|
||||
|
||||
SET qddemo_src=%qddemo%\src
|
||||
SET qddemo_bin=%qddemo%\lib
|
||||
SET qddemo_class=%qddemo%\class
|
||||
|
||||
echo %qddemo_class%
|
||||
echo %qddemo_bin%
|
||||
|
||||
IF EXIST %qddemo_class% RMDIR %qddemo_class%
|
||||
IF NOT EXIST %qddemo_class% MKDIR %qddemo_class%
|
||||
|
||||
cd %cur_dir%
|
||||
CD %qddemo_src%
|
||||
FOR /R %%b IN ( . ) DO (
|
||||
IF EXIST %%b/*.java SET JFILES=!JFILES! %%b/*.java
|
||||
)
|
||||
|
||||
MKDIR %qddemo_class%\lib
|
||||
XCOPY %qddemo_bin% %qddemo_class%\lib
|
||||
XCOPY %qddemo%\MANIFEST.MF %qddemo_class%
|
||||
|
||||
cd %cur_dir%
|
||||
|
||||
javac -d %qddemo_class% -encoding utf-8 -cp .;%qddemo_bin%\commons-codec-1.10.jar;%qddemo_bin%\commons-logging-1.2.jar;%qddemo_bin%\fastjson-1.2.47.jar;%qddemo_bin%\httpclient-4.5.5.jar;%qddemo_bin%\httpcore-4.4.9.jar;%qddemo_bin%\httpmime-4.5.6.jar; %JFILES%
|
||||
|
||||
cd %qddemo_class%
|
||||
jar -cvfm %qddemo%\grabit-java.jar %qddemo%\MANIFEST-windwos.MF *
|
||||
|
||||
echo "successfully"
|
||||
|
||||
pause
|
||||
|
|
@ -0,0 +1,28 @@
|
|||
#!/bin/bash
|
||||
|
||||
cur_dir=$(pwd)
|
||||
|
||||
function compile(){
|
||||
src_dir=$cur_dir/src
|
||||
bin_dir=$cur_dir/lib
|
||||
class_dir=$cur_dir/class
|
||||
|
||||
|
||||
rm -rf $src_dir/sources.list
|
||||
find $src_dir -name "*.java" > $src_dir/sources.list
|
||||
cat $src_dir/sources.list
|
||||
|
||||
|
||||
rm -rf $class_dir
|
||||
mkdir $class_dir
|
||||
cp $cur_dir/MANIFEST.MF $class_dir
|
||||
cp -r $cur_dir/lib $class_dir
|
||||
|
||||
javac -d $class_dir -cp .:$bin_dir/fastjson-1.2.47.jar:$bin_dir/commons-codec-1.10.jar:$bin_dir/commons-logging-1.2.jar:$bin_dir/slf4j-api-1.7.25.jar:$bin_dir/slf4j-log4j12-1.7.25.jar:$bin_dir/httpcore-4.4.9.jar:$bin_dir/httpclient-4.5.5.jar:$bin_dir/httpmime-4.5.6.jar -g -sourcepath $src_dir @$src_dir/sources.list
|
||||
|
||||
cd $class_dir
|
||||
jar -cvfm $cur_dir/grabit-java.jar MANIFEST.MF *
|
||||
}
|
||||
|
||||
compile
|
||||
exit 0
|
||||
|
After Width: | Height: | Size: 196 KiB |
|
|
@ -0,0 +1,56 @@
|
|||
-- sql server sample sql
|
||||
CREATE TABLE dbo.EmployeeSales
|
||||
( DataSource varchar(20) NOT NULL,
|
||||
BusinessEntityID varchar(11) NOT NULL,
|
||||
LastName varchar(40) NOT NULL,
|
||||
SalesDollars money NOT NULL
|
||||
);
|
||||
GO
|
||||
CREATE PROCEDURE dbo.uspGetEmployeeSales
|
||||
AS
|
||||
SET NOCOUNT ON;
|
||||
SELECT 'PROCEDURE', sp.BusinessEntityID, c.LastName,
|
||||
sp.SalesYTD
|
||||
FROM Sales.SalesPerson AS sp
|
||||
INNER JOIN Person.Person AS c
|
||||
ON sp.BusinessEntityID = c.BusinessEntityID
|
||||
WHERE sp.BusinessEntityID LIKE '2%'
|
||||
ORDER BY sp.BusinessEntityID, c.LastName;
|
||||
GO
|
||||
--INSERT...SELECT example
|
||||
INSERT INTO dbo.EmployeeSales
|
||||
SELECT 'SELECT', sp.BusinessEntityID, c.LastName, sp.SalesYTD
|
||||
FROM Sales.SalesPerson AS sp
|
||||
INNER JOIN Person.Person AS c
|
||||
ON sp.BusinessEntityID = c.BusinessEntityID
|
||||
WHERE sp.BusinessEntityID LIKE '2%'
|
||||
ORDER BY sp.BusinessEntityID, c.LastName;
|
||||
GO
|
||||
|
||||
|
||||
CREATE VIEW hiredate_view
|
||||
AS
|
||||
SELECT p.FirstName, p.LastName, e.BusinessEntityID, e.HireDate
|
||||
FROM HumanResources.Employee e
|
||||
JOIN Person.Person AS p ON e.BusinessEntityID = p.BusinessEntityID ;
|
||||
GO
|
||||
|
||||
CREATE VIEW view1
|
||||
AS
|
||||
SELECT fis.CustomerKey, fis.ProductKey, fis.OrderDateKey,
|
||||
fis.SalesTerritoryKey, dst.SalesTerritoryRegion
|
||||
FROM FactInternetSales AS fis
|
||||
LEFT OUTER JOIN DimSalesTerritory AS dst
|
||||
ON (fis.SalesTerritoryKey=dst.SalesTerritoryKey);
|
||||
|
||||
GO
|
||||
SELECT ROW_NUMBER() OVER(PARTITION BY PostalCode ORDER BY SalesYTD DESC) AS "Row Number",
|
||||
p.LastName, s.SalesYTD, a.PostalCode
|
||||
FROM Sales.SalesPerson AS s
|
||||
INNER JOIN Person.Person AS p
|
||||
ON s.BusinessEntityID = p.BusinessEntityID
|
||||
INNER JOIN Person.Address AS a
|
||||
ON a.AddressID = p.BusinessEntityID
|
||||
WHERE TerritoryID IS NOT NULL
|
||||
AND SalesYTD <> 0
|
||||
ORDER BY PostalCode;
|
||||
|
After Width: | Height: | Size: 54 KiB |
|
|
@ -0,0 +1,156 @@
|
|||
## JAVA Data lineage: using the SQLFlow REST API (Advanced)
|
||||
|
||||
This article illustrates how to discover the data lineage using JAVA and the SQLFlow REST API.
|
||||
|
||||
By using the SQLFlow REST API, you can code in JAVA to discover the data lineage in SQL scripts
|
||||
and get the result in an actionable diagram, json, csv or graphml format.
|
||||
|
||||
You can integerate the JAVA code provided here into your own project and add the powerful
|
||||
data lineage analsysis capability instantly.
|
||||
|
||||
### 1. interactive data lineage visualizations
|
||||

|
||||
|
||||
### 2. [Data lineage in JSON format](java-data-lineage-result.json)
|
||||
|
||||
### 3. Data lineage in CSV, graphml format
|
||||
|
||||
|
||||
## Prerequisites
|
||||
- [SQLFlow Cloud Server or on-premise version](https://github.com/sqlparser/sqlflow_public/tree/master/api#prerequisites)
|
||||
|
||||
- Java 8 or higher version must be installed and configured correctly.
|
||||
|
||||
- setup the PATH like this: (Please change the JAVA_HOME according to your environment)
|
||||
```
|
||||
export JAVA_HOME=/usr/lib/jvm/default-java
|
||||
|
||||
export PATH=$JAVA_HOME/bin:$PATH
|
||||
```
|
||||
|
||||
- compile and build `grabit-java.jar`
|
||||
|
||||
**mac&linux**
|
||||
```
|
||||
chmod 777 compile.sh
|
||||
|
||||
./compile.sh
|
||||
```
|
||||
|
||||
**windows**
|
||||
|
||||
```
|
||||
compile.bat
|
||||
```
|
||||
|
||||
### Usage
|
||||
|
||||
````
|
||||
java -jar grabit-java.jar /s server /p port /u userId /k userSecret /t databaseType /f path_to_config_file /r resultType
|
||||
|
||||
eg:
|
||||
java -jar grabit-java.jar /u 'auth0|xxx' /k cab9712c45189014a94a8b7aceeef7a3db504be58e18cd3686f3bbefd078ef4d /s https://api.gudusoft.com /t oracle /f demo.sql /r 1
|
||||
|
||||
note:
|
||||
If the parameter string contains symbols like "|" , it must be included in a single quotes (' ') or double quotes on windows (" ")
|
||||
````
|
||||
|
||||
Example:
|
||||
|
||||
1. Connect to the SQLFlow Cloud Server
|
||||
```
|
||||
java -jar grabit-java.jar /s https://api.gudusoft.com /u 'YOUR_USER_ID' /k YOUR_SECRET_KEY /t sqlserver /f java-data-lineage-sqlserver.sql /r 1
|
||||
```
|
||||
|
||||
2. Connect to the SQLFlow on-premise
|
||||
This will discover data lineage by analyzing the `java-data-lineage-sqlserver.sql` file. You may also specify a zip file which includes lots of SQL files.
|
||||
```
|
||||
java -jar grabit-java.jar /s http://127.0.0.1 /p 8081 /u 'gudu|0123456789' /t sqlserver /f java-data-lineage-sqlserver.sql /r 1
|
||||
```
|
||||
|
||||
This will discover data lineage by analyzing all SQL files under `sqlfiles` directory.
|
||||
```
|
||||
java -jar grabit-java.jar /s http://127.0.0.1 /p 8081 /u 'gudu|0123456789' /t mysql /f sqlfiles /r 1
|
||||
```
|
||||
|
||||
After execution, view the `logs/graibt.log` file for the detailed information.
|
||||
|
||||
If the log prints a **submit the job to sqlflow successful**.
|
||||
Then it is proved that the upload to SQLFlow has been successful.
|
||||
|
||||
Log in to the SQLFlow website to view the newly analyzed results.
|
||||
In the `Job List`, you can view the analysis results of the currently submitted tasks.
|
||||
|
||||
### Parameters
|
||||
|
||||
- **path_to_config_file**
|
||||
|
||||
This can be a single SQL file, a zip file including multiple SQL files, or a directory including lots of SQL files.
|
||||
|
||||
- **server**
|
||||
|
||||
Usually, it is the IP address of [the SQLFlow on-premise version](https://www.gudusoft.com/sqlflow-on-premise-version/)
|
||||
installed on your owner servers such as `127.0.0.1` or `http://127.0.0.1`
|
||||
|
||||
You may set the value to `https://api.gudusoft.com` if you like to send your SQL script to [the SQLFlow Cloud Server](https://sqlflow.gudusoft.com) to get the data lineage result.
|
||||
|
||||
- **port**
|
||||
|
||||
The default value is `8081` if you connect to your SQLFlow on-premise server.
|
||||
|
||||
However, if you setup the nginx reverse proxy in the nginx configuration file like this:
|
||||
```
|
||||
location /api/ {
|
||||
proxy_pass http://127.0.0.1:8081/;
|
||||
proxy_connect_timeout 600s ;
|
||||
proxy_read_timeout 600s;
|
||||
proxy_send_timeout 600s;
|
||||
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header User-Agent $http_user_agent;
|
||||
}
|
||||
```
|
||||
Then, keep the value of `serverPort` empty and set `server` to the value like this: `http://127.0.0.1/api`.
|
||||
|
||||
>Please keep this value empty if you connect to the SQLFlow Cloud Server by specifying the `https://api.gudusoft.com`
|
||||
in the `server`
|
||||
>
|
||||
- **userId, userSecret**
|
||||
|
||||
This is the user id that is used to connect to the SQLFlow server.
|
||||
Always set this value to `gudu|0123456789` and keep `userSecret` empty if you use the SQLFlow on-premise version.
|
||||
|
||||
If you want to connect to [the SQLFlow Cloud Server](https://sqlflow.gudusoft.com), you may [request a 30 days premium account](https://www.gudusoft.com/request-a-premium-account/) to
|
||||
[get the necessary userId and secret code](/sqlflow-userid-secret.md).
|
||||
|
||||
|
||||
- **databaseType**
|
||||
|
||||
This parameter specifies the database dialect of the SQL scripts that the SQLFlow has analyzed.
|
||||
|
||||
```txt
|
||||
access,bigquery,couchbase,dax,db2,greenplum,hana,hive,impala,informix,mdx,mssql,
|
||||
sqlserver,mysql,netezza,odbc,openedge,oracle,postgresql,postgres,redshift,snowflake,
|
||||
sybase,teradata,soql,vertica
|
||||
```
|
||||
|
||||
- **resultType**
|
||||
|
||||
When you submit SQL script to the SQLFlow server, A job is created on the SQLFlow server
|
||||
and you can always see the graphic data lineage result via the browser,
|
||||
|
||||
|
||||
Even better, This demo will fetch the data lineage back to the directory where the demo is running.
|
||||
Those data lineage results are stored in the `data/result/` directory.
|
||||
|
||||
This parameter specifies which kind of format is used to save the data lineage result.
|
||||
|
||||
Available values for this parameter:
|
||||
- 1: JSON, data lineage result in JSON.
|
||||
- 2: CSV, data lineage result in CSV format.
|
||||
- 3: diagram, in graphml format that can be viewed by yEd.
|
||||
|
||||
### SQLFlow REST API
|
||||
Please check here for the detailed information about the [SQLFlow REST API](https://github.com/sqlparser/sqlflow_public/tree/master/api/sqlflow_api.md)
|
||||
|
|
@ -0,0 +1,28 @@
|
|||
package com.gudusoft.grabit;
|
||||
|
||||
import java.text.SimpleDateFormat;
|
||||
import java.util.Date;
|
||||
|
||||
public class DateUtil {
|
||||
public DateUtil() {
|
||||
}
|
||||
|
||||
public static String format(Date date) {
|
||||
return format(date, "yyyyMMdd");
|
||||
}
|
||||
|
||||
public static String format(Date date, String pattern) {
|
||||
if (date != null) {
|
||||
SimpleDateFormat df = new SimpleDateFormat(pattern);
|
||||
return df.format(date);
|
||||
} else {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
public static String timeStamp2Date(Long seconds) {
|
||||
SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMdd");
|
||||
return sdf.format(new Date(seconds));
|
||||
}
|
||||
|
||||
}
|
||||
|
|
@ -0,0 +1,89 @@
|
|||
package com.gudusoft.grabit;
|
||||
|
||||
import java.io.*;
|
||||
import java.util.zip.ZipEntry;
|
||||
import java.util.zip.ZipOutputStream;
|
||||
|
||||
public class FileUtil {
|
||||
|
||||
private static final int BUFFER_SIZE = 10 * 1024 * 1024;
|
||||
|
||||
private FileUtil() {
|
||||
}
|
||||
|
||||
public static void mkFile(String filePath) throws IOException {
|
||||
File testFile = new File(filePath);
|
||||
File fileParent = testFile.getParentFile();
|
||||
if (!fileParent.exists()) {
|
||||
fileParent.mkdirs();
|
||||
}
|
||||
if (!testFile.exists()) {
|
||||
testFile.createNewFile();
|
||||
}
|
||||
}
|
||||
|
||||
public static void toZip(String srcDir, OutputStream out, boolean KeepDirStructure)
|
||||
throws RuntimeException {
|
||||
ZipOutputStream zos = null;
|
||||
try {
|
||||
zos = new ZipOutputStream(out);
|
||||
File sourceFile = new File(srcDir);
|
||||
compress(sourceFile, zos, sourceFile.getName(), KeepDirStructure);
|
||||
} catch (Exception e) {
|
||||
throw new RuntimeException("zip error from ZipUtils", e);
|
||||
} finally {
|
||||
if (zos != null) {
|
||||
try {
|
||||
zos.close();
|
||||
} catch (IOException e) {
|
||||
e.printStackTrace();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
private static void compress(File sourceFile, ZipOutputStream zos, String name,
|
||||
boolean KeepDirStructure) throws Exception {
|
||||
byte[] buf = new byte[BUFFER_SIZE];
|
||||
if (sourceFile.isFile()) {
|
||||
zos.putNextEntry(new ZipEntry(name));
|
||||
int len;
|
||||
FileInputStream in = new FileInputStream(sourceFile);
|
||||
while ((len = in.read(buf)) != -1) {
|
||||
zos.write(buf, 0, len);
|
||||
}
|
||||
zos.closeEntry();
|
||||
in.close();
|
||||
} else {
|
||||
File[] listFiles = sourceFile.listFiles();
|
||||
if (listFiles == null || listFiles.length == 0) {
|
||||
if (KeepDirStructure) {
|
||||
zos.putNextEntry(new ZipEntry(name + "/"));
|
||||
zos.closeEntry();
|
||||
}
|
||||
|
||||
} else {
|
||||
for (File file : listFiles) {
|
||||
if (KeepDirStructure) {
|
||||
compress(file, zos, name + "/" + file.getName(), KeepDirStructure);
|
||||
} else {
|
||||
compress(file, zos, file.getName(), KeepDirStructure);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
public static OutputStream outStream(String path) throws IOException {
|
||||
FileOutputStream fileOutputStream;
|
||||
try {
|
||||
fileOutputStream = new FileOutputStream(path);
|
||||
} catch (Exception ex) {
|
||||
mkFile(path);
|
||||
fileOutputStream = new FileOutputStream(path);
|
||||
}
|
||||
return fileOutputStream;
|
||||
}
|
||||
|
||||
}
|
||||
|
|
@ -0,0 +1,182 @@
|
|||
package com.gudusoft.grabit;
|
||||
|
||||
import com.alibaba.fastjson.JSONObject;
|
||||
import com.gudusoft.grabit.SqlFlowUtil;
|
||||
import com.gudusoft.grabit.DateUtil;
|
||||
import com.gudusoft.grabit.FileUtil;
|
||||
|
||||
import java.io.File;
|
||||
import java.io.IOException;
|
||||
import java.util.Arrays;
|
||||
import java.util.Date;
|
||||
import java.util.List;
|
||||
|
||||
|
||||
public class Runner {
|
||||
|
||||
|
||||
public static void main(String[] args) throws IOException {
|
||||
if (args.length < 2) {
|
||||
System.err.println("please enter the correct parameters.");
|
||||
return;
|
||||
}
|
||||
|
||||
List<String> argList = Arrays.asList(args);
|
||||
matchParam("/f", argList);
|
||||
String fileVal = detectParam("/f", args, argList);
|
||||
File file = new File(fileVal);
|
||||
if (!file.exists()) {
|
||||
System.err.println("{} is not exist." + file);
|
||||
return;
|
||||
}
|
||||
|
||||
matchParam("/s", argList);
|
||||
String server = detectParam("/s", args, argList);
|
||||
if (!server.startsWith("http") && !server.startsWith("https")) {
|
||||
server = "http://" + server;
|
||||
}
|
||||
if (server.endsWith(File.separator)) {
|
||||
server = server.substring(0, server.length() - 1);
|
||||
}
|
||||
|
||||
if (argList.contains("/p") && argList.size() > argList.indexOf("/p") + 1) {
|
||||
server = server + ":" + detectParam("/p", args, argList);
|
||||
}
|
||||
|
||||
matchParam("/u", argList);
|
||||
String userId = detectParam("/u", args, argList).replace("'", "");
|
||||
|
||||
String userSecret = "";
|
||||
if (argList.contains("/k") && argList.size() > argList.indexOf("/k") + 1) {
|
||||
userSecret = detectParam("/k", args, argList);
|
||||
}
|
||||
|
||||
String databaseType = "dbvoracle";
|
||||
if (argList.contains("/t") && argList.size() > argList.indexOf("/t") + 1) {
|
||||
databaseType = "dbv" + detectParam("/t", args, argList);
|
||||
if ("dbvsqlserver".equalsIgnoreCase(databaseType)) {
|
||||
databaseType = "dbvmssql";
|
||||
}
|
||||
}
|
||||
|
||||
int resultType = 1;
|
||||
if (argList.contains("/r") && argList.size() > argList.indexOf("/r") + 1) {
|
||||
resultType = Integer.parseInt(detectParam("/r", args, argList));
|
||||
}
|
||||
|
||||
System.out.println("================= run start grabit ==================");
|
||||
run(file, server, userId, userSecret, databaseType, resultType);
|
||||
System.out.println("================= run end grabit ==================");
|
||||
}
|
||||
|
||||
private static void run(File file, String server, String userId, String userSecret, String databaseType, Integer resultType) throws IOException {
|
||||
String tokenUrl = String.format("%s/gspLive_backend/user/generateToken", server);
|
||||
String token = SqlFlowUtil.getToken(tokenUrl, userId, userSecret, 0);
|
||||
if ("".equals(token)) {
|
||||
System.err.println("connection to sqlflow failed.");
|
||||
System.exit(1);
|
||||
}
|
||||
|
||||
String path = "";
|
||||
if (file.isDirectory()) {
|
||||
path = file.getPath() + ".zip";
|
||||
FileUtil.toZip(file.getPath(), FileUtil.outStream(path), true);
|
||||
} else if (file.isFile()) {
|
||||
path = file.getPath();
|
||||
}
|
||||
|
||||
String submitUrl = String.format("%s/gspLive_backend/sqlflow/job/submitUserJob", server);
|
||||
final String taskName = DateUtil.format(new Date()) + "_" + System.currentTimeMillis();
|
||||
String result = SqlFlowUtil.submitJob(path, submitUrl,
|
||||
databaseType,
|
||||
userId, token,
|
||||
taskName);
|
||||
JSONObject object = JSONObject.parseObject(result);
|
||||
if (null != object) {
|
||||
Integer code = object.getInteger("code");
|
||||
if (code == 200) {
|
||||
JSONObject data = object.getJSONObject("data");
|
||||
System.out.println("submit job to sqlflow successful. SQLFlow is being analyzed...");
|
||||
String jobId = data.getString("jobId");
|
||||
|
||||
String jsonJobUrl = String.format("%s/gspLive_backend/sqlflow/job/displayUserJobSummary", server);
|
||||
while (true) {
|
||||
String statusRs = SqlFlowUtil.getStatus(jsonJobUrl, userId, token, jobId);
|
||||
JSONObject statusObj = JSONObject.parseObject(statusRs);
|
||||
if (null != statusObj) {
|
||||
if (statusObj.getInteger("code") == 200) {
|
||||
JSONObject val = statusObj.getJSONObject("data");
|
||||
String status = val.getString("status");
|
||||
if ("success".equals(status) || "partial_success".equals(status)) {
|
||||
System.out.println("sqlflow analyze successful.");
|
||||
break;
|
||||
}
|
||||
if ("fail".equals(status)) {
|
||||
System.err.println(val.getString("errorMessage"));
|
||||
System.exit(1);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
String rsUrl = "";
|
||||
String downLoadPath = "";
|
||||
String rootPath = "data" + File.separator + "result" + File.separator + DateUtil.timeStamp2Date(System.currentTimeMillis()) + "_" + jobId;
|
||||
switch (resultType) {
|
||||
case 1:
|
||||
rsUrl = String.format("%s/gspLive_backend/sqlflow/job/exportLineageAsJson", server);
|
||||
downLoadPath = rootPath + "_json.json";
|
||||
break;
|
||||
case 2:
|
||||
rsUrl = String.format("%s/gspLive_backend/sqlflow/job/exportLineageAsCsv", server);
|
||||
downLoadPath = rootPath + "_csv.csv";
|
||||
break;
|
||||
case 3:
|
||||
rsUrl = String.format("%s/gspLive_backend/sqlflow/job/exportLineageAsGraphml", server);
|
||||
downLoadPath = rootPath + "_graphml.graphml";
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
|
||||
SqlFlowUtil.ExportLineageReq request = new SqlFlowUtil.ExportLineageReq();
|
||||
request.setToken(token);
|
||||
request.setJobId(jobId);
|
||||
request.setTableToTable(true);
|
||||
request.setUserId(userId);
|
||||
request.setUrl(rsUrl);
|
||||
request.setDownloadFilePath(downLoadPath);
|
||||
|
||||
System.out.println("start export result from sqlflow.");
|
||||
result = SqlFlowUtil.exportLineage(request);
|
||||
if (!result.contains("success")) {
|
||||
System.err.println("export json result failed");
|
||||
System.exit(1);
|
||||
}
|
||||
|
||||
System.out.println("export json result successful,downloaded file path is {}" + downLoadPath);
|
||||
} else {
|
||||
System.err.println("submit job to sqlflow failed.");
|
||||
System.exit(1);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private static String detectParam(String param, String[] args, List<String> argList) {
|
||||
try {
|
||||
return args[argList.indexOf(param) + 1];
|
||||
} catch (Exception e) {
|
||||
System.err.println("Please enter the correct parameters.");
|
||||
System.exit(1);
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
private static void matchParam(String param, List<String> argList) {
|
||||
if (!argList.contains(param) || argList.size() <= argList.indexOf(param) + 1) {
|
||||
System.err.println("{} parameter is required." + param);
|
||||
System.exit(1);
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
|
|
@ -0,0 +1,227 @@
|
|||
package com.gudusoft.grabit;
|
||||
|
||||
import com.alibaba.fastjson.JSONObject;
|
||||
import org.apache.http.HttpEntity;
|
||||
import org.apache.http.NameValuePair;
|
||||
import org.apache.http.client.entity.UrlEncodedFormEntity;
|
||||
import org.apache.http.client.methods.CloseableHttpResponse;
|
||||
import org.apache.http.client.methods.HttpPost;
|
||||
import org.apache.http.entity.ContentType;
|
||||
import org.apache.http.entity.mime.MultipartEntityBuilder;
|
||||
import org.apache.http.impl.client.CloseableHttpClient;
|
||||
import org.apache.http.impl.client.HttpClients;
|
||||
import org.apache.http.message.BasicNameValuePair;
|
||||
import org.apache.http.util.EntityUtils;
|
||||
|
||||
import java.io.*;
|
||||
import java.util.ArrayList;
|
||||
import java.util.HashMap;
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
|
||||
public class SqlFlowUtil {
|
||||
|
||||
private static String token = "";
|
||||
|
||||
private SqlFlowUtil() {
|
||||
}
|
||||
|
||||
public static String getToken(String url, String userId,
|
||||
String secretKey, Integer flag) {
|
||||
try {
|
||||
System.out.println("start get token from sqlflow.");
|
||||
Map<String, String> param = new HashMap<>();
|
||||
param.put("secretKey", secretKey);
|
||||
param.put("userId", userId);
|
||||
if ("gudu|0123456789".equals(userId)) {
|
||||
return "token";
|
||||
}
|
||||
String result = doPost(url, param);
|
||||
JSONObject object = JSONObject.parseObject(result);
|
||||
if ("200".equals(object.getString("code"))) {
|
||||
token = object.getString("token");
|
||||
System.out.println("get token from sqlflow successful.");
|
||||
return token;
|
||||
}
|
||||
return "";
|
||||
} catch (Exception e) {
|
||||
if (flag == 0) {
|
||||
if (url.startsWith("http:")) {
|
||||
url = url.replace("http", "https");
|
||||
}
|
||||
return getToken(url, userId,
|
||||
secretKey, 1);
|
||||
}
|
||||
if (flag == 1) {
|
||||
System.err.println("get token from sqlflow failed.");
|
||||
}
|
||||
return token;
|
||||
}
|
||||
}
|
||||
|
||||
public static String submitJob(String filePath,
|
||||
String url,
|
||||
String dbVendor,
|
||||
String userId,
|
||||
String token,
|
||||
String jobName) throws IOException {
|
||||
System.out.println("start submit job to sqlflow.");
|
||||
CloseableHttpClient httpClient = HttpClients.createDefault();
|
||||
HttpPost uploadFile = new HttpPost(url);
|
||||
MultipartEntityBuilder builder = MultipartEntityBuilder.create();
|
||||
builder.addTextBody("dbvendor", dbVendor, ContentType.TEXT_PLAIN);
|
||||
builder.addTextBody("jobName", jobName, ContentType.TEXT_PLAIN);
|
||||
builder.addTextBody("token", token, ContentType.TEXT_PLAIN);
|
||||
builder.addTextBody("userId", userId, ContentType.TEXT_PLAIN);
|
||||
File f = new File(filePath);
|
||||
builder.addBinaryBody("sqlfiles", new FileInputStream(f), ContentType.APPLICATION_OCTET_STREAM, f.getName());
|
||||
|
||||
HttpEntity multipart = builder.build();
|
||||
uploadFile.setEntity(multipart);
|
||||
CloseableHttpResponse response = httpClient.execute(uploadFile);
|
||||
HttpEntity responseEntity = response.getEntity();
|
||||
return EntityUtils.toString(responseEntity, "UTF-8");
|
||||
}
|
||||
|
||||
|
||||
public static String getStatus(String url,
|
||||
String userId,
|
||||
String token,
|
||||
String jobId) throws IOException {
|
||||
CloseableHttpClient httpClient = HttpClients.createDefault();
|
||||
HttpPost uploadFile = new HttpPost(url);
|
||||
MultipartEntityBuilder builder = MultipartEntityBuilder.create();
|
||||
builder.addTextBody("jobId", jobId, ContentType.TEXT_PLAIN);
|
||||
builder.addTextBody("token", token, ContentType.TEXT_PLAIN);
|
||||
builder.addTextBody("userId", userId, ContentType.TEXT_PLAIN);
|
||||
|
||||
HttpEntity multipart = builder.build();
|
||||
uploadFile.setEntity(multipart);
|
||||
CloseableHttpResponse response = httpClient.execute(uploadFile);
|
||||
HttpEntity responseEntity = response.getEntity();
|
||||
return EntityUtils.toString(responseEntity, "UTF-8");
|
||||
}
|
||||
|
||||
|
||||
public static String exportLineage(ExportLineageReq req) throws IOException {
|
||||
CloseableHttpClient httpClient = HttpClients.createDefault();
|
||||
HttpPost uploadFile = new HttpPost(req.getUrl());
|
||||
MultipartEntityBuilder builder = MultipartEntityBuilder.create();
|
||||
builder.addTextBody("jobId", req.getJobId(), ContentType.TEXT_PLAIN);
|
||||
builder.addTextBody("userId", req.getUserId(), ContentType.TEXT_PLAIN);
|
||||
builder.addTextBody("token", req.getToken(), ContentType.TEXT_PLAIN);
|
||||
builder.addTextBody("tableToTable", String.valueOf(req.getTableToTable()), ContentType.TEXT_PLAIN);
|
||||
|
||||
HttpEntity multipart = builder.build();
|
||||
uploadFile.setEntity(multipart);
|
||||
CloseableHttpResponse response = httpClient.execute(uploadFile);
|
||||
|
||||
HttpEntity responseEntity = response.getEntity();
|
||||
|
||||
InputStream in = responseEntity.getContent();
|
||||
FileUtil.mkFile(req.getDownloadFilePath());
|
||||
File file = new File(req.getDownloadFilePath());
|
||||
FileOutputStream fout = new FileOutputStream(file);
|
||||
int a;
|
||||
byte[] tmp = new byte[1024];
|
||||
while ((a = in.read(tmp)) != -1) {
|
||||
fout.write(tmp, 0, a);
|
||||
}
|
||||
fout.flush();
|
||||
fout.close();
|
||||
in.close();
|
||||
return "download success, path:" + req.getDownloadFilePath();
|
||||
}
|
||||
|
||||
|
||||
private static String doPost(String url, Map<String, String> param) {
|
||||
CloseableHttpClient httpClient = HttpClients.createDefault();
|
||||
CloseableHttpResponse response = null;
|
||||
String resultString = "";
|
||||
try {
|
||||
HttpPost httpPost = new HttpPost(url);
|
||||
if (param != null) {
|
||||
List<NameValuePair> paramList = new ArrayList<>();
|
||||
for (String key : param.keySet()) {
|
||||
paramList.add(new BasicNameValuePair(key, param.get(key)));
|
||||
}
|
||||
UrlEncodedFormEntity entity = new UrlEncodedFormEntity(paramList, "utf-8");
|
||||
httpPost.setEntity(entity);
|
||||
}
|
||||
response = httpClient.execute(httpPost);
|
||||
resultString = EntityUtils.toString(response.getEntity(), "utf-8");
|
||||
} catch (Exception e) {
|
||||
e.printStackTrace();
|
||||
} finally {
|
||||
try {
|
||||
response.close();
|
||||
} catch (IOException e) {
|
||||
e.printStackTrace();
|
||||
}
|
||||
}
|
||||
|
||||
return resultString;
|
||||
}
|
||||
|
||||
public static class ExportLineageReq {
|
||||
|
||||
private String jobId;
|
||||
private String userId;
|
||||
private String token;
|
||||
|
||||
private String url;
|
||||
private String downloadFilePath;
|
||||
private Boolean tableToTable = false;
|
||||
|
||||
public String getJobId() {
|
||||
return jobId;
|
||||
}
|
||||
|
||||
public void setJobId(String jobId) {
|
||||
this.jobId = jobId;
|
||||
}
|
||||
|
||||
public String getUserId() {
|
||||
return userId;
|
||||
}
|
||||
|
||||
public void setUserId(String userId) {
|
||||
this.userId = userId;
|
||||
}
|
||||
|
||||
public String getToken() {
|
||||
return token;
|
||||
}
|
||||
|
||||
public void setToken(String token) {
|
||||
this.token = token;
|
||||
}
|
||||
|
||||
public Boolean getTableToTable() {
|
||||
return tableToTable;
|
||||
}
|
||||
|
||||
public void setTableToTable(Boolean tableToTable) {
|
||||
this.tableToTable = tableToTable;
|
||||
}
|
||||
|
||||
public String getUrl() {
|
||||
return url;
|
||||
}
|
||||
|
||||
public void setUrl(String url) {
|
||||
this.url = url;
|
||||
}
|
||||
|
||||
public String getDownloadFilePath() {
|
||||
return downloadFilePath;
|
||||
}
|
||||
|
||||
public void setDownloadFilePath(String downloadFilePath) {
|
||||
this.downloadFilePath = downloadFilePath;
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,4 @@
|
|||
/Users/g7/Documents/project/sqlflow_public/api/java/src/main/java/com/gudusoft/grabit/FileUtil.java
|
||||
/Users/g7/Documents/project/sqlflow_public/api/java/src/main/java/com/gudusoft/grabit/Runner.java
|
||||
/Users/g7/Documents/project/sqlflow_public/api/java/src/main/java/com/gudusoft/grabit/SqlFlowUtil.java
|
||||
/Users/g7/Documents/project/sqlflow_public/api/java/src/main/java/com/gudusoft/grabit/DateUtil.java
|
||||
|
After Width: | Height: | Size: 34 KiB |
|
|
@ -0,0 +1,151 @@
|
|||
<?php
|
||||
|
||||
|
||||
class Grabit
|
||||
{
|
||||
|
||||
function run($argv)
|
||||
{
|
||||
if (sizeof($argv) < 2) {
|
||||
echo 'please enter the correct parameters.';
|
||||
exit(1);
|
||||
}
|
||||
|
||||
$userSecret = '';
|
||||
$userId = '';
|
||||
$dbvendor = '';
|
||||
$sqlfiles = '';
|
||||
$server = '';
|
||||
$port = '';
|
||||
$download = 1;
|
||||
for ($i = 0; $i < sizeof($argv) - 1; $i++) {
|
||||
if ($argv[$i] == '/s') {
|
||||
$server = $argv[$i + 1];
|
||||
}
|
||||
if ($argv[$i] == '/p') {
|
||||
$port = $argv[$i + 1];
|
||||
}
|
||||
if ($argv[$i] == '/f') {
|
||||
$sqlfiles = $argv[$i + 1];
|
||||
if (!file_exists($sqlfiles)) {
|
||||
echo "The file is no exists";
|
||||
exit(1);
|
||||
}
|
||||
}
|
||||
if ($argv[$i] == '/u') {
|
||||
$userId = $argv[$i + 1];
|
||||
$userId = str_replace("'", '', $userId);
|
||||
}
|
||||
if ($argv[$i] == '/t') {
|
||||
$dbvendor = 'dbv' . $argv[$i + 1];
|
||||
if ($dbvendor == 'dbvsqlserver') {
|
||||
$dbvendor = 'dbvmssql';
|
||||
}
|
||||
}
|
||||
if ($argv[$i] == '/k') {
|
||||
$userSecret = $argv[$i + 1];
|
||||
}
|
||||
if ($argv[$i] == '/r') {
|
||||
$download = $argv[$i + 1];
|
||||
}
|
||||
}
|
||||
|
||||
if (!substr($server, 0, 4) === "http" && !substr($server, 0, 5) === "https") {
|
||||
$server = "http://" . $server;
|
||||
}
|
||||
if (substr($server, -strlen(DIRECTORY_SEPARATOR)) === DIRECTORY_SEPARATOR) {
|
||||
$server = substr($server, 0, strlen($server) - 1);
|
||||
}
|
||||
if ($port != '') {
|
||||
$server = $server . ':' . $port;
|
||||
}
|
||||
|
||||
echo '===================================== start =====================================';
|
||||
echo PHP_EOL;
|
||||
|
||||
echo('start get token.');
|
||||
echo PHP_EOL;
|
||||
|
||||
include('SqlFlowUtil.php');
|
||||
$obj = new SqlFlowUtil();
|
||||
$token = $obj->getToken($server, $userId, $userSecret);
|
||||
echo 'get token successful.';
|
||||
echo PHP_EOL;
|
||||
if (is_dir($sqlfiles)) {
|
||||
if (substr($sqlfiles, -strlen(DIRECTORY_SEPARATOR)) === DIRECTORY_SEPARATOR) {
|
||||
$sqlfiles = rtrim($sqlfiles, DIRECTORY_SEPARATOR);
|
||||
}
|
||||
|
||||
$zip = new \ZipArchive();
|
||||
$sqlfileDir = $sqlfiles . '.zip';
|
||||
if (file_exists($sqlfileDir)) {
|
||||
if (PATH_SEPARATOR == ':') {
|
||||
unlink($sqlfileDir);
|
||||
} else {
|
||||
$url = iconv('utf-8', 'gbk', $sqlfileDir);
|
||||
unlink($url);
|
||||
}
|
||||
}
|
||||
|
||||
$open = $zip->open($sqlfileDir, \ZipArchive::CREATE);
|
||||
if ($open === true) {
|
||||
$this->toZip($sqlfiles, $zip);
|
||||
$zip->close();
|
||||
}
|
||||
$sqlfiles = $sqlfileDir;
|
||||
}
|
||||
|
||||
echo 'start submit job.';
|
||||
echo PHP_EOL;
|
||||
|
||||
$result = $obj->submitJob($server, $userId, $token, $sqlfiles, time(), $dbvendor);
|
||||
if ($result['code'] == 200) {
|
||||
echo 'submit job successful.';
|
||||
echo PHP_EOL;
|
||||
|
||||
$jobId = $result['data']['jobId'];
|
||||
while (true) {
|
||||
$result = $obj->getStatus($server, $userId, $token, $jobId);
|
||||
if ($result['code'] == 200) {
|
||||
$status = $result['data']['status'];
|
||||
if ($status == 'partial_success' || $status == 'success') {
|
||||
break;
|
||||
}
|
||||
if ($status == 'fail') {
|
||||
echo 'job execution failed.';
|
||||
exit(1);
|
||||
}
|
||||
}
|
||||
}
|
||||
echo $status;
|
||||
echo 'start get result from sqlflow.';
|
||||
echo PHP_EOL;
|
||||
$filePath = $obj->getResult($server, $userId, $token, $jobId, $download);
|
||||
echo 'get result from sqlflow successful. file path is : ' . $filePath;
|
||||
} else {
|
||||
echo 'submit job failed.';
|
||||
}
|
||||
echo PHP_EOL;
|
||||
echo '===================================== end =====================================';
|
||||
}
|
||||
|
||||
function toZip($path, $zip)
|
||||
{
|
||||
$handler = opendir($path);
|
||||
while (($filename = readdir($handler)) !== false) {
|
||||
if ($filename != "." && $filename != "..") {
|
||||
if (is_dir($path . DIRECTORY_SEPARATOR . $filename)) {
|
||||
$obj = new Grabit();
|
||||
$obj->toZip($path . DIRECTORY_SEPARATOR . $filename, $zip);
|
||||
} else {
|
||||
$zip->addFile($path . DIRECTORY_SEPARATOR . $filename);
|
||||
$zip->renameName($path . DIRECTORY_SEPARATOR . $filename, $filename);
|
||||
}
|
||||
}
|
||||
}
|
||||
@closedir($path);
|
||||
}
|
||||
}
|
||||
|
||||
$obj = new Grabit();
|
||||
$obj->run($argv);
|
||||
|
|
@ -0,0 +1,110 @@
|
|||
<?php
|
||||
|
||||
|
||||
class HttpClient
|
||||
{
|
||||
|
||||
protected static $url;
|
||||
protected static $delimiter;
|
||||
|
||||
function mkdirs($a1, $mode = 0777)
|
||||
{
|
||||
if (is_dir($a1) || @mkdir($a1, $mode)) return TRUE;
|
||||
if (!static::mkdirs(dirname($a1), $mode)) return FALSE;
|
||||
return @mkdir($a1, $mode);
|
||||
}
|
||||
|
||||
|
||||
public function __construct()
|
||||
{
|
||||
static::$delimiter = uniqid();
|
||||
}
|
||||
|
||||
private static function buildData($param)
|
||||
{
|
||||
$data = '';
|
||||
$eol = "\r\n";
|
||||
$upload = $param['sqlfiles'];
|
||||
unset($param['sqlfiles']);
|
||||
|
||||
foreach ($param as $name => $content) {
|
||||
$data .= "--" . static::$delimiter . "\r\n"
|
||||
. 'Content-Disposition: form-data; name="' . $name . "\"\r\n\r\n"
|
||||
. $content . "\r\n";
|
||||
}
|
||||
$data .= "--" . static::$delimiter . $eol
|
||||
. 'Content-Disposition: form-data; name="sqlfiles"; filename="' . $param['filename'] . '"' . "\r\n"
|
||||
. 'Content-Type:application/octet-stream' . "\r\n\r\n";
|
||||
|
||||
$data .= $upload . "\r\n";
|
||||
$data .= "--" . static::$delimiter . "--\r\n";
|
||||
return $data;
|
||||
}
|
||||
|
||||
function postFile($url, $param)
|
||||
{
|
||||
$post_data = static::buildData($param);
|
||||
$curl = curl_init($url);
|
||||
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
|
||||
curl_setopt($curl, CURLOPT_POST, true);
|
||||
curl_setopt($curl, CURLOPT_POSTFIELDS, $post_data);
|
||||
curl_setopt($curl, CURLOPT_HTTPHEADER, [
|
||||
"Content-Type: multipart/form-data; boundary=" . static::$delimiter,
|
||||
"Content-Length: " . strlen($post_data)
|
||||
]);
|
||||
$response = curl_exec($curl);
|
||||
curl_close($curl);
|
||||
$info = json_decode($response, true);
|
||||
return $info;
|
||||
}
|
||||
|
||||
|
||||
function postFrom($url, $data)
|
||||
{
|
||||
$headers = array('Content-Type: application/x-www-form-urlencoded');
|
||||
$curl = curl_init();
|
||||
curl_setopt($curl, CURLOPT_URL, $url);
|
||||
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
|
||||
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
|
||||
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
|
||||
curl_setopt($curl, CURLOPT_AUTOREFERER, 1);
|
||||
curl_setopt($curl, CURLOPT_POST, 1);
|
||||
curl_setopt($curl, CURLOPT_POSTFIELDS, http_build_query($data));
|
||||
curl_setopt($curl, CURLOPT_TIMEOUT, 30);
|
||||
curl_setopt($curl, CURLOPT_HEADER, 0);
|
||||
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
|
||||
curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
|
||||
$result = curl_exec($curl);
|
||||
if (curl_errno($curl)) {
|
||||
return 'Errno' . curl_error($curl);
|
||||
}
|
||||
curl_close($curl);
|
||||
return json_decode($result, true);
|
||||
}
|
||||
|
||||
|
||||
function postJson($url, $data, $filePath)
|
||||
{
|
||||
$headers = array('Content-Type: application/x-www-form-urlencoded');
|
||||
$curl = curl_init();
|
||||
curl_setopt($curl, CURLOPT_URL, $url);
|
||||
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
|
||||
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
|
||||
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
|
||||
curl_setopt($curl, CURLOPT_AUTOREFERER, 1);
|
||||
curl_setopt($curl, CURLOPT_POST, 1);
|
||||
curl_setopt($curl, CURLOPT_POSTFIELDS, http_build_query($data));
|
||||
curl_setopt($curl, CURLOPT_TIMEOUT, 30);
|
||||
curl_setopt($curl, CURLOPT_HEADER, 0);
|
||||
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
|
||||
curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
|
||||
$result = curl_exec($curl);
|
||||
if (curl_errno($curl)) {
|
||||
return 'Errno' . curl_error($curl);
|
||||
}
|
||||
$fp = @fopen($filePath, "a");
|
||||
fwrite($fp, $result);
|
||||
fclose($fp);
|
||||
}
|
||||
|
||||
}
|
||||
|
|
@ -0,0 +1,72 @@
|
|||
<?php
|
||||
include('HttpClient.php');
|
||||
|
||||
class SqlFlowUtil
|
||||
{
|
||||
function getToken($server, $userId, $userSecret)
|
||||
{
|
||||
if ($userId == 'gudu|0123456789') {
|
||||
return 'token';
|
||||
}
|
||||
|
||||
$httpVendor = new HttpClient();
|
||||
$json['userId'] = $userId;
|
||||
$json['secretKey'] = $userSecret;
|
||||
$url = $server . '/gspLive_backend/user/generateToken';
|
||||
$result = $httpVendor->postFrom($url, $json);
|
||||
return $result['token'];
|
||||
}
|
||||
|
||||
function submitJob($server, $userId, $token, $sqlfiles, $jobName, $dbvendor)
|
||||
{
|
||||
$httpVendor = new HttpClient();
|
||||
$params = array(
|
||||
'userId' => $userId,
|
||||
'token' => $token,
|
||||
'jobName' => $jobName,
|
||||
'dbvendor' => $dbvendor,
|
||||
'filename' => $jobName,
|
||||
'sqlfiles' => file_get_contents($sqlfiles)
|
||||
);
|
||||
$url = $server . '/gspLive_backend/sqlflow/job/submitUserJob';
|
||||
$result = $httpVendor->postFile($url, $params);
|
||||
return $result;
|
||||
}
|
||||
|
||||
function getStatus($server, $userId, $token, $jobId)
|
||||
{
|
||||
$httpVendor = new HttpClient();
|
||||
$json['userId'] = $userId;
|
||||
$json['token'] = $token;
|
||||
$json['jobId'] = $jobId;
|
||||
$url = $server . '/gspLive_backend/sqlflow/job/displayUserJobSummary';
|
||||
$result = $httpVendor->postFrom($url, $json);
|
||||
return $result;
|
||||
}
|
||||
|
||||
function getResult($server, $userId, $token, $jobId, $download)
|
||||
{
|
||||
$dir = 'data' . DIRECTORY_SEPARATOR . 'result';
|
||||
$str = $dir . DIRECTORY_SEPARATOR . date("Ymd") . '_' . $jobId;
|
||||
$filePath = '';
|
||||
$url = '';
|
||||
if ($download == 1) {
|
||||
$url = $server . '/gspLive_backend/sqlflow/job/exportLineageAsJson';
|
||||
$filePath = $str . '_json.json';
|
||||
} else if ($download == 2) {
|
||||
$url = $server . '/gspLive_backend/sqlflow/job/exportLineageAsGraphml';
|
||||
$filePath = $str . '_graphml.graphml';
|
||||
} else if ($download == 3) {
|
||||
$url = $server . '/gspLive_backend/sqlflow/job/exportLineageAsCsv';
|
||||
$filePath = $str . '_csv.csv';
|
||||
}
|
||||
|
||||
$httpVendor = new HttpClient();
|
||||
$json['userId'] = $userId;
|
||||
$json['token'] = $token;
|
||||
$json['jobId'] = $jobId;
|
||||
$httpVendor->mkdirs($dir);
|
||||
$httpVendor->postJson($url, $json, $filePath);
|
||||
return $filePath;
|
||||
}
|
||||
}
|
||||
|
After Width: | Height: | Size: 196 KiB |
|
|
@ -0,0 +1,56 @@
|
|||
-- sql server sample sql
|
||||
CREATE TABLE dbo.EmployeeSales
|
||||
( DataSource varchar(20) NOT NULL,
|
||||
BusinessEntityID varchar(11) NOT NULL,
|
||||
LastName varchar(40) NOT NULL,
|
||||
SalesDollars money NOT NULL
|
||||
);
|
||||
GO
|
||||
CREATE PROCEDURE dbo.uspGetEmployeeSales
|
||||
AS
|
||||
SET NOCOUNT ON;
|
||||
SELECT 'PROCEDURE', sp.BusinessEntityID, c.LastName,
|
||||
sp.SalesYTD
|
||||
FROM Sales.SalesPerson AS sp
|
||||
INNER JOIN Person.Person AS c
|
||||
ON sp.BusinessEntityID = c.BusinessEntityID
|
||||
WHERE sp.BusinessEntityID LIKE '2%'
|
||||
ORDER BY sp.BusinessEntityID, c.LastName;
|
||||
GO
|
||||
--INSERT...SELECT example
|
||||
INSERT INTO dbo.EmployeeSales
|
||||
SELECT 'SELECT', sp.BusinessEntityID, c.LastName, sp.SalesYTD
|
||||
FROM Sales.SalesPerson AS sp
|
||||
INNER JOIN Person.Person AS c
|
||||
ON sp.BusinessEntityID = c.BusinessEntityID
|
||||
WHERE sp.BusinessEntityID LIKE '2%'
|
||||
ORDER BY sp.BusinessEntityID, c.LastName;
|
||||
GO
|
||||
|
||||
|
||||
CREATE VIEW hiredate_view
|
||||
AS
|
||||
SELECT p.FirstName, p.LastName, e.BusinessEntityID, e.HireDate
|
||||
FROM HumanResources.Employee e
|
||||
JOIN Person.Person AS p ON e.BusinessEntityID = p.BusinessEntityID ;
|
||||
GO
|
||||
|
||||
CREATE VIEW view1
|
||||
AS
|
||||
SELECT fis.CustomerKey, fis.ProductKey, fis.OrderDateKey,
|
||||
fis.SalesTerritoryKey, dst.SalesTerritoryRegion
|
||||
FROM FactInternetSales AS fis
|
||||
LEFT OUTER JOIN DimSalesTerritory AS dst
|
||||
ON (fis.SalesTerritoryKey=dst.SalesTerritoryKey);
|
||||
|
||||
GO
|
||||
SELECT ROW_NUMBER() OVER(PARTITION BY PostalCode ORDER BY SalesYTD DESC) AS "Row Number",
|
||||
p.LastName, s.SalesYTD, a.PostalCode
|
||||
FROM Sales.SalesPerson AS s
|
||||
INNER JOIN Person.Person AS p
|
||||
ON s.BusinessEntityID = p.BusinessEntityID
|
||||
INNER JOIN Person.Address AS a
|
||||
ON a.AddressID = p.BusinessEntityID
|
||||
WHERE TerritoryID IS NOT NULL
|
||||
AND SalesYTD <> 0
|
||||
ORDER BY PostalCode;
|
||||
|
After Width: | Height: | Size: 54 KiB |
|
|
@ -0,0 +1,192 @@
|
|||
## PHP Data lineage: using the SQLFlow REST API (Advanced)
|
||||
|
||||
This article illustrates how to discover the data lineage using PHP and the SQLFlow REST API.
|
||||
|
||||
By using the SQLFlow REST API, you can code in PHP to discover the data lineage in SQL scripts
|
||||
and get the result in an actionable diagram, json, csv or graphml format.
|
||||
|
||||
You can integerate the PHP code provided here into your own project and add the powerful
|
||||
data lineage analsysis capability instantly.
|
||||
|
||||
### 1. interactive data lineage visualizations
|
||||

|
||||
|
||||
### 2. [Data lineage in JSON format](php-data-lineage-result.json)
|
||||
|
||||
### 3. Data lineage in CSV, graphml format
|
||||
|
||||
|
||||
## Prerequisites
|
||||
- [SQLFlow Cloud Server or on-premise version](https://github.com/sqlparser/sqlflow_public/tree/master/api#prerequisites)
|
||||
|
||||
- PHP 7.3 or higher version must be installed and configured correctly.
|
||||
|
||||
- Install the ZIP extension
|
||||
|
||||
**mac**
|
||||
|
||||
````
|
||||
wget http://pecl.php.net/get/zip-1.12.4.tgz
|
||||
|
||||
tar zxfv zip-1.12.4.tgz
|
||||
|
||||
cd zip-1.12.4
|
||||
|
||||
sudo mount -uw /
|
||||
|
||||
sudo ln -s /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/ /usr
|
||||
|
||||
sudo phpize
|
||||
|
||||
which php-config(get path,eg :/usr/bin/php-config)
|
||||
|
||||
./configure --with-php-config=/usr/bin/php-config
|
||||
|
||||
sudo mount -uw /
|
||||
|
||||
sudo make
|
||||
|
||||
sudo make install
|
||||
|
||||
cd /usr/lib/php/extensions/no-debug-non-zts-20180731
|
||||
|
||||
sudo cp /private/etc/php.ini.default php.ini
|
||||
|
||||
chmod 777 php.ini
|
||||
|
||||
sudo vim php.ini, write extension=zip.so
|
||||
|
||||
sudo apachectl restart
|
||||
````
|
||||
|
||||
**linux**
|
||||
|
||||
````
|
||||
wget http://pecl.php.net/get/zip-1.12.4.tgz
|
||||
|
||||
tar zxfv zip-1.12.4.tgz
|
||||
|
||||
cd zip-1.12.4
|
||||
|
||||
sudo phpize
|
||||
|
||||
which php-config(get path,eg :/usr/bin/php-config)
|
||||
|
||||
./configure --with-php-config=/usr/bin/php-config
|
||||
|
||||
sudo make
|
||||
|
||||
sudo make install
|
||||
|
||||
cd /usr/lib/php/extensions/no-debug-non-zts-20180731
|
||||
|
||||
sudo vi /usr/local/php/etc/php.ini, write extension=zip.so
|
||||
|
||||
sudo apachectl restart
|
||||
````
|
||||
|
||||
#### [Reference Documentation](https://www.php.net/manual/en/install.pecl.phpize.php)
|
||||
|
||||
### Usage
|
||||
|
||||
````
|
||||
php Grabit.php /s server /p port /u userId /k userSecret /t databaseType /f path_to_config_file /r resultType
|
||||
|
||||
eg:
|
||||
php Grabit.php /u 'auth0|xxx' /k cab9712c45189014a94a8b7aceeef7a3db504be58e18cd3686f3bbefd078ef4d /s https://api.gudusoft.com /t oracle /f demo.sql /r 1
|
||||
|
||||
note:
|
||||
If the parameter string contains symbols like "|" , it must be included in a single quotes (' ')
|
||||
````
|
||||
|
||||
Example:
|
||||
|
||||
1. Connect to the SQLFlow Cloud Server
|
||||
```
|
||||
php Grabit.php /s https://api.gudusoft.com /u 'YOUR_USER_ID' /k YOUR_SECRET_KEY /t sqlserver /f PHP-data-lineage-sqlserver.sql /r 1
|
||||
```
|
||||
|
||||
2. Connect to the SQLFlow on-premise
|
||||
This will discover data lineage by analyzing the `PHP-data-lineage-sqlserver.sql` file. You may also specify a zip file which includes lots of SQL files.
|
||||
```
|
||||
php Grabit.php /s http://127.0.0.1 /p 8081 /u 'gudu|0123456789' /t sqlserver /f PHP-data-lineage-sqlserver.sql /r 1
|
||||
```
|
||||
|
||||
This will discover data lineage by analyzing all SQL files under `sqlfiles` directory.
|
||||
```
|
||||
php Grabit.php /s http://127.0.0.1 /p 8081 /u 'gudu|0123456789' /t mysql /f sqlfiles /r 1
|
||||
```
|
||||
|
||||
### Parameters
|
||||
|
||||
- **path_to_config_file**
|
||||
|
||||
This can be a single SQL file, a zip file including multiple SQL files, or a directory including lots of SQL files.
|
||||
|
||||
- **server**
|
||||
|
||||
Usually, it is the IP address of [the SQLFlow on-premise version](https://www.gudusoft.com/sqlflow-on-premise-version/)
|
||||
installed on your owner servers such as `127.0.0.1` or `http://127.0.0.1`
|
||||
|
||||
You may set the value to `https://api.gudusoft.com` if you like to send your SQL script to [the SQLFlow Cloud Server](https://sqlflow.gudusoft.com) to get the data lineage result.
|
||||
|
||||
- **port**
|
||||
|
||||
The default value is `8081` if you connect to your SQLFlow on-premise server.
|
||||
|
||||
However, if you setup the nginx reverse proxy in the nginx configuration file like this:
|
||||
```
|
||||
location /api/ {
|
||||
proxy_pass http://127.0.0.1:8081/;
|
||||
proxy_connect_timeout 600s ;
|
||||
proxy_read_timeout 600s;
|
||||
proxy_send_timeout 600s;
|
||||
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header User-Agent $http_user_agent;
|
||||
}
|
||||
```
|
||||
Then, keep the value of `serverPort` empty and set `server` to the value like this: `http://127.0.0.1/api`.
|
||||
|
||||
>Please keep this value empty if you connect to the SQLFlow Cloud Server by specifying the `https://api.gudusoft.com`
|
||||
in the `server`
|
||||
>
|
||||
- **userId, userSecret**
|
||||
|
||||
This is the user id that is used to connect to the SQLFlow server.
|
||||
Always set this value to `gudu|0123456789` and keep `userSecret` empty if you use the SQLFlow on-premise version.
|
||||
|
||||
If you want to connect to [the SQLFlow Cloud Server](https://sqlflow.gudusoft.com), you may [request a 30 days premium account](https://www.gudusoft.com/request-a-premium-account/) to
|
||||
[get the necessary userId and secret code](/sqlflow-userid-secret.md).
|
||||
|
||||
|
||||
- **databaseType**
|
||||
|
||||
This parameter specifies the database dialect of the SQL scripts that the SQLFlow has analyzed.
|
||||
|
||||
```txt
|
||||
access,bigquery,couchbase,dax,db2,greenplum,hana,hive,impala,informix,mdx,mssql,
|
||||
sqlserver,mysql,netezza,odbc,openedge,oracle,postgresql,postgres,redshift,snowflake,
|
||||
sybase,teradata,soql,vertica
|
||||
```
|
||||
|
||||
- **resultType**
|
||||
|
||||
When you submit SQL script to the SQLFlow server, A job is created on the SQLFlow server
|
||||
and you can always see the graphic data lineage result via the browser,
|
||||
|
||||
|
||||
Even better, This demo will fetch the data lineage back to the directory where the demo is running.
|
||||
Those data lineage results are stored in the `data/result/` directory.
|
||||
|
||||
This parameter specifies which kind of format is used to save the data lineage result.
|
||||
|
||||
Available values for this parameter:
|
||||
- 1: JSON, data lineage result in JSON.
|
||||
- 2: CSV, data lineage result in CSV format.
|
||||
- 3: diagram, in graphml format that can be viewed by yEd.
|
||||
|
||||
### SQLFlow REST API
|
||||
Please check here for the detailed information about the [SQLFlow REST API](https://github.com/sqlparser/sqlflow_public/tree/master/api/sqlflow_api.md)
|
||||
|
|
@ -0,0 +1,77 @@
|
|||
/**
|
||||
* 解析SQLFLow exportLineageAsJson接口返回的JSON格式的血缘关系中的关系链路
|
||||
*
|
||||
* 例如demo中的血缘数据,解析成以下链路:
|
||||
* 达成的目标是,List中两个元素:
|
||||
* SCOTT.DEPT -> SCOTT.EMP->VSAL
|
||||
* SCOTT.EMP->VSAL
|
||||
*/
|
||||
|
||||
import json
|
||||
|
||||
class Node:
|
||||
def __init__(self, value, node_id):
|
||||
self.value = value
|
||||
self.id = node_id
|
||||
self.next = None
|
||||
|
||||
def key(self):
|
||||
node = self.next
|
||||
key = self.id
|
||||
while node:
|
||||
key += node.id
|
||||
node = node.next
|
||||
return key
|
||||
|
||||
def main():
|
||||
input_data = '{"jobId":"d9550e491c024d0cbe6e1034604aca17","code":200,"data":{"mode":"global","sqlflow":{"relationship":[{"sources":[{"parentName":"ORDERS","column":"TABLE","coordinates":[],"id":"10000106","parentId":"86"}],"id":"1000012311","type":"fdd","target":{"parentName":"SPECIAL_ORDERS","column":"TABLE","coordinates":[],"id":"10000102","parentId":"82"}},{"sources":[{"parentName":"CUSTOMERS","column":"TABLE","coordinates":[],"id":"10000103","parentId":"94"}],"id":"1000012312","type":"fdd","target":{"parentName":"SPECIAL_ORDERS","column":"TABLE","coordinates":[],"id":"10000102","parentId":"82"}}]}},"sessionId":"8bb7d3da4b687bb7badf01608a739fbebd61309cd5a643cecf079d122095738a_1685604216451"}'
|
||||
try:
|
||||
data = json.loads(input_data)
|
||||
relationship_node = data["data"]["sqlflow"]["relationships"]
|
||||
data_list = relationship_node
|
||||
|
||||
value = []
|
||||
node_map = {}
|
||||
for data_item in data_list:
|
||||
sources = data_item["sources"]
|
||||
target_node = data_item["target"]
|
||||
target = Node(target_node["parentName"], target_node["parentId"])
|
||||
if sources:
|
||||
for source in sources:
|
||||
parent_id = source["parentId"]
|
||||
parent_name = source["parentName"]
|
||||
source_node = Node(parent_name, parent_id)
|
||||
source_node.next = target
|
||||
value.append(source_node)
|
||||
node_map[parent_id] = source_node
|
||||
else:
|
||||
value.append(target)
|
||||
node_map[target_node["parentId"]] = target
|
||||
|
||||
for node in value:
|
||||
next_node = node.next
|
||||
if next_node:
|
||||
next_id = next_node.id
|
||||
next_node = node_map.get(next_id)
|
||||
if next_node:
|
||||
node.next = next_node
|
||||
|
||||
key_set = set()
|
||||
value_iter = iter(value)
|
||||
while True:
|
||||
try:
|
||||
node = next(value_iter)
|
||||
k = node.key()
|
||||
if k in key_set:
|
||||
value_iter.remove()
|
||||
key_set.add(k)
|
||||
except StopIteration:
|
||||
break
|
||||
|
||||
chains = []
|
||||
print(chains)
|
||||
except json.JSONDecodeError as e:
|
||||
print(e)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
|
@ -0,0 +1,45 @@
|
|||
#!/usr/bin/python
|
||||
# -*- coding: UTF-8 -*-
|
||||
import requests
|
||||
|
||||
import json
|
||||
|
||||
|
||||
def getToken(sys, userId, server, port):
|
||||
if len(sys.argv) < 1:
|
||||
print('Please enter the args.')
|
||||
sys.exit(0)
|
||||
|
||||
url = '/gspLive_backend/user/generateToken'
|
||||
screctKey = ''
|
||||
for i in range(1, len(sys.argv)):
|
||||
if sys.argv[i] == '/k':
|
||||
try:
|
||||
if sys.argv[i + 1] is not None:
|
||||
screctKey = sys.argv[i + 1]
|
||||
except Exception:
|
||||
print(
|
||||
'Please enter the screctKey,the secret key of sqlflow user for webapi request, required true. eg: /k xxx')
|
||||
sys.exit(0)
|
||||
|
||||
if port != '':
|
||||
url = server + ':' + port + url
|
||||
else:
|
||||
url = server + url
|
||||
mapA = {'secretKey': screctKey, 'userId': userId}
|
||||
header_dict = {"Content-Type": "application/x-www-form-urlencoded"}
|
||||
|
||||
print('start get token.')
|
||||
try:
|
||||
r = requests.post(url, data=mapA, headers=header_dict)
|
||||
except Exception:
|
||||
print('get token failed.')
|
||||
sys.exit(0)
|
||||
result = json.loads(r.text)
|
||||
|
||||
if result['code'] == '200':
|
||||
print('get token successful.')
|
||||
return result['token']
|
||||
else:
|
||||
print(result['error'])
|
||||
sys.exit(0)
|
||||
|
|
@ -0,0 +1,32 @@
|
|||
#!/usr/bin/python
|
||||
# -*- coding: UTF-8 -*-
|
||||
import requests
|
||||
|
||||
import json
|
||||
import sys
|
||||
|
||||
|
||||
def getStatus(userId, token, server, port, jobId):
|
||||
url = "/gspLive_backend/sqlflow/job/displayUserJobSummary"
|
||||
|
||||
if port != '':
|
||||
url = server + ':' + port + url
|
||||
else:
|
||||
url = server + url
|
||||
|
||||
data = {'jobId': jobId, 'token': token, 'userId': userId}
|
||||
datastr = json.dumps(data)
|
||||
|
||||
try:
|
||||
response = requests.post(url, data=eval(datastr))
|
||||
except Exception:
|
||||
print('get job status to sqlflow failed.')
|
||||
sys.exit(0)
|
||||
|
||||
result = json.loads(response.text)
|
||||
if result['code'] == 200:
|
||||
status = result['data']['status']
|
||||
if status == 'fail':
|
||||
print(result['data']['errorMessage'])
|
||||
sys.exit(0)
|
||||
return status
|
||||
|
|
@ -0,0 +1,51 @@
|
|||
#!/usr/bin/python
|
||||
# -*- coding: UTF-8 -*-
|
||||
import requests
|
||||
|
||||
import json
|
||||
import sys
|
||||
import os
|
||||
|
||||
|
||||
def getResult(download, userId, token, server, port, jobId, filePath):
|
||||
sep = 'data' + os.sep + 'result' + os.sep
|
||||
filePath = filePath + '_' + jobId
|
||||
if download == 'json':
|
||||
url = "/gspLive_backend/sqlflow/job/exportLineageAsJson"
|
||||
filePath = sep + filePath + '_json.json'
|
||||
elif download == 'graphml':
|
||||
url = "/gspLive_backend/sqlflow/job/exportLineageAsGraphml"
|
||||
filePath = sep + filePath + '_graphml.graphml'
|
||||
elif download == 'csv':
|
||||
url = "/gspLive_backend/sqlflow/job/exportLineageAsCsv"
|
||||
filePath = sep + filePath + '_csv.csv'
|
||||
else:
|
||||
print('Please enter the correct output type.')
|
||||
sys.exit(0)
|
||||
|
||||
if port != '':
|
||||
url = server + ':' + port + url
|
||||
else:
|
||||
url = server + url
|
||||
|
||||
data = {'jobId': jobId, 'token': token, 'userId': userId, 'tableToTable': 'false'}
|
||||
datastr = json.dumps(data)
|
||||
|
||||
print('start download result to sqlflow.')
|
||||
try:
|
||||
response = requests.post(url, data=eval(datastr))
|
||||
except Exception:
|
||||
print('download result to sqlflow failed.')
|
||||
sys.exit(0)
|
||||
|
||||
if not os.path.exists(sep):
|
||||
os.makedirs(sep)
|
||||
|
||||
try:
|
||||
with open(filePath, 'wb') as f:
|
||||
f.write(response.content)
|
||||
except Exception:
|
||||
print(filePath, 'is not exist.')
|
||||
sys.exit(0)
|
||||
|
||||
print('download result to sqlflow successful.file path is ', filePath)
|
||||
|
|
@ -0,0 +1,130 @@
|
|||
#!/usr/bin/python
|
||||
# -*- coding: UTF-8 -*-
|
||||
import os
|
||||
import sys
|
||||
import GetGenerateToken
|
||||
import SubmitJob
|
||||
import time
|
||||
import GetResultToSqlflow
|
||||
import GetJobStatus
|
||||
import datetime
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
||||
print('========================================grabit-python======================================')
|
||||
|
||||
userId = ''
|
||||
dbvendor = ''
|
||||
sqlfiles = ''
|
||||
server = ''
|
||||
port = ''
|
||||
download = ''
|
||||
|
||||
for i in range(1, len(sys.argv)):
|
||||
if sys.argv[i] == '/u':
|
||||
try:
|
||||
if sys.argv[i + 1] is not None:
|
||||
userId = sys.argv[i + 1]
|
||||
else:
|
||||
print(
|
||||
'Please enter the userId,the user id of sqlflow web or client, required true. eg: /n gudu|123456789')
|
||||
sys.exit(0)
|
||||
except BrokenPipeError:
|
||||
print(
|
||||
'Please enter the userId,the user id of sqlflow web or client, required true. eg: /n gudu|123456789')
|
||||
if sys.argv[i] == '/t':
|
||||
try:
|
||||
if sys.argv[i + 1] is not None:
|
||||
dbvendor = sys.argv[i + 1]
|
||||
else:
|
||||
print(
|
||||
'Please enter the dbvendor.')
|
||||
sys.exit(0)
|
||||
except Exception:
|
||||
print(
|
||||
'Please enter the dbvendor.')
|
||||
if sys.argv[i] == '/f':
|
||||
try:
|
||||
if sys.argv[i + 1] is not None:
|
||||
sqlfiles = sys.argv[i + 1]
|
||||
else:
|
||||
print(
|
||||
'Please enter the sqlfiles,request sql files, please use multiple parts to submit the sql files, required true. eg: /f path')
|
||||
sys.exit(0)
|
||||
except Exception:
|
||||
print(
|
||||
'Please enter the sqlfiles,request sql files, please use multiple parts to submit the sql files, required true. eg: /f path')
|
||||
if sys.argv[i] == '/s':
|
||||
try:
|
||||
if sys.argv[i + 1] is not None:
|
||||
server = sys.argv[i + 1]
|
||||
else:
|
||||
print('Please enter the server. eg: /s https://api.gudusoft.com or /s https://127.0.0.1')
|
||||
sys.exit(0)
|
||||
except Exception:
|
||||
print('Please enter the server. eg: /s https://api.gudusoft.com or /s https://127.0.0.1')
|
||||
sys.exit(0)
|
||||
if sys.argv[i] == '/p':
|
||||
try:
|
||||
if sys.argv[i + 1] is not None:
|
||||
port = sys.argv[i + 1]
|
||||
except Exception:
|
||||
print('Please enter the port. eg: /p 8081')
|
||||
sys.exit(0)
|
||||
if sys.argv[i] == '/r':
|
||||
try:
|
||||
if sys.argv[i + 1] is not None:
|
||||
download = sys.argv[i + 1]
|
||||
except Exception:
|
||||
print('Please enter the download type to sqlflow,type 1:json 2:csv 3:diagram : eg: /r 1')
|
||||
sys.exit(0)
|
||||
|
||||
if userId == '':
|
||||
print('Please enter the userId,the user id of sqlflow web or client, required true. eg: /n gudu|123456789')
|
||||
sys.exit(0)
|
||||
if dbvendor == '':
|
||||
print(
|
||||
'Please enter the dbvendor,available values:bigquery,couchbase,db2,greenplum,hana,hive,impala,informix,mdx,mysql,netezza,openedge,oracle,postgresql,redshift,snowflake,mssql,sybase,teradata,vertica. eg: /t oracle')
|
||||
sys.exit(0)
|
||||
|
||||
if dbvendor == 'mssql' or dbvendor == 'sqlserver':
|
||||
dbvendor = 'mssql'
|
||||
|
||||
dbvendor = 'dbv' + dbvendor
|
||||
|
||||
if sqlfiles == '':
|
||||
print(
|
||||
'Please enter the sqlfiles,request sql files, please use multiple parts to submit the sql files, required true. eg: /f path')
|
||||
sys.exit(0)
|
||||
if server == '':
|
||||
print('Please enter the server. eg: /s https://api.gudusoft.com or /s https://127.0.0.1')
|
||||
sys.exit(0)
|
||||
|
||||
if server.find('http:') == -1 and server.find('https:') == -1:
|
||||
server = 'http://' + server
|
||||
|
||||
if server.endswith(os.sep):
|
||||
server = server[:-1]
|
||||
|
||||
if server == 'https://sqlflow.gudusoft.com':
|
||||
server = 'https://api.gudusoft.com'
|
||||
|
||||
if userId == 'gudu|0123456789':
|
||||
token = 'token'
|
||||
else:
|
||||
token = GetGenerateToken.getToken(sys, userId, server, port)
|
||||
|
||||
time_ = datetime.datetime.now().strftime('%Y%m%d')
|
||||
|
||||
jobId = SubmitJob.toSqlflow(userId, token, server, port, time_, dbvendor, sqlfiles)
|
||||
|
||||
if download != '':
|
||||
while True:
|
||||
status = GetJobStatus.getStatus(userId, token, server, port, jobId)
|
||||
if status == 'partial_success' or status == 'success':
|
||||
GetResultToSqlflow.getResult(download, userId, token, server, port, jobId, time_)
|
||||
break
|
||||
|
||||
print('========================================grabit-python======================================')
|
||||
|
||||
sys.exit(0)
|
||||
|
|
@ -0,0 +1,57 @@
|
|||
#!/usr/bin/python
|
||||
# -*- coding: UTF-8 -*-
|
||||
import zipfile
|
||||
|
||||
import requests
|
||||
|
||||
import json
|
||||
import sys
|
||||
import os
|
||||
|
||||
|
||||
def toSqlflow(userId, token, server, port, jobName, dbvendor, sqlfiles):
|
||||
url = '/gspLive_backend/sqlflow/job/submitUserJob'
|
||||
|
||||
if port != '':
|
||||
url = server + ':' + port + url
|
||||
else:
|
||||
url = server + url
|
||||
|
||||
if os.path.isdir(sqlfiles):
|
||||
sqlfiles = toZip(sqlfiles)
|
||||
files = {'sqlfiles': open(sqlfiles, 'rb')}
|
||||
data = {'dbvendor': dbvendor, 'jobName': jobName, 'token': token, 'userId': userId}
|
||||
datastr = json.dumps(data)
|
||||
|
||||
print('start submit job to sqlflow.')
|
||||
|
||||
try:
|
||||
response = requests.post(url, data=eval(datastr), files=files)
|
||||
except Exception:
|
||||
print('submit job to sqlflow failed.')
|
||||
sys.exit(0)
|
||||
|
||||
result = json.loads(response.text)
|
||||
|
||||
if result['code'] == 200:
|
||||
print('submit job to sqlflow successful.')
|
||||
return result['data']['jobId']
|
||||
else:
|
||||
print(result['error'])
|
||||
sys.exit(0)
|
||||
|
||||
|
||||
def toZip(start_dir):
|
||||
if start_dir.endswith(os.sep):
|
||||
start_dir = start_dir[:-1]
|
||||
start_dir = start_dir
|
||||
file_news = start_dir + '.zip'
|
||||
|
||||
z = zipfile.ZipFile(file_news, 'w', zipfile.ZIP_DEFLATED)
|
||||
for dir_path, dir_names, file_names in os.walk(start_dir):
|
||||
f_path = dir_path.replace(start_dir, '')
|
||||
f_path = f_path and f_path + os.sep or ''
|
||||
for filename in file_names:
|
||||
z.write(os.path.join(dir_path, filename), f_path + filename)
|
||||
z.close()
|
||||
return file_news
|
||||
|
After Width: | Height: | Size: 196 KiB |
|
|
@ -0,0 +1,56 @@
|
|||
-- sql server sample sql
|
||||
CREATE TABLE dbo.EmployeeSales
|
||||
( DataSource varchar(20) NOT NULL,
|
||||
BusinessEntityID varchar(11) NOT NULL,
|
||||
LastName varchar(40) NOT NULL,
|
||||
SalesDollars money NOT NULL
|
||||
);
|
||||
GO
|
||||
CREATE PROCEDURE dbo.uspGetEmployeeSales
|
||||
AS
|
||||
SET NOCOUNT ON;
|
||||
SELECT 'PROCEDURE', sp.BusinessEntityID, c.LastName,
|
||||
sp.SalesYTD
|
||||
FROM Sales.SalesPerson AS sp
|
||||
INNER JOIN Person.Person AS c
|
||||
ON sp.BusinessEntityID = c.BusinessEntityID
|
||||
WHERE sp.BusinessEntityID LIKE '2%'
|
||||
ORDER BY sp.BusinessEntityID, c.LastName;
|
||||
GO
|
||||
--INSERT...SELECT example
|
||||
INSERT INTO dbo.EmployeeSales
|
||||
SELECT 'SELECT', sp.BusinessEntityID, c.LastName, sp.SalesYTD
|
||||
FROM Sales.SalesPerson AS sp
|
||||
INNER JOIN Person.Person AS c
|
||||
ON sp.BusinessEntityID = c.BusinessEntityID
|
||||
WHERE sp.BusinessEntityID LIKE '2%'
|
||||
ORDER BY sp.BusinessEntityID, c.LastName;
|
||||
GO
|
||||
|
||||
|
||||
CREATE VIEW hiredate_view
|
||||
AS
|
||||
SELECT p.FirstName, p.LastName, e.BusinessEntityID, e.HireDate
|
||||
FROM HumanResources.Employee e
|
||||
JOIN Person.Person AS p ON e.BusinessEntityID = p.BusinessEntityID ;
|
||||
GO
|
||||
|
||||
CREATE VIEW view1
|
||||
AS
|
||||
SELECT fis.CustomerKey, fis.ProductKey, fis.OrderDateKey,
|
||||
fis.SalesTerritoryKey, dst.SalesTerritoryRegion
|
||||
FROM FactInternetSales AS fis
|
||||
LEFT OUTER JOIN DimSalesTerritory AS dst
|
||||
ON (fis.SalesTerritoryKey=dst.SalesTerritoryKey);
|
||||
|
||||
GO
|
||||
SELECT ROW_NUMBER() OVER(PARTITION BY PostalCode ORDER BY SalesYTD DESC) AS "Row Number",
|
||||
p.LastName, s.SalesYTD, a.PostalCode
|
||||
FROM Sales.SalesPerson AS s
|
||||
INNER JOIN Person.Person AS p
|
||||
ON s.BusinessEntityID = p.BusinessEntityID
|
||||
INNER JOIN Person.Address AS a
|
||||
ON a.AddressID = p.BusinessEntityID
|
||||
WHERE TerritoryID IS NOT NULL
|
||||
AND SalesYTD <> 0
|
||||
ORDER BY PostalCode;
|
||||
|
After Width: | Height: | Size: 54 KiB |
|
|
@ -0,0 +1,129 @@
|
|||
## Python Data lineage: using the SQLFlow REST API (Advanced)
|
||||
|
||||
This article illustrates how to discover the data lineage using Python and the SQLFlow REST API.
|
||||
|
||||
By using the SQLFlow REST API, you can code in python to discover the data lineage in SQL scripts
|
||||
and get the result in an actionable diagram, json, csv or graphml format.
|
||||
|
||||
You can integerate the python code provided here into your own project and add the powerful
|
||||
data lineage analsysis capability instantly.
|
||||
|
||||
### 1. interactive data lineage visualizations
|
||||

|
||||
|
||||
### 2. [Data lineage in JSON format](python-data-lineage-result.json)
|
||||
|
||||
### 3. Data lineage in CSV, graphml format
|
||||
|
||||
|
||||
## Prerequisites
|
||||
- [SQLFlow Cloud Server or on-premise version](https://github.com/sqlparser/sqlflow_public/tree/master/api#prerequisites)
|
||||
- Python 2.7 or higher version must be installed and configured correctly.
|
||||
- Installing Dependency Libraries:
|
||||
```
|
||||
pip install requests
|
||||
```
|
||||
|
||||
### Usage
|
||||
````
|
||||
python Grabit.py /s server /p port /u userId /k userSecret /t databaseType /f path_to_config_file /r resultType
|
||||
|
||||
eg:
|
||||
python Grabit.py /u 'auth0|xxx' /k cab9712c45189014a94a8b7aceeef7a3db504be58e18cd3686f3bbefd078ef4d /s https://api.gudusoft.com /t oracle /f demo.sql /r 1
|
||||
|
||||
note:
|
||||
If the parameter string contains symbols like "|" , it must be included in a single quotes (' ')
|
||||
````
|
||||
|
||||
Example:
|
||||
|
||||
1. Connect to the SQLFlow Cloud Server
|
||||
```
|
||||
python Grabit.py /s https://api.gudusoft.com /u 'YOUR_USER_ID' /k YOUR_SECRET_KEY /t sqlserver /f python-data-lineage-sqlserver.sql /r 1
|
||||
```
|
||||
|
||||
2. Connect to the SQLFlow on-premise
|
||||
This will discover data lineage by analyzing the `python-data-lineage-sqlserver.sql` file. You may also specify a zip file which includes lots of SQL files.
|
||||
```
|
||||
python Grabit.py /s http://127.0.0.1 /p 8081 /u 'gudu|0123456789' /t sqlserver /f python-data-lineage-sqlserver.sql /r 1
|
||||
```
|
||||
|
||||
This will discover data lineage by analyzing all SQL files under `sqlfiles` directory.
|
||||
```
|
||||
python Grabit.py /s http://127.0.0.1 /p 8081 /u 'gudu|0123456789' /t mysql /f sqlfiles /r 1
|
||||
```
|
||||
|
||||
|
||||
### Parameters
|
||||
|
||||
- **path_to_config_file**
|
||||
|
||||
This can be a single SQL file, a zip file including multiple SQL files, or a directory including lots of SQL files.
|
||||
|
||||
- **server**
|
||||
|
||||
Usually, it is the IP address of [the SQLFlow on-premise version](https://www.gudusoft.com/sqlflow-on-premise-version/)
|
||||
installed on your owner servers such as `127.0.0.1` or `http://127.0.0.1`
|
||||
|
||||
You may set the value to `https://api.gudusoft.com` if you like to send your SQL script to [the SQLFlow Cloud Server](https://sqlflow.gudusoft.com) to get the data lineage result.
|
||||
|
||||
- **port**
|
||||
|
||||
The default value is `8081` if you connect to your SQLFlow on-premise server.
|
||||
|
||||
However, if you setup the nginx reverse proxy in the nginx configuration file like this:
|
||||
```
|
||||
location /api/ {
|
||||
proxy_pass http://127.0.0.1:8081/;
|
||||
proxy_connect_timeout 600s ;
|
||||
proxy_read_timeout 600s;
|
||||
proxy_send_timeout 600s;
|
||||
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header User-Agent $http_user_agent;
|
||||
}
|
||||
```
|
||||
Then, keep the value of `serverPort` empty and set `server` to the value like this: `http://127.0.0.1/api`.
|
||||
|
||||
>Please keep this value empty if you connect to the SQLFlow Cloud Server by specifying the `https://api.gudusoft.com`
|
||||
in the `server`
|
||||
>
|
||||
- **userId, userSecret**
|
||||
|
||||
This is the user id that is used to connect to the SQLFlow server.
|
||||
Always set this value to `gudu|0123456789` and keep `userSecret` empty if you use the SQLFlow on-premise version.
|
||||
|
||||
If you want to connect to [the SQLFlow Cloud Server](https://sqlflow.gudusoft.com), you may [request a 30 days premium account](https://www.gudusoft.com/request-a-premium-account/) to
|
||||
[get the necessary userId and secret code](/sqlflow-userid-secret.md).
|
||||
|
||||
|
||||
- **databaseType**
|
||||
|
||||
This parameter specifies the database dialect of the SQL scripts that the SQLFlow has analyzed.
|
||||
|
||||
```txt
|
||||
access,bigquery,couchbase,dax,db2,greenplum,hana,hive,impala,informix,mdx,mssql,
|
||||
sqlserver,mysql,netezza,odbc,openedge,oracle,postgresql,postgres,redshift,snowflake,
|
||||
sybase,teradata,soql,vertica
|
||||
```
|
||||
|
||||
- **resultType**
|
||||
|
||||
When you submit SQL script to the SQLFlow server, A job is created on the SQLFlow server
|
||||
and you can always see the graphic data lineage result via the browser,
|
||||
|
||||
|
||||
Even better, This demo will fetch the data lineage back to the directory where the demo is running.
|
||||
Those data lineage results are stored in the `data/result/` directory.
|
||||
|
||||
This parameter specifies which kind of format is used to save the data lineage result.
|
||||
|
||||
Available values for this parameter:
|
||||
- 1: JSON, data lineage result in JSON.
|
||||
- 2: CSV, data lineage result in CSV format.
|
||||
- 3: diagram, in graphml format that can be viewed by yEd.
|
||||
|
||||
### SQLFlow REST API
|
||||
Please check here for the detailed information about the [SQLFlow REST API](https://github.com/sqlparser/sqlflow_public/tree/master/api/sqlflow_api.md)
|
||||
|
|
@ -0,0 +1,232 @@
|
|||
#!/usr/bin/python
|
||||
# -*- coding: UTF-8 -*-
|
||||
import zipfile
|
||||
|
||||
import requests
|
||||
import time
|
||||
import json
|
||||
import sys
|
||||
import os
|
||||
|
||||
|
||||
def toSqlflow(userId, token, server, port, jobName, dbvendor, sqlfiles):
|
||||
url = '/api/gspLive_backend/sqlflow/job/submitUserJob'
|
||||
if 'api.gudusoft.com' in server:
|
||||
url = '/gspLive_backend/sqlflow/job/submitUserJob'
|
||||
|
||||
if port != '':
|
||||
url = server + ':' + port + url
|
||||
else:
|
||||
url = server + url
|
||||
|
||||
if os.path.isdir(sqlfiles):
|
||||
sqlfiles = toZip(sqlfiles)
|
||||
files = {'sqlfiles': open(sqlfiles, 'rb')}
|
||||
data = {'dbvendor': dbvendor, 'jobName': jobName, 'token': token, 'userId': userId}
|
||||
datastr = json.dumps(data)
|
||||
|
||||
print('start submit job to sqlflow.')
|
||||
|
||||
try:
|
||||
response = requests.post(url, data=eval(datastr), files=files, verify=False)
|
||||
except Exception:
|
||||
print('submit job to sqlflow failed.')
|
||||
sys.exit(0)
|
||||
|
||||
result = json.loads(response.text)
|
||||
|
||||
if result['code'] == 200:
|
||||
print('submit job to sqlflow successful.')
|
||||
return result['data']['jobId']
|
||||
else:
|
||||
print(result['error'])
|
||||
sys.exit(0)
|
||||
|
||||
|
||||
def toZip(start_dir):
|
||||
if start_dir.endswith(os.sep):
|
||||
start_dir = start_dir[:-1]
|
||||
start_dir = start_dir
|
||||
file_news = start_dir + '.zip'
|
||||
|
||||
z = zipfile.ZipFile(file_news, 'w', zipfile.ZIP_DEFLATED)
|
||||
for dir_path, dir_names, file_names in os.walk(start_dir):
|
||||
f_path = dir_path.replace(start_dir, '')
|
||||
f_path = f_path and f_path + os.sep or ''
|
||||
for filename in file_names:
|
||||
z.write(os.path.join(dir_path, filename), f_path + filename)
|
||||
z.close()
|
||||
return file_news
|
||||
|
||||
|
||||
def getToken(userId, server, port,screctKey):
|
||||
|
||||
if userId == 'gudu|0123456789':
|
||||
return 'token'
|
||||
|
||||
url = '/api/gspLive_backend/user/generateToken'
|
||||
if 'api.gudusoft.com' in server:
|
||||
url = '/gspLive_backend/user/generateToken'
|
||||
if port != '':
|
||||
url = server + ':' + port + url
|
||||
else:
|
||||
url = server + url
|
||||
mapA = {'secretKey': screctKey, 'userId': userId}
|
||||
header_dict = {"Content-Type": "application/x-www-form-urlencoded"}
|
||||
|
||||
print('start get token.')
|
||||
try:
|
||||
r = requests.post(url, data=mapA, headers=header_dict, verify=False)
|
||||
print(r)
|
||||
except Exception:
|
||||
print('get token failed.')
|
||||
result = json.loads(r.text)
|
||||
|
||||
if result['code'] == '200':
|
||||
print('get token successful.')
|
||||
return result['token']
|
||||
else:
|
||||
print(result['error'])
|
||||
|
||||
|
||||
def getResult(dataLineageFileType, userId, token, server, port, jobId, filePath):
|
||||
sep = 'data' + os.sep + 'result' + os.sep
|
||||
filePath = filePath + '_' + jobId
|
||||
if dataLineageFileType == 'json':
|
||||
url = "/api/gspLive_backend/sqlflow/job/exportLineageAsJson"
|
||||
if 'api.gudusoft.com' in server:
|
||||
url = '/gspLive_backend/sqlflow/job/exportLineageAsJson'
|
||||
filePath = sep + filePath + '_json.json'
|
||||
elif dataLineageFileType == 'graphml':
|
||||
url = "/api/gspLive_backend/sqlflow/job/exportLineageAsGraphml"
|
||||
if 'api.gudusoft.com' in server:
|
||||
url = '/gspLive_backend/sqlflow/job/exportLineageAsGraphml'
|
||||
filePath = sep + filePath + '_graphml.graphml'
|
||||
elif dataLineageFileType == 'csv':
|
||||
url = "/api/gspLive_backend/sqlflow/job/exportLineageAsCsv"
|
||||
if 'api.gudusoft.com' in server:
|
||||
url = '/gspLive_backend/sqlflow/job/exportLineageAsCsv'
|
||||
filePath = sep + filePath + '_csv.csv'
|
||||
else:
|
||||
url = "/api/gspLive_backend/sqlflow/job/exportLineageAsJson"
|
||||
if 'api.gudusoft.com' in server:
|
||||
url = '/gspLive_backend/sqlflow/job/exportLineageAsJson'
|
||||
filePath = sep + filePath + '_json.json'
|
||||
|
||||
if port != '':
|
||||
url = server + ':' + port + url
|
||||
else:
|
||||
url = server + url
|
||||
|
||||
data = {'jobId': jobId, 'token': token, 'userId': userId, 'tableToTable': 'false'}
|
||||
datastr = json.dumps(data)
|
||||
|
||||
print('start download result to sqlflow.')
|
||||
try:
|
||||
response = requests.post(url, data=eval(datastr), verify=False)
|
||||
except Exception:
|
||||
print('download result to sqlflow failed.')
|
||||
sys.exit(0)
|
||||
|
||||
if not os.path.exists(sep):
|
||||
os.makedirs(sep)
|
||||
|
||||
try:
|
||||
with open(filePath, 'wb') as f:
|
||||
f.write(response.content)
|
||||
except Exception:
|
||||
print(filePath, 'is not exist.')
|
||||
sys.exit(0)
|
||||
|
||||
print('download result to sqlflow successful.file path is ', filePath)
|
||||
|
||||
|
||||
|
||||
def getStatus(userId, token, server, port, jobId):
|
||||
url = "/api/gspLive_backend/sqlflow/job/displayUserJobSummary"
|
||||
if 'api.gudusoft.com' in server:
|
||||
url = '/gspLive_backend/sqlflow/job/displayUserJobSummary'
|
||||
|
||||
if port != '':
|
||||
url = server + ':' + port + url
|
||||
else:
|
||||
url = server + url
|
||||
|
||||
data = {'jobId': jobId, 'token': token, 'userId': userId}
|
||||
datastr = json.dumps(data)
|
||||
|
||||
try:
|
||||
response = requests.post(url, data=eval(datastr), verify=False)
|
||||
except Exception:
|
||||
print('get job status to sqlflow failed.')
|
||||
sys.exit(0)
|
||||
|
||||
result = json.loads(response.text)
|
||||
if result['code'] == 200:
|
||||
status = result['data']['status']
|
||||
if status == 'fail':
|
||||
print(result['data']['errorMessage'])
|
||||
sys.exit(0)
|
||||
return status
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
if len(sys.argv) < 1:
|
||||
print('Please enter the args.')
|
||||
sys.exit(0)
|
||||
|
||||
# the user id of sqlflow web or client, required true
|
||||
userId = ''
|
||||
|
||||
# the secret key of sqlflow user for webapi request, required true
|
||||
screctKey = ''
|
||||
|
||||
|
||||
# sqlflow server
|
||||
server = ''
|
||||
|
||||
# sqlflow api port
|
||||
port = ''
|
||||
|
||||
# database type
|
||||
dbvendor = 'dbvmysql'
|
||||
|
||||
sqlfile = ''
|
||||
dataLineageFileType = ''
|
||||
for i in range(1, len(sys.argv)):
|
||||
if sys.argv[i] == '/f':
|
||||
try:
|
||||
if sys.argv[i + 1] is not None:
|
||||
sqlfile = sys.argv[i + 1]
|
||||
except Exception:
|
||||
print('Please enter the sqlfile path,required true. eg: /f sql.txt')
|
||||
sys.exit(0)
|
||||
elif sys.argv[i] == '/o':
|
||||
try:
|
||||
if sys.argv[i + 1] is not None:
|
||||
dataLineageFileType = sys.argv[i + 1]
|
||||
except Exception:
|
||||
dataLineageFileType = 'json'
|
||||
|
||||
token = getToken(userId, server, port, screctKey);
|
||||
|
||||
# sqlflow job name
|
||||
jobName = 'test'
|
||||
jobId = toSqlflow(userId, token, server, port, jobName, dbvendor, sqlfile)
|
||||
|
||||
while 1==1:
|
||||
status = getStatus(userId, token, server, port, jobId)
|
||||
if status == 'fail':
|
||||
print('job execute failed.')
|
||||
break;
|
||||
elif status == 'success':
|
||||
print('job execute successful.')
|
||||
break;
|
||||
elif status == 'partial_success':
|
||||
print('job execute partial successful.')
|
||||
break;
|
||||
time.sleep(2)
|
||||
|
||||
# data lineage file path
|
||||
filePath = 'datalineage'
|
||||
getResult(dataLineageFileType, userId, token, server, port, jobId, filePath)
|
||||
|
|
@ -0,0 +1,57 @@
|
|||
import zipfile
|
||||
import sys
|
||||
import os
|
||||
|
||||
|
||||
def toZip(start_dir):
|
||||
if start_dir.endswith(os.sep):
|
||||
start_dir = start_dir[:-1]
|
||||
start_dir = start_dir
|
||||
file_news = start_dir + '.zip'
|
||||
|
||||
z = zipfile.ZipFile(file_news, 'w', zipfile.ZIP_DEFLATED)
|
||||
for dir_path, dir_names, file_names in os.walk(start_dir):
|
||||
f_path = dir_path.replace(start_dir, '')
|
||||
f_path = f_path and f_path + os.sep or ''
|
||||
for filename in file_names:
|
||||
z.write(os.path.join(dir_path, filename), f_path + filename)
|
||||
z.close()
|
||||
return file_news
|
||||
|
||||
|
||||
def buildSqltextParam(userId, token, delimiter, export_include_table, showConstantTable,
|
||||
treatArgumentsInCountFunctionAsDirectDataflow, dbvendor, sqltext):
|
||||
data = {'dbvendor': dbvendor, 'token': token, 'userId': userId}
|
||||
if delimiter != '':
|
||||
data['delimiter'] = delimiter
|
||||
if export_include_table != '':
|
||||
data['export_include_table'] = export_include_table
|
||||
if showConstantTable != '':
|
||||
data['showConstantTable'] = showConstantTable
|
||||
if treatArgumentsInCountFunctionAsDirectDataflow != '':
|
||||
data['treatArgumentsInCountFunctionAsDirectDataflow'] = treatArgumentsInCountFunctionAsDirectDataflow
|
||||
if sqltext != '':
|
||||
data['sqltext'] = sqltext
|
||||
return data
|
||||
|
||||
|
||||
def buildSqlfileParam(userId, token, delimiter, export_include_table, showConstantTable,
|
||||
treatArgumentsInCountFunctionAsDirectDataflow, dbvendor, sqlfile):
|
||||
files = ''
|
||||
if sqlfile != '':
|
||||
if os.path.isdir(sqlfile):
|
||||
print('The SQL file cannot be a directory.')
|
||||
sys.exit(0)
|
||||
files = {'sqlfile': open(sqlfile, 'rb')}
|
||||
|
||||
data = {'dbvendor': dbvendor, 'token': token, 'userId': userId}
|
||||
if delimiter != '':
|
||||
data['delimiter'] = delimiter
|
||||
if export_include_table != '':
|
||||
data['export_include_table'] = export_include_table
|
||||
if showConstantTable != '':
|
||||
data['showConstantTable'] = showConstantTable
|
||||
if treatArgumentsInCountFunctionAsDirectDataflow != '':
|
||||
data['treatArgumentsInCountFunctionAsDirectDataflow'] = treatArgumentsInCountFunctionAsDirectDataflow
|
||||
return data, files
|
||||
|
||||
|
|
@ -0,0 +1,44 @@
|
|||
import requests
|
||||
|
||||
import json
|
||||
|
||||
def getToken(userId, server, port, screctKey):
|
||||
if userId == 'gudu|0123456789':
|
||||
return 'token'
|
||||
url = '/api/gspLive_backend/user/generateToken'
|
||||
if 'api.gudusoft.com' in server:
|
||||
url = '/gspLive_backend/user/generateToken'
|
||||
if port != '':
|
||||
url = server + ':' + port + url
|
||||
else:
|
||||
url = server + url
|
||||
mapA = {'secretKey': screctKey, 'userId': userId}
|
||||
header_dict = {"Content-Type": "application/x-www-form-urlencoded"}
|
||||
|
||||
try:
|
||||
r = requests.post(url, data=mapA, headers=header_dict, verify=False)
|
||||
except Exception as e:
|
||||
print('get token failed.', e)
|
||||
result = json.loads(r.text)
|
||||
|
||||
if result['code'] == '200':
|
||||
return result['token']
|
||||
else:
|
||||
print(result['error'])
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
||||
server = ''
|
||||
|
||||
port = ''
|
||||
|
||||
# the user id of sqlflow web or client, required true
|
||||
userId = ''
|
||||
|
||||
# the secret key of sqlflow user for webapi request, required true
|
||||
screctKey = ''
|
||||
|
||||
token = getToken(userId, server, port, screctKey)
|
||||
|
||||
print(token)
|
||||
|
|
@ -0,0 +1,61 @@
|
|||
#!/usr/bin/python
|
||||
# -*- coding: UTF-8 -*-
|
||||
import requests
|
||||
import json
|
||||
import GenerateToken
|
||||
|
||||
|
||||
def check(server, port, sql, dbvendor, userId, token):
|
||||
url = "/api/gspLive_backend/demo/syntax/check"
|
||||
if 'api.gudusoft.com' in server:
|
||||
url = '/gspLive_backend/demo/syntax/check'
|
||||
if port != '':
|
||||
url = server + ':' + port + url
|
||||
else:
|
||||
url = server + url
|
||||
|
||||
data = {'sql': sql, 'dbvendor': dbvendor, 'userId': userId, 'token': token}
|
||||
header_dict = {"Content-Type": "application/x-www-form-urlencoded;charset=UTF-8"}
|
||||
try:
|
||||
r = requests.post(url, data=data, headers=header_dict, verify=False)
|
||||
except Exception as e:
|
||||
print('syntax error.', e)
|
||||
result = json.loads(r.text)
|
||||
|
||||
if result['code'] == 200:
|
||||
usedTime = result['data']['usedTime']
|
||||
version = result['data']['gsp.version']
|
||||
print('syntax correct. elapsed time: ' + usedTime+' ,gsp version: ' + version)
|
||||
else:
|
||||
usedTime = result['data']['usedTime']
|
||||
version = result['data']['gsp.version']
|
||||
print('syntax error. elapsed time: ' + usedTime + ' ,gsp version: ' + version + ' ,error info:')
|
||||
errorInfos = result['data']['errorInfos']
|
||||
for error in errorInfos:
|
||||
print(error['errorMessage'])
|
||||
|
||||
if __name__ == '__main__':
|
||||
# the user id of sqlflow web or client, required true
|
||||
userId = ''
|
||||
|
||||
# the secret key of sqlflow user for webapi request, required true
|
||||
screctKey = ''
|
||||
|
||||
# sqlflow server, For the cloud version, the value is https://api.gudusoft.com
|
||||
server = 'https://api.gudusoft.com'
|
||||
|
||||
|
||||
# sqlflow api port, For the cloud version, the value is 80
|
||||
port = ''
|
||||
|
||||
# The token is generated from userid and usersecret. It is used in every Api invocation.
|
||||
token = GenerateToken.getToken(userId, server, port, screctKey)
|
||||
|
||||
# sql to be checked
|
||||
sql = 'select * fro1m table1'
|
||||
|
||||
# database type, dbvansi,dbvathena,dbvazuresql,dbvbigquery,dbvcouchbase,dbvdb2,dbvgreenplum,dbvgaussdb,dbvhana,dbvhive,dbvimpala,dbvinformix,dbvmdx,dbvmysql,dbvnetezza,dbvopenedge,dbvoracle,dbvpresto,dbvpostgresql,dbvredshift,dbvsnowflake,dbvmssql,dbvsparksql,dbvsybase,dbvteradata,dbvvertica
|
||||
dbvendor = 'dbvoracle'
|
||||
|
||||
# check syntax
|
||||
check(server, port, sql, dbvendor, userId, token)
|
||||
|
|
@ -0,0 +1,105 @@
|
|||
#!/usr/bin/python
|
||||
# -*- coding: UTF-8 -*-
|
||||
import requests
|
||||
import json
|
||||
import sys
|
||||
import GenerateToken
|
||||
import GenerateLineageParam
|
||||
|
||||
|
||||
def getResult(server, port, data, files):
|
||||
url = "/api/gspLive_backend/sqlflow/generation/sqlflow/exportFullLineageAsCsv"
|
||||
if 'api.gudusoft.com' in server:
|
||||
url = '/gspLive_backend/sqlflow/generation/sqlflow/exportFullLineageAsCsv'
|
||||
if port != '':
|
||||
url = server + ':' + port + url
|
||||
else:
|
||||
url = server + url
|
||||
|
||||
datastr = json.dumps(data)
|
||||
|
||||
print('start get csv result from sqlflow.')
|
||||
try:
|
||||
if files != '':
|
||||
response = requests.post(url, data=eval(datastr), files=files, verify=False)
|
||||
else:
|
||||
response = requests.post(url, data=eval(datastr), verify=False)
|
||||
except Exception as e:
|
||||
print('get csv result from sqlflow failed.', e)
|
||||
sys.exit(0)
|
||||
|
||||
print('get csv result from sqlflow successful. result : ')
|
||||
print()
|
||||
return response.text
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# the user id of sqlflow web or client, required true
|
||||
userId = ''
|
||||
|
||||
# the secret key of sqlflow user for webapi request, required true
|
||||
screctKey = ''
|
||||
|
||||
# sqlflow server, For the cloud version, the value is https://api.gudusoft.com
|
||||
server = 'http://127.0.0.1'
|
||||
|
||||
# sqlflow api port, For the cloud version, the value is 443
|
||||
port = '8165'
|
||||
|
||||
# For the cloud version
|
||||
# server = 'https://api.gudusoft.com'
|
||||
# port = '80'
|
||||
|
||||
# The token is generated from userid and usersecret. It is used in every Api invocation.
|
||||
token = GenerateToken.getToken(userId, server, port, screctKey)
|
||||
|
||||
# delimiter of the values in CSV, default would be ',' string
|
||||
delimiter = ','
|
||||
|
||||
# export_include_table, string
|
||||
export_include_table = ''
|
||||
|
||||
# showConstantTable, boolean
|
||||
showConstantTable = 'true'
|
||||
|
||||
# Whether treat the arguments in COUNT function as direct Dataflow, boolean
|
||||
treatArgumentsInCountFunctionAsDirectDataflow = ''
|
||||
|
||||
# database type,
|
||||
# dbvazuresql
|
||||
# dbvbigquery
|
||||
# dbvcouchbase
|
||||
# dbvdb2
|
||||
# dbvgreenplum
|
||||
# dbvhana
|
||||
# dbvhive
|
||||
# dbvimpala
|
||||
# dbvinformix
|
||||
# dbvmdx
|
||||
# dbvmysql
|
||||
# dbvnetezza
|
||||
# dbvopenedge
|
||||
# dbvoracle
|
||||
# dbvpostgresql
|
||||
# dbvredshift
|
||||
# dbvsnowflake
|
||||
# dbvmssql
|
||||
# dbvsparksql
|
||||
# dbvsybase
|
||||
# dbvteradata
|
||||
# dbvvertica
|
||||
dbvendor = 'dbvoracle'
|
||||
|
||||
# sql text
|
||||
# sqltext = 'select * from table'
|
||||
# data = GenerateLineageParam.buildSqltextParam(userId, token, delimiter, export_include_table, showConstantTable, treatArgumentsInCountFunctionAsDirectDataflow, dbvendor, sqltext)
|
||||
# resp = getResult(server, port, data, '')
|
||||
|
||||
# sql file
|
||||
sqlfile = 'test.sql'
|
||||
data, files = GenerateLineageParam.buildSqlfileParam(userId, token, delimiter, export_include_table,
|
||||
showConstantTable,
|
||||
treatArgumentsInCountFunctionAsDirectDataflow, dbvendor,
|
||||
sqlfile)
|
||||
resp = getResult(server, port, data, files)
|
||||
print(resp)
|
||||
|
|
@ -0,0 +1,104 @@
|
|||
## Python Data lineage: using the SQLFlow REST API (Basic)
|
||||
|
||||
A basic tutorial for using the Python version of the SQLFlow API.
|
||||
|
||||
Here is an advanced version of how to use [Python to get the data lineage](https://github.com/sqlparser/sqlflow_public/tree/master/api/python/advanced).
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Python 2.7 or higher version must be installed and configured correctly.
|
||||
|
||||
- Installing Dependency Libraries:
|
||||
|
||||
`
|
||||
pip install requests
|
||||
`
|
||||
|
||||
### GenerateTokenDemo.py
|
||||
|
||||
This demo shows how to get a token from a SQLFlow system that can be used to legally call other interfaces.
|
||||
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **userSecret**: the userSecret of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
|
||||
This is the user id that is used to connect to the SQLFlow server.
|
||||
Always set this value to `gudu|0123456789` and keep `userSecret` empty if you use the SQLFlow on-premise version.
|
||||
|
||||
If you want to connect to [the SQLFlow Cloud Server](https://sqlflow.gudusoft.com), you may [request a 30 days premium account](https://www.gudusoft.com/request-a-premium-account/) to
|
||||
[get the necessary userId and secret code](/sqlflow-userid-secret.md).
|
||||
|
||||
**set the parameters in the code**
|
||||
|
||||
Connect to the SQLFlow Cloud Server:
|
||||
|
||||
````json
|
||||
url = 'https://api.gudusoft.com/gspLive_backend/user/generateToken'
|
||||
userId = 'YOUR USER ID'
|
||||
screctKey = 'YOUR SECRET KEY'
|
||||
````
|
||||
|
||||
Connect to the SQLFlow on-premise version:
|
||||
|
||||
````json
|
||||
url = 'http://127.0.0.1:8081/gspLive_backend/user/generateToken'
|
||||
userId = 'gudu|012345678'
|
||||
screctKey = ''
|
||||
````
|
||||
|
||||
**start script**
|
||||
|
||||
`python GenerateTokenDemo.py`
|
||||
|
||||
### GenerateDataLineageDemo.py
|
||||
|
||||
This demo shows how to get the desired SQL script analysis results from the SQLFlow system.
|
||||
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **userSecret**: the userSecret of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* sqltext: sql text, required false
|
||||
* sqlfile: sql file, required false
|
||||
* **dbvendor**: database vendor, required **true**, available values:
|
||||
* dbvbigquery, dbvcouchbase,dbvdb2,dbvgreenplum,dbvhana,dbvhive,dbvimpala,dbvinformix,dbvmdx,dbvmysql,dbvnetezza,dbvopenedge,dbvoracle,dbvpostgresql,dbvredshift,dbvsnowflake,dbvmssql,dbvsybase,dbvteradata,dbvvertica
|
||||
* filePath: data lineage file path
|
||||
|
||||
|
||||
**set the parameters in the code**
|
||||
|
||||
Connect to the SQLFlow Cloud Server:
|
||||
|
||||
````json
|
||||
tokenUrl = 'https://api.gudusoft.com/gspLive_backend/user/generateToken'
|
||||
generateDataLineageUrl = 'https://api.gudusoft.com/gspLive_backend/sqlflow/generation/sqlflow'
|
||||
userId = 'YOUR USER ID'
|
||||
screctKey = 'YOUR SECRET KEY'
|
||||
sqlfile = 'test.sql'
|
||||
dbvendor = 'dbvoracle'
|
||||
filePath = 'datalineage'
|
||||
````
|
||||
|
||||
Connect to the SQLFlow on-premise version:
|
||||
|
||||
````json
|
||||
tokenUrl = 'http://127.0.0.1:8081/gspLive_backend/user/generateToken'
|
||||
generateDataLineageUrl = 'http://127.0.0.1:8081/gspLive_backend/sqlflow/generation/sqlflow'
|
||||
userId = 'gudu|012345678'
|
||||
screctKey = ''
|
||||
sqlfile = 'test.sql'
|
||||
dbvendor = 'dbvoracle'
|
||||
filePath = 'datalineage'
|
||||
````
|
||||
|
||||
**start script**
|
||||
|
||||
cmd:
|
||||
|
||||
- /f. the sqlfile path,required. eg: /f sql.txt
|
||||
- /o. the data lineage file type. default value is json, optional. eg: /o csv , /o json
|
||||
|
||||
eg:
|
||||
|
||||
`python GenerateDataLineageDemo.py /f test.sql /o csv`
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,59 @@
|
|||
#!/usr/bin/python
|
||||
# -*- coding: UTF-8 -*-
|
||||
import requests
|
||||
import json
|
||||
import GenerateToken
|
||||
|
||||
|
||||
def toxml(server, port, sql, dbvendor, userId, token):
|
||||
url = "/api/gspLive_backend/demo/xml/toXML"
|
||||
if 'api.gudusoft.com' in server:
|
||||
url = '/gspLive_backend/demo/xml/toXML'
|
||||
if port != '':
|
||||
url = server + ':' + port + url
|
||||
else:
|
||||
url = server + url
|
||||
|
||||
data = {'sql': sql, 'dbvendor': dbvendor, 'userId': userId, 'token': token}
|
||||
header_dict = {"Content-Type": "application/x-www-form-urlencoded;charset=UTF-8"}
|
||||
try:
|
||||
r = requests.post(url, data=data, headers=header_dict, verify=False)
|
||||
except Exception as e:
|
||||
print('convert failed.', e)
|
||||
result = json.loads(r.text)
|
||||
|
||||
usedTime = result['data']['usedTime']
|
||||
version = result['data']['gsp.version']
|
||||
if result['code'] == 200:
|
||||
xml = result['data']['xml']
|
||||
print('elapsed time: ' + usedTime+' ,gsp version: ' + version + ' ,xml result: ')
|
||||
print(xml)
|
||||
else:
|
||||
print('to xml failed. elapsed time: ' + usedTime + ' ,gsp version: ' + version + ' ,error info: ')
|
||||
print(result['error'])
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# the user id of sqlflow web or client, required true
|
||||
userId = ''
|
||||
|
||||
# the secret key of sqlflow user for webapi request, required true
|
||||
screctKey = ''
|
||||
|
||||
# sqlflow server, For the cloud version, the value is https://api.gudusoft.com
|
||||
server = 'https://api.gudusoft.com'
|
||||
|
||||
# sqlflow api port, For the cloud version, the value is 80
|
||||
port = ''
|
||||
|
||||
# The token is generated from userid and usersecret. It is used in every Api invocation.
|
||||
token = GenerateToken.getToken(userId, server, port, screctKey)
|
||||
|
||||
# sql to be checked
|
||||
sql = 'select * from table1'
|
||||
|
||||
# database type, dbvansi,dbvathena,dbvazuresql,dbvbigquery,dbvcouchbase,dbvdb2,dbvgreenplum,dbvgaussdb,dbvhana,dbvhive,dbvimpala,dbvinformix,dbvmdx,dbvmysql,dbvnetezza,dbvopenedge,dbvoracle,dbvpresto,dbvpostgresql,dbvredshift,dbvsnowflake,dbvmssql,dbvsparksql,dbvsybase,dbvteradata,dbvvertica
|
||||
dbvendor = 'dbvoracle'
|
||||
|
||||
# to xml
|
||||
toxml(server, port, sql, dbvendor, userId, token)
|
||||
|
|
@ -0,0 +1,163 @@
|
|||
|
||||
## THIS VERSION IS DEPRECIATED, PLEASE USE THE CODE IN THE BASIC OR ADAVANCED DIRECTORY
|
||||
|
||||
========================================================================================================================================================================================================
|
||||
SQLFlow API Python Client Documentation
|
||||
========================================================================================================================================================================================================
|
||||
|
||||
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
DESCRIPTION
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
High-level Python client of the SQLFlow API.
|
||||
|
||||
SQLFlow is a product of Gudusoft. The software's purpose is to analyze the flow of data, data relationships and dependencies coded into various SQL scripts.
|
||||
|
||||
This Python wrapper is built to process SQL scripts using the API with the option to export the API responses into JSON files.
|
||||
|
||||
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
BASIC USAGE
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
The Python client is built into a single module. To use it, one must have a valid API key (currently available for the community at https://github.com/sqlparser/sqlflow_public/tree/master/api/client/csharp).
|
||||
|
||||
****************************************************************************************************
|
||||
|
||||
SQLFlowClient(api_key, api_url) class stores relevant parameters and methods to utilize SQLFlow API.
|
||||
|
||||
It has all the default values included for both the API key (which is currently available to the public) and the API base URL.
|
||||
|
||||
Initializig it will create an object with the following variables: API key, API URL, and it will also initialize the default request header and a default API parameter configuration.
|
||||
|
||||
****************************************************************************************************
|
||||
|
||||
configure_api(db_vendor, rel_type, simple_output, ignore_rs) method is created to change default API parameters as per required. It will change the pre-set API configuration based to provided parameter values.
|
||||
|
||||
Detailed explanations regarding API configuration could be found here: https://github.com/sqlparser/sqlflow_public/tree/master/api/client/csharp and here: https://api.gudusoft.com/gspLive_backend/swagger-ui.html#!/sqlflow-controller/generateSqlflowUsingPOST.
|
||||
|
||||
While using the method, one must provide all four parameters. Omitting one will result in error, while passing an invalid value will result in a notification and both will prevent the client from configuring the API request, and a notification message will be returned.
|
||||
|
||||
Valid parameters are as follows:
|
||||
|
||||
- db_vendor: dbvbigquery, dbvcouchbase, dbvdb2, dbvgreenplum, dbvhana, dbvhive, dbvimpala, dbvinformix, dbvmdx, dbvmysql, dbvnetezza, dbvopenedge, dbvoracle, dbvpostgresql, dbvredshift, dbvsnowflake, dbvmssql, bvsybase, dbvteradata, dbvvertica
|
||||
|
||||
- rel_type: fdd, fdr, frd, fddi, join
|
||||
|
||||
- simple_output: true, false
|
||||
|
||||
- ignore_rs: true, false
|
||||
|
||||
****************************************************************************************************
|
||||
|
||||
analyze_script(script_path) method can be used to submit a SQL script to the SQLFlow API for analysis. If the analysis returns a response sucessfully, the results will be stored in the SQLFlowClient object's results variable. Results variable is a dictionary object containing script paths and API responses as key-value pairs.
|
||||
|
||||
The method won't perform if the in-built check of the provided file path is not pointing to a SQL script. This will result in a notification message instead.
|
||||
|
||||
If the API call results in an error (e.g. invalid API key, server being busy), the response won't be stored, but a notification message will be returned instead.
|
||||
|
||||
****************************************************************************************************
|
||||
|
||||
export_results(export_folder) method simply dumps all the API call results stored already in SQLFlowClient's results variable to the specified output folder path.
|
||||
|
||||
The API responses will be saved as JSON files, with a filename corresponding to their source scripts'.
|
||||
|
||||
If the provided path doesn't exist, the method will automatically build the path.
|
||||
|
||||
If there are no stored responses yet, the function won't perform, and will return a notification message.
|
||||
|
||||
****************************************************************************************************
|
||||
|
||||
mass_process_scripts(source_folder, export_folder = None) method will scan the entire directory tree of the provided source folder for SQL script files and submits each to the API, storing all the responses in the results variable.
|
||||
|
||||
It can optionally export the results of the detected scripts to a desired export folder. If export_folder is left as None, this operation will be skipped.
|
||||
|
||||
Please note that this method will only execute the exporting of API results of scripts which were discovered in the specified directory at the function's execution.
|
||||
|
||||
****************************************************************************************************
|
||||
|
||||
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
CODE EXAMPLES
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
# Initialize API client
|
||||
|
||||
client = SQLFlowClient()
|
||||
|
||||
# =============================================================================
|
||||
|
||||
# Configure the API parameters
|
||||
|
||||
client.configure('dbvmssql', 'fddi', 'false', 'false')
|
||||
|
||||
# Check config values after setting the parameters
|
||||
|
||||
print(client.config)
|
||||
|
||||
# =============================================================================
|
||||
|
||||
# Execute the analysis of a single script file
|
||||
|
||||
client.analyze_script('C:/Users/TESTUSER/Desktop/EXAMPLESCRIPT.sql')
|
||||
|
||||
# Check stored API response of the previous step
|
||||
|
||||
print(client.results)
|
||||
|
||||
# =============================================================================
|
||||
|
||||
# Export the stored response
|
||||
|
||||
client.export_results('C:/Users/TESTUSER/Desktop/EXPORTFOLDER')
|
||||
|
||||
# =============================================================================
|
||||
|
||||
# Execute mass processing of SQL scripts in a folder with an export folder specified
|
||||
|
||||
client.mass_process_scripts('C:/Users/TESTUSER/Desktop/SOURCEFOLDER', 'C:/Users/TESTUSER/Desktop/EXPORTFOLDER')
|
||||
|
||||
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
AUTHORS
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
Bence Kiss (vencentinus@gmail.com)
|
||||
|
||||
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
ADDITIONAL INFORMATION
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
Detailed information about the SQLFlow project could be accessed via the following links:
|
||||
|
||||
API configuration https://api.gudusoft.com/gspLive_backend/swagger-ui.html#!/sqlflow-controller/generateSqlflowUsingPOST
|
||||
|
||||
SQLFlow Git repo https://github.com/sqlparser/sqlflow_public
|
||||
|
||||
Dataflow relationship types https://github.com/sqlparser/sqlflow_public/blob/master/dbobjects_relationship.md
|
||||
|
||||
SQLFlow front end http://www.gudusoft.com/sqlflow/#/
|
||||
|
||||
C# API client https://github.com/sqlparser/sqlflow_public/tree/master/api/client/csharp
|
||||
|
||||
|
||||
In case of any questions regarding SQLFlow please contact Mr. James Wang at info@sqlparser.com.
|
||||
|
||||
In case of bugs, comments, questions etc. please feel free to contact the author at vencentinus@gmail.com or Mr. James Wang at info@sqlparser.com.
|
||||
|
||||
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
ACKNOWLEDGEMENTS
|
||||
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
The author of this project acknowledges that SQLFlow is a product and intellectual property exclusively of Gudusoft.
|
||||
|
||||
This project has been created to facilitate the utilization of the tool by the community, and the author of this Python client neither received nor expects to receive any compensation from Gudusoft in exchange.
|
||||
|
||||
This development has been created with good faith and with the intention to contribute to a great development, which the author of this wrapper has been utilizing for free under its development period.
|
||||
|
||||
The code is free to use for anyone intending to use SQLFlow API in any manner.
|
||||
|
||||
Thanks to Mr. James Wang, CTO of Gudusoft for his kind support and allowing me to utilize the tool under it's development and contribute to his company's project.
|
||||
|
|
@ -0,0 +1,371 @@
|
|||
'''
|
||||
************************************************************************************************************************************************************
|
||||
|
||||
Properties
|
||||
================
|
||||
NAME: SQLFlow API Python Client
|
||||
DESCRIPTION: A simple wrapper written for Gudusoft's SQLFlow API.
|
||||
AUTHOR: Bence Kiss
|
||||
ORIGIN DATE: 21-MAR-2020
|
||||
PYTHON VERSION: 3.7.3
|
||||
|
||||
Additional Notes
|
||||
================
|
||||
-
|
||||
|
||||
|
||||
ADDITIONAL INFORMATION
|
||||
============================================================================================================================================================
|
||||
Resources URL
|
||||
============================== ============================================================================================================================
|
||||
API configuration https://api.gudusoft.com/gspLive_backend/swagger-ui.html#!/sqlflow-controller/generateSqlflowUsingPOST
|
||||
------------------------------ ----------------------------------------------------------------------------------------------------------------------------
|
||||
SQLFlow Git repo https://github.com/sqlparser/sqlflow_public
|
||||
------------------------------ ----------------------------------------------------------------------------------------------------------------------------
|
||||
Dataflow relationship types https://github.com/sqlparser/sqlflow_public/blob/master/dbobjects_relationship.md
|
||||
------------------------------ ----------------------------------------------------------------------------------------------------------------------------
|
||||
SQLFlow front end http://www.gudusoft.com/sqlflow/#/
|
||||
------------------------------ ----------------------------------------------------------------------------------------------------------------------------
|
||||
C# API client https://github.com/sqlparser/sqlflow_public/tree/master/api/client/csharp
|
||||
------------------------------ ----------------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
|
||||
REVISION HISTORY
|
||||
============================================================================================================================================================
|
||||
Version Change Date Author Narrative
|
||||
======= =============== ====== ============================================================================================================================
|
||||
1.0.0 21-MAR-2020 BK Created
|
||||
------- --------------- ------ ----------------------------------------------------------------------------------------------------------------------------
|
||||
0.0.0 DD-MMM-YYYY XXX What changed and why...
|
||||
------- --------------- ------ ----------------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
************************************************************************************************************************************************************
|
||||
'''
|
||||
|
||||
# ==========================================================================================================================================================
|
||||
|
||||
# Import required modules
|
||||
|
||||
import os
|
||||
import requests
|
||||
import json
|
||||
|
||||
# ==========================================================================================================================================================
|
||||
|
||||
class SQLFlowClient:
|
||||
|
||||
'''
|
||||
|
||||
Class description
|
||||
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
Class containing various functions to use SQLFlow API.
|
||||
|
||||
Class instance variables
|
||||
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
- api_key: The token needed for authorization. Default public token can be found here:
|
||||
|
||||
https://github.com/sqlparser/sqlflow_public/tree/master/api/client/csharp
|
||||
|
||||
- api_url: Default base URL of the API requests. Can be changed at class initialization.
|
||||
|
||||
Class methods
|
||||
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
- configure_api: Set the API parameters for the requests.
|
||||
- analyze_script: Submit a single SQL script using POST request to the API. Responses are stored in the class instance's results variable.
|
||||
- export_responses: Export all stored API responses to a target folder as JSON files.
|
||||
- mass_process_scripts: Process all SQL scripts found in a directory tree, optionally exporting results to a designated folder.
|
||||
|
||||
Class dependencies
|
||||
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
Packages used in the script considered to be core Python packages.
|
||||
|
||||
- os: Used to handle input/output file and folder paths.
|
||||
- requests: Used to generate POST requests and submit script files to the API.
|
||||
- json: Used to process API responses when it comes to exporting.
|
||||
|
||||
'''
|
||||
|
||||
# ==========================================================================================================================================================
|
||||
# ==========================================================================================================================================================
|
||||
|
||||
def __init__(self,
|
||||
api_key = 'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYwMzc1NjgwMCwiaWF0IjoxNTcyMjIwODAwfQ.EhlnJO7oqAHdr0_bunhtrN-TgaGbARKvTh2URTxu9iU',
|
||||
api_url = 'https://api.gudusoft.com/gspLive_backend/sqlflow/generation/sqlflow'
|
||||
):
|
||||
|
||||
'''
|
||||
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
Initialize SQLFlow API client.
|
||||
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
'''
|
||||
|
||||
# Set instance variables
|
||||
|
||||
self.key = api_key
|
||||
|
||||
self.url = api_url
|
||||
|
||||
# Set default request header
|
||||
|
||||
self.headers = {'Accept': 'application/json;charset=utf-8',
|
||||
'Authorization': self.key
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
|
||||
# Set lists of allowed API configuration values
|
||||
|
||||
# List of allowed database vendors
|
||||
|
||||
self.dbvendors = ['dbvbigquery',
|
||||
'dbvcouchbase',
|
||||
'dbvdb2',
|
||||
'dbvgreenplum',
|
||||
'dbvhana',
|
||||
'dbvhive',
|
||||
'dbvimpala',
|
||||
'dbvinformix',
|
||||
'dbvmdx',
|
||||
'dbvmysql',
|
||||
'dbvnetezza',
|
||||
'dbvopenedge',
|
||||
'dbvoracle',
|
||||
'dbvpostgresql',
|
||||
'dbvredshift',
|
||||
'dbvsnowflake',
|
||||
'dbvmssql',
|
||||
'dbvsybase',
|
||||
'dbvteradata',
|
||||
'dbvvertica'
|
||||
]
|
||||
|
||||
# List of allowed data relationship types
|
||||
|
||||
self.reltypes = ['fdd',
|
||||
'fdr',
|
||||
'frd',
|
||||
'fddi',
|
||||
'join'
|
||||
]
|
||||
|
||||
# List of allowed values for Boolean parameters
|
||||
|
||||
self.switches = ['true',
|
||||
'false'
|
||||
]
|
||||
|
||||
# =============================================================================
|
||||
|
||||
# Set default API configuration
|
||||
|
||||
self.config = {'dbvendor': 'dbvmssql',
|
||||
'showRelationType': 'fdd',
|
||||
'simpleOutput': 'false',
|
||||
'ignoreRecordSet': 'false'
|
||||
}
|
||||
|
||||
# Variable to store API responses
|
||||
|
||||
self.results = dict()
|
||||
|
||||
# ==========================================================================================================================================================
|
||||
# ==========================================================================================================================================================
|
||||
|
||||
def configure_api(self,
|
||||
db_vendor,
|
||||
rel_type,
|
||||
simple_output,
|
||||
ignore_rs
|
||||
):
|
||||
|
||||
'''
|
||||
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
Configure the API request parameters. Only works if all parameters are provided.
|
||||
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
'''
|
||||
|
||||
# Check if the provided configuration values are valid
|
||||
|
||||
if db_vendor in self.dbvendors and rel_type in self.reltypes and simple_output in self.switches and ignore_rs in self.switches:
|
||||
|
||||
# Assign valid configuration parameters to config variable
|
||||
|
||||
self.config = {'dbvendor': db_vendor,
|
||||
'showRelationType': rel_type,
|
||||
'simpleOutput': simple_output,
|
||||
'ignoreRecordSet': ignore_rs
|
||||
}
|
||||
|
||||
# If any of the provided parameters are invalid, quit function and notify user
|
||||
|
||||
else:
|
||||
|
||||
print('\n\n' + '=' * 75 + '\n\nOne or more configuration values are missing or invalid. Please try again.\n\nAllowed values for db_vendor:\n\n' +
|
||||
' / '.join(self.dbvendors) +
|
||||
'\n\nAllowed values for relation_type:\n\n' +
|
||||
' / '.join(self.reltypes) +
|
||||
'\n\nAllowed values for simple_output and ignore_rs:\n\n' +
|
||||
' / '.join(self.switches) +
|
||||
'\n\n' + '=' * 75
|
||||
)
|
||||
|
||||
# ==========================================================================================================================================================
|
||||
# ==========================================================================================================================================================
|
||||
|
||||
def analyze_script(self,
|
||||
script_path
|
||||
):
|
||||
|
||||
'''
|
||||
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
Submit SQL script file for SQLFlow analysis.
|
||||
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
'''
|
||||
|
||||
# Compile the API request URL
|
||||
|
||||
configuredURL = self.url + '?' + ''.join(str(parameter) + '=' + str(setting) + '&' for parameter, setting in self.config.items()).rstrip('&')
|
||||
|
||||
# =============================================================================
|
||||
|
||||
# Check if provided path points to a SQL script file
|
||||
|
||||
if os.path.isfile(script_path) and script_path.lower().endswith('.sql'):
|
||||
|
||||
# Open the script file in binary mode so it could be submitted in a POST request
|
||||
|
||||
with open(script_path, mode = 'rb') as scriptFile:
|
||||
|
||||
# Use requests module's POST function to submit file and retrieve API response
|
||||
|
||||
response = requests.post(configuredURL, files = {'sqlfile': scriptFile}, headers = self.headers)
|
||||
|
||||
# =============================================================================
|
||||
|
||||
# Add the request response to the class variable if response was OK
|
||||
|
||||
if response.status_code == 200:
|
||||
|
||||
self.results[script_path] = json.loads(response.text)
|
||||
|
||||
# If response returned a different status, quit function and notify user
|
||||
|
||||
else:
|
||||
|
||||
print('\nAn invalid response was returned for < ' + os.path.basename(script_path) + ' >.\n', '\nStatus code: ' + str(response.status_code) + '\n')
|
||||
|
||||
# If script file's path is invalid, quit function and notify user
|
||||
|
||||
else:
|
||||
|
||||
print('\nProvided path is not pointing to a SQL script file. Please try again.\n')
|
||||
|
||||
# ==========================================================================================================================================================
|
||||
# ==========================================================================================================================================================
|
||||
|
||||
def export_results(self,
|
||||
export_folder
|
||||
):
|
||||
|
||||
'''
|
||||
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
Export all stored API responses as JSON files to a specified folder.
|
||||
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
'''
|
||||
|
||||
# Check if there are responses to be exported
|
||||
|
||||
if len(self.results) != 0:
|
||||
|
||||
# Create the directory for the result files if it doesn't exist
|
||||
|
||||
os.makedirs(export_folder, exist_ok = True)
|
||||
|
||||
# =============================================================================
|
||||
|
||||
# Iterate the API results stored in the class
|
||||
|
||||
for scriptpath, response in self.results.items():
|
||||
|
||||
# Create a JSON file and export API results of each processed script file into the JSON file
|
||||
|
||||
with open(os.path.join(export_folder, os.path.basename(scriptpath).replace('.sql', '') + '.json'), mode = 'w') as resultFile:
|
||||
|
||||
# Write the response into the JSON file
|
||||
|
||||
json.dump(response, resultFile)
|
||||
|
||||
# If there are no responses yet, quit function and notify user
|
||||
|
||||
else:
|
||||
|
||||
print('\nThere are no API responses stored by the client yet.\n')
|
||||
|
||||
# ==========================================================================================================================================================
|
||||
# ==========================================================================================================================================================
|
||||
|
||||
def mass_process_scripts(self,
|
||||
source_folder,
|
||||
export_folder = None):
|
||||
|
||||
'''
|
||||
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
Scan a directory tree for SQL script files and pass each to an API call. Optionally export results to a desired folder.
|
||||
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
'''
|
||||
|
||||
# List to store SQL script file paths found in source folder
|
||||
|
||||
scriptPaths = list()
|
||||
|
||||
# =============================================================================
|
||||
|
||||
# Scan source folder and subfolders
|
||||
|
||||
for (dirpath, dirnames, filenames) in os.walk(source_folder):
|
||||
|
||||
# Collect all paths which refer SQL scripts
|
||||
|
||||
scriptPaths += [os.path.join(dirpath, file) for file in filenames if os.path.isfile(os.path.join(dirpath, file)) and file.lower().endswith('.sql')]
|
||||
|
||||
# =============================================================================
|
||||
|
||||
# If there is at least one SQL script in the directory tree execute API call
|
||||
|
||||
if len(scriptPaths) != 0:
|
||||
|
||||
# Iterate the SQL scrip paths and call the API for each file
|
||||
|
||||
[self.analyze_script(script_path = path) for path in scriptPaths]
|
||||
|
||||
# =============================================================================
|
||||
|
||||
# If an export folder is provided, save the responses to that folder (but only those which have been analyzed at function call)
|
||||
|
||||
if export_folder:
|
||||
|
||||
# Store the current set of API responses
|
||||
|
||||
allResults = self.results
|
||||
|
||||
# Filter for responses related to current function call
|
||||
|
||||
self.results = {scriptpath: response for scriptpath, response in self.results.items() if scriptpath in scriptPaths}
|
||||
|
||||
# Export the responses of the current function call to the desired target folder
|
||||
|
||||
self.export_results(export_folder = export_folder)
|
||||
|
||||
# Reset the results variable to contain all responses again
|
||||
|
||||
self.results = allResults
|
||||
|
||||
# If no SQL script files were found in the directory tree, quit finction and notify user
|
||||
|
||||
else:
|
||||
|
||||
print('\nNo SQL script files have been found in the specified source folder and its subfolders.\n')
|
||||
|
|
@ -0,0 +1,121 @@
|
|||
## How to use the Rest API of SQLFlow
|
||||
|
||||
This article describes how to use the Rest API provided by the SQLFlow to
|
||||
communicate with the SQLFlow server and get the generated metadata and data lineage.
|
||||
|
||||
In this article, we use `Curl` to demonstrate the usage of the Rest API,
|
||||
you can use any preferred programming language as you like.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
In order to use the SQLFlow rest API, you may connect to the [**SQLFlow Cloud server**](https://sqlflow.gudusoft.com),
|
||||
Or, setup a [**SQLFlow on-premise version**](https://www.gudusoft.com/sqlflow-on-premise-version/) on your owner server.
|
||||
|
||||
|
||||
1. **SQLFlow Cloud server**
|
||||
|
||||
- User ID
|
||||
- Secrete Key
|
||||
|
||||
If you want to connect to [the SQLFlow Cloud Server](https://sqlflow.gudusoft.com), you may [request a 30 days premium account](https://www.gudusoft.com/request-a-premium-account/) to
|
||||
[get the necessary userId and secret code](/sqlflow-userid-secret.md).
|
||||
|
||||
|
||||
2. **SQLFlow on-premise version**
|
||||
|
||||
Please [check here](https://github.com/sqlparser/sqlflow_public/blob/master/install_sqlflow.md) to see how to install SQLFlow on-premise version on you own server.
|
||||
|
||||
- User ID
|
||||
- Secrete Key
|
||||
|
||||
Always set userId to `gudu|0123456789` and keep `userSecret` empty when connect to the SQLFlow on-premise version.
|
||||
|
||||
|
||||
### Difference of the API calls between SQLFlow Cloud server and SQLFlow on-premise version
|
||||
|
||||
1. TOKEN is not needed in the API calls when connect to the SQLFlow on-premise version
|
||||
2. userId alwyas set to `gudu|0123456789` and `userSecret` leave empty when connect to the SQLFlow on-premise version.
|
||||
3. The server port is 8081 by default for the SQLFlow on-premise version, and There is no need to specify the port when connect to the SQLFlow Cloud server.
|
||||
|
||||
Regarding the server port of the SQLFlow on-premise version, please [check here](https://github.com/sqlparser/sqlflow_public/tree/master/grabit#1-sqlflow-server) for more information.
|
||||
|
||||
|
||||
|
||||
### Using the Rest API
|
||||
|
||||
#### 1. Generate a token
|
||||
|
||||
|
||||
Once you have the `userid` and `secret key`, the first API need to call is:
|
||||
|
||||
```
|
||||
/gspLive_backend/user/generateToken
|
||||
```
|
||||
|
||||
This API will return a temporary token that needs to be used in the API call thereafter.
|
||||
|
||||
**SQLFlow Cloud Server**
|
||||
```
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/user/generateToken" -H "Request-Origion:testClientDemo" -H "accept:application/json;charset=utf-8" -H "Content-Type:application/x-www-form-urlencoded;charset=UTF-8" -d "secretKey=YOUR SECRET KEY" -d "userId=YOUR USER ID HERE"
|
||||
```
|
||||
|
||||
**SQLFlow on-premise version**
|
||||
|
||||
TOKEN is not needed in the on-premise version. So, there is no need to generate a token.
|
||||
|
||||
|
||||
#### 2. Generate the data lineage
|
||||
|
||||
Call this API by sending the SQL query and get the result includes the data lineage.
|
||||
|
||||
```
|
||||
/gspLive_backend/sqlflow/generation/sqlflow
|
||||
```
|
||||
|
||||
|
||||
**SQLFlow Cloud Server**
|
||||
```
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/sqlflow/generation/sqlflow?showRelationType=fdd" -H "Request-Origion:testClientDemo" -H "accept:application/json;charset=utf-8" -H "Content-Type:multipart/form-data" -F "sqlfile=" -F "dbvendor=dbvoracle" -F "ignoreRecordSet=false" -F "simpleOutput=false" -F "sqltext=CREATE VIEW vsal as select * from emp" -F "userId=YOUR USER ID HERE" -F "token=YOUR TOKEN HERE"
|
||||
```
|
||||
|
||||
**SQLFlow on-premise version**
|
||||
```
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/generation/sqlflow?showRelationType=fdd" -H "Request-Origion:testClientDemo" -H "accept:application/json;charset=utf-8" -H "Content-Type:multipart/form-data" -F "sqlfile=" -F "dbvendor=dbvoracle" -F "ignoreRecordSet=false" -F "simpleOutput=false" -F "sqltext=CREATE VIEW vsal as select * from emp" -F "userId=gudu|0123456789"
|
||||
```
|
||||
|
||||
|
||||
#### 3. Export the data lineage in csv format
|
||||
|
||||
Call this API by sending the SQL file and get the csv result includes the data lineage.
|
||||
|
||||
```
|
||||
/gspLive_backend/sqlflow/generation/sqlflow/exportLineageAsCsv
|
||||
```
|
||||
|
||||
```
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/sqlflow/generation/sqlflow/exportLineageAsCsv" -H "accept:application/json;charset=utf-8" -H "Content-Type:multipart/form-data" -F "userId=YOUR USER ID HERE" -F "token=YOUR TOKEN HERE" -F "dbvendor=dbvoracle" -F "showRelationType=fdd" -F "sqlfile=@YOUR UPLOAD FILE PATH HERE" --output YOUR DOWNLOAD FILE PATH HERE
|
||||
```
|
||||
|
||||
Sample:
|
||||
```
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/sqlflow/generation/sqlflow/exportLineageAsCsv" -H "accept:application/json;charset=utf-8" -H "Content-Type:multipart/form-data" -F "userId=auth0|5fc8e95991a780006f180d4d" -F "token=YOUR TOKEN HERE" -F "dbvendor=dbvoracle" -F "showRelationType=fdd" -F "sqlfile=@c:\prg\tmp\demo.sql" --output c:\prg\tmp\demo.csv
|
||||
```
|
||||
|
||||
|
||||
**Note:**
|
||||
* -H "Content-Type:multipart/form-data" is required.
|
||||
* Add **@** before the upload file path
|
||||
* --output is required.
|
||||
* Optional, if you just want to fetch table to table relations, please add **-F "tableToTable=true"**
|
||||
|
||||
|
||||
#### 4. Submit multiple SQL files and get the data lineage in CSV, JSON, graphml format.
|
||||
<a href="sqlflow-job-api-tutorial.md">Rest APIs: Job</a>
|
||||
|
||||
### The full reference to the Rest APIs
|
||||
|
||||
[SQLFlow rest API reference](sqlflow_api.md)
|
||||
|
||||
### Troubleshooting
|
||||
|
||||
- Under windows, you may need to add option `--ssl-no-revoke` to avoid some security issues, `curl --ssl-no-revoke`
|
||||
|
|
@ -0,0 +1,255 @@
|
|||
- [SQLFlow Job API tutorial](#sqlflow-job-api-tutorial)
|
||||
* [1. Prerequisites](#1-prerequisites)
|
||||
+ [Difference of the API calls between SQLFlow Cloud server and SQLFlow on-premise version](#difference-of-the-api-calls-between-sqlflow-cloud-server-and-sqlflow-on-premise-version)
|
||||
+ [Generate a token](#generate-a-token)
|
||||
* [2. Different type of Job](#2-different-type-of-job)
|
||||
* [3. Simple job rest API](#3-simple-job-rest-api)
|
||||
+ [1. Submit a sqlflow job](#1-submit-a-sqlflow-job)
|
||||
+ [2. Get job status](#2-get-job-status)
|
||||
+ [3. Export data lineage](#3-export-data-lineage)
|
||||
* [4. Regular job rest API](#4-regular-job-rest-api)
|
||||
|
||||
|
||||
## SQLFlow Job API tutorial
|
||||
|
||||
This article describes how to use the Job Rest API provided by the SQLFlow to
|
||||
communicate with the SQLFlow server and export the data lineage in json, csv, graphml formats.
|
||||
|
||||
### 1. Prerequisites
|
||||
In order to use the SQLFlow rest API, you may connect to the [**SQLFlow Cloud server**](https://sqlflow.gudusoft.com),
|
||||
Or, setup a [**SQLFlow on-premise version**](https://www.gudusoft.com/sqlflow-on-premise-version/) on your owner server.
|
||||
|
||||
|
||||
1. **SQLFlow Cloud server**
|
||||
|
||||
- User ID
|
||||
- Secrete Key
|
||||
|
||||
If you want to connect to [the SQLFlow Cloud Server](https://sqlflow.gudusoft.com), you may [request a 30 days premium account](https://www.gudusoft.com/request-a-premium-account/) to
|
||||
[get the necessary userId and secret code](/sqlflow-userid-secret.md).
|
||||
|
||||
|
||||
2. **SQLFlow on-premise version**
|
||||
|
||||
Please [check here](https://github.com/sqlparser/sqlflow_public/blob/master/install_sqlflow.md) to see how to install SQLFlow on-premise version on you own server.
|
||||
|
||||
- User ID
|
||||
- Secrete Key
|
||||
|
||||
Always set userId to `gudu|0123456789` and keep `userSecret` empty when connect to the SQLFlow on-premise version.
|
||||
|
||||
|
||||
#### Difference of the API calls between SQLFlow Cloud server and SQLFlow on-premise version
|
||||
|
||||
1. TOKEN is not needed in the API calls when connect to the SQLFlow on-premise version
|
||||
2. userId alwyas set to `gudu|0123456789` and `userSecret` leave empty when connect to the SQLFlow on-premise version.
|
||||
3. The server port is 8081 by default for the SQLFlow on-premise version, and There is no need to specify the port when connect to the SQLFlow Cloud server.
|
||||
|
||||
Regarding the server port of the SQLFlow on-premise version, please [check here](https://github.com/sqlparser/sqlflow_public/tree/master/grabit#1-sqlflow-server) for more information.
|
||||
|
||||
#### Generate a token
|
||||
|
||||
Once you have the `userid` and `secret key`, the first API need to call is:
|
||||
|
||||
```
|
||||
/gspLive_backend/user/generateToken
|
||||
```
|
||||
|
||||
This API will return a temporary token that needs to be used in the API call thereafter.
|
||||
|
||||
```
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/user/generateToken" -H "Request-Origion:testClientDemo" -H "accept:application/json;charset=utf-8" -H "Content-Type:application/x-www-form-urlencoded;charset=UTF-8" -d "secretKey=YOUR SECRET KEY" -d "userId=YOUR USER ID HERE"
|
||||
```
|
||||
|
||||
More detail, please see https://github.com/sqlparser/sqlflow_public/edit/master/api/readme.md
|
||||
|
||||
### 2. Different type of Job
|
||||

|
||||
|
||||
### 3. Simple job rest API
|
||||
|
||||
#### 1. Submit a sqlflow job
|
||||
|
||||
Call this API by sending the SQL files and get the result includes the data lineage. SQLFlow job supports both of multiple files and zip archive file.
|
||||
|
||||
```
|
||||
/gspLive_backend/sqlflow/job/submitUserJob
|
||||
```
|
||||
|
||||
Example in `Curl`
|
||||
```
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/sqlflow/job/submitUserJob" -H "accept:application/json;charset=utf-8" -H "Content-Type:multipart/form-data" -F "userId=YOUR USER ID HERE" -F "token=YOUR TOKEN HERE" -F "sqlfiles=@FIRST FILE PATH" -F "sqlfiles=@SECOND FILE PATH" -F "dbvendor=dbvmssql" -F "jobName=job1"
|
||||
```
|
||||
|
||||
**Note:**
|
||||
* **-H "Content-Type:multipart/form-data"** is required
|
||||
* Add **@** before the file path
|
||||
|
||||
Return data:
|
||||
```json
|
||||
{
|
||||
"code":200,
|
||||
"data":{
|
||||
"jobId":"c359aef4bd9641d697732422debd8055",
|
||||
"jobName":"job1",
|
||||
"userId":"google-oauth2|104002923119102769706",
|
||||
"dbVendor":"dbvmssql",
|
||||
"dataSource":{
|
||||
|
||||
},
|
||||
"fileNames":["1.sql","1.zip"],
|
||||
"createTime":"2020-12-15 15:14:39",
|
||||
"status":"create"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Please records the jobId field.
|
||||
|
||||
#### 2. Get job status
|
||||
|
||||
* Get the specify user job status and summary
|
||||
|
||||
```
|
||||
/gspLive_backend/sqlflow/job/displayUserJobSummary
|
||||
```
|
||||
|
||||
Example in `Curl`
|
||||
|
||||
```json
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/sqlflow/job/displayUserJobSummary" -F "jobId=c359aef4bd9641d697732422debd8055" -F "userId=YOUR USER ID HERE" -F "token=YOUR TOKEN HERE"
|
||||
```
|
||||
|
||||
Return data:
|
||||
```json
|
||||
{
|
||||
"code":200,
|
||||
"data":{
|
||||
"jobId":"c359aef4bd9641d697732422debd8055",
|
||||
"jobName":"job1",
|
||||
"userId":"google-oauth2|104002923119102769706",
|
||||
"dbVendor":"dbvmssql",
|
||||
"dataSource":{
|
||||
|
||||
},
|
||||
"fileNames":["1.sql","1.zip"],
|
||||
"createTime":"2020-12-15 15:14:39",
|
||||
"status":"success",
|
||||
"sessionId":"fe5898d4e1b1a7782352b50a8203ca24c04f5513446e9fb059fc4d584fab4dbf_1608045280033"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
* Get all jobs (include history jobs) status and summary
|
||||
|
||||
```
|
||||
/gspLive_backend/sqlflow/job/displayUserJobsSummary
|
||||
```
|
||||
|
||||
Example in `Curl`
|
||||
|
||||
```json
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/sqlflow/job/displayUserJobsSummary" -F "userId=YOUR USER ID HERE" -F "token=YOUR TOKEN HERE"
|
||||
```
|
||||
|
||||
|
||||
|
||||
#### 3. Export data lineage
|
||||
|
||||
When the job status is **success**, you can export the data lineage in json, csv, graphml formats
|
||||
|
||||
* 3.1 Export data lineage in json format
|
||||
|
||||
```
|
||||
/gspLive_backend/sqlflow/job/exportLineageAsJson
|
||||
```
|
||||
|
||||
Example in `Curl`
|
||||
|
||||
```
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/sqlflow/job/exportLineageAsJson" -F "userId=YOUR USER ID HERE" -F "token=YOUR TOKEN HERE" -F "jobId=c359aef4bd9641d697732422debd8055" --output lineage.json
|
||||
```
|
||||
**Note:**
|
||||
> If you want to get table to table relation, please add option -F "tableToTable=true"
|
||||
|
||||
* 3.2 Export data lineage in csv format
|
||||
|
||||
```
|
||||
/gspLive_backend/sqlflow/job/exportFullLineageAsCsv
|
||||
```
|
||||
|
||||
Example in `Curl`
|
||||
|
||||
```
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/sqlflow/job/exportFullLineageAsCsv" -F "userId=YOUR USER ID HERE" -F "token=YOUR TOKEN HERE" -F "jobId=c359aef4bd9641d697732422debd8055" --output lineage.csv
|
||||
```
|
||||
|
||||
**Note:**
|
||||
> If you want to get table to table relation, please add option -F "tableToTable=true"
|
||||
|
||||
> If you want to change csv delimiter, please add option -F "delimiter=<delimiter char>"
|
||||
|
||||
|
||||
* 3.3 Export data lineage in graphml format, you can view the lineage graph at yEd Graph Editor
|
||||
|
||||
```
|
||||
/gspLive_backend/sqlflow/job/exportLineageAsGraphml
|
||||
```
|
||||
|
||||
Example in `Curl`
|
||||
|
||||
```
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/sqlflow/job/exportLineageAsGraphml" -F "userId=YOUR USER ID HERE" -F "token=YOUR TOKEN HERE" -F "jobId=c359aef4bd9641d697732422debd8055" --output lineage.graphml
|
||||
```
|
||||
|
||||
**Note:**
|
||||
> If you want to get table to table relation, please add option -F "tableToTable=true"
|
||||
|
||||
### 4. Regular job rest API
|
||||
|
||||
#### 1. Submit a regular job
|
||||
|
||||
Call this API by sending the SQL files and get the result includes the data lineage. SQLFlow job supports both of multiple files and zip archive file.
|
||||
|
||||
If the job is incremental, please set incremental=true
|
||||
* first submit, jobId is null, and record the jobId field from response message
|
||||
* second submit, jobId can't be null, please fill the jobId which returns by the first submit response.
|
||||
|
||||
```
|
||||
/gspLive_backend/sqlflow/job/submitPersistJob
|
||||
```
|
||||
|
||||
Example in `Curl`
|
||||
```
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/sqlflow/job/submitPersistJob" -H "accept:application/json;charset=utf-8" -H "Content-Type:multipart/form-data" -F "userId=YOUR USER ID HERE" -F "token=YOUR TOKEN HERE" -F "sqlfiles=@FIRST FILE PATH" -F "sqlfiles=@SECOND FILE PATH" -F "dbvendor=dbvmssql" -F "jobName=job1" -F "incremental=true"
|
||||
```
|
||||
|
||||
Incremental submit in `Curl`
|
||||
```
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/sqlflow/job/submitPersistJob" -H "accept:application/json;charset=utf-8" -H "Content-Type:multipart/form-data" -F "userId=YOUR USER ID HERE" -F "token=YOUR TOKEN HERE" -F "sqlfiles=@FIRST FILE PATH" -F "sqlfiles=@SECOND FILE PATH" -F "dbvendor=dbvmssql" -F "jobName=job1" -F "incremental=true" -F "jobId=JobId OF FIRST SUBMIT"
|
||||
```
|
||||
|
||||
**Note:**
|
||||
* **-H "Content-Type:multipart/form-data"** is required
|
||||
* Add **@** before the file path
|
||||
|
||||
Return data:
|
||||
```json
|
||||
{
|
||||
"code":200,
|
||||
"data":{
|
||||
"jobId":"c359aef4bd9641d697732422debd8055",
|
||||
"jobName":"job1",
|
||||
"userId":"google-oauth2|104002923119102769706",
|
||||
"dbVendor":"dbvmssql",
|
||||
"dataSource":{
|
||||
|
||||
},
|
||||
"fileNames":["1.sql","1.zip"],
|
||||
"createTime":"2020-12-15 15:14:39",
|
||||
"status":"create"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Please records the jobId field.
|
||||
|
|
@ -0,0 +1,465 @@
|
|||
# SQLFlow WebAPI
|
||||
|
||||
|
||||
## JWT Client API Authorization (for sqlflow client api call)
|
||||
* All of the restful requests are based on JWT authorization. Before accessing the sqlflow WebAPI, client user needs to obtain the corresponding JWT token for legal access.
|
||||
|
||||
* How to get JWT Token
|
||||
1. Login on [the sqlflow web](https://sqlflow.gudusoft.com), upgrade to premium account.
|
||||
2. Move mouse on the login user image, select Account menu item, click the "generate" button to generate the user secret key.
|
||||
3. When you get the user secret key, you can call **/gspLive_backend/user/generateToken** api to obtain a token, the ttl of new token is 24 hours.
|
||||
4. **/gspLive_backend/user/generateToken**
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **secretKey**: the secret key of sqlflow user for webapi request, required **true**
|
||||
|
||||
* How to use JWT Token for security authentication?
|
||||
* Each webapi contains two parameters, named userId and token.
|
||||
|
||||
## WebAPI
|
||||
|
||||
### Sqlflow Generation Interface
|
||||
|
||||
* **/sqlflow/generation/sqlflow/graph**
|
||||
* Description: generate sqlflow model and graph
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: The token is only used when connecting to the SQLFlow Cloud server, not used when connect to the SQLFlow on-premise version.
|
||||
* sqltext: sql text, optional
|
||||
* sqlfile: sql file, optional
|
||||
* **dbvendor**: database vendor, required **true**, available values:
|
||||
* dbvbigquery, dbvcouchbase,dbvdb2,dbvgreenplum,dbvhana,dbvhive,dbvimpala,dbvinformix,dbvmdx,dbvmysql,dbvnetezza,dbvopenedge,dbvoracle,dbvpostgresql,dbvredshift,dbvsnowflake,dbvmssql,dbvsybase,dbvteradata,dbvvertica
|
||||
* simpleOutput: simple output, ignore the intermediate results, defualt is false.
|
||||
* ignoreRecordSet: same as simpleOutput, but will keep output of the top level select list, default is false.
|
||||
* dataflowOfAggregateFunction, treat the dataflow generated by the aggregate function as direct dataflow or not ,default is direct.
|
||||
* hideColumn: whether hide the column ui, required false, default value is false
|
||||
* ignoreFunction: whether ignore the function relations, required false, default value is false
|
||||
* showConstantTable: return constant or not, default is false.
|
||||
* showLinkOnly: whether show relation linked columns only, required false, default value is true
|
||||
* showRelationType: show relation type, optional, default value is **fdd**, multiple values seperated by comma like fdd,frd,fdr. Available values:
|
||||
* **fdd**: value of target column from source column
|
||||
* **join**: combine rows from two or more tables, based on a related column between them
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
|
||||
|
||||
* Sample:
|
||||
* test sql:
|
||||
```sql
|
||||
select name from user
|
||||
```
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/generation/sqlflow/graph" -H "accept:application/json;charset=utf-8" -F "userId=your user id here" -F "token=your token here" -F "dbvendor=dbvoracle" -F "ignoreFunction=true" -F "ignoreRecordSet=true" -F "sqltext=select name from user"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"mode": "global",
|
||||
"summary": {
|
||||
...
|
||||
},
|
||||
"sqlflow": {
|
||||
"dbvendor": "dbvoracle",
|
||||
"dbobjs": [
|
||||
...
|
||||
]
|
||||
},
|
||||
"graph": {
|
||||
"elements": {
|
||||
"tables": [
|
||||
...
|
||||
],
|
||||
"edges": [
|
||||
...
|
||||
]
|
||||
},
|
||||
"tooltip": {},
|
||||
"relationIdMap": {
|
||||
...
|
||||
},
|
||||
"listIdMap": {
|
||||
...
|
||||
}
|
||||
}
|
||||
},
|
||||
"sessionId": "6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501562051"
|
||||
}
|
||||
```
|
||||
|
||||
* **/sqlflow/generation/sqlflow/selectedgraph**
|
||||
* Description: generate sqlflow model and selected dbobject graph
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **sessionId**: request sessionId, the value is from api **/sqlflow/generation/sqlflow/graph**, required **true**
|
||||
* database: selected database, required false
|
||||
* schema: selected schema, required false
|
||||
* table: selected table, required false
|
||||
* isReturnModel: whether return the sqlflow model, required false, default value is true
|
||||
* **dbvendor**: database vendor, required **true**, available values:
|
||||
* dbvbigquery, dbvcouchbase,dbvdb2,dbvgreenplum,dbvhana,dbvhive,dbvimpala,dbvinformix,dbvmdx,dbvmysql,dbvnetezza,dbvopenedge,dbvoracle,dbvpostgresql,dbvredshift,dbvsnowflake,dbvmssql,dbvsybase,dbvteradata,dbvvertica
|
||||
* showRelationType: show relation type, required false, default value is **fdd**, multiple values seperated by comma like fdd,frd,fdr. Available values:
|
||||
* **fdd**: value of target column from source column
|
||||
* **frd**: the recordset count of target column which is affect by value of source column
|
||||
* **fdr**: value of target column which is affected by the recordset count of source column
|
||||
* **join**: combine rows from two or more tables, based on a related column between them
|
||||
* simpleOutput: whether output relation simply, required false, default value is false
|
||||
* ignoreRecordSet: whether ignore the record sets, required false, default value is false
|
||||
* showLinkOnly: whether show relation linked columns only, required false, default value is true
|
||||
* hideColumn: whether hide the column ui, required false, default value is false
|
||||
* ignoreFunction: whether ignore the function relations, required false, default value is false
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql:
|
||||
```sql
|
||||
select name from user
|
||||
```
|
||||
* session id: `6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501562051`
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/generation/sqlflow/selectedgraph" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "dbvendor=dbvoracle" -F "ignoreFunction=true" -F "ignoreRecordSet=true" -F "isReturnModel=false" -F "sessionId=6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501562051" -F "table=user"
|
||||
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"mode": "global",
|
||||
"summary": {
|
||||
...
|
||||
},
|
||||
"graph": {
|
||||
"elements": {
|
||||
"tables": [
|
||||
...
|
||||
],
|
||||
"edges": [
|
||||
...
|
||||
]
|
||||
},
|
||||
"tooltip": {},
|
||||
"relationIdMap": {
|
||||
...
|
||||
},
|
||||
"listIdMap": {
|
||||
...
|
||||
}
|
||||
}
|
||||
},
|
||||
"sessionId": "6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501562051"
|
||||
}
|
||||
```
|
||||
* **/sqlflow/generation/sqlflow/getSelectedDbObjectInfo**
|
||||
* Description: get the selected dbobject information, such as file information, sql index, dbobject positions, sql which contains selected dbobject.
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **sessionId**: request sessionId, the value is from api **/sqlflow/generation/sqlflow/graph**, required **true**
|
||||
* **coordinates**: the select dbobject positions, it's a json array string, the value is from api **/sqlflow/generation/sqlflow/graph**, required **true**
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql:
|
||||
```sql
|
||||
select name from user
|
||||
```
|
||||
* session id: `6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501562051`
|
||||
* coordinates: `[{'x':1,'y':8,'hashCode':'3630d5472af5f149fe3fb2202c8a338d'},{'x':1,'y':12,'hashCode':'3630d5472af5f149fe3fb2202c8a338d'}]`
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/generation/sqlflow/getSelectedDbObjectInfo" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "coordinates=[{'x':1,'y':8,'hashCode':'3630d5472af5f149fe3fb2202c8a338d'},{'x':1,'y':12,'hashCode':'3630d5472af5f149fe3fb2202c8a338d'}]" -F "sessionId=6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501562051"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": [
|
||||
{
|
||||
"index": 0,
|
||||
"positions": [
|
||||
{
|
||||
"x": 1,
|
||||
"y": 8
|
||||
},
|
||||
{
|
||||
"x": 1,
|
||||
"y": 12
|
||||
}
|
||||
],
|
||||
"sql": "select name from user"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Sqlflow User Job Interface
|
||||
* **/sqlflow/job/submitUserJob**
|
||||
* Description: submit user job for multiple sql files, support zip file.
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **jobName**: job name, required **true**
|
||||
* **dbvendor**: database vendor, required **true**, available values:
|
||||
* dbvbigquery, dbvcouchbase,dbvdb2,dbvgreenplum,dbvhana,dbvhive,dbvimpala,dbvinformix,dbvmdx,dbvmysql,dbvnetezza,dbvopenedge,dbvoracle,dbvpostgresql,dbvredshift,dbvsnowflake,dbvmssql,dbvsybase,dbvteradata,dbvvertica
|
||||
* **sqlfiles**: request sql files, please use **multiple parts** to submit the sql files, required **true**
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql file: D:\sql.txt
|
||||
* curl command: **Note: please add **@** before the sql file path**
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/job/submitUserJob" -H "accept:application/json;charset=utf-8" -H "Content-Type:multipart/form-data" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "sqlfiles=@D:/sql.txt" -F "dbvendor=dbvoracle" -F "jobName=job_test"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"jobId": "6218721f092540c5a771ca8f82986be7",
|
||||
"jobName": "job_test",
|
||||
"userId": "user_test",
|
||||
"dbVendor": "dbvoracle",
|
||||
"defaultDatabase": "",
|
||||
"defaultSchema": "",
|
||||
"fileNames": [
|
||||
"sql.txt"
|
||||
],
|
||||
"createTime": "2020-09-08 10:11:28",
|
||||
"status": "create"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
* **/sqlflow/job/displayUserJobsSummary**
|
||||
* Description: get the user jobs summary information.
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql file: D:\sql.txt
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/job/displayUserJobsSummary" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"total": 1,
|
||||
"success": 1,
|
||||
"partialSuccess": 0,
|
||||
"fail": 0,
|
||||
"jobIds": [
|
||||
"bb996c1ee5b741c5b4ff6c2c66c371dd"
|
||||
],
|
||||
"jobDetails": [
|
||||
{
|
||||
"jobId": "bb996c1ee5b741c5b4ff6c2c66c371dd",
|
||||
"jobName": "job_test",
|
||||
"userId": "user_test",
|
||||
"dbVendor": "dbvoracle",
|
||||
"fileNames": [
|
||||
"sql.txt"
|
||||
],
|
||||
"createTime": "2020-09-08 10:16:11",
|
||||
"status": "success",
|
||||
"sessionId": "a9f751281f8ef6936c554432e169359190d392565208931f201523e08036109d_1599531372233"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
* **/sqlflow/job/displayUserJobSummary**
|
||||
* Description: get the specify user job information.
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **jobId**: job id, the value is from user jobs summary detail, required **true**
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql file: D:\sql.txt
|
||||
* job id: bb996c1ee5b741c5b4ff6c2c66c371dd
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/job/displayUserJobSummary" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "jobId=bb996c1ee5b741c5b4ff6c2c66c371dd"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"total": 1,
|
||||
"success": 1,
|
||||
"partialSuccess": 0,
|
||||
"fail": 0,
|
||||
"jobIds": [
|
||||
"bb996c1ee5b741c5b4ff6c2c66c371dd"
|
||||
],
|
||||
"jobDetails": [
|
||||
{
|
||||
"jobId": "bb996c1ee5b741c5b4ff6c2c66c371dd",
|
||||
"jobName": "job_test",
|
||||
"userId": "user_test",
|
||||
"dbVendor": "dbvoracle",
|
||||
"fileNames": [
|
||||
"sql.txt"
|
||||
],
|
||||
"createTime": "2020-09-08 10:16:11",
|
||||
"status": "success",
|
||||
"sessionId": "a9f751281f8ef6936c554432e169359190d392565208931f201523e08036109d_1599531372233"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
* **/sqlflow/job/deleteUserJob**
|
||||
* Description: delete the user job by job id.
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **jobId**: job id, the value is from user job detail, required **true**
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql file: D:\sql.txt
|
||||
* job id: bb996c1ee5b741c5b4ff6c2c66c371dd
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/job/deleteUserJob" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "jobId=bb996c1ee5b741c5b4ff6c2c66c371dd"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"jobId": "bb996c1ee5b741c5b4ff6c2c66c371dd",
|
||||
"jobName": "job_test",
|
||||
"userId": "user_test",
|
||||
"dbVendor": "dbvoracle",
|
||||
"fileNames": [
|
||||
"sql.txt"
|
||||
],
|
||||
"createTime": "2020-09-08 10:16:11",
|
||||
"status": "delete",
|
||||
"sessionId": "a9f751281f8ef6936c554432e169359190d392565208931f201523e08036109d_1599531372233"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
* **/sqlflow/job/displayUserJobGraph**
|
||||
* Description: get the sqlflow job's model and graph
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **jobId**: job id, the value is from user jobs summary detail, required **true**
|
||||
* database: selected database, required false
|
||||
* schema: selected schema, required false
|
||||
* table: selected table, required false
|
||||
* isReturnModel: whether return the sqlflow model, required false, default value is true
|
||||
* showRelationType: show relation type, required false, default value is **fdd**, multiple values seperated by comma like fdd,frd,fdr. Available values:
|
||||
* **fdd**: value of target column from source column
|
||||
* **frd**: the recordset count of target column which is affect by value of source column
|
||||
* **fdr**: value of target column which is affected by the recordset count of source column
|
||||
* **join**: combine rows from two or more tables, based on a related column between them
|
||||
* simpleOutput: whether output relation simply, required false, default value is false
|
||||
* ignoreRecordSet: whether ignore the record sets, required false, default value is false
|
||||
* showLinkOnly: whether show relation linked columns only, required false, default value is true
|
||||
* hideColumn: whether hide the column ui, required false, default value is false
|
||||
* ignoreFunction: whether ignore the function relations, required false, default value is false
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql file: D:\sql.txt
|
||||
* job id: bb996c1ee5b741c5b4ff6c2c66c371dd
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/job/displayUserJobGraph?showRelationType=fdd&showRelationType=" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "jobId=bb996c1ee5b741c5b4ff6c2c66c371dd" -F "ignoreFunction=true" -F "ignoreRecordSet=true" -F "isReturnModel=false" -F "jobId=bb996c1ee5b741c5b4ff6c2c66c371dd" -F "table=user"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"mode": "global",
|
||||
"summary": {
|
||||
...
|
||||
},
|
||||
"graph": {
|
||||
"elements": {
|
||||
"tables": [
|
||||
...
|
||||
],
|
||||
"edges": [
|
||||
...
|
||||
],
|
||||
},
|
||||
"tooltip": {},
|
||||
"relationIdMap": {
|
||||
...
|
||||
},
|
||||
"listIdMap": {
|
||||
...
|
||||
}
|
||||
}
|
||||
},
|
||||
"sessionId": "a9f751281f8ef6936c554432e169359190d392565208931f201523e08036109d_1599531372233"
|
||||
}
|
||||
```
|
||||
|
||||
* **/sqlflow/job/updateUserJobGraphCache**
|
||||
* Description: update the user job graph cache, then user can call **/sqlflow/generation/sqlflow/selectedgraph** by sessionId, the sessionId value is from job detail.
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **jobId**: job id, the value is from user job detail, required **true**
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql file: D:\sql.txt
|
||||
* job id: bb996c1ee5b741c5b4ff6c2c66c371dd
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/job/updateUserJobGraphCache" -H "Request-Origion:SwaggerBootstrapUi" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "jobId=bb996c1ee5b741c5b4ff6c2c66c371dd"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"sessionId": "a9f751281f8ef6936c554432e169359190d392565208931f201523e08036109d_1599531372233"
|
||||
}
|
||||
}
|
||||
```
|
||||
## Swagger
|
||||
More information, please check the test environment swagger document:
|
||||
|
||||
* http://111.229.12.71:8081/gspLive_backend/doc.html?lang=en
|
||||
* Token: `eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYwMzc1NjgwMCwiaWF0IjoxNTcyMjIwODAwfQ.EhlnJO7oqAHdr0_bunhtrN-TgaGbARKvTh2URTxu9iU`
|
||||
|
|
@ -0,0 +1,517 @@
|
|||
# SQLFlow WebAPI
|
||||
|
||||
## JWT WEB Authorization (Only for sqlflow web)
|
||||
* All of the restful requests are based on JWT authorization. Before accessing the sqlflow WebAPI, web user needs to obtain the corresponding JWT token for legal access.
|
||||
* How to use JWT Token for security authentication?
|
||||
* In the header of the HTTP request, please pass the parameters:
|
||||
```
|
||||
Key: Authorization
|
||||
Value: Token <token>
|
||||
```
|
||||
|
||||
## JWT Client API Authorization (for sqlflow client api call)
|
||||
* All of the restful requests are based on JWT authorization. Before accessing the sqlflow WebAPI, client user needs to obtain the corresponding JWT token for legal access.
|
||||
|
||||
* How to get JWT Token
|
||||
1. Login on the sqlflow web
|
||||
2. Move mouse on the login user image, click the "generate token" menu item, you can get the user secret key and token, the ttl of token is 24 hours.
|
||||
3. When you get the user secret key, you can call **/gspLive_backend/user/generateToken** api to refresh token, the ttl of new token is 24 hours.
|
||||
4. **/gspLive_backend/user/generateToken**
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **secretKey**: the secret key of sqlflow user for webapi request, required **true**
|
||||
|
||||
* How to use JWT Token for security authentication?
|
||||
* Each webapi contains two parameters, named userId and token.
|
||||
|
||||
## WebAPI
|
||||
|
||||
### Sqlflow Generation Interface
|
||||
|
||||
* **/sqlflow/generation/sqlflow**
|
||||
* Description: generate sqlflow model
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* sqltext: sql text, required false
|
||||
* sqlfile: sql file, required false
|
||||
* **dbvendor**: database vendor, required **true**, available values:
|
||||
* dbvbigquery, dbvcouchbase,dbvdb2,dbvgreenplum,dbvhana,dbvhive,dbvimpala,dbvinformix,dbvmdx,dbvmysql,dbvnetezza,dbvopenedge,dbvoracle,dbvpostgresql,dbvredshift,dbvsnowflake,dbvmssql,dbvsybase,dbvteradata,dbvvertica
|
||||
* showRelationType: show relation type, required false, default value is **fdd**, multiple values seperated by comma like fdd,frd,fdr. Available values:
|
||||
* **fdd**: value of target column from source column
|
||||
* **frd**: the recordset count of target column which is affect by value of source column
|
||||
* **fdr**: value of target column which is affected by the recordset count of source column
|
||||
* **join**: combine rows from two or more tables, based on a related column between them
|
||||
* simpleOutput: whether simple output relation, required false, default value is false
|
||||
* ignoreRecordSet: whether ignore the record set, required false, default value is false
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql:
|
||||
```sql
|
||||
select name from user
|
||||
```
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/generation/sqlflow" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "dbvendor=dbvoracle" -F "ignoreRecordSet=true" -F "sqltext=select name from user"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"dbvendor": "dbvoracle",
|
||||
"dbobjs": [
|
||||
...
|
||||
],
|
||||
"relations": [
|
||||
...
|
||||
]
|
||||
},
|
||||
"sessionId": "6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501108040"
|
||||
}
|
||||
```
|
||||
|
||||
* **/sqlflow/generation/sqlflow/graph**
|
||||
* Description: generate sqlflow model and graph
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* sqltext: sql text, required false
|
||||
* sqlfile: sql file, required false
|
||||
* **dbvendor**: database vendor, required **true**, available values:
|
||||
* dbvbigquery, dbvcouchbase,dbvdb2,dbvgreenplum,dbvhana,dbvhive,dbvimpala,dbvinformix,dbvmdx,dbvmysql,dbvnetezza,dbvopenedge,dbvoracle,dbvpostgresql,dbvredshift,dbvsnowflake,dbvmssql,dbvsybase,dbvteradata,dbvvertica
|
||||
* showRelationType: show relation type, required false, default value is **fdd**, multiple values seperated by comma like fdd,frd,fdr. Available values:
|
||||
* **fdd**: value of target column from source column
|
||||
* **frd**: the recordset count of target column which is affect by value of source column
|
||||
* **fdr**: value of target column which is affected by the recordset count of source column
|
||||
* **join**: combine rows from two or more tables, based on a related column between them
|
||||
* simpleOutput: whether output relation simply, required false, default value is false
|
||||
* ignoreRecordSet: whether ignore the record sets, required false, default value is false
|
||||
* showLinkOnly: whether show relation linked columns only, required false, default value is true
|
||||
* hideColumn: whether hide the column ui, required false, default value is false
|
||||
* ignoreFunction: whether ignore the function relations, required false, default value is false
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql:
|
||||
```sql
|
||||
select name from user
|
||||
```
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/generation/sqlflow/graph" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "dbvendor=dbvoracle" -F "ignoreFunction=true" -F "ignoreRecordSet=true" -F "sqltext=select name from user"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"mode": "global",
|
||||
"summary": {
|
||||
...
|
||||
},
|
||||
"sqlflow": {
|
||||
"dbvendor": "dbvoracle",
|
||||
"dbobjs": [
|
||||
...
|
||||
]
|
||||
},
|
||||
"graph": {
|
||||
"elements": {
|
||||
"tables": [
|
||||
...
|
||||
],
|
||||
"edges": [
|
||||
...
|
||||
]
|
||||
},
|
||||
"tooltip": {},
|
||||
"relationIdMap": {
|
||||
...
|
||||
},
|
||||
"listIdMap": {
|
||||
...
|
||||
}
|
||||
}
|
||||
},
|
||||
"sessionId": "6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501562051"
|
||||
}
|
||||
```
|
||||
|
||||
* **/sqlflow/generation/sqlflow/selectedgraph**
|
||||
* Description: generate sqlflow model and selected dbobject graph
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **sessionId**: request sessionId, the value is from api **/sqlflow/generation/sqlflow/graph**, required **true**
|
||||
* database: selected database, required false
|
||||
* schema: selected schema, required false
|
||||
* table: selected table, required false
|
||||
* isReturnModel: whether return the sqlflow model, required false, default value is true
|
||||
* **dbvendor**: database vendor, required **true**, available values:
|
||||
* dbvbigquery, dbvcouchbase,dbvdb2,dbvgreenplum,dbvhana,dbvhive,dbvimpala,dbvinformix,dbvmdx,dbvmysql,dbvnetezza,dbvopenedge,dbvoracle,dbvpostgresql,dbvredshift,dbvsnowflake,dbvmssql,dbvsybase,dbvteradata,dbvvertica
|
||||
* showRelationType: show relation type, required false, default value is **fdd**, multiple values seperated by comma like fdd,frd,fdr. Available values:
|
||||
* **fdd**: value of target column from source column
|
||||
* **frd**: the recordset count of target column which is affect by value of source column
|
||||
* **fdr**: value of target column which is affected by the recordset count of source column
|
||||
* **join**: combine rows from two or more tables, based on a related column between them
|
||||
* simpleOutput: whether output relation simply, required false, default value is false
|
||||
* ignoreRecordSet: whether ignore the record sets, required false, default value is false
|
||||
* showLinkOnly: whether show relation linked columns only, required false, default value is true
|
||||
* hideColumn: whether hide the column ui, required false, default value is false
|
||||
* ignoreFunction: whether ignore the function relations, required false, default value is false
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql:
|
||||
```sql
|
||||
select name from user
|
||||
```
|
||||
* session id: `6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501562051`
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/generation/sqlflow/selectedgraph" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "dbvendor=dbvoracle" -F "ignoreFunction=true" -F "ignoreRecordSet=true" -F "isReturnModel=false" -F "sessionId=6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501562051" -F "table=user"
|
||||
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"mode": "global",
|
||||
"summary": {
|
||||
...
|
||||
},
|
||||
"graph": {
|
||||
"elements": {
|
||||
"tables": [
|
||||
...
|
||||
],
|
||||
"edges": [
|
||||
...
|
||||
]
|
||||
},
|
||||
"tooltip": {},
|
||||
"relationIdMap": {
|
||||
...
|
||||
},
|
||||
"listIdMap": {
|
||||
...
|
||||
}
|
||||
}
|
||||
},
|
||||
"sessionId": "6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501562051"
|
||||
}
|
||||
```
|
||||
* **/sqlflow/generation/sqlflow/getSelectedDbObjectInfo**
|
||||
* Description: get the selected dbobject information, such as file information, sql index, dbobject positions, sql which contains selected dbobject.
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **sessionId**: request sessionId, the value is from api **/sqlflow/generation/sqlflow/graph**, required **true**
|
||||
* **coordinates**: the select dbobject positions, it's a json array string, the value is from api **/sqlflow/generation/sqlflow/graph**, required **true**
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql:
|
||||
```sql
|
||||
select name from user
|
||||
```
|
||||
* session id: `6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501562051`
|
||||
* coordinates: `[{'x':1,'y':8,'hashCode':'3630d5472af5f149fe3fb2202c8a338d'},{'x':1,'y':12,'hashCode':'3630d5472af5f149fe3fb2202c8a338d'}]`
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/generation/sqlflow/getSelectedDbObjectInfo" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "coordinates=[{'x':1,'y':8,'hashCode':'3630d5472af5f149fe3fb2202c8a338d'},{'x':1,'y':12,'hashCode':'3630d5472af5f149fe3fb2202c8a338d'}]" -F "sessionId=6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501562051"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": [
|
||||
{
|
||||
"index": 0,
|
||||
"positions": [
|
||||
{
|
||||
"x": 1,
|
||||
"y": 8
|
||||
},
|
||||
{
|
||||
"x": 1,
|
||||
"y": 12
|
||||
}
|
||||
],
|
||||
"sql": "select name from user"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Sqlflow User Job Interface
|
||||
* **/sqlflow/job/submitUserJob**
|
||||
* Description: submit user job for multiple sql files, support zip file.
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **jobName**: job name, required **true**
|
||||
* **dbvendor**: database vendor, required **true**, available values:
|
||||
* dbvbigquery, dbvcouchbase,dbvdb2,dbvgreenplum,dbvhana,dbvhive,dbvimpala,dbvinformix,dbvmdx,dbvmysql,dbvnetezza,dbvopenedge,dbvoracle,dbvpostgresql,dbvredshift,dbvsnowflake,dbvmssql,dbvsybase,dbvteradata,dbvvertica
|
||||
* **sqlfiles**: request sql files, please use **multiple parts** to submit the sql files, required **true**
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql file: D:\sql.txt
|
||||
* curl command: **Note: please add **@** before the sql file path**
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/job/submitUserJob" -H "accept:application/json;charset=utf-8" -H "Content-Type:multipart/form-data" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "sqlfiles=@D:/sql.txt" -F "dbvendor=dbvoracle" -F "jobName=job_test"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"jobId": "6218721f092540c5a771ca8f82986be7",
|
||||
"jobName": "job_test",
|
||||
"userId": "user_test",
|
||||
"dbVendor": "dbvoracle",
|
||||
"defaultDatabase": "",
|
||||
"defaultSchema": "",
|
||||
"fileNames": [
|
||||
"sql.txt"
|
||||
],
|
||||
"createTime": "2020-09-08 10:11:28",
|
||||
"status": "create"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
* **/sqlflow/job/displayUserJobsSummary**
|
||||
* Description: get the user jobs summary information.
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql file: D:\sql.txt
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/job/displayUserJobsSummary" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"total": 1,
|
||||
"success": 1,
|
||||
"partialSuccess": 0,
|
||||
"fail": 0,
|
||||
"jobIds": [
|
||||
"bb996c1ee5b741c5b4ff6c2c66c371dd"
|
||||
],
|
||||
"jobDetails": [
|
||||
{
|
||||
"jobId": "bb996c1ee5b741c5b4ff6c2c66c371dd",
|
||||
"jobName": "job_test",
|
||||
"userId": "user_test",
|
||||
"dbVendor": "dbvoracle",
|
||||
"fileNames": [
|
||||
"sql.txt"
|
||||
],
|
||||
"createTime": "2020-09-08 10:16:11",
|
||||
"status": "success",
|
||||
"sessionId": "a9f751281f8ef6936c554432e169359190d392565208931f201523e08036109d_1599531372233"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
* **/sqlflow/job/displayUserJobSummary**
|
||||
* Description: get the specify user job information.
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **jobId**: job id, the value is from user jobs summary detail, required **true**
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql file: D:\sql.txt
|
||||
* job id: bb996c1ee5b741c5b4ff6c2c66c371dd
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/job/displayUserJobSummary" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "jobId=bb996c1ee5b741c5b4ff6c2c66c371dd"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"total": 1,
|
||||
"success": 1,
|
||||
"partialSuccess": 0,
|
||||
"fail": 0,
|
||||
"jobIds": [
|
||||
"bb996c1ee5b741c5b4ff6c2c66c371dd"
|
||||
],
|
||||
"jobDetails": [
|
||||
{
|
||||
"jobId": "bb996c1ee5b741c5b4ff6c2c66c371dd",
|
||||
"jobName": "job_test",
|
||||
"userId": "user_test",
|
||||
"dbVendor": "dbvoracle",
|
||||
"fileNames": [
|
||||
"sql.txt"
|
||||
],
|
||||
"createTime": "2020-09-08 10:16:11",
|
||||
"status": "success",
|
||||
"sessionId": "a9f751281f8ef6936c554432e169359190d392565208931f201523e08036109d_1599531372233"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
* **/sqlflow/job/deleteUserJob**
|
||||
* Description: delete the user job by job id.
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **jobId**: job id, the value is from user job detail, required **true**
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql file: D:\sql.txt
|
||||
* job id: bb996c1ee5b741c5b4ff6c2c66c371dd
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/job/deleteUserJob" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "jobId=bb996c1ee5b741c5b4ff6c2c66c371dd"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"jobId": "bb996c1ee5b741c5b4ff6c2c66c371dd",
|
||||
"jobName": "job_test",
|
||||
"userId": "user_test",
|
||||
"dbVendor": "dbvoracle",
|
||||
"fileNames": [
|
||||
"sql.txt"
|
||||
],
|
||||
"createTime": "2020-09-08 10:16:11",
|
||||
"status": "delete",
|
||||
"sessionId": "a9f751281f8ef6936c554432e169359190d392565208931f201523e08036109d_1599531372233"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
* **/sqlflow/job/displayUserJobGraph**
|
||||
* Description: get the sqlflow job's model and graph
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **jobId**: job id, the value is from user jobs summary detail, required **true**
|
||||
* database: selected database, required false
|
||||
* schema: selected schema, required false
|
||||
* table: selected table, required false
|
||||
* isReturnModel: whether return the sqlflow model, required false, default value is true
|
||||
* showRelationType: show relation type, required false, default value is **fdd**, multiple values seperated by comma like fdd,frd,fdr. Available values:
|
||||
* **fdd**: value of target column from source column
|
||||
* **frd**: the recordset count of target column which is affect by value of source column
|
||||
* **fdr**: value of target column which is affected by the recordset count of source column
|
||||
* **join**: combine rows from two or more tables, based on a related column between them
|
||||
* simpleOutput: whether output relation simply, required false, default value is false
|
||||
* ignoreRecordSet: whether ignore the record sets, required false, default value is false
|
||||
* showLinkOnly: whether show relation linked columns only, required false, default value is true
|
||||
* hideColumn: whether hide the column ui, required false, default value is false
|
||||
* ignoreFunction: whether ignore the function relations, required false, default value is false
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql file: D:\sql.txt
|
||||
* job id: bb996c1ee5b741c5b4ff6c2c66c371dd
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/job/displayUserJobGraph?showRelationType=fdd&showRelationType=" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "jobId=bb996c1ee5b741c5b4ff6c2c66c371dd" -F "ignoreFunction=true" -F "ignoreRecordSet=true" -F "isReturnModel=false" -F "jobId=bb996c1ee5b741c5b4ff6c2c66c371dd" -F "table=user"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"mode": "global",
|
||||
"summary": {
|
||||
...
|
||||
},
|
||||
"graph": {
|
||||
"elements": {
|
||||
"tables": [
|
||||
...
|
||||
],
|
||||
"edges": [
|
||||
...
|
||||
],
|
||||
},
|
||||
"tooltip": {},
|
||||
"relationIdMap": {
|
||||
...
|
||||
},
|
||||
"listIdMap": {
|
||||
...
|
||||
}
|
||||
}
|
||||
},
|
||||
"sessionId": "a9f751281f8ef6936c554432e169359190d392565208931f201523e08036109d_1599531372233"
|
||||
}
|
||||
```
|
||||
|
||||
* **/sqlflow/job/updateUserJobGraphCache**
|
||||
* Description: update the user job graph cache, then user can call **/sqlflow/generation/sqlflow/selectedgraph** by sessionId, the sessionId value is from job detail.
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **jobId**: job id, the value is from user job detail, required **true**
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql file: D:\sql.txt
|
||||
* job id: bb996c1ee5b741c5b4ff6c2c66c371dd
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/job/updateUserJobGraphCache" -H "Request-Origion:SwaggerBootstrapUi" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "jobId=bb996c1ee5b741c5b4ff6c2c66c371dd"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": {
|
||||
"sessionId": "a9f751281f8ef6936c554432e169359190d392565208931f201523e08036109d_1599531372233"
|
||||
}
|
||||
}
|
||||
```
|
||||
## Swagger
|
||||
More information, please check the test environment swagger document:
|
||||
|
||||
* http://111.229.12.71:8081/gspLive_backend/doc.html?lang=en
|
||||
* Token: `eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYwMzc1NjgwMCwiaWF0IjoxNTcyMjIwODAwfQ.EhlnJO7oqAHdr0_bunhtrN-TgaGbARKvTh2URTxu9iU`
|
||||
|
|
@ -0,0 +1,51 @@
|
|||
## How to use Rest API of SQLFlow
|
||||
|
||||
This article describes how to use the Rest API provided by the SQLFlow to
|
||||
communicate with the SQLFlow server and get the generated metadata and data lineage.
|
||||
|
||||
In this article, we use `Curl` to demonstrate the usage of the Rest API,
|
||||
you can use any preferred programming language as you like.
|
||||
|
||||
### Prerequisites
|
||||
To use the Rest API of the SQLFlow, you need to <a href="https://gudusoft.com">obtain a premium account</a>.
|
||||
After that, you will get the `userid` and `secret key`, which will be used in the API.
|
||||
|
||||
- User ID
|
||||
- Secrete Key
|
||||
|
||||
### Call Rest API
|
||||
|
||||
#### 1. Generate a token
|
||||
|
||||
Once you have the `userid` and `secret key`, the first API need to call is:
|
||||
|
||||
```
|
||||
/gspLive_backend/user/generateToken
|
||||
```
|
||||
|
||||
This API will return a temporary token that needs to be used in the API call thereafter.
|
||||
|
||||
```
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/user/generateToken" -H "Request-Origion:testClientDemo" -H "accept:application/json;charset=utf-8" -H "Content-Type:application/x-www-form-urlencoded;charset=UTF-8" -d "secretKey=YOUR SECRET KEY" -d "userId=YOUR USER ID HERE"
|
||||
```
|
||||
|
||||
|
||||
#### 2. Generate the data lineage
|
||||
|
||||
Call this API by sending the SQL query and get the result includes the data lineage.
|
||||
|
||||
```
|
||||
/gspLive_backend/sqlflow/generation/sqlflow
|
||||
```
|
||||
|
||||
Example in `Curl`
|
||||
```
|
||||
curl -X POST "https://api.gudusoft.com/gspLive_backend/sqlflow/generation/sqlflow?showRelationType=fdd" -H "Request-Origion:testClientDemo" -H "accept:application/json;charset=utf-8" -H "Content-Type:multipart/form-data" -F "sqlfile=" -F "dbvendor=dbvoracle" -F "ignoreRecordSet=false" -F "simpleOutput=false" -F "sqltext=CREATE VIEW vsal as select * from emp" -F "userId=YOUR USER ID HERE" -F "token=YOUR TOKEN HERE"
|
||||
```
|
||||
|
||||
#### 3. Other features
|
||||
You can also use the rest API to submit a zip file that includes many SQL files or generate a map of the columns in the join condition.
|
||||
|
||||
### The full reference to the Rest APIs
|
||||
|
||||
[SQLFlow rest API reference](sqlflow_api.md)
|
||||
|
|
@ -0,0 +1,263 @@
|
|||
# SQLFlow Web UI Control
|
||||
|
||||

|
||||
|
||||
SQLFlow Web UI has some choice to control the result:
|
||||
|
||||
1. hide all columns
|
||||
* just affect ui, table column ui height is 0.
|
||||
2. dataflow
|
||||
* show fdd relation.
|
||||
3. impact
|
||||
* show fdr, fdr relations.
|
||||
4. show intermediate recordset
|
||||
* display or hide intermediate recordset
|
||||
5. show function
|
||||
* display or hide function
|
||||
|
||||
## Web API Call
|
||||
We use the restful api **/sqlflow/generation/sqlflow/graph** to get the sqlflow graph, it has several arguments:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* sqltext: sql text, required false
|
||||
* sqlfile: sql file, required false
|
||||
* **dbvendor**: database vendor, required **true**, available values:
|
||||
* dbvbigquery, dbvcouchbase,dbvdb2,dbvgreenplum,dbvhana,dbvhive,dbvimpala,dbvinformix,dbvmdx,dbvmysql,dbvnetezza,dbvopenedge,dbvoracle,dbvpostgresql,dbvredshift,dbvsnowflake,dbvmssql,dbvsybase,dbvteradata,dbvvertica
|
||||
* showRelationType: show relation type, required false, default value is **fdd**, multiple values seperated by comma like fdd,frd,fdr. Available values:
|
||||
* **fdd**: value of target column from source column
|
||||
* **frd**: the recordset count of target column which is affect by value of source column
|
||||
* **fdr**: value of target column which is affected by the recordset count of source column
|
||||
* **join**: combine rows from two or more tables, based on a related column between them
|
||||
* simpleOutput: whether output relation simply, required false, default value is false
|
||||
* ignoreRecordSet: whether ignore the record sets, required false, default value is false
|
||||
* showLinkOnly: whether show relation linked columns only, required false, default value is true
|
||||
* hideColumn: whether hide the column ui, required false, default value is false
|
||||
* ignoreFunction: whether ignore the function relations, required false, default value is false
|
||||
|
||||
## How to Control The Sqlflow Web UI
|
||||
1. hide all columns
|
||||
* it matches the `hideColumn` argument. If the argument is `true`, `hideColumn` will be checked.
|
||||
2. dataflow
|
||||
* it matches the `showRelationType` argument. If the argument contains `fdd`, `dataflow` will be checked.
|
||||
3. impact
|
||||
* it matches the `showRelationType` argument. If the argument contains `fdr,fdd`, `impact` will be checked.
|
||||
4. show intermediate recordset
|
||||
* it matches the `ignoreRecordSet` argument. If the argument is `true`, `show intermediate recordset` will be checked.
|
||||
5. show function
|
||||
* it matches the `ignoreFunction` argument. If the argument is `true`, `show function` will be checked.
|
||||
|
||||

|
||||
|
||||
1. Visualize join
|
||||
* show join relations.
|
||||
* it matches the `showRelationType` argument. If the argument is `join`, `Visualize join` will be displayed.
|
||||
|
||||

|
||||
|
||||
If sqlflow has some errors, it will be shown in the sqlflow json.
|
||||
Sqlflow error message has 4 types:
|
||||
* SYNTAX_ERROR
|
||||
* gsp parsing sql returns some error messages.
|
||||
* SYNTAX_HINT
|
||||
* gsp parsing sql returns some hint messages.
|
||||
* ANALYZE_ERROR
|
||||
* dataflow analyzer occurs error.
|
||||
* LINK_ORPHAN_COLUMN
|
||||
* dataflow analyzer returns linking orphan column hint.
|
||||
|
||||
## Get the Error Message Position
|
||||
Typically, if the datafow returns error messages, the lineage xml will show:
|
||||
```xml
|
||||
<dlineage>
|
||||
...
|
||||
<error errorMessage="find orphan column(10500) near: quantity(4,22)" errorType="SyntaxHint" coordinate="[4,22,0],[4,30,0]" originCoordinate="[4,22],[4,30]"/>
|
||||
</dlineage>
|
||||
```
|
||||
Noting `coordinate="[4,22,0],[4,30,0]"`, we can use it to get the error position. [4,22,0] is the start position and [4,30,0] is the end position, 0 is the index of SQLInfo hashcode.
|
||||
|
||||
## How to Use WebAPI to Point the Position
|
||||
* **/sqlflow/generation/sqlflow/getSelectedDbObjectInfo**
|
||||
* Description: get the selected dbobject information, such as file information, sql index, dbobject positions, sql which contains selected dbobject.
|
||||
* HTTP Method: **POST**
|
||||
* Parameters:
|
||||
* **userId**: the user id of sqlflow web or client, required **true**
|
||||
* **token**: the token of sqlflow client request. sqlflow web, required false, sqlflow client, required true
|
||||
* **sessionId**: request sessionId, the value is from api **/sqlflow/generation/sqlflow/graph**, required **true**
|
||||
* **coordinates**: the select dbobject positions, it's a json array string, the value is from api **/sqlflow/generation/sqlflow/graph**, required **true**
|
||||
* Return code:
|
||||
* 200: successful
|
||||
* other: failed, check the error field to get error message.
|
||||
* Sample:
|
||||
* test sql:
|
||||
```sql
|
||||
select name from user
|
||||
```
|
||||
* session id: `6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501562051`
|
||||
* coordinates: `[{'x':1,'y':8,'hashCode':'0'},{'x':1,'y':12,'hashCode':'0'}]`
|
||||
* curl command:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:8081/gspLive_backend/sqlflow/generation/sqlflow/getSelectedDbObjectInfo" -H "accept:application/json;charset=utf-8" -F "userId=google-oauth2|104002923119102769706" -F "token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiJndWR1c29mdCIsImV4cCI6MTYxMDEyMTYwMCwiaWF0IjoxNTc4NTg1NjAwfQ.9AAIkjZ3NF7Pns-hRjZQqRHprcsj1dPKHquo8zEp7jE" -F "coordinates=[{'x':1,'y':8,'hashCode':'3630d5472af5f149fe3fb2202c8a338d'},{'x':1,'y':12,'hashCode':'3630d5472af5f149fe3fb2202c8a338d'}]" -F "sessionId=6172a4095280ccce97e996242d8b4084f46e2c954455e71339aeffccad5f0d57_1599501562051"
|
||||
```
|
||||
* response:
|
||||
```json
|
||||
{
|
||||
"code": 200,
|
||||
"data": [
|
||||
{
|
||||
"index": 0,
|
||||
"positions": [
|
||||
{
|
||||
"x": 1,
|
||||
"y": 8
|
||||
},
|
||||
{
|
||||
"x": 1,
|
||||
"y": 12
|
||||
}
|
||||
],
|
||||
"sql": "select name from user"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Get SQL Information By SQLFLow Coordinate
|
||||
|
||||
### SQLInfo
|
||||
When the sqlflow analyzing sql has been finished, it recorded some sql information, we can use it to locate database object position.
|
||||
|
||||
```java
|
||||
public class SqlInfo {
|
||||
private String fileName;
|
||||
private String sql;
|
||||
private int originIndex;
|
||||
private int index;
|
||||
private String group;
|
||||
private int originLineStart;
|
||||
private int originLineEnd;
|
||||
private int lineStart;
|
||||
private int lineEnd;
|
||||
private String hash;
|
||||
}
|
||||
```
|
||||
|
||||
Each sql file matches a SqlInfo object, and the map key is "hash" property.
|
||||
|
||||
Sqlflow provides a tool class **gudusoft.gsqlparser.dlineage.util.SqlInfoHelper**, which can transform dataflow coordinate to `DbObjectPosition`.
|
||||
|
||||
### SqlInfoHelper
|
||||
|
||||
1. First step, call api `SqlInfoHelper.getSqlInfoJson` to fetch the sqlinfo map from the DataFlowAnalyzer object, and persist it.
|
||||
```java
|
||||
public static String getSqlInfoJson(DataFlowAnalyzer analyzer);
|
||||
```
|
||||
|
||||
2. Second step, initialize the SqlInfoHelper with the sqlinfo json string.
|
||||
```java
|
||||
//Constructor
|
||||
public SqlInfoHelper(String sqlInfoJson);
|
||||
```
|
||||
|
||||
3. Third step, transform sqlflow position string to `dataflow.model.json.Coordinate` array.
|
||||
* If you use the `dataflow.model.json.DataFlow` model, you can get the Coordinate object directly, doesn't need any transform.
|
||||
* If you use the `dataflow.model.xml.dataflow` model, you can call api `SqlInfoHelper.parseCoordinateString`
|
||||
```java
|
||||
public static Coordinate[][] parseCoordinateString(String coordinate);
|
||||
```
|
||||
* Method parseCoordinateString support both of xml output coordinate string and json output coordinate string, like these:
|
||||
```
|
||||
//xml output coordinate string
|
||||
[56,36,0],[56,62,0]
|
||||
|
||||
//json output coordinate string
|
||||
[{"x":31,"y":36,"hashCode":"0"},{"x":31,"y":38,"hashCode":"0"}]
|
||||
```
|
||||
|
||||
4. Fourth step, get the DbObjectPosition by api `getSelectedDbObjectInfo`
|
||||
```java
|
||||
public DbObjectPosition getSelectedDbObjectInfo(Coordinate start, Coordinate end);
|
||||
```
|
||||
* Each position has two coordinates, start coordinate and end coordinate. If the result of DBObject.getCoordinates() has 10 items, it matches 5 positions.
|
||||
* The position is based on the entire file, but not one statement.
|
||||
* The sql field of DbObjectPosition return all sqls of the file.
|
||||
|
||||
5. If you just want to get the specific statement information, please call the api `getSelectedDbObjectStatementInfo`
|
||||
```java
|
||||
public DbObjectPosition getSelectedDbObjectStatementInfo(EDbVendor vendor, Coordinate start, Coordinate end);
|
||||
```
|
||||
* The position is based on the statement.
|
||||
* Return the statement index of sqls, index **bases 0**.
|
||||
* Return a statement, but not all sqls of the file.
|
||||
|
||||
### How to Use DbObjectPosition
|
||||
```java
|
||||
public class DbObjectPosition {
|
||||
private String file;
|
||||
private String sql;
|
||||
private int index;
|
||||
private List<Pair<Integer, Integer>> positions = new ArrayList<Pair<Integer, Integer>>();
|
||||
}
|
||||
```
|
||||
* file field matches the sql file name.
|
||||
* sql field matches the sql content.
|
||||
* index:
|
||||
* If the sql file is from `grabit`, it's a json file, and it has an json array named "query", the value of index field is the query item index.
|
||||
* Other case, the value of index field is 0.
|
||||
* positions, locations of database object, they are matched the sql field. Position x and y **base 1** but not 0.
|
||||
|
||||
### Example 1 (getSelectedDbObjectInfo)
|
||||
```java
|
||||
String sql = "Select\n a\nfrom\n b;";
|
||||
DataFlowAnalyzer dataflow = new DataFlowAnalyzer(sql, EDbVendor.dbvmssql, false);
|
||||
dataflow.generateDataFlow(new StringBuffer());
|
||||
dataflow flow = dataflow.getDataFlow();
|
||||
String coordinate = flow.getTables().get(0).getCoordinate();
|
||||
Coordinate[][] coordinates = SqlInfoHelper.parseCoordinateString(coordinate);
|
||||
SqlInfoHelper helper = new SqlInfoHelper(SqlInfoHelper.getSqlInfoJson(dataflow));
|
||||
DbObjectPosition position = helper.getSelectedDbObjectInfo(coordinates[0][0], coordinates[0][1]);
|
||||
System.out.println(position.getSql());
|
||||
System.out.println("table " + flow.getTables().get(0).getName() + " position is " + Arrays.toString(position.getPositions().toArray()));
|
||||
```
|
||||
|
||||
Return:
|
||||
```java
|
||||
Select
|
||||
a
|
||||
from
|
||||
b;
|
||||
|
||||
table b position is [[4,2], [4,3]]
|
||||
```
|
||||
|
||||
### Example 2 (getSelectedDbObjectStatementInfo)
|
||||
```java
|
||||
String sql = "Select\n a\nfrom\n b;\n Select c from d;";
|
||||
DataFlowAnalyzer dataflow = new DataFlowAnalyzer(sql, EDbVendor.dbvmssql, false);
|
||||
dataflow.generateDataFlow(new StringBuffer());
|
||||
gudusoft.gsqlparser.dlineage.dataflow.model.xml.dataflow flow = dataflow.getDataFlow();
|
||||
String coordinate = flow.getTables().get(1).getCoordinate();
|
||||
Coordinate[][] coordinates = SqlInfoHelper.parseCoordinateString(coordinate);
|
||||
SqlInfoHelper helper = new SqlInfoHelper(SqlInfoHelper.getSqlInfoJson(dataflow));
|
||||
DbObjectPosition position = helper.getSelectedDbObjectStatementInfo(EDbVendor.dbvmssql, coordinates[0][0], coordinates[0][1]);
|
||||
System.out.println(position.getSql());
|
||||
System.out.println(
|
||||
"table " + flow.getTables().get(1).getName() + " position is " + Arrays.toString(position.getPositions().toArray()));
|
||||
System.out.println(
|
||||
"stmt index is " + position.getIndex());
|
||||
```
|
||||
|
||||
Return:
|
||||
```java
|
||||
Select c from d;
|
||||
table d position is [[1,20], [1,21]]
|
||||
stmt index is 1
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
After Width: | Height: | Size: 10 KiB |
|
After Width: | Height: | Size: 19 KiB |
|
After Width: | Height: | Size: 9.4 KiB |
|
After Width: | Height: | Size: 19 KiB |
|
After Width: | Height: | Size: 2.7 KiB |
|
After Width: | Height: | Size: 9.0 KiB |
|
|
@ -0,0 +1,58 @@
|
|||
## Element in the data lineage xml output generated by the SQLFlow
|
||||
|
||||
### Table
|
||||
`Table` is one of the major elements in the output of the data lineage.
|
||||
|
||||
The `type` of a `table` element can be the value of `table`, `pseudoTable`
|
||||
|
||||
#### 1. type = "table"
|
||||
This means a base table found in the SQL query.
|
||||
|
||||
```sql
|
||||
create view v123 as select a,b from employee a, name b where employee.id = name.id
|
||||
```
|
||||
|
||||
```xml
|
||||
<table id="2" name="employee" alias="a" type="table">
|
||||
```
|
||||
|
||||
#### 2. type = "pseudoTable"
|
||||
Due to the lack of metadata information, some columns can't be linked to a table correctly.
|
||||
Those columns will be assigned to a pseudo table with name: `pseudo_table_include_orphan_column`.
|
||||
The type of this table is `pseudoTable`.
|
||||
|
||||
In the following sample sql, columm `a`, `b` can't be linked to a specific table without enough information,
|
||||
so a pseudo table with name `pseudo_table_include_orphan_column` is created to contain those orphan columns.
|
||||
|
||||
```sql
|
||||
create view v123 as select a,b from employee a, name b where employee.id = name.id
|
||||
```
|
||||
|
||||
```xml
|
||||
<table id="11" name="pseudo_table_include_orphan_column" type="pseudoTable" coordinate="[1,1,f904f8312239df09d5e008bb9d69b466],[1,35,f904f8312239df09d5e008bb9d69b466]">
|
||||
<column id="12" name="a" coordinate="[1,28,f904f8312239df09d5e008bb9d69b466],[1,29,f904f8312239df09d5e008bb9d69b466]"/>
|
||||
<column id="14" name="b" coordinate="[1,30,f904f8312239df09d5e008bb9d69b466],[1,31,f904f8312239df09d5e008bb9d69b466]"/>
|
||||
</table>
|
||||
```
|
||||
|
||||
#### tableType
|
||||
In the most case of SQL query, the table used is a base table.
|
||||
However, derived tables are also used in the from clause or other places.
|
||||
|
||||
The `tableType` property in the `table` element tells you what kind of the derived table this table is.
|
||||
|
||||
Take the following sql for example, `WarehouseReporting.dbo.fnListToTable` is a function that
|
||||
used as a derived table. So, the value of `tableType` is `function`.
|
||||
|
||||
Currently(GSP 2.2.0.6), `function` is the only value of `tableType`. More value of `tableType` will be added in the later version
|
||||
such as `JSON_TABLE` for JSON_TABLE.
|
||||
|
||||
```sql
|
||||
select entry as Account FROM WarehouseReporting.dbo.fnListToTable(@AccountList)
|
||||
```
|
||||
|
||||
```xml
|
||||
<table id="2" database="WarehouseReporting" schema="dbo" name="WarehouseReporting.dbo.fnListToTable" type="table" tableType="function" coordinate="[1,30,15c3ec5e6df0919bb570c4d8cdd66651],[1,87,15c3ec5e6df0919bb570c4d8cdd66651]">
|
||||
<column id="3" name="entry" coordinate="[1,8,15c3ec5e6df0919bb570c4d8cdd66651],[1,13,15c3ec5e6df0919bb570c4d8cdd66651]"/>
|
||||
</table>
|
||||
```
|
||||
|
After Width: | Height: | Size: 17 KiB |
|
After Width: | Height: | Size: 11 KiB |
|
After Width: | Height: | Size: 33 KiB |
|
After Width: | Height: | Size: 21 KiB |
|
After Width: | Height: | Size: 33 KiB |
|
After Width: | Height: | Size: 13 KiB |
|
|
@ -0,0 +1,70 @@
|
|||
## Automated data lineage from Azure (Command Line Mode)
|
||||
This article introduces how to discover the data lineage from azure scripts or the azure database and automatically update it.
|
||||
So the business users and developers can see the azure data lineage graph instantly.
|
||||
|
||||
### Software used in this solution
|
||||
- [SQLFlow Cloud](https://sqlflow.gudusoft.com) Or [SQLFlow on-premise version](https://www.gudusoft.com/sqlflow-on-premise-version/)
|
||||
- [Grabit tool](https://www.gudusoft.com/grabit/) for SQLFlow. It's free.
|
||||
|
||||
|
||||
### Install grabit tool
|
||||
After [download grabit tool](https://www.gudusoft.com/grabit/), please [check this article](https://github.com/sqlparser/sqlflow_public/tree/master/grabit)
|
||||
to see how to setup the grabit tool.
|
||||
|
||||
### Discover data lineage in a Azure database
|
||||
- Modify the `conf-template\azure-config-template` to meet your environment.
|
||||
|
||||
Here is a sample config file: `azure-config` that grabs metadata from the remote azure database
|
||||
and sends the metadata to the SQLFlow Cloud to discover the data lineage.
|
||||
|
||||
It would help if you had [a premium account](https://github.com/sqlparser/sqlflow_public/blob/master/sqlflow-userid-secret.md) to access the SQLFlow Cloud.
|
||||
|
||||
|
||||
```json
|
||||
{
|
||||
"databaseType":"azure",
|
||||
"optionType":1,
|
||||
"resultType":1,
|
||||
"databaseServer":{
|
||||
"hostname":"azure ip address",
|
||||
"port":"1433",
|
||||
"username":"azure user name",
|
||||
"password":"your password here",
|
||||
"database":"",
|
||||
"extractedDbsSchemas":"",
|
||||
"excludedDbsSchemas":"",
|
||||
"extractedStoredProcedures":"",
|
||||
"extractedViews":"",
|
||||
"enableQueryHistory":false,
|
||||
"queryHistoryBlockOfTimeInMinutes":30
|
||||
},
|
||||
"SQLFlowServer":{
|
||||
"server":"https://api.gudusoft.com",
|
||||
"serverPort":"",
|
||||
"userId":"your sqlflow premium account id",
|
||||
"userSecret":"your sqlflow premium account secret code"
|
||||
},
|
||||
"neo4jConnection":{
|
||||
"url":"",
|
||||
"username":"",
|
||||
"password":""
|
||||
},
|
||||
"isUploadNeo4j":0
|
||||
}
|
||||
```
|
||||
|
||||
- Run grabit command-line tool, you may find the grabit.log under the logs directory.
|
||||
```
|
||||
./start.sh /f azure-config
|
||||
```
|
||||
|
||||
- Check out the diagram via this url: [https://sqlflow.gudusoft.com/#/job/latest](https://sqlflow.gudusoft.com/#/job/latest)
|
||||
|
||||
- You may save the data lineage in JSON/CSV/GRAPHML format.
|
||||
|
||||
The file will be saved under `data\datalineage` directory.
|
||||
|
||||
- Run the grabit at a scheduled time
|
||||
|
||||
[Please check the instructions here](https://github.com/sqlparser/sqlflow_public/tree/master/grabit#run-the-grabit-at-a-scheduled-time)
|
||||
|
||||
|
|
@ -0,0 +1,68 @@
|
|||
## Automated data lineage from Azure (GUI Mode)
|
||||
This article introduces how to discover the data lineage from azure scripts or the azure database and automatically update it.
|
||||
So the business users and developers can see the Azure data lineage graph instantly.
|
||||
|
||||
### Software used in this solution
|
||||
- [SQLFlow Cloud](https://sqlflow.gudusoft.com) Or [SQLFlow on-premise version](https://www.gudusoft.com/sqlflow-on-premise-version/)
|
||||
- [Grabit tool](https://www.gudusoft.com/grabit/) for SQLFlow. It's free.
|
||||
|
||||
|
||||
### Install grabit tool
|
||||
After [download grabit tool](https://www.gudusoft.com/grabit/), please [check this article](https://github.com/sqlparser/sqlflow_public/tree/master/grabit)
|
||||
to see how to setup the grabit tool.
|
||||
|
||||
### Discover data lineage in a Azure database
|
||||
- After [start up the grabit tool](https://github.com/sqlparser/sqlflow_public/tree/master/grabit#running-the-grabit-tool), this is the first UI.
|
||||
Click the `database` button.
|
||||
|
||||

|
||||
|
||||
- Select `azure` in the list
|
||||
|
||||

|
||||
|
||||
- Set the database parameters. In this example, we only discover the data lineage in DEMO_DB/PUBLIC schema.
|
||||
|
||||

|
||||
|
||||
- note
|
||||
|
||||
1.The `Database` parameter is must specified.
|
||||
|
||||
2.When the `ExtractedDBSSchemas` and `ExcludedDBSSchemas` parameters are null, all data for the currently connected database is retrieved by default.
|
||||
|
||||
3.If you just want to get all the data in the specified database, you can use the following configuration to achieve this: `ExtractedDBSSchemas: db/*`.
|
||||
|
||||
|
||||
- After grabbing the metadata from the azure database, connect to the SQLFlow server.
|
||||
It would help if you had [a premium account](https://github.com/sqlparser/sqlflow_public/blob/master/sqlflow-userid-secret.md) to access the SQLFlow Cloud.
|
||||
|
||||

|
||||
|
||||
- Submit the database metadata to the SQLFlow server and get the data lineage
|
||||

|
||||
|
||||
- Check out the diagram via this url: [https://sqlflow.gudusoft.com/#/job/latest](https://sqlflow.gudusoft.com/#/job/latest)
|
||||
|
||||

|
||||
|
||||
- You may save the data lineage in JSON/CSV/GRAPHML format
|
||||
|
||||
The file will be saved under `data\datalineage` directory.
|
||||
|
||||
### Further information
|
||||
This tutorial illustrates how to discover the data lineage of a Azure database in the grabit UI mode,
|
||||
If you like to automated the data lineage discovery, you may use the Grabit command line mode.
|
||||
|
||||
- [Discover azure data lineage in command line mode](grabit-azure-command-line.md)
|
||||
|
||||
|
||||
This tutorial illustrates how to discover the data lineage of a azure database by submitting the database
|
||||
metadata to the SQLFlow Cloud version, You may set up the [SQLFlow on-premise version](https://www.gudusoft.com/sqlflow-on-premise-version/)
|
||||
on your server to secure your information.
|
||||
|
||||
For more options of the grabit tool, please check this page.
|
||||
- [Grabit tool readme](https://github.com/sqlparser/sqlflow_public/tree/master/grabit)
|
||||
|
||||
The completed guide of SQLFlow UI
|
||||
- [How to use SQLFlow](https://github.com/sqlparser/sqlflow_public/blob/master/sqlflow_guide.md)
|
||||
|
|
@ -0,0 +1,244 @@
|
|||
## Grabit Databse Connection Information Document
|
||||
|
||||
Specify a database instance that grabit will connect to fetch the metadata that helps SQLFlow make a more precise analysis and get a more accurate result of data lineage.
|
||||
|
||||
#### Databse Connection Information UI
|
||||

|
||||
|
||||
#### Parameter Specification Of Connection Information
|
||||
|
||||
#### hostname
|
||||
|
||||
The IP of the database server that the grabit connects.
|
||||
|
||||
#### port
|
||||
|
||||
The port number of the database server that the grabit connect.
|
||||
|
||||
#### username
|
||||
|
||||
The database user used to login to the database.
|
||||
|
||||
#### password
|
||||
|
||||
The password of the database user.
|
||||
|
||||
note: the passwords can be encrypted using tools [Encrypted password](#Encrypted password), using encrypted passwords more secure.
|
||||
|
||||
#### privateKeyFile
|
||||
|
||||
Use a private key to connect, Only supports the `snowflake`.
|
||||
|
||||
#### privateKeyFilePwd
|
||||
|
||||
Generate the password for the private key, Only supports the `snowflake`.
|
||||
|
||||
#### database
|
||||
|
||||
The name of the database instance to which it is connected.
|
||||
|
||||
For azure,greenplum,netezza,oracle,postgresql,redshift,teradata databases, it represents the database name and is required, For other databases, it is optional.
|
||||
|
||||
`
|
||||
note:
|
||||
If this parameter is specified and the database to which it is connected is Azure, Greenplum, PostgreSQL, or Redshift, then only metadata under that library is extracted.
|
||||
`
|
||||
|
||||
#### extractedDbsSchemas
|
||||
|
||||
List of databases and schemas to extract, separated by
|
||||
commas, which are to be provided in the format database/schema;
|
||||
Or blank to extract all databases.
|
||||
`database1/schema1,database2/schema2,database3` or `database1.schema1,database2.schema2,database3`
|
||||
When parameter `database` is filled in, this parameter is considered a schema.
|
||||
And support wildcard characters such as `database1/*`,`*/schema`,`*/*`.
|
||||
|
||||
When the connected databases are `Oracle` and `Teradata`, this parameter is set the schemas, for example:
|
||||
|
||||
````json
|
||||
extractedDbsSchemas: "HR,SH"
|
||||
````
|
||||
|
||||
When the connected databases are `Mysql` , `Sqlserver`, `Postgresql`, `Snowflake`, `Greenplum`, `Redshift`, `Netezza`, `Azure`, this parameter is set database/schema, for example:
|
||||
|
||||
````json
|
||||
extractedDbsSchemas: "MY/ADMIN"
|
||||
````
|
||||
|
||||
|
||||
#### excludedDbsSchemas
|
||||
|
||||
This parameters works under the resultset filtered by `extractedDbsSchemas`.
|
||||
List of databases and schemas to exclude from extraction, separated by commas
|
||||
`database1/schema1,database2` or `database1.schema1,database2`
|
||||
When parameter `database` is filled in, this parameter is considered a schema.
|
||||
And support wildcard characters such as `database1/*`,`*/schema`,`*/*`.
|
||||
|
||||
When the connected databases are `Oracle` and `Teradata`, this parameter is set the schemas, for example:
|
||||
|
||||
````json
|
||||
excludedDbsSchemas: "HR"
|
||||
````
|
||||
|
||||
When the connected databases are `Mysql` , `Sqlserver`, `Postgresql`, `Snowflake`, `Greenplum`, `Redshift`, `Netezza`, `Azure`, this parameter is set database/schema, for example:
|
||||
|
||||
````json
|
||||
excludedDbsSchemas: "MY/*"
|
||||
````
|
||||
|
||||
#### extractedStoredProcedures
|
||||
|
||||
A list of stored procedures under the specified database and schema to extract, separated by
|
||||
commas, which are to be provided in the format database.schema.procedureName or schema.procedureName;
|
||||
Or blank to extract all databases, support expression.
|
||||
`database1.schema1.procedureName1,database2.schema2.procedureName2,database3.schema3,database4` or `database1/schema1/procedureName1,database2/schema2`
|
||||
|
||||
for example:
|
||||
|
||||
````json
|
||||
extractedStoredProcedures: "database.scott.vEmp*"
|
||||
````
|
||||
|
||||
or
|
||||
|
||||
````json
|
||||
extractedStoredProcedures: "database.scott"
|
||||
````
|
||||
|
||||
#### extractedViews
|
||||
|
||||
A list of stored views under the specified database and schema to extract, separated by
|
||||
commas, which are to be provided in the format database.schema.viewName or schema.viewName.
|
||||
Or blank to extract all databases, support expression.
|
||||
`database1.schema1.procedureName1,database2.schema2.procedureName2,database3.schema3,database4` or `database1/schema1/procedureName1,database2/schema2`
|
||||
|
||||
for example:
|
||||
|
||||
````json
|
||||
extractedViews: "database.scott.vEmp*"
|
||||
````
|
||||
|
||||
or
|
||||
|
||||
````json
|
||||
extractedViews: "database.scott"
|
||||
````
|
||||
|
||||
#### enableQueryHistory
|
||||
|
||||
Fetch SQL queries from the query history if set to `true` default is false.
|
||||
|
||||
#### queryHistoryBlockOfTimeInMinutes
|
||||
|
||||
When `enableQueryHistory:true`, the interval at which the SQL query was extracted in the query History,default is `30` minutes.
|
||||
|
||||
#### queryHistorySqlType
|
||||
|
||||
When `enableQueryHistory:true`, the DML type of SQL is extracted from the query History.
|
||||
When empty, all types are extracted, and when multiple types are specified, a comma separates them, such as `SELECT,UPDATE,MERGE`.
|
||||
Currently only the snowflake database supports this parameter,support types are **SHOW,SELECT,INSERT,UPDATE,DELETE,MERGE,CREATE TABLE, CREATE VIEW, CREATE PROCEDURE, CREATE FUNCTION**.
|
||||
|
||||
for example:
|
||||
|
||||
````json
|
||||
queryHistorySqlType: "SELECT,DELETE"
|
||||
````
|
||||
|
||||
#### snowflakeDefaultRole
|
||||
|
||||
This value represents the role of the snowflake database.
|
||||
|
||||
````
|
||||
note: You must define a role that has access to the SNOWFLAKE database,And assign WAREHOUSE permission to this role.
|
||||
````
|
||||
|
||||
Assign permissions to a role, for example:
|
||||
|
||||
````sql
|
||||
#create role
|
||||
use role accountadmin;
|
||||
grant imported privileges on database snowflake to role sysadmin;
|
||||
grant imported privileges on database snowflake to role customrole1;
|
||||
use role customrole1;
|
||||
select * from snowflake.account_usage.databases;
|
||||
|
||||
#To do this, the Role gives the WAREHOUSE permission
|
||||
select current_warehouse()
|
||||
use role sysadmin
|
||||
GRANT ALL PRIVILEGES ON WAREHOUSE %current_warehouse% TO ROLE customrole1;
|
||||
````
|
||||
|
||||
#### metaStore
|
||||
|
||||
If the current data source is a `Hive` or `Spark` data store, this parameter can be set to `hive` or `sparksql`. By default, this parameter is left blank.
|
||||
|
||||
|
||||
|
||||
Sample configuration of a SQL Server database:
|
||||
```json
|
||||
"hostname":"127.0.0.1",
|
||||
"port":"1433",
|
||||
"username":"sa",
|
||||
"password":"PASSWORD",
|
||||
"database":"",
|
||||
"extractedDbsSchemas":"AdventureWorksDW2019/dbo",
|
||||
"excludedDbsSchemas":"",
|
||||
"extractedStoredProcedures":"AdventureWorksDW2019.dbo.f_qry*",
|
||||
"extractedViews":"",
|
||||
"enableQueryHistory":false,
|
||||
"queryHistoryBlockOfTimeInMinutes":30,
|
||||
"snowflakeDefaultRole":"",
|
||||
"queryHistorySqlType":"",
|
||||
"metaStore":"hive"
|
||||
```
|
||||
|
||||
#### sqlsourceTableName
|
||||
|
||||
table name: **query_table**
|
||||
|
||||
| query_name | query_source |
|
||||
| ---------- | ----------------------------------- |
|
||||
| query1 | create view v1 as select f1 from t1 |
|
||||
| query2 | create view v2 as select f2 from t2 |
|
||||
| query3 | create view v3 as select f3 from t3 |
|
||||
|
||||
If you save SQL queries in a specific table, one SQL query per row.
|
||||
|
||||
Let's say: `query_table.query_source` store the source code of the query.
|
||||
We can use this query to fetch all SQL queries in this table:
|
||||
|
||||
```sql
|
||||
select query_name as queryName, query_source as querySource from query_table
|
||||
```
|
||||
|
||||
By setting the value of `sqlsourceTableName` and `sqlsourceColumnQuerySource`,`sqlsourceColumnQueryName`
|
||||
grabit can fetch all SQL queries in this table and send it to the SQLFlow to analzye the lineage.
|
||||
|
||||
In this example,
|
||||
```
|
||||
"sqlsourceTableName":"query_table"
|
||||
"sqlsourceColumnQuerySource":"query_source"
|
||||
"sqlsourceColumnQueryName":"query_name"
|
||||
```
|
||||
|
||||
Please leave `sqlsource_table_name` empty if you don't fetch SQL queries from a specific table.
|
||||
|
||||
#### sqlsourceColumnQuerySource
|
||||
In the above sample:
|
||||
```
|
||||
"sqlsourceColumnQuerySource":"query_source"
|
||||
```
|
||||
|
||||
#### sqlsourceColumnQueryName
|
||||
```
|
||||
"sqlsourceColumnQueryName":"query_name"
|
||||
```
|
||||
This parameter is optional, you don't need to speicify a query name column if it doesn't exist in the table.
|
||||
|
||||
- **fetch from query history**
|
||||
|
||||
Fetch SQL queries from the query history if set to `yes` default is no, SQL statement that can retrieve history execution from the database to which it is connected. You can specify the time for history execution. The default is 30 minutes.
|
||||
|
||||
`
|
||||
note: Currently only supported Snowflake,Sqlserver
|
||||
`
|
||||
|
After Width: | Height: | Size: 227 KiB |
|
|
@ -0,0 +1 @@
|
|||
## Greenplum
|
||||
|
|
@ -0,0 +1,29 @@
|
|||
### Discover data lineage from Hive alter table set location
|
||||
|
||||
```sql
|
||||
ALTER TABLE a.b SET LOCATION 's3://xxx/xx/1/xxx/';
|
||||
```
|
||||
#### output lineage in diagram
|
||||

|
||||
|
||||
#### output lineage in xml
|
||||
```xml
|
||||
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
|
||||
<dlineage>
|
||||
<path id="8" name="'s3://xxx/xx/1/xxx/'" uri="'s3://xxx/xx/1/xxx/'" type="path" coordinate="[1,30,0],[1,50,0]">
|
||||
<column id="9" name="*" coordinate="[-1,-1,0],[-1,-1,0]"/>
|
||||
</path>
|
||||
<process id="6" name="Query Set Table Location-1" procedureName="batchQueries" queryHashId="05e88d7c9059de6a9fcbf0b185930152" type="sstaltertable" coordinate="[1,1,0],[1,51,0]"/>
|
||||
<table id="4" database="a" name="a.b" type="table" processIds="6" coordinate="[1,13,0],[1,16,0]">
|
||||
<column id="5" name="*" coordinate="[1,1,0],[1,2,0]"/>
|
||||
</table>
|
||||
<relationship id="1" type="fdd" processId="6" processType="sstaltertable">
|
||||
<target id="5" column="*" parent_id="4" parent_name="a.b" coordinate="[1,1,0],[1,2,0]"/>
|
||||
<source id="9" column="*" parent_id="8" parent_name="'s3://xxx/xx/1/xxx/'" coordinate="[-1,-1,0],[-1,-1,0]"/>
|
||||
</relationship>
|
||||
</dlineage>
|
||||
```
|
||||
|
||||
This data lineage in xml is generated by [Gudu SQLFlow Java tool](https://www.gudusoft.com/sqlflow-java-library-2/)
|
||||
|
||||
|
||||
|
After Width: | Height: | Size: 2.0 KiB |
|
|
@ -0,0 +1,62 @@
|
|||
## Hive data lineage examples
|
||||
|
||||
- [Alter table set location](alter_table_set_location.md)
|
||||
|
||||
|
||||
## connect to hive metastore
|
||||
|
||||
Use grabit command line to connect to a MySQL database that save the
|
||||
Hive metastore. Fetch the metadata from the Hive metastore and send
|
||||
to the SQLFlow to analyze the data lineage.
|
||||
|
||||
### config file
|
||||
```json
|
||||
{
|
||||
"databaseServer":{
|
||||
"hostname":"",
|
||||
"port":"3306",
|
||||
"username":"",
|
||||
"password":"",
|
||||
"database":"",
|
||||
"extractedDbsSchemas":"",
|
||||
"excludedDbsSchemas":"",
|
||||
"extractedStoredProcedures":"",
|
||||
"extractedViews":"",
|
||||
"metaStore":"hive"
|
||||
},
|
||||
"SQLFlowServer":{
|
||||
"server":"http://127.0.0.1",
|
||||
"serverPort":"8081",
|
||||
"userId":"gudu|0123456789",
|
||||
"userSecret":""
|
||||
},
|
||||
"SQLScriptSource":"database",
|
||||
"lineageReturnFormat":"json",
|
||||
"databaseType":"mysql"
|
||||
}
|
||||
```
|
||||
|
||||
Please make sure to setup the `database` to the name of the MySQL database
|
||||
which store the Hive metastore.
|
||||
|
||||
The IP below should be the machine where the SQLFlow on-premise version is installed.
|
||||
```
|
||||
"server":"http://127.0.0.1",
|
||||
```
|
||||
|
||||
|
||||
### command line syntax
|
||||
- **mac & linux**
|
||||
```shell script
|
||||
chmod +x start.sh
|
||||
|
||||
sh start.sh /f config.json
|
||||
```
|
||||
|
||||
- **windows**
|
||||
```bat
|
||||
start.bat /f config.json
|
||||
```
|
||||
|
||||
## download the latest version grabit tool
|
||||
https://www.gudusoft.com/grabit/
|
||||
|
After Width: | Height: | Size: 118 KiB |
|
After Width: | Height: | Size: 90 KiB |
|
After Width: | Height: | Size: 202 KiB |
|
After Width: | Height: | Size: 153 KiB |
|
After Width: | Height: | Size: 229 KiB |
|
After Width: | Height: | Size: 57 KiB |