sqlflow_public/data-lineage-xml-elements.md

59 lines
2.5 KiB
Markdown

## Element in the data lineage xml output generated by the SQLFlow
### Table
`Table` is one of the major elements in the output of the data lineage.
The `type` of a `table` element can be the value of `table`, `pseudoTable`
#### 1. type = "table"
This means a base table found in the SQL query.
```sql
create view v123 as select a,b from employee a, name b where employee.id = name.id
```
```xml
<table id="2" name="employee" alias="a" type="table">
```
#### 2. type = "pseudoTable"
Due to the lack of metadata information, some columns can't be linked to a table correctly.
Those columns will be assigned to a pseudo table with name: `pseudo_table_include_orphan_column`.
The type of this table is `pseudoTable`.
In the following sample sql, columm `a`, `b` can't be linked to a specific table without enough information,
so a pseudo table with name `pseudo_table_include_orphan_column` is created to contain those orphan columns.
```sql
create view v123 as select a,b from employee a, name b where employee.id = name.id
```
```xml
<table id="11" name="pseudo_table_include_orphan_column" type="pseudoTable" coordinate="[1,1,f904f8312239df09d5e008bb9d69b466],[1,35,f904f8312239df09d5e008bb9d69b466]">
<column id="12" name="a" coordinate="[1,28,f904f8312239df09d5e008bb9d69b466],[1,29,f904f8312239df09d5e008bb9d69b466]"/>
<column id="14" name="b" coordinate="[1,30,f904f8312239df09d5e008bb9d69b466],[1,31,f904f8312239df09d5e008bb9d69b466]"/>
</table>
```
#### tableType
In the most case of SQL query, the table used is a base table.
However, derived tables are also used in the from clause or other places.
The `tableType` property in the `table` element tells you what kind of the derived table this table is.
Take the following sql for example, `WarehouseReporting.dbo.fnListToTable` is a function that
used as a derived table. So, the value of `tableType` is `function`.
Currently(GSP 2.2.0.6), `function` is the only value of `tableType`. More value of `tableType` will be added in the later version
such as `JSON_TABLE` for JSON_TABLE.
```sql
select entry as Account FROM WarehouseReporting.dbo.fnListToTable(@AccountList)
```
```xml
<table id="2" database="WarehouseReporting" schema="dbo" name="WarehouseReporting.dbo.fnListToTable" type="table" tableType="function" coordinate="[1,30,15c3ec5e6df0919bb570c4d8cdd66651],[1,87,15c3ec5e6df0919bb570c4d8cdd66651]">
<column id="3" name="entry" coordinate="[1,8,15c3ec5e6df0919bb570c4d8cdd66651],[1,13,15c3ec5e6df0919bb570c4d8cdd66651]"/>
</table>
```