Creating a Data Model
Query and Join Data Across Sources
To unify different data sources, Data Virtuality Server provides an ANSI-compatible SQL dialect that is provided by every connector. If you are familiar with SQL, you should be able to get started quickly. There are a few particularities of the Data Virtuality Server which you can find in the collection of supported SQL functions of our documentation.
This course now refers to the demo data sources available in the free trial version.
On opening a data source, such as the MySQL data source, a click on “Tables / views” will show all objects that are available in the data source:
A double-click on one of the tables opens a new SQL Editor tab with an SQL statement to query that table:
The SQL Editor is the main tool for writing, executing and analysing SQL queries. It also gives quick access to a number of wizards, e.g. for parsing CSV/XML/JSON files or exporting a result set as a CSV file.
Click the () button and run the query on the MySQL data source. The result will be shown in the Data Virtuality Studio. As both data sources support the same SQL dialect, you can do your first cross-data source join:
-- Select the Order ID from PostgreSQL and aggregate additional metrics from MySQL SELECT CAST(h.salesorderid AS STRING) AS order_id, h.ModifiedDate AS modified_date, ROUND(SUM(TaxAmt), 0) AS taxes_total, SUM(OrderQty) AS items_ordered FROM "PostgreSQL.salesorderheader" h JOIN "MySQL.salesorderdetail" d ON d.SalesOrderId = h.SalesOrderId GROUP BY 1, 2 ORDER BY 1 DESC;;
The Data Virtuality Server will execute queries on both data sources and join them in memory. The result will be shown in the Data Virtuality Studio without being cached in any way.
The SQL editor has a few particularities to consider:
- SQL statements are separated by a double semicolon (
- Identifiers, such as view, column or data source names, are optionally enclosed in double-quotes (
- Strings are enclosed in single quotes (
'This is a string')
- There are two run buttons: will run the whole content of the SQL editor, will only run the statement currently marked with your cursor
Create a Virtual Schema and Virtual View
To save the join from the previous section as part of your data model, a new virtual schema needs to be created. This can be done by right-clicking on “Virtual schemas”:
Create a new schema and name it
Virtual schemas consist of virtual views.
To create a new view in that schema, run a
CREATE VIEW statement:
CREATE VIEW sales.sales_analytics AS SELECT CAST(h.salesorderid AS STRING) AS order_id, h.ModifiedDate AS modified_date, ROUND(SUM(TaxAmt), 0) AS taxes_total, SUM(OrderQty) AS items_ordered FROM "PostgreSQL.salesorderheader" h JOIN "MySQL.salesorderdetail" d ON d.SalesOrderId = h.SalesOrderId GROUP BY 1, 2;;
After creating the view, click on the refresh button () inside the virtual schema for the view to show up.
A view can be queried just like data source tables by double-clicking them as shown before. It is also possible to create views based on other views.
View definitions are saved without impacting the underlying data source. Data sources are only accessed when a view is queried by a user.
The View Builder can help to speed up development and can be started from the main toolbar ().
When a front-end reporting tool is connected to Data Virtuality Server, the virtual schemas and views are also displayed in the front-end tool like the data sources, as if they were objects in an RDBMS.