TIBCO Data Virtualization

Virtual Views

  • Virtual Data Table
  • Defined by SQL and TDV Metadata
  • Contains SQL SELECT and any ANSI-Standard SQL
  • Often federate data from multiple physical sources
  • Encapsulate business meaning
  • Hide complexity
  • Enable re-use

Views

  • Created graphically on Model and Grid tabs
  • Created Manually on SQL tab
    • Graphical tabs will be removed
    • Graphical tabs can be re-created in most cases
    • Graphical tabs are not requried
  • Move freely between graphical and manual modes

Custom Functions

  • User-defined SQL construct

  • Accept parameters, return a single scalar output

  • May be used in SELECT or WHERE clause

  • Example: SELECT UPPER(column_name) AS column_name_upper

  • TDV custom functions created by promoting SQL scripts or custom Java procedures

  • Encapsulate business functionality specific to your enterprice

  • Enable re-use across many projects and many data sources

  • Any script or procedure that outputs a single scalar can be promoted to a Custom Function

Layered Development

  • Enhance the development process but does not reduce performance at run time
  • Enables reuse of virtual resources within and across projects
  • Layers are flexible

Physical Layer

  • Abstract away the physical characteristics of the data on individual data sources
  • We generally build virtual views that are almost identical to the underlying physical data structures

Business Layer

  • Building coarse-grained data resources that have some important enterprise wide meaning and might be used by many different projects

Application Layer

  • Tailor the business layer resources to meet the needs of individual projects

Publishing Virtual Databases

  • Makes TDV resources accessible to authorized consumers

  • Accessible as a Database via JDBC, ODBC, ADO.NET, OData

  • Accessible as Web Services via REST, SOAP

  • Many resource types may be published

  • You can create as many Virtual Databases as you like

  • Catelogs and Schemas are optional, However, Catelogs are required for ODBC clients

  • Changes are automatically propagated to published resources

Publishing Web Services

  • we could publish procedures as Web Services

  • In most cases, it is not good practice to publish entire database tables as Web Services, because there might be hundreds of millions of rows, and web services over HTTP are generally not aimed at such use cases

  • Stored procedures however, can accept input parameters which can be used to filter results, therefore, a common practice is to wrap a view within a stored procedure using one or more input parameters that force the data consumer to filter data requests

  • Changes are automatically propagated to published resources

Relational Data Source Connections

  • Specify connection details

    • Host, port, database name, user ID, password
  • Introspect the data source and gather matadata

    • Catalogs, schemas, tables, columns, etc…
  • Accept and respond to queries

  • Handle insert, update, and delete transactions

  • Execute stored procedures

  • Expose database artifacts for use in virtual views

  • Understand capabilities of physical databases

  • A JDBC Driveer .jar is required

    • TDV is permitted to bundle some drivers
    • other drivers must be downloaded and installed
      • This is a one-time process
  • Copy the driver into the appropriate TDV driver

    • TDV restart is required

REST Clients

  • Uniform interface to web resources
    • Based on URIs
    • Manipulate textual represntations of web resources
    • Based on HTTP verbs (GET, POST, DELETE, UPDATE)
    • Human-readable and self-describing
  • Stateless
  • Contract-free

Automate Authentication Process

  • Soap UI
  • cURL
  • Chrome Advanced REST Client

SQL Script

  • ANSI-SQL-compliant Scripting language

  • Enables procedural logic, including

    • Conditional execution
    • Looping
    • Pipelining
    • Exception handling
    • Etc…
  • Procedural logic

  • Parameter-driven processing

  • Calls to system procedures and custom procedures

  • Exception handling and logging

  • Scripts can call other resources, including

    • Other scripts
    • Java procedures
    • Packaged queries
    • TDV procedure library

XSLT and Streaming Transformations

  • XSLT Transformations

    • Extensible Stylesheet Language Transformations
    • Industry-standard transformation language
    • In DV, most commonly used for flattening XML
  • Streaming Transformations

    • Similar to XSLT Editor
    • Useful for large data sets
  • Relational data structures provide a common format for data federation

  • Hierarchical XML structures must be flattened in order to participate in data federation projects

Procedure Joins

  • A special type of join between a View and a Stored Procedure

  • Stored Procedure is executed once per View row - for unique values only

  • Data from the View row provides input parameters to the Procedure

  • Results from all executions are aggregated and then used for the join

  • Augment View data

    • Complex calculations
    • Procedural logic
    • External data in non-standard format

Triggers

  • Execute based on

    • Time parameters
    • System events
    • User-defined events
    • JMS messages
  • Actions include

    • Send email
    • Execute a procedure
    • Gather statistics
    • Re-introspect
  • Notify adminstrators of important TDV conditions

  • Provide automated response to important TDV conditions

  • Automate common maintenance tasks

  • Enable developers to execute multiple asynchronous actions

TDV Rights

  • System-wide capabilities, including

    • Tools, such as Studio
    • Administration, such as config, status, and users
    • Access to resources, such as Views and Procedures
  • Rights may be assigned to

    • Groups
    • Users
  • During development

    • Prevent unauthorized users from accessing developer tools and resources
    • Enable developers to access needed resources
  • During ongoing operation

    • Enable System Administrators to limit access to operational controls

SOAP Data Sources

  • Enable TDV to introspect SOAP based web services
  • Responses can be transformed, federated and published
  • Access to data on internet or intranet
  • Commonly-used API for enterprise applications

Cache Index Management

  • Indexes can improve Cache read times for

    • Views and Procedures
    • Single-Table, Multi-Table
    • Full and inCremental Refreshes
  • Index Management helps

    • Reduce cache refresh time
    • reduce index create/update time
  • Manually-Indexes Cache

  • Single-Table Cache

  • Multi-Table Cache

Caching Multi Table

  • Materialized Views or Procedures

  • Wide range of relational database targets

  • Simple or highly abstract

  • Automatic refresh on configurable schedule

  • Cache data held:

    • In a relational database - contrasts with file cache
    • On multiple tables - contrasts with single-table cache
  • Useful when

    • Response time trumps latency
    • A physical data source has restricted access
    • Efficient use of indexes is desirable (*)

A specific benefit of multi table caching is that it provides physical separation of cache versions which means indexes on a cache are more efficient.

Single table caches use a cache key column to maintain logical separation of current and previous cache versions, data rows representing the previous expired version are not deleted until all transactions using the old version have completed, this means the old and new cache data may exist simultaneously in a cache for a period of time. TDV will use the appropriate cache key to ensure that correct data is always returned. However, database indexes built on cache columns may be inefficient because they will contain extraneous rows.

Multi-table cache avoids this issue and enables indexes to be as efficient as possible, but it takes more space.

Caching Single Table

  • Materialized Views or Procedures

  • Wide range of relational database targets

  • Simple or highly abstract

  • Automatic refresh on configurable schedule

  • Cache data held:

    • In a relational database - contrasts with file cache
    • On multiple tables - contrasts with single-table cache
  • Useful when

    • Response time trumps latency
    • A physical data source has restricted access

Caching Policies

All or nothing

  • Define cache refresh/expiration schedules for groups of resources

  • Maintain consistency across multiple cached resources

  • Enhance cache performance when dependencies are present

  • Not supported for incremental caching

  • Enhance data consistency across multiple caches

  • Ensure accuracy and efficiency with dependent caches

  • Cache policy can mix single-table and multi-table caching

Caching Incremental Pull Based

  • Initial - full load

  • Subsequent refreshes - changes only

  • useful when

    • Cached data set is large
    • full refresh is time-consuming

Caching Stored Procedures

  • Cache resutls for stored procedure ‘Variants’

    • Each variant cached upon first use
    • Number of cached variants is configurable
      • Default is 32, max is 99999999
      • when limit is reached, LRU variant is purged
    • All input parameters must be scalars
  • Performance improvements for

    • Long-running procedures, external web services
    • highly repetitive, high-concurrency procedures
    • Unreliable data source connections

CORS operations

  • Cross-Origin Resource Sharing
  • HTTP security standard for scripts and other operations
  • Applies to script operations like XMLHttpRequest
  • Servers may permit or deny CORS requests
  • Does not apply to simple request like <img> src
  • Requests(GET HEAD POST) execute in one step
  • other requests must be pre-flighted using OPTIONS
  • Customers serve HTML template pages from a web server in one domain and then populate these pages with REST resources from served from a TDV instance

Custom Datasource Adapters

  • New datasource adapters created by reconfiguring a template based on an existing adapter
  • No coding required
  • The template may be a product-specific datasource, or a generic JDBC datasource

Three Common use cases

  • Settings changes on a supported datasource
  • custom functionality in a supported datasource
  • connectivity to a datasource not supported by TDV out of the box

Custom Java Procedure (CJP)

  • An API that enables Java code to interact with TDV
  • TDV provides a set of interfaces to be implemented
  • The CJP is installed as a data source in TDV