Windocks blog

Oracle and Oracle EBS Database Subsetting

Written by Paul Stanton | Sep 27, 2025 8:56:18 PM

Provisioning multi-terabyte databases has always been challenging, and is the reason behind Windocks database virtualization which provides writable databases in seconds, with 99% storage compression.   Now, Windocks is pleased to offer a complimentary solution in database subsetting.

Windocks database subsetting delivers a copy of the source database, reduced in size, with all tables and relationships, and with optional bias controls.   The Windocks subsetter handles circular dependencies, composite keys, and other challenges automatically. All that is needed is a specified “percent of source.”

 Windocks subsetting runs on Windows or Linux, or Docker container. It connects to a source database to build the subset, which is written to a new target database (on the source or different instance). The subsetter does not perform any writes to the source database. Windocks supports subsetting for Oracle, Oracle EBS, Snowflake, Aurora, SQL Server, Postgres, MySQL, AWS, Azure, and OCI databases. 

Start small 

Subsetting compute and memory scales with number of tables, and size of the target database.   A best practice is to start with a small target, and increase incrementally as needed.  Resources needed vary depending on database complexity. SQL Server databases typically include scores or hundreds of tables, compared to Oracle EBS databases with 20,000 tables and a half million columns.

Use-cases

Managing multi-tenant databases is easy with a subset based on tenant ID.  A large multi-tenant database is easily divided into a subset of tenants A to F, and G to Z. 

        Source -> Subset A to F

        Source -> Subset G to Z

Combined with synthetic data based masking, subsetted databases reduces security risk, and provides efficient support for analytics or machine learning pipelines.  

Source -> Subset -> Analytics and ML pipeline development

Fully synthetic populated databases are best created from a subset.   Subsets include source data that can be modeled to populate the target database with fully synthetic data that looks like the source, but is safe to share with third parties.

Source -> Subset -> Synthetic populated database

Data privacy concerns are always a concern when replicating production databases, and is easily incorporated along with subsetting.

           Source -> Subset -> Masking -> Users 

Windocks growing capabilities

In addition to database subsetting, Windocks supports a range of capabilities:

  • Database virtualization delivers writable databases in seconds, with 99% storage compression. Database virtualization is commonly combined with Windocks created containers, but are also delivered to fixed instances.   Database virtualization is also commonly combined with Windocks masking, or in-house or third party masking.

  • Windows SQL Server Docker containers: introduced in 2015, Windocks continues to be the only vendor in the marketplace supporting Windows SQL Server containers.   Windocks customers streamline lower level development and test by combining SQL Server containers with SQL Server database virtualization.   Running 50 lower level database environments on a single server drives savings, efficiency, and improved security. This combination is also the go-to solution for distributed team support. 
     
  • Synthetic data and masking: Windocks synthetic data supports replacement of PII data (ie., masking), and can populate tables and databases with fully synthetic data that is modeled to reflect source data distribution.

  • Analytics and ML feature data pipeline: a set of UI based tools to create high quality analytics and ML feature data sets sourced from databases.   Tools include schema parsing and cross table and cross database joining, data cleansing, data normalization, aggregation, statistical measures, and logical features.  

Database subsetting is a powerful new capability.   If you’re interested to learn more, please get in touch!