Provisioning multi-terabyte databases has always been challenging, and is the reason behind Windocks database virtualization which provides writable databases in seconds, with 99% storage compression. Now, Windocks is pleased to offer a complimentary solution in database subsetting.
Windocks database subsetting delivers a copy of the source database, reduced in size, with all tables and relationships, and with optional bias controls. The Windocks subsetter handles circular dependencies, composite keys, and other challenges automatically. All that is needed is a specified “percent of source.”
Windocks subsetting runs on Windows or Linux, or Docker container. It connects to a source database to build the subset, which is written to a new target database (on the source or different instance). The subsetter does not perform any writes to the source database. Windocks supports subsetting for Oracle, Oracle EBS, Snowflake, Aurora, SQL Server, Postgres, MySQL, AWS, Azure, and OCI databases.
Start small
Subsetting compute and memory scales with number of tables, and size of the target database. A best practice is to start with a small target, and increase incrementally as needed. Resources needed vary depending on database complexity. SQL Server databases typically include scores or hundreds of tables, compared to Oracle EBS databases with 20,000 tables and a half million columns.
Use-cases
Managing multi-tenant databases is easy with a subset based on tenant ID. A large multi-tenant database is easily divided into a subset of tenants A to F, and G to Z.
Source -> Subset A to F
Source -> Subset G to Z
Combined with synthetic data based masking, subsetted databases reduces security risk, and provides efficient support for analytics or machine learning pipelines.
Source -> Subset -> Analytics and ML pipeline development
Fully synthetic populated databases are best created from a subset. Subsets include source data that can be modeled to populate the target database with fully synthetic data that looks like the source, but is safe to share with third parties.
Source -> Subset -> Synthetic populated database
Data privacy concerns are always a concern when replicating production databases, and is easily incorporated along with subsetting.
Source -> Subset -> Masking -> Users
Windocks growing capabilities
In addition to database subsetting, Windocks supports a range of capabilities:
Database subsetting is a powerful new capability. If you’re interested to learn more, please get in touch!