Bring the world closer with Bing Wallpaper
Download the free app and enjoy breathtaking views with a new background each day.
Program for TPC-H Data Generation with Skew
The schema and queries of the TPC-H (formerly TPC-D) benchmark are widely used by people in the database community. Last published: April 26, 2016.
Important! Selecting a language below will dynamically change the complete page content to that language.
Version:
1.1
Date Published:
7/15/2024
File Name:
TPCDSkew.zip
File Size:
246.0 KB
The schema and queries of the TPC-H (formerly TPC-D) benchmark are widely used by people in the database community. One of the requirements of the benchmark is that data for columns in the database are generated from a uniform distribution. However, this requirement makes it hard for users to conclude about the robustness/effectiveness of their system since real world data distributions are often non-uniform. We have therefore created a new data generation program for TPC-H that is capable of generating a database where the columns have non-uniform (skewed) data distributions. In particular, the program can generate data from a Zipfian distribution, where the Zipf value (z), which controls the degree of skew in the data, is a parameter that can be specified to the program. In addition, the program allows the generation of a database with “mixed” data distribution where the skew of a column in the database is randomly chosen from the Zipfian values {0,1,2,3,4}. Note that the total number of rows in the tables and the total database size are not affected by our changes.Supported Operating Systems
Windows 10, Windows 7, Windows 8
- Windows 7, Windows 8, or Windows 10
- Click Download and follow the instructions.
Follow Microsoft