All Case Studies
Analytics EngineeringData Engineering
Optimizing Data Warehouse for Y-Combinator Backed Startup
Reducing query time of business-critical queries by 80%.
Background
A Y-Combinator backed startup engaged Beyond Data Consulting to optimize their data warehouse. Billions of data points per month were being loaded into their AWS Athena Data Warehouse, and they were experiencing significant delays on business-critical queries due to poorly designed partitioning schema.
Solution
The consulting team architected a new optimized partitioning and clustering structure for the company's data lake and migrated to this solution. The work included:
- Migrating old data from one S3 bucket into a different, repartitioned bucket in S3.
- Performing similar optimizations from the migration process onto the data ingestion process in AWS Kinesis to ingest data into this new repartitioned architecture.
- Introducing handling and reporting of low-quality data to report on data that didn't correctly fit the correct schema.
Results
- Improved Efficiency: Average query time on high-frequency business-critical queries were reduced by 80%.
- Improved Visibility: By adding handling and reporting of low-quality data, the time to respond to issues in the data ingestion stream was reduced significantly.
Technologies Used
PythonApache SparkAWS KinesisAWS GlueAWS Athena
More Case Studies
Analytics Engineering
Automating a Reporting System for a Building Performance Analysis Consultancy
Saving 30 labor hours per month and increasing report quality.
Read Case StudyAnalytics Engineering
Building and Automating a Media Tracking Reporting System
Improving operational efficiency and unlocking scalability.
Read Case StudyAnalytics Engineering
Increasing Profit-Per-Customer with Incentive Optimization for E-Commerce Company
Saving millions of dollars of ad-spend without losing customer purchases.
Read Case Study