The practical use of repetition and definition levels in BigQuery

Google’s Dremel paper is an interesting read that explains some of the concepts that underlie BigQuery. I am still processing the paper and have noticed a few things about repetition and definition levels that are relevant to the every day use of BigQuery. Columnar Data and Records The underlying storage format for BigQuery is columnar. One of the first pieces of advice given to people using BigQuery is to only select the rows that you need....

September 12, 2021 · 2 min · Peter Boone  · Source

Unnecessary BigQuery Optimization

I was reading a post about BigQuery cost optimization which stated this: Also remember you are charged for bytes processed in the first stage of query execution. Avoid creating a complex multistage query just to optimize for bytes processed in the intermediate stages, since there are no cost implications anyway (though you may achieve performance gains). I was curious about this, because I have written some queries to minimize reading data multiple times....

September 12, 2021 · 4 min · Peter Boone  · Source