NULL in BigQuery

Keeping track of how NULL values are handled in different SQL dialects can be tricky. This post will serve as a living document where I keep track of how BigQuery does this. NULL and STRING comparison Comparison always returns NULL. SELECT CAST(NULL AS STRING) = '', -- null CAST(NULL AS STRING) = 'value', -- null CAST(NULL AS STRING) = CAST(NULL AS STRING) -- null NULL and BOOL comparison NULL is not TRUE or FALSE....

September 14, 2021 · 3 min · Peter Boone  · Source

The practical use of repetition and definition levels in BigQuery

Google’s Dremel paper is an interesting read that explains some of the concepts that underlie BigQuery. I am still processing the paper and have noticed a few things about repetition and definition levels that are relevant to the every day use of BigQuery. Columnar Data and Records The underlying storage format for BigQuery is columnar. One of the first pieces of advice given to people using BigQuery is to only select the rows that you need....

September 12, 2021 · 2 min · Peter Boone  · Source

Unnecessary BigQuery Optimization

I was reading a post about BigQuery cost optimization which stated this: Also remember you are charged for bytes processed in the first stage of query execution. Avoid creating a complex multistage query just to optimize for bytes processed in the intermediate stages, since there are no cost implications anyway (though you may achieve performance gains). I was curious about this, because I have written some queries to minimize reading data multiple times....

September 12, 2021 · 4 min · Peter Boone  · Source