The object storage stuff is new, but it's mostly confirmed that the older architecture works. MPP with shared (S3) storage and everything above that on local SSD and compute delivers the best performance. Even Snowflake finally came out with "interactive" warehouses with this architecture.
Parquet, Iceberg, and other open formats seem good, but they may hit a complexity wall. There's already some inconsistency between platforms, eg with delete vectors.
Incremental view maintenance interests me as well, and I would like to see it more available on different platforms. It's ironic that people use dbt etc. to test every little edit of their manually coded delta pipelines, but don't look at IVM.
Before spidering the site for offline reading, be aware:
“Rather than secure rights to the recommended papers, we have simply provided links to Google Scholar searches that should help the reader locate the relevant papers.”
1. You do not talk about Sci-Hub.
2. You do NOT talk about Sci-Hub.
3. If a download says "Stop," goes limp,
or taps out, that download is over.
4. Only two tries per mirror.
5. One download at a time.
6. Shirt and shoes optional.
7. Downloads will continue until publicly funded
research is widely distributed.
8. If this is your first time at Sci-Hub, you
have to download something interesting,
actually read at least part of it, learn
something, and then fight ignorance and/or
stupidity with it.
Amazing: the website's index page has the book's index in it. While this makes perfect sense, it's a kind of a feature that is becoming rare in today's tech book websites which display all sorts of marketing fluff, social confirmations etc and not the structure of the book itself.
Readings in Database Systems (commonly known as the "Red Book") has offered readers an opinionated take on both classic and cutting-edge research in the field of data management since 1988. Here, we present the Fifth Edition of the Red Book — the first in over ten years.
CHAPTERS
Preface
[HTML] [PDF]
Background introduced by Michael Stonebraker
[HTML] [PDF]
Traditional RDBMS Systems introduced by Michael Stonebraker
[HTML] [PDF]
Techniques Everyone Should Know introduced by Peter Bailis
[HTML] [PDF]
New DBMS Architectures introduced by Michael Stonebraker
[HTML] [PDF]
Large-Scale Dataflow Engines introduced by Peter Bailis
[HTML] [PDF]
Weak Isolation and Distribution introduced by Peter Bailis
[HTML] [PDF]
Query Optimization introduced by Joe Hellerstein
[HTML] [PDF]
Interactive Analytics introduced by Joe Hellerstein
[HTML] [PDF]
Languages introduced by Joe Hellerstein
[HTML] [PDF]
Web Data introduced by Peter Bailis
[HTML] [PDF]
A Biased Take on a Moving Target: Complex Analytics
by Michael Stonebraker
[HTML] [PDF]
A Biased Take on a Moving Target: Data Integration
by Michael Stonebraker
[HTML] [PDF]
Complete Book: [HTML] [PDF]
Readings Only: [HTML] [PDF]
Previous Editions: [HTML]
- Vector databases and hybrid search?
- Object storage for all the things? Lake houses. Parquet and beyond.
- Continuously materialized views? I'm not sure this one has made the splash but I think about Naiad (Materialize) and Noria (Readyset)
- NewSQL went mostly mainstream (Spanner wasn't included in the last one, but there's been more here with things like CockroachDB, TiDB, etc)
Parquet, Iceberg, and other open formats seem good, but they may hit a complexity wall. There's already some inconsistency between platforms, eg with delete vectors.
Incremental view maintenance interests me as well, and I would like to see it more available on different platforms. It's ironic that people use dbt etc. to test every little edit of their manually coded delta pipelines, but don't look at IVM.
2020 (225 points, 30 comments) https://news.ycombinator.com/item?id=15436647
2017 (247 points, 44 comments) https://news.ycombinator.com/item?id=15436647
2015 (189 points, 37 comments) https://news.ycombinator.com/item?id=10694538
“Rather than secure rights to the recommended papers, we have simply provided links to Google Scholar searches that should help the reader locate the relevant papers.”
https://ibb.co/BVrzQRWH
Readings in Database Systems (commonly known as the "Red Book") has offered readers an opinionated take on both classic and cutting-edge research in the field of data management since 1988. Here, we present the Fifth Edition of the Red Book — the first in over ten years. CHAPTERS Preface [HTML] [PDF] Background introduced by Michael Stonebraker [HTML] [PDF] Traditional RDBMS Systems introduced by Michael Stonebraker [HTML] [PDF] Techniques Everyone Should Know introduced by Peter Bailis [HTML] [PDF] New DBMS Architectures introduced by Michael Stonebraker [HTML] [PDF] Large-Scale Dataflow Engines introduced by Peter Bailis [HTML] [PDF] Weak Isolation and Distribution introduced by Peter Bailis [HTML] [PDF] Query Optimization introduced by Joe Hellerstein [HTML] [PDF] Interactive Analytics introduced by Joe Hellerstein [HTML] [PDF] Languages introduced by Joe Hellerstein [HTML] [PDF] Web Data introduced by Peter Bailis [HTML] [PDF] A Biased Take on a Moving Target: Complex Analytics by Michael Stonebraker [HTML] [PDF] A Biased Take on a Moving Target: Data Integration by Michael Stonebraker [HTML] [PDF] Complete Book: [HTML] [PDF] Readings Only: [HTML] [PDF] Previous Editions: [HTML]
I just switched networks (wifi/mobile) and it worked, only that provider seems to block it
Some might argue the Red Book to be “NSA Trusted Networks” a.k.a the ugly red book that won't fit on the shelf.
Crash & Burn <3