The site is indeed instant, those performance tricks does work (inline everything, botli compression, cache, edge network like cdn), BUT the site is also completely empty, it shows nothing except a placeholder.
Things can easily change when you start adding functionalities. One site I like to visit to remind myself of how fast usable websites can be, is Dlangs forum. I just navigate around to get the experience.
> One site I like to visit to remind myself of how fast usable websites can be, is Dlangs forum. I just navigate around to get the experience
Interestingly, for me each page load takes a noticeably long delay. Once it starts loading all of the content snaps in almost at once. It’s slower to get there than the other forums I visit though.
It's crazy how unusable most gun websites are for browsing what's available. This though is the perfect example of what I really want when browsing catalogues.
One time I decided to check how much faster really you can go while still getting decent usability out of "simple blog platform" type of webapp.
End result, written in go, did around 80-200us to generate post page and 150-200us (on cheap linode VPS... probably far faster on my dev machine) for index page with a bunch of posts.
Core was basically
* pre-compile the templates
* load blogpost into RAM, pre-compile and cache the markdown part
cache could be easily kicked off to redis or similar but it's just text, there is no need
Fun stuff I hit around:
* runtime template loading takes a lot just for the type-casting; the template framework I used was basically thin veneer over Go code that got compiled to Go code when ran
* it was fast enough that multiple Write() vs one was noticeable on flame graph
* smart caching will get you everywhere if you get cache invalidation right, making the "slow" parts not matter; unless you're running years of content and gigabytes of text you probably don't want to cache it anywhere else than in RAM or at the very least have over-memory cache be second tier.
The project itself was rewrite of same thing that I tried in Perl(using Mojolicious) and even there it achieved single digit ms.
And it feels so... weird, using webpage that just reacts with speed that the well-written native app has. Whole design process was going against the ye olde "don't optimize prematurely" and it was complete success, looking at performance in each iteration of component paid off really quickly. We got robbed of so much time from badly running websites.
I had my page served with Go and it was instant, 100% speed score. Then I moved the static content to a CDN and it's slower now, only 96% speed. However, the question is really how fast the page is when it comes under heavy load.
My blog directory/search engine [1] runs on Cloudflare workers as well. I was able to get pretty good results, too. For example, the listing of 1200+ blogs [2], each with 5 latest posts, loads in ~500ms. A single post with a list of related posts, loads in ~200ms. Yeah, it's still a lot, but it includes all the normal web app logic like auth middlewares, loading user settings, and states; everything is rendered server-side, no rich frontend code (apart from htmx for a couple of buttons to make simple one-off requests like "subscribe to blog" or "add to favorites"). A static page (like /about) usually loads in ~100ms.
This is a bit stochastic because of regions and dynamic allocation of resources. So, e.g. if you're the first user from a large georgraphic region to visit the website in the last several hours, your first load will be longer.
My other project (a blog platform) contains a lot of optimizations, so posts [3] load pretty much as fast as that example from the thread, i.e. 60-70ms.
Do browsers use a custom dictionary for zstd (I don’t think so since I can precompress zstd content server-side)?
Brotli was designed for html compression so despite/while being a relatively inferior algorithm, its stock dictionary is all html/css/js-trained/optimized. Chrome/Blink recently added support for seeing content compressed with a bespoke dictionary, but that only works for massive sites that have a heavily skewed new/returning visit ratio (because of the cost of shipping both the compressed content and the dictionary).
Long story short, I could see br being better than zstd for basic web purposes.
Also, heard multiple times that edge network can be worse, because if you're low prio and other part of globe is not busy, you get it routed in worst possible way.
Getting a site to load quickly isn't that difficult from a technical perspective. You just need to strip out everything that slows it down. If you can deliver a page of HTML and inlined CSS that renders without JS or images then your site will be fast (or at least it'll be perceived as fast, which is fine.) So long as you're using some fairly reputable hosting infrastructure (AWS, Azure, Google, etc), and if you're rendering on the server you're not doing silly things on the hot path, then you don't need to worry about speed.
The hard part when it comes to site optimization is persuading various stakeholders who want GTM, Clarity, Dynatrace, DataDog, New Relic, 7 different ad retargeters, Meta, X, and probably AI as well now that a fast loading website is more important than the data they get from whichever of those things they happen to be interested in.
For any individual homepage where that stuff isn't an issue because the owner is making all the decisions, it's fair to say that if your site loads slowly it's because you chose to make it slow. For any large business, it's because 'the business' chose to make it slow.
Eventually you'll want to know what users are doing, and specifically why they're not doing what you expected them to do after you spent ages crafting the perfect user journeys around your app. Then you'll start wondering if installing something to record sessions is actually a great idea that could really help you optimize things for people and get them more engaged (and spending more money.)
Fast forward three years, and you'll be looking at the source of a page wondering how things got so bad.
> Eventually you'll want to know what users are doing, and specifically why they're not doing what you expected them to do after you spent ages crafting the perfect user journeys around your app
That's putting the cart before the horse. The way it's properly done is just to invite a few users and measure and track their interaction with your software. And this way you'd have good feedback instead of frustrating your real users with slow software.
Yeah, you'll do that, and get great feedback, and then when you roll it out to other users they'll do weird stuff you've not seen any of the test group try before.
Users being weird are the fundamental root cause of all software problems. :)
Users can’t click a button that does not exist. It’s on product and engineering to curtail what the user can do. Optimizing for the happy path while not eliminating the incorrect flow is just bad software engineering.
I decided to go check my website’s PageSpeed and I do have a 100/100/100/100 with pretty lots of content on the homepage including 6 separate thumbnails.
My site is on a straight path, no tricks — Github Pages Served to the Internet by Cloudflare.
hmm, it seems the last static site I did is slightly faster https://nissestyrelsen.dk/ but probably just because it's hosted in country near me, if I was somewhere else probably much worse. Not that I care that much to really research it, I figure it's fast enough and just a funny idea.
I still appreciate that you shared this even though other comments are correct that there isn’t much content on the page. Frankly speaking, more devs should put more thought into performance.
I’m currently working on a small e-commerce store for myself, written in SvelteKit (frontend) and Go (backend) and one of my core objectives is to make it fast. Not crazy fast, but looking for TTFB < 50-70ms for an average Polish user. Will definitely share it once it’s public.
another trick is adding speculation rules on MPA sites. so when you hover over a link the page gets prerendered. For example, my initial page takes ~80ms, but navigating to other pages take 20ms
These 30 ms and 4 ms numbers were typical Apache to Netscape from MAE East and MAE West in 1998. Twenty five years and orders of magnitude more computing later? Same numbers.
But now it's that fast from almost everywhere on the planet, with nearly zero effort from the developer. We've been limited by light speed here, not compute.
I get 381ms/401ms on first load and not the claimed ~30ms. I'm not really sure what the point is here though. CDNs and browser cache headers work? Static sites are fast to paint?
Yeah, I'm not seeing fast uncached times either. I usually hit Cloudflare's Miami datacenter, which is only about 200 miles and very low latency. But I'm seeing 200+ms on this site right now.
Most cloudflare products are very slow / offer very poor performance. I was surprised by this but that’s just how it is. It basically negates any claimed performance advantage.
Durable objects, r2 as well as tunnel have been particularly poor performing in my experience. Workers has not been a great experience either.
R2 in particular has been the slowest / highest latency s3 alternative I ever had experience with, falling behind backblaze b2, wasabi and even hetzner’s object storage.
The circumference of Earth at the equator is about 40,000 km and the speed of light is about 300,000 km/s. The appropriate division results in about 0.13 s.
That seems to track. The vast majority of requests won’t go half way around the Earth, so maybe halving that time at 0.06 seems like a reasonable target.
I believe that FTL communication (if it's achievable) will start out in data centers at small scales. Perhaps millimeters.
Possibly as an extension of Quantum Computing where some probabilistic asymmetry can be taken advantage of. The QC itself might not be faster than classical computing, but the FTL comms could improve memory and cache access.
Also MetaGoog will use it to serve up hyper personalized ads in their Gemini based Metaverse.
I agree. Not impressed, frankly. Cloudflare workers is just even-more localized CDN, and the benefit is so tiny that it's not worth the investment nor maintenance costs. (I wrote extensively about this non-thing here: https://wskpf.com/takes/you-dont-need-a-cdn-for-seo). My site (https://wskpf.com), which has way more elements and, err, stuff, loads in 50ms, and unless you are superman or an atomic clock, you wouldn't care. same lighthouse scores as this one, but with no CDN nor cloudflare workers, and it actually has stuff on it.
Things can easily change when you start adding functionalities. One site I like to visit to remind myself of how fast usable websites can be, is Dlangs forum. I just navigate around to get the experience.
https://forum.dlang.org
Interestingly, for me each page load takes a noticeably long delay. Once it starts loading all of the content snaps in almost at once. It’s slower to get there than the other forums I visit though.
End result, written in go, did around 80-200us to generate post page and 150-200us (on cheap linode VPS... probably far faster on my dev machine) for index page with a bunch of posts.
Core was basically
* pre-compile the templates
* load blogpost into RAM, pre-compile and cache the markdown part
cache could be easily kicked off to redis or similar but it's just text, there is no need
Fun stuff I hit around:
* runtime template loading takes a lot just for the type-casting; the template framework I used was basically thin veneer over Go code that got compiled to Go code when ran
* it was fast enough that multiple Write() vs one was noticeable on flame graph
* smart caching will get you everywhere if you get cache invalidation right, making the "slow" parts not matter; unless you're running years of content and gigabytes of text you probably don't want to cache it anywhere else than in RAM or at the very least have over-memory cache be second tier.
The project itself was rewrite of same thing that I tried in Perl(using Mojolicious) and even there it achieved single digit ms.
And it feels so... weird, using webpage that just reacts with speed that the well-written native app has. Whole design process was going against the ye olde "don't optimize prematurely" and it was complete success, looking at performance in each iteration of component paid off really quickly. We got robbed of so much time from badly running websites.
It appears to have static content. Why does it need any JS at all?
Looks like the only JavaScript running on the client is for installing the service worker and some Cloudflare tracking junk.
This is a bit stochastic because of regions and dynamic allocation of resources. So, e.g. if you're the first user from a large georgraphic region to visit the website in the last several hours, your first load will be longer.
My other project (a blog platform) contains a lot of optimizations, so posts [3] load pretty much as fast as that example from the thread, i.e. 60-70ms.
1. https://minifeed.net/
2. https://minifeed.net/blogs
3. https://rakhim.exotext.com/but-what-if-i-really-want-a-faste...
For a dynamic service, well.. maybe implement something of interest and then we can discuss.
Why brag about how it's not static content, if you're just going to tell the browser to cache it until the end of time anyways?
Brotli is so 2024. Use zstd. (73.62%, I know. Slightly worse compression ratio, I know that too.)
Brotli was designed for html compression so despite/while being a relatively inferior algorithm, its stock dictionary is all html/css/js-trained/optimized. Chrome/Blink recently added support for seeing content compressed with a bespoke dictionary, but that only works for massive sites that have a heavily skewed new/returning visit ratio (because of the cost of shipping both the compressed content and the dictionary).
Long story short, I could see br being better than zstd for basic web purposes.
Pretty much any small payload/non-javascript site is going to render very quickly (and instantly from cache) making SSL time be the long pole.
I have 5G network :)
Also, heard multiple times that edge network can be worse, because if you're low prio and other part of globe is not busy, you get it routed in worst possible way.
The hard part when it comes to site optimization is persuading various stakeholders who want GTM, Clarity, Dynatrace, DataDog, New Relic, 7 different ad retargeters, Meta, X, and probably AI as well now that a fast loading website is more important than the data they get from whichever of those things they happen to be interested in.
For any individual homepage where that stuff isn't an issue because the owner is making all the decisions, it's fair to say that if your site loads slowly it's because you chose to make it slow. For any large business, it's because 'the business' chose to make it slow.
Eventually you'll want to know what users are doing, and specifically why they're not doing what you expected them to do after you spent ages crafting the perfect user journeys around your app. Then you'll start wondering if installing something to record sessions is actually a great idea that could really help you optimize things for people and get them more engaged (and spending more money.)
Fast forward three years, and you'll be looking at the source of a page wondering how things got so bad.
That's putting the cart before the horse. The way it's properly done is just to invite a few users and measure and track their interaction with your software. And this way you'd have good feedback instead of frustrating your real users with slow software.
Users being weird are the fundamental root cause of all software problems. :)
I decided to go check my website’s PageSpeed and I do have a 100/100/100/100 with pretty lots of content on the homepage including 6 separate thumbnails.
My site is on a straight path, no tricks — Github Pages Served to the Internet by Cloudflare.
- 3942ms
- 4281ms
Guess it depends on your region. This is from East-Asia.
I’m currently working on a small e-commerce store for myself, written in SvelteKit (frontend) and Go (backend) and one of my core objectives is to make it fast. Not crazy fast, but looking for TTFB < 50-70ms for an average Polish user. Will definitely share it once it’s public.
https://github.com/ericfortis/mockaton/blob/main/www/src/_as...
uBlock Origin does it by default for instance.
On Brave, the workaround on that linked snippet bypasses their blocking.
The reason is that browser prefetching may hit URLs that were intended to be blocked.
I think most sites could either be static HTML and use a CDN, or they need a database and pretty much have to be located in one place anyway.
It's quite hard to think of use cases where that isn't true.
These 30 ms and 4 ms numbers were typical Apache to Netscape from MAE East and MAE West in 1998. Twenty five years and orders of magnitude more computing later? Same numbers.
Durable objects, r2 as well as tunnel have been particularly poor performing in my experience. Workers has not been a great experience either.
R2 in particular has been the slowest / highest latency s3 alternative I ever had experience with, falling behind backblaze b2, wasabi and even hetzner’s object storage.
The site should be faster, though. I’ve had a small CF workers project that works correctly with quick load times.
That seems to track. The vast majority of requests won’t go half way around the Earth, so maybe halving that time at 0.06 seems like a reasonable target.
Getting it closer can save you 50-150ms, but if whole load takes 1s+ that's minuscule
Possibly as an extension of Quantum Computing where some probabilistic asymmetry can be taken advantage of. The QC itself might not be faster than classical computing, but the FTL comms could improve memory and cache access.
Also MetaGoog will use it to serve up hyper personalized ads in their Gemini based Metaverse.
Is the site getting slower?
And with Workers they're accessible from hundreds of locations around the world so you can get this sort of speed from almost anywhere.
Maybe add some dynamic feature for the demo so that we don't need to trust you and be surprised at a nothingburger.
Add imagery and see if you get the same results. I expect you could achieve such with Base64 but the caveat would be larger file sizes.