This tool wouldn’t be useful for most (if not) all enterprise services I’ve worked for. For enterprise, you want fully featured synthetics services such as Thousand Eyes, plus an internal monitoring and alerting system.
Also you typically don’t want to expose your health endpoint to the outside world. It’s a security risk.
It's aimed at indie devs/startups shipping ideas quick. Built it for ourselves while we were starting an app under the aws free tier which occasionally went down when usage spiked. Notified us to fix it quickly before losing users that could download the app but not create an account. It can be set up in 30 seconds without needing to code anything, so mainly for coders that want a quick and easy solution.
So not aiming for enterprise on this one, made the pricing quite accessible and with minimal features.
For the health endpoint as long as it only returns a 200 status code (without disclosing info like tokens or resource info/server configurations) then the risk is very minimal.
Built YourServerIsDown.com as a side project that we needed for our startup... anyone else have the issue of not finding out quickly enough if your server went down?
For our app it's super important as if our server goes down, users can download the app but get stuck at the sign in flow. There's subscription services out there that do more in-depth monitoring but this is all we needed.
I listed an alternative solution below for those wanting to build or customize their own solution, ours just gets the job done, is quick to set up, and you can avoid the monthly twilio/sms fees.
Other alternative we received as feedback for those interested: "If any one wants an AWS Native way and assuming it has ALB you can target elb metric 503 via Cloudwatch Alarm and create an output to an SNS topic that goes to Slack, or use AWS chatbot/q, or set number as destination for sms via sns"
With that being said, I find these kinds notifications to provide more false positives than correctly detecting downtime. That ends up costing more time checking/double checking.
On the other hand, if you are running a service with no users and you have downtime... did you really have downtime?
If you run a service and you have downtime and no one reports it, did you have downtime?
I don't even check for my services. If something goes down, I'll find out via email from one or more of my customers. It happens very rarely.
Correct. It requires an unauthenticated endpoint that retuns a 200 response. So usually this is the /health endpoint, but as long as we can send a ping it works.
ok how does it actually work. i get it you ll check for 500 errors by hitting multiple endpoints every x units of time. But the number of endpoints you must check also keeps going up for your service. Today you start and have 10 endpoints,6 months down the line you need to check 10000 endpoints every x units of time. How do you manage scaling this?
This tool wouldn’t be useful for most (if not) all enterprise services I’ve worked for. For enterprise, you want fully featured synthetics services such as Thousand Eyes, plus an internal monitoring and alerting system.
Also you typically don’t want to expose your health endpoint to the outside world. It’s a security risk.
So not aiming for enterprise on this one, made the pricing quite accessible and with minimal features.
For the health endpoint as long as it only returns a 200 status code (without disclosing info like tokens or resource info/server configurations) then the risk is very minimal.
For our app it's super important as if our server goes down, users can download the app but get stuck at the sign in flow. There's subscription services out there that do more in-depth monitoring but this is all we needed.
I listed an alternative solution below for those wanting to build or customize their own solution, ours just gets the job done, is quick to set up, and you can avoid the monthly twilio/sms fees.
Other alternative we received as feedback for those interested: "If any one wants an AWS Native way and assuming it has ALB you can target elb metric 503 via Cloudwatch Alarm and create an output to an SNS topic that goes to Slack, or use AWS chatbot/q, or set number as destination for sms via sns"
With that being said, I find these kinds notifications to provide more false positives than correctly detecting downtime. That ends up costing more time checking/double checking.
On the other hand, if you are running a service with no users and you have downtime... did you really have downtime?
If you run a service and you have downtime and no one reports it, did you have downtime?
I don't even check for my services. If something goes down, I'll find out via email from one or more of my customers. It happens very rarely.
There are services like Textbelt that leave the trigger mechanisms all up to you and your local tools:
https://textbelt.com/