First, let's start with some working definition of a lie.
A lie - making false claims; misrepresenting truth; making someone believe a thing is in order while it isn't.
What happens when a component in a system misrepresents data to another? E.g. imagine you have this dataset:
region | app_support | cdn_support eu | 1 | 1 us | 1 | 1 china | 1 | 0
And you run this query:
select cdn_support where region = "china";
What would you think of this component if it misrepresented the value as
Well, in a software component it's called a bug. In the real world, we usually don't use this word - instead, we call it a lie.
What Microsoft claims
Case 1: Pricing calculator
- Go to Azure's pricing calculator (retrieved: 2015-11-30)
- Add a CDN package
- Add "1GB" of transfer for Zone 2, which includes "Asia Pacific Japan Australia"
Observation: Estimated monthly cost is €0.12, which, given that you assume you'd get what you're paying for, implies the service is available and works in the region.
Case 2: POP locations
- Go to https://azure.microsoft.com/en-us/documentation/articles/cdn-pop-locations/ (retrieved: 2015-11-30)
Observation: claimed POP locations in Taiwan (Kaohsiung) and Hong Kong.
What we've learned from Microsoft support...
(...while trying to host an actual f@ck!ng website in regions including China and Taiwan, and users reporting incredibly bad performance.)
Microsoft is merely a reseller for Verizon Edgecast for their CDN offerings.
We ran traceroutes from a VPN in China. We got routed through Germany and to USA.
[snip a long email thread]
Us (Sunday 21:12)
Does it mean that Azure clients are only handled by US/European data centers? I might have missed a small disclaimer on your page, but the map [snip link] doesn't say anything about the limitations.
[snip apologies for delays and promises to get someone on the case]
Them (Monday 19:00)
[...] Please note however that Azure customers cannot deliver from our POPs in China today. There is no commercial agreement in place for that. So in most cases, requests that come from users inside China will route to the United States.
TLDR: 503 Service Not Available.
The droplet that overflows the jar of crap
We have struggled with several other issues.
Some deployments of our Python application randomly caused the WebApp to start throwing server errors. A workaround suggested by MS support was to log in over FTP and replace the contents of a specific platform-managed directory (a
virtualenv) with another, working version.
Where do we get the working version? Well, the platform should generate one upon each deployment. This is where I found out that our deployment script (which I've never seen before, it's also managed by the platform) was somehow replaced by a version that... seems specific to NodeJS. Wat?
Another option is to simply run
pip install -r requirements.txtfrom inside a
virtualenv, and FTP-copy that. Except, the magic has to happen on a Windows system. (What if I have none?)
An entire region (EU-West) became unavailable. We've learned this... wait for it... from our QA team, as they pointed out the website doesn't load anymore.
I dug thru the management portal to find... An "Information" that the team is investigating an alert in the region. No notification, no emails, no alert, nothing. Just a silent outage.
Azure Web Apps consistently hides simple errors and makes debugging live issues a total pain. Examples:
There's a simple shim that you're supposed to bundle with your project, that is responsible for bootstrapping the application (activate
virtualenv, import the app, etcetera). It's described in the docs as "so simple you shouldn't care it exists, you can totally just copypaste it, and it certainly has no bugs".
What it also does, is it hides
ImportErrors raised due to missing third-party dependencies during application load, and claims it can't import the application itself. The "fix" was as simple as adding a call to
except ImportError as e:- to actually see the damn error.
Can you see what's wrong with this line perhaps?
with open(os.path.join("foo", "bar.txt", "r")) as f:
Well, naturally it would try to find a (nonexistent)
foo\bar.txt\rfile, the fix is pretty obvious (move the closing paren). This is the infinitely helpful suggestion I've found in the server logs:
- IIS was not able to access the web.config file for the Web site or application. This can occur if the NTFS permissions are set incorrectly.
- IIS was not able to process configuration for the Web site or application.
- The authenticated user does not have permission to use this
- The request is mapped to a managed handler but the .NET Extensibility Feature is not installed.
Mind you, this was a line in the application code.
I've spent an hour (or two) learning all about NTFS permission models, binging around for .NET DLL loading procedures (there was no .NET code in our application, pure CPython), and otherwise pulling my hair.
(Luckily this one time, it didn't happen in the production environment.)
TLDR: 500 Internal Server Error.
My coworker suggested:
We should do a campaign called "Internal Server Error"
Azure seems to be the perfect platform for that job.