Guide Area

Serverless architecture… not as perfect as you’re told!

I bet you came across this article when you wanted to know more about differences between those three. Or you wanted to know if these platforms are as good as people tell you. I’ve been through many home projects – backend applications, REST APIs, frontends, Android applications, you name it. And just as you, I also read a lot of articles where people claimed that when kicking off a startup or creating a small-to-medium projects, you should ALWAYS go with 3rd party APIs. Yes, it saves you time. Yes, it saves resources needed for development. And yes, you save yourself a whole lot of troubles and complications. But in life, there is no such thing as “100% perfect” and that’s what this article is about.

Disclaimer: I have actually never deployed any application using these. But it’s not because I didn’t want to. Anytime I tried to do it, I came across a bunch of things that were standing in the way of my requirements and I had to switch to manual work. I’m not writing this article to tell you “DON’T USE THEM”. I just want you to know what to expect before you start development.”

Manual development takes time, resources, nerves, blah blah blah

Yes, indeed it does. Compared to server-less architecture, it’s a pain in the ass. You need to:

  • do an initial research on cloud platforms
  • create your virtual machines and provision them
  • your apps need to have an optimized application server
  • custom-made authentication is a hustle
  • take care of security of VMs
  • and other things, like load balancing, failover strategy, scaling, etc.

Although, there is always a “but”. Doing following things will give you a 100% control over EVERYTHING that you do. Entrepreneurs will tell you: “It’s an overkill, use APIs and do programming instead of wasting time on architecture”. And I couldn’t agree more. But it all depends on what your plans for the future are. Do you want to host a food recipe web-app for your local community and your goal is to have 50 users and 5.000 visitors a month? No problem, go for Firebase (or other).

But, will your site grow to millions of users? Will you have 100 million visits every month? Is your page going to contain sensitive user data? Then, you need to think about other things as well. Try to read following articles: How we spent $30k in Firebase in less than 72 hours and Over 3,000 iOS and Android Apps Leak 100 Million User Sensitive Records via Misconfigured Firebase Backends. I’m not going to go in depth here because we will cover these topics later in this article

Why do you think that big companies do not use these APIs? They do everything themselves because sooner or later, at some point you will get limited by what you can do and you will realize that if you want to have something self-tailored, you’re f***ed. Plus, many other reasons – keep reading.

You don’t own data

I’m not sure if this presents a big problem in your project. But you should know that if you store data in 3rd party services, you are no longer the only one who has the ownership over them.

A few months back I talked to a co-founder of one unnamed API. They stored end-user data on their servers and in their terms and conditions they stated that by using their service you agree to allow the service to use end-user data for purposes of analytics and marketing. I asked if it’s possible to cooperate while ignoring this rule (in case of website that works with highly sensitive user data) and I got a negative response.

Firebase, AWS, doesn’t matter who – they all process data of your end-users for analytic purposes (if not more).

If you want to make sure that your data is not processed by anyone else but you, you have to migrate to your own servers and databases.

Reliable for years to come?

This is a very important issue for a lot of people. When GDPR came to life, a lot of companies had to switch old dependencies which reached their EOL (end of life).

Let’s say you chose OAuth as a 3rd party API for authentication. It’s very probable that this API will live for a long time. But what if it doesn’t?

I’ve seen many, many projects which needed to be refactored because the services that they used were deprecated. When I say refactored, I mean completely rewritten. Just imagine that one day Firebase would be cancelled (just as Google+ was cancelled even though a lot of users used it) and you’d have to switch your whole architecture to use AWS instead. Yes, yes, yes, I know, I’m out of my mind, I’m going nuts, this will never happen… until it does, and you’re screwed.

Alright, let’s say that everything will be fine. You’re on Firebase and it will never reach EOL. Do you remember what happened to Java? They introduced a really massive pricing for enterprise servers and it was a huge punch in the face for people who expected Oracle Java to exist for free forever. What if in 2 years Firebase raises its price tags by 2 or 3 times? You suddenly realize that all of your profits are gone and you’re diving deeper in debt every month.

Yes, I understand that this can also happen to pricing tiers of standalone servers. But you need to realize that there is just one Firebase, while there is thousands of server providers all around the world, so all you have to do is migrate your Linux server structure to another provider – a lot easier than rewriting millions of lines of code.

Customization – general

In one of my projects, I was using AWS Cognito (Amplify) as an authentication service. It was supposed to only authenticate and store user data – the rest of the functionality was taken care of in my Lambdas. I needed my users to enter their email address, username and password while registering. Both email address and username had to be unique. Long story short – not possible. Either email or username can be unique, but not both. So what do you do? Use Cognito to authenticate, then create a database for storing additional data and a backend logic to put the whole process together.

I tried the same thing in Firebase. Firebase doesn’t even have custom fields – you can only have email address and password there. So you have to repeat the procedure from above.

You might not realize it but you already need 3 of the provider’s services to do one simple thing.

Now compare it with a single server where your Python or NodeJS app takes care of authentication and it can be customized to your needs. Extra fields? No problem. Custom encryption? No problem. Special rules? No problem. Workflow customization? No problem. I could go on and on.

Customization – security

I don’t exactly remember which of the specific services had this issue. I had an element (function? hosting? not sure) which I needed to protect, so only my (lambda) function could access the specific element. You’d think that it would be easily configurable, but it actually isn’t. You have to take care of stuff like this on the application level.

Let me try to explain this on a (possibly sci-fi) scenario: You have a hosting endpoint which you don’t want to reveal. Only your cloud function should be able to access it. So you have to write a piece of code which takes care of all this. For example, it could only allow specific domains or IPs, or users with a special token. The first problem is that you had to take care of this yourself, which means that these services are not as secure as you were told. Second thing is that the endpoint is still available to everyone on earth. Which leads us to rate limiting.

Rate limiting

This seems unbelievable and I’m not sure if AWS lambdas have the same issue, so let’s talk Firebase. There is absolutely no way to specify rate limit on your cloud functions. What does this mean? Let’s say you have an HTTP endpoint which prints “Hello world” to the browser. Me as a “hacker” will get this URL from developer tools in my browser (or other way, doesn’t matter). I can now query your lambda 1000x every hour, every minute or every second. And there is absolutely no way for you to limit query rate from my IP address.

So in the end, you will have to implement some rate limiter to avoid this. For example, you can create your own server, implement some functionality which limits every IP to maximum of 5 requests per second, and then call this server from your app. This server will then call your cloud function if you didn’t exceed query limit.

And believe you me when I tell you that you will absolutely have to do this at some point. Let’s talk about why – pricing.

Pricing

So you created your function on Firebase and there is no query limit available to you – it can be triggered by anyone, as often as they like. Current pricing on Firebase Functions goes like this: $0.40/million invocations.

Let’s say that me and 10 of my friends will try to attack your endpoint. Consider this a very, very light attack. We will make 1000 requests towards your endpoint every minute for the whole month.

10 friends x 1000 requests x 60 minutes x 24 hours x 30 days = 432 000 000 function invocations

Our small experiment costs you 432 x 0.40 = $172.8, just like that. You can’t really call this a DDoS, it’s just a very small attack. Now imagine 2000 computers all around the world making 10x as much requests each minute.

One of the article which I linked above says:

Since the campaign was released, and for the next 48 hours, we had use lot of resources of Firestore, our billing came up to $35,000 USD!!! We did more than 46 BILLION requests to Firestore. Yes, billion with a B.

That’s another thing to talk about. When you depend on cloud functions, you have to be extra-f***ing-careful about what you do. A small mistake in code can easily cost you a lot of money.

The thing is that if you had your own set of servers behind a load balancer, no matter how many requests would be coming you’d still pay the same amount. Two load balancers and three app servers, $10 a piece would become $50 dollars a month and there is no way to exceed it. What’s the worst thing that can happen? Your servers will crash due to unbearable load.

When it comes to pricing, pay-as-you-go plans are incredibly risky. Unfortunately, that’s what you get for a serverless architecture.

Documentation

Don’t you just hate it when there is something that you don’t know and you can’t find documentation to it? A few days back I was trying to make Firebase Authentication work in my React app, just for fun. I was looking for a tutorial about how to make this work properly. Google’s documentation was extremely brief – there is no step-by-step introduction tutorial or anything like that. They spit bunch of code snippets to your face and all you can do is to follow a method called “fail-and-try-again”.

Roughly a year ago I was using a framework called Django to develop a Python application. Boy, did they have extensive documentation. You can easily spent a day reading all about it, including step-by-step articles about how to do stuff. Django is an open-source framework, while Firebase is a paid service. You’d expect that Firebase docs would be even more complex than Django’s, but they aren’t. So instead you have to look for unofficial sources for documentation and hope you will find something.

Conclusion

Seems like a was very negative towards serverless structure in this article, so let me remind you again: This article is here to show you disadvantages of this approach, not to tell you “DON’T DO IT”.

I can imagine that you had your hopes high when it comes to one of the mentioned platforms and I ruined it all. In that case I’m sorry.

I hope I helped at least a few of you make your mind up.

 

Vladimir Marton

DevOps Engineer focused on cloud infrastructure, automation, CI/CD and programming in Javascript, Python, PHP and SQL. Guidearea is my oldest project where I write articles about programming, marketing, SEO and others.