How we came to replacing Amplitude and Mixpanel by our own serverless analytic stack

Disclaimer

Amplitude, Mixpanel and even Google Analytics are wonderful tools. If your concern is to build your analytic stack as an early stage startup, this is a no-brainer : Set up these tools and forget this question for few years.

Why putting so much effort in Analytic ?

Analytic is light in the night.

Especially for an application with hundreds of thousands of people using it and almost as many different usages. It allows you to know what works and what does not, scientifically. No guess, no intuition. Proof, data, science.


For us, the goals analytic have to reach are various, from detecting when our extension is no longer able to apply coupons on a website and need maintenance, to detecting when a coupon shared by a user has just been disabled by the merchant.
We use this super power to build a better service and to know what to build next.

Our historical analytical stack and data mindset

At first, Google Analytics

Traditional website businesses are used to use Google Analytics. For free, you can answer questions like :

How many users went to my website today and from where

or

How many of them just clicked on an ad on Google/Facebook without spending a cent on my website


This kind of analytic solution can show you macroscopic events that are really fine in a website context (Page Load, Visits, Visitors) but truly insufficient in an app context. At Wanteeed, we want to know how our users are using the extension and if they clicked on the button when we offer them to automatically test & apply the best coupon in few seconds (if not, well, our users don't care anymore about our value proposition, this is the start of the end my friend :D ). We also want to know how long are we getting at being successful for a coupon test and how much discount we apply to our users.

The Mixpanel revolution

All these micro actions can't be easily tracked in analytic solutions specifically crafted for websites context. When Google Analytics shows you how many users have loaded a coupon page, we want to know which coupon every single user has clicked in this page and if he interacted with other page or app elements. Sure, we know that you can now send events in GA, we tried it and we're not comfortable with it, we feel too limited.

That's why since the very early days of Wanteeed, we introduced Mixpanel into our website and extensions. We had a good time with it for almost 2 years, contemplating our first thousands users in various wonderful graphs like Active users, Retention and Life Time Value (LTV).

Oh my god, we have users and it's not my mum

The king is dead. Long live the (Amplitude) King !

Pricing considerations are mandatory in an early stage startup. We chose to switch to Amplitude because of their very aggressive pricing that offered us up to 10M events a month ! That's crazy when you think about it ! We truly think we were so lucky to used this incredible tool for free for so long. Thank for this Amplitude. This was a good strategy as we then became a paying customer 1 year later with a 5 figures digits bill and we're still here 3 years later with ~25M events a month (the max according to our plan).

But here is the thing : ~25M events is becoming too few for us and we're making efforts into not sending low valued events to respect this limitation. Plus, we occasionally want to access raw events to build our own crafted analysis that are not available into Amplitude UI. This is possible, but with a big bill raise ! Also, we sometimes had some troubles in production where events were sent with incorrect data or duplicates ones. This is a nightmare as this screw up all your graphs & KPI's ! Imagine how sending incorrect revenue events could affect your LTV graph.

Amplitude suggestion here is to download all your data (they have a pretty good API for that), process it on your own so that you can clean incorrect events, and then resubmit all data history into a new Amplitude project. Not to mention the pricing to do that (every resubmitted event count as a new monthly event), we're talking about hundreds millions events now, and billions in few months. We don't have to do that.


Amplitude, we need to talk.

So, the best players in town don’t suit our needs ? Houston, we’ve got a problem

First option was to try an other fully managed analytic solution. But we ended up to a NoGo very early in the thinking process. These tools’ pricing simply doesn't fit our business.

Why can't we afford these tools ?

Wanteeed is a complicated business from a tech point of view. Think about companies where you can use SaaS products like Segment, Customer.io, Intercom.com and even Mixpanel (we never paid for it in the early days). Think about how fast and easily you can implement Data and Marketing automation solutions. Well, you can stop dreaming now because you work at Wanteeed and we're a B2C business with a pretty low LTV in opposition to a SaaS business LTV.

These kind of solutions have a pricing that doesn't suit our use cases.


For example, Segment pricing is about how many MTU (monthly tracked users) you need. In a SaaS business where a single user can give you hundreds euros a year, every data point has a value of this magnitude. In opposition, when a user give you only few euros a year, you can't afford a solution with a per user pricing. That's why this kind of products, which would avoid us months of work, will never be part of the Wanteeed journey, and why we have to actively rebuild the wheel, in a cost effective way and specifically tailored to our needs and nothing more. I could also argue on why Extension VS Regular websites businesses are also a major reason on the necessity to rebuilding wheel from the tech point of view, but it will be part of a next blog post.

Trying again to avoid the hard way

Other possibility was to build half the analytic stack and rely on other people to build and maintain complexes parts of it. There are plenty of these options out there that do the hard work for you. These kind of solutions called data pipeline as a service (Fivetran, StitchData) offer to collect your data wherever it is, shape it in a consistent format and store it in your location of choice. It let you with just the mission of consuming the aggregated and formatted data with any BI / DataViz tool.

Pretty cool ! But... Damn expensive too.

Back to beginning, insert coin and play again.


Did i mention already that we never wanted to raise a single euro in venture capital so far ? That's another constraint that definitively oblige us to carefully control our costs to stay profitable.

Deny the evidence first, then gathering forces and diving in

The time you spend into building something you can pay other to offer you, is time you don't spend into building your product and making your users happy. It's almost every time, some badly spent time.


As this consideration is so huge for us and is an everyday part of our journey since the start of the company, we first tried to avoid diving into the hard way.

But after weeks of thinking and analysis of different options, we finally came to an exciting but scary conviction :

We needed to build our own analytic pipeline, own our data end to end and control the related costs.



In the following posts, we'll discuss about the tech parts, the challenges we faced and how we came over them.

  • Part 2 | What it really means to build your own data pipeline : the state or the art
  • Part 3 | ELT analytic pipeline : Extracting the data
  • Part 4 | ELT analytic pipeline : Loading the data
  • Part 5 | ELT analytic pipeline : Transforming the data
  • Part 6 | Put your data at work : DataViz with redash
  • Part 7 | Analytic pipeline : Future and various learnings


Be sure to subscribe to our list to be notified when we'll release the next posts.

We're hiring, come say hi !

We are looking for talented people to join our team.
Send your resume with a few words here => Jobs at Wanteeed

Want to know what’s it like to be a software engineer at Wanteeed first ? Read this