3 Habits of a Highly Effective Growth Engineering Team

There are a lot of great articles about how to set up your Growth team for success [1] [2]. At product driven companies like Airbnb, Pinterest, Uber, Facebook, etc., the Growth team is made up in a significant part by engineers. Building a great Growth Engineering team is a crucial part of building a great Growth team. I’ve been working on or managing Growth Engineering teams for over 5 years and have seen Growth teams as small as 2 people to Growth orgs larger than 50 people. In this post, I’ll cover three keys to success to set your Growth Engineering org up for maximum impact.


Full Stack Staffing

One of the critical traits of an effective Growth team is execution velocity. Execution velocity is important because Growth requires a lot of rapid iteration to learn and explore new ideas. In order to maximize the execution velocity, you need to staff the team with all the engineers you need to work on all areas of growth. This means backend engineers, iOS engineers, Android engineers, machine learning engineers, etc. Depending on another team that has different priorities can potentially kill the Growth team’s velocity. Depending on another team often means you need to constantly negotiate to get things on their roadmap, and once you learn from an experiment and come up with follow up iterations; you need to go through the whole negotiation process again. This process can cripple Growth’s ability to quickly iterate and learn.


From my own personal experiences, I’ve seen what a difference a full stack team can make. At Pinterest we rely heavily on recommendations in emails. These recommendations used to be powered by another team that (rightly so) could never prioritize significant investment in the email pipelines over their other priorities. The growth team decided to hire our own machine-learning engineers to start working on the email recommendations ourselves and within six months the team was able to launch several improvements to email recommendations that were the team’s biggest growth drivers of the year.


Once you’ve gotten headcount, you then need to hire the engineers. I’ve written before about hiring Growth engineers. Since Growth is still relatively new, there aren’t a lot of engineers who have worked specifically on Growth before, so I look for engineers who would be a good fit for Growth. Some engineers are motivated by working on hard technical problems, some by crafting the pixel perfect product experience, but the engineers that work best on the Growth team are engineers who are motivated by making a business impact. The engineers motivated by business impact tend to be the engineers that are most engaged with their projects and will be the ones coming up with ideas to even further increase their projects impact (it also helps if they have some product intuition as well). Which leads into my next tenant, building a culture of ownership.


Culture Of Ownership

            Once you have the team staffed up, it is important to build the culture of the team. For Growth Engineering teams, building a strong culture of ownership around the projects that the engineers are working on leads to the best results in the long run. I’m a firm believer that an engineer who spends days or weeks working on a particular project is in a significantly better position to come up with ideas for how to further increase the impact of the project than the Product Manager or Engineering Manager who can only spend a fraction of their time thinking about the project.


On my team, I try to instill a sense of ownership where engineers act as a mini-PM for their projects. Engineers are responsible for their experiments beginning to end, starting from writing the doc about why we are running the experiment, implementing it, doing the final analysis and finally, making the recommendation to ship or not. They are also responsible for coming up with ideas to further increase the impact of their project beyond what was originally scoped. They are empowered to propose and run experiments on what they think will further increase its impact. I’ve seen time and again that setting the expectation that engineers are responsible for figuring out how to increase the impact of their projects and giving them the autonomy to try things out has increased the impact of the original project by 50-100% in some cases. Our team helps reinforce the culture of ownership and maximizing results by recognizing and celebrating wins. We make a point to call out when engineers went above and beyond on a project to make it successful.


Quality and Stability

Finally, there is a lot of pressure to move fast on the Growth team. However, when things break, that can be a problem if it hurts topline Growth or if it happens so often the Growth team is constantly in firefighting mode. The Growth team can be responsible for over a dozen major features including signups, logins, NUX (new user experiences), emails, push notifications, invites, app upsells and SEO. If you multiply these features by each platform, like mobile web, desktop web, iOS, and Android, the numbers of opportunities for something to break compounds, especially when you have hundreds of engineers committing new code every day. A minor outage on something like your signup flow can end up costing you hundreds of thousands of users when you’re growing at scale.


In addition to working on experiments to increase Growth, it is imperative that the Growth team works to build a culture around quality and stability. This means ensuring proper unit testing, alerting, and extensive monitoring are in place to protect existing growth. For instance, alerts should be set up to catch a sudden drop in signups, an on-call rotation should be then be able to quickly deal with it. The on-call person needs extensive stats and logs to be able to quickly pinpoint the issue and diagnose the fix. Finally, a post-mortem process should be in place to review how to prevent future outages.


Wrap Up

Over the years, I’ve seen first-hand how each of these attributes have helped amplify the impact of the Growth Engineering team. Full stack staffing allows the Growth team to execute with a high velocity. At Pinterest the Growth team has taken on areas not normally associated with Growth, such as content recommendation pipelines, in order to move faster. Building a culture of ownership where engineers own and are empowered on the experiments they work on and are actively encouraged to come up with new ideas to improve the impact of their projects. Finally, a focus on quality and stability helps ensure Growth isn’t constantly fighting fires and existing growth is protected.

The Wrong Way To Analyze Experiments

One of the biggest mistakes I see Growth teams make when it comes to analyzing experiments is focusing too much on percentage gains. I’ve seen it time and time again when Growth teams pat themselves on the back for 30% increase in this metric or 15% increase in that metric. In this post, I’ll dissect the problem around percentage gains and why Growth teams should avoid using them to when it comes to assessing the overall impact of an experiment.


The Problems

The first problem with reporting percentage gains is that every experiment has an audience bias. By audience bias, I’m referring to the characteristics of the users in the experiment can differ significantly in terms of demographics, engagement levels, etc. from one experiment to another. For instance, let’s say I run an experiment where I send users a notification when someone likes their post. That experiment is naturally going to have an audience biased heavily towards active users that are actively posting content. You need to be posting content in order to be eligible to receive a notification about someone liking your post. In that hypothetical experiment, I might see only a small increase (ex: 1%) in daily active users (DAUs) because many people in the audience might already be DAUs. However, if I run another experiment to send an email to re-engage dormant users, I might see a 150% increase in DAUs because so few of the users in the experiment would become a DAU organically.

The second issue with percentage gains is that it obscures the true business impact. Going back to the email example, while one experiment had a 1% gain and the other had a 150% gain, we actually have no way of telling which experiment was more impactful for the business. The critical piece of data that is missing is the baseline population that the percentage gain is increasing. The 1% gain could be on a population with 10 million DAUs, which means the experiment netted us 100,000 incremental DAUs. Conversely, the 150% gain might be on a population of 10 million dormant users of which 20,000 were coming back organically, which means that experiment netted us only 50,000 incremental DAUs.


An Example

When I first started on the Growth team at Shopkick, we made the mistake of looking exclusively at percentage gains. During the first several months, the Growth team shipped a lot of experiments that had great percentage gains (70% increase in signups coming from Facebook posts, 50% increase in-store visits, etc.) About 5 months after forming the Growth team, we realized we weren’t actually seeing our topline metrics move. Going back over the data, we realized that a lot of those big percentage gains we posted in earlier experiments weren’t actually impacting the bottom line. For instance, the experiment that delivered a 70% increase in signups in reality was only adding an incremental 20 signups a day. Looking at percentage gains had been blinding us and we needed to focus on absolute numbers if we wanted to gauge the true impact.


Not All Bad

So, are percentage gains completely worthless? While I just spent most of the post railing on percentage gains, there are cases when they can be helpful to look at. For instance, percentage gains can be used to gauge the effectiveness of the experiment by judging the percentage increases for one metric relative to another. For instance, if you send 10% more email to dormant users that resulted in 20% gain in DAUs, that’s a good indicator that users really like the email. The inverse correlation might indicate the email isn’t that great and you’re just making up for it in volume. Just don’t use those percentage gains when reporting the experiments impact.


Wrap Up

About a year and a half ago, at Pinterest we shifted to using absolute numbers on the Growth team when reporting results. It has helped us compare and measure the true business impact of experiments that range across many different surface areas of the product and many different segments of users. Absolute numbers have also helped us more concretely measure a team’s output, by summing up the absolute impact of all the experiments they shipped that quarter. In Growth, the number 1 priority is driving impact, and if you’re not using absolute numbers to measure your output, then you can’t be sure you’re doing that.

How Pinterest, Dropbox, and Yelp Drive Growth @ Scale

We recently held a meetup at Pinterest’s offices to discuss how some of the most successful companies drive growth at scale. We had a great turnout to hear speakers from Dropbox, Pinterest, and Yelp.

Video Link: https://youtu.be/-1ABjPs6da4

Our next meetup is September 28th, 2016 at Lyft’s HQ in San Francisco. We will have speakers from Lyft, Pinterest, Airbnb, and Yik Yak. Please RSVP here: http://www.meetup.com/Growth-Engineering/events/234131759/


Synopsis of the Talks

Using Machine Learning to Drive Growth at Pinterest

At Pinterest, we send different types of recommendation notifications to our users in order to help them discover new pins and boards. In this talk, we present how we use machine learning techniques to decide which notification types to send and how often based on the user’s historical engagement. Using these models we are able to deliver a personalized notification experience to our individual users.

Utku is a Software Engineer on the Growth team and focuses on engagement modeling. He previously worked at Linkedin as a Staff Software Engineer.


Building a Growth Team at Yelp

Over the last year and a half, Yelp’s Growth team was built from scratch.  We’ll cover building a growth team from scratch, how we approached it, what went well, and lessons learned.  We’ll walk through a few of our experiments as case studies, and provide some guidance on how we decided what to focus the Growth team on.

Alex is an Engineering Manager on the Growth team.  Previous to the Growth team, Alex was an iOS and web engineer at Yelp.


Growth Platform at Dropbox

In this talk, we cover the platform that we’re building at Dropbox to scale communication with our half a billion users across several different channels such as email, mobile/desktop notifications and in-app surfaces for onboarding, engagement and monetization.
Ruchi is an Engineering Manager on Growth Platform team at Dropbox. She has been at Dropbox for a little more than a year. She previously worked at Hearsay Social, an enterprise SaaS startup, as the tech lead for their social media based lead generation platform.

Aditi is a Product Manager on the Growth Platform team at Dropbox. She has been at Dropbox for 3 years and has worked on diverse product experiences across sharing, user identity, growth, and monetization.

How Pinterest increased MAUs with one simple trick

For many areas of growth, presenting your message with the right hook to pique a user’s interest and to get them to engage is critical. Copy is especially important in areas such as landing pages, email subject lines or blog post titles, where users make split second decisions on whether or not to engage with the content based on a short phrase. Companies like BuzzFeed have built multi-billion dollar businesses in part by getting this phrasing down to a science and doing it more effectively than their competitors.

At Pinterest, we knew copy testing could be impactful, but we weren’t regularly running copy experiments because they were tedious to setup and analyze in our existing systems. This made it difficult to do the type of iteration necessary to optimize a piece of copy. Last year, however, we built a framework called Copytune to help address these issues. The framework has helped us optimize copy across numerous languages and significantly boost MAUs (Monthly Active Users). In this post, we’ll cover how we built Copytune, the strategy we’ve found most effective for optimizing copy and some important lessons we learned along the way.


Building Copytune

When we decided to build Copytune, we had a few goals in mind:

1)   Optimize copy on a per-language basis by running an independent experiment for each language. What performs best in English won’t necessarily perform the best in German.

2)   Make copy experiments easy to set up, and eliminate the need to change code in order to setup an experiment.

3)   Have copy experiments auto-resolve themselves. When you’re running 30+ independent experiments (one experiment for each language), each with 15 different variants, it becomes too much analysis overhead to have a human go in and pick the winning variant for each language.

Copytune dashboard showing different winners among languages

To achieve these goals, we built a framework that mimicked the API for Tower, the translation library that every string passes through. We first had every string pass through Copytune, which would check the database to see if there was an experiment setup for that string. If so, it would return one of the variants. If the string was not in experiment, Copytune would then pass the string to Tower to get the correct translation of the string. A nightly job would then compile statistics on all the copy experiments and would automatically shut down experiments when there was enough data to declare the winner.

Copy optimization strategy

Testing copy requires an iterative process to achieve the best results. It’s almost impossible to identify the ‘best’ copy in one go, so we  took an incremental approach to discover it.

  1. Explore Phase: You can’t know for sure what will work, so we started by testing many variants that touch on very different themes, tones, etc. We typically brainstorm 15 – 20 different variants. For example:

    • The latest Pins in Home Decor
    • Come see the top Pins in Home Decor for 12/3/2015
    • We found a few Pins you might like
  1. Refine Phase: After the Explore Phase, we began  to see which tones and phrasing  were performing best. Then we could refine by testing different components of the winning variants of the Explore Phase.

Let’s say that in Explore Phase, the winner was “We found some {pin_keyword} and {pin_topic} Pins and boards for you!”. There are many possible optimizations we can test in this example.

Example of component variations

We can try adding “Hey Emma!” at the beginning to catch the Pinner’s attention. We can even test whether “Hi Emma!” or just “Emma!” is better than “Hey Emma!”. We can test some phrases like “we found” vs. “we picked.” We can test if “Pins and boards” is better than just having “Pins” or “Boards.” In this example there are at least 10 components we can test. We treat them as independent components and test each of them against the winner.

  1. Combine Phase: Let’s say “Hi Emma!”, “we picked” and “{pin_topic}” were winners in the Refine Phase. We can now test if the combination Refine Phase winners (a) performs better than the original winner (b)

1. “Hi Emma! We picked some {pin_topic1} and {pin_topic2} Pins and boards for you!”

2. “We found some {pin_keyword} and {pin_topic} Pins and boards for you!”

Note that it’s possible that some components are not independent, so we also tested other combinations that seem promising.

In one of our highest volume emails, the winning variant from the Explore Phase showed only a gain in one percent open rate. By the end of the whole iteration, optimizing the subject line on one email boosted it to an 11 percent gain, adding hundreds of thousands more active Pinners each week.


Lessons learned

Copytune has been in place for almost a year now, and we’ve learned some lessons along the way:

Defining Success: When we initially started testing email subject lines, we defined the success criteria as driving an email open. This seemed to be the most straightforward since the Pinner reads the email subject line and the next action is to either open it or don’t. What we found, however, was that defining success with metrics that were further downstream (i.e. clicking on the content in the email) was more effective. Some subject lines were great at getting opens, but there was a mismatch in user expectations based on the subject line and the actual content in the email so net net they were actually resulting in fewer clicks.

Picking Variants: The original vision for Copytune was to use a Multi Armed Bandit framework for picking variants and auto-resolving experiments. The difficulty we ran into was feature owners wanted to see how the experiment performed across a variety of metrics and to be able to report concrete MAU gains from the experiment. To accommodate these needs, we ultimately needed to integrate Copytune with our internal A/B testing framework.


Acknowledgements: Koichiro Narita for co-writing this post, helping develop Copytune, and running the subject line experiments covered in this post. Devin Finzer and Sangmin Shin for helping develop Copytune.

This post was originally published on the Pinterest Engineering Blog

4 Steps To Develop Your Push Notification Strategy

Startups often struggle with how to develop their push notification strategy. While email has been around for decades and is fairly mature, push has only been around a few years and people are still trying to get a handle on it. In this post, I’ll cover the basics of how to develop a messaging strategy that applies to both push and email and how to take advantage of some of the unique aspects of push.

Step 1: Define the product’s core value proposition

Push notifications should be an extension of the product’s core value proposition. I can’t emphasize this enough. One of the biggest mistakes I see startups make (and I’ve made myself) is that they send emails/notifications about things that are not strongly tied to the value proposition. The value proposition is the reason people engage with the product and is what sets your product apart. Push notifications should further that engagement and make it easier for them to derive that core value. For users who don’t yet “get” the product, push should help them understand the value. For users who do “get” the product, push notifications should help them engage further engage with it.

Step 2: Figure out what you can send that is tied to that core value proposition

Content generally falls into one of three broad buckets, each with their own pros and cons. The type of content you send depends both on what makes sense with the product and what your resource constraints are.

Marketing Driven: These are notification blasts sent out by the marketing team to most or all users. A lot of ecommerce and brick and mortar retailers fall into this bucket.


  • Coverage: Can send to every single user
  • Minimal engineering effort required, which makes it great for early stage startups

  • Content is not personalized, which leads to low engagement rates
  • Users have lower tolerance to these types of notifications, which means you have to use them sparingly to avoid high unsubscribe and app deletion rates

Transactional: These notifications are triggered by users’ actions on the service. They inform other users about those actions. Facebook and LinkedIn are great examples of this.


  • Generally good engagement since content is relevant by virtue of the fact that the user has an direct connection to the action
  • Higher level of tolerance since users understand what is triggering the notifications

  • Need to have enough engagement on the site, or be connected to enough users, to get the flywheel going


Content Driven: Content driven notifications connect users with relevant and interesting content. They generally use some amount of personalization to figure out which content to recommend.  Twitter for example will send emails/notifications to less engaged users about popular tweets they think the user will be interested in.


  • Can get good engagement rates by sending highly personalized, relevant content.
  • Can get good coverage by sending trending and popular content to users for which you don’t have enough signal to personalize recommendations.

  • Engagement rates get worse the less the user has engaged with the site
  • Expensive to build out recommendation algorithms from an engineering effort perspective

Step 3: Figure out your user segments

Once you’ve figured out what content you send, you then need to figure out who you want to send to. Not all notifications are good for all users. Notifications should be targeted based on where the user is in their lifecycle. A very simple but powerful segmentation is classifying users into new, engaged, and unengaged.

  • New Users – Send notifications that help reinforce the product’s value and help them figure out how to get more value out of the app.
  • Engaged Users –These users are already engaged and understand the product, so only send them the best, most useful notifications that help them engage even further.
  • Unengaged Users – Unengaged users are always the toughest nut to crack since they have already shown a bias towards not engaging with the product. The signals you have on them may or may not be accurate so sending a mix of personalized notifications and broader non-personalized notifications is necessary to try and re-engage them.

Step 4: Think about what makes push unique

Up to this point, everything we’ve talked about can apply just as much to email as it does to push. However, there are a few things that really differentiate push from email and may change your approach to push.

1)   Timeliness – Since most people have their phones on them at all times, push notifications allow you to reach users more immediately than email.

2)   Location Based – Both iOS and Android have good support for geofenced notifications that allow you to notify the user when they are near a certain latitude and longitude point.

3)   Badging – Badging is a way to give the user an indicator that there is something new in the app in a way that is less intrusive than sending an email or normal push, and still triggers a lot of engagement.

The final step is to ask yourself if there is any way these attributes naturally dovetail with your product’s value proposition. For example, geofencing notifications are a great fit for location based apps, but can feel out of place if location is not a core feature of the app.

Wrap Up

 As with anything Growth related, push notifications require a lot of trial and error, iteration, and experimentation. However, I’ve found thinking of push notifications as an extension of the app’s value proposition and then thinking through this framework has helped me a lot when crafting a push notification strategy.

Experiment Segmentation: Avoiding Old Dogs and Watered Down Results

One of the biggest growth bets we placed during my time at Shopkick was on geofenced notifications. Geofenced notifications are location-based alerts users received when they were near one of our partner stores.  To drive more in-store visits, the notification would tell users how many reward points were available at the store and remind them to pull out the app. Since the iOS and Android support for geofencing was pretty new at the time, we had to spend a lot of engineering effort building out the feature and fine-tuning it to strike the right balance between accuracy vs. battery life. We chose to make such a big investment because we believed it could increase store visits by 20%-30%. When we launched the experiment however, we were pretty disappointed. The initial experiment results showed only a 3% increase in store visits, which was far less than our expectations. We knew something was wrong because we really believed that geofencing could be a game changer, so we spent the next several weeks on a major effort to debug and try to figure out what the problem was. The team even went as far as building a standalone iOS app for the sole purpose of testing and debugging geofencing and driving all over the Bay Area to do field tests. After all this work, we found a few minor issues but still couldn’t pinpoint any major problems. Finally, we took a step back and took a second look at our experiment data.  This time, however, we chose to isolate our analysis to just new users who had joined in the weeks since the experiment started. It was then that we saw that geofencing had increased store visits by over 20% amongst new users and substantially improved new user activation.

When it comes to experiments aimed at increasing user activity or engagement, it is critical to segment your experiment analysis to get the full and accurate picture of how the experiment is performing. There are two main effects to watch out for:

Old Dogs: We’ve all heard the idiom “you can’t teach an old dog new tricks”. The first effect to watch out for is that existing users have a strong bias towards using the product the way it was before the experiment. This is because they learned how to use the product before the experiment, they experienced enough value to stick around without the experiment existing, and they will most likely continue to use the product in the patterns they developed before the experiment. New users, however, have no preconceived notions, and as far as they know, the experiment has always been a part of the product. Looking at new users can give you valuable insight into the experiment from an unbiased population.

Watered Down Results: The second effect to look out for is that an established userbase of highly active and highly engaged users can dilute the results for experiments aimed at increasing engagement. The reason is that it can be very difficult to take someone who is already hyper-engaged with the product and increase their level of engagement. However, it can be much easier to take a less engaged user or a new user and get them to become more engaged. This effect was illustrated in an experiment I ran at Pinterest. The experiment was to send a new push notification to a group of users. Overall, the experiment showed a 3% lift in WAUs amongst the target population.

Experiment results amongst all users

However, when we segmented our analysis and looked at how the experiment performed amongst less engaged users (users who usually use the app <4 times a month), we saw that it resulted in a lift of 10% in WAUs amongst that particular group.

Experiment results from users who usually use the app <4 times a month

Sure enough, when we looked at how the experiment performed amongst Core users (users who usually use the app multiple times a week), we can see it had no impact on moving the WAU metric.

Experiment results from users who usually use the app multiple times a week

A/B experimentation on the surface sounds easy. However, experiments rarely affect all users equally and only looking at the macro level results can be misleading. Segmenting experiments by country, gender, the user’s level of engagement (prior to the experiment starting, of course), and being aware of Old Dogs and Watered Down Results is crucial in fully understanding the impact the experiment had. You may even discover certain segments of the userbase that perhaps don’t need the experiment and where the experiment may actually be doing more harm than good.

When do features drive growth?

As I mentioned in my previous post, I often see this belief in product development that adding new product features to a product will help spur growth. The thinking behind it is basically that more features == more value == more growth. I disagreed with that thinking in my previous post and received a few questions about that point, so I will expand upon it in this post.


First I want to define what a core product feature is. A core product feature is a feature that is part of the normal everyday use of the product. So, I’m excluding typical growth features such as new user flows, invite referrals, sharing to social networks, SEO, etc., where the immediate impact on growth is very clear, but the feature is not part of the core usage of the product. My hypothesis is that new core product features only help change a company’s growth trajectory if it creates an engagement loop or creates a step-change improvement in the amount of value the average user gets from the product (preferably it does both). Specifically, for a feature to accelerate growth, I think it needs to meet the following criteria:

A) Most important is mass adoption by the user base. Over 50% of the users will need to interact with the feature on a regular basis (i.e. daily or weekly basis).

B) The feature creates an engagement loop that allows you to email or notify users on a regular basis. The content in those emails/notifications also needs to be compelling enough that it maintains a high click through rate over time.

C) Or as an alternative to B, the feature is a step-change improvement in the core value of the product for a majority of the userbase.

Facebook: A Case Study 

To put this hypothesis to the test, Facebook serves as a great case study of which types of features drive Growth and which do not. Over the years, they have launched Photos, News Feed, Platform, and Chat, which have all been major drivers of growth and engagement. However, they have also launched Questions, Places, Deals, Gifts, and Timeline, which have not fared as well in terms of driving growth and engagement.

To define what Facebook’s core product value is, I’ll use one of their more succinct mission statements from 2008, which is: “Facebook helps you connect and share with the people in your life.” [1]

Features That Did Drive Growth and Engagement

Photos: Photos was launched in 2005 and was one of the first major new features Facebook added after launch. Facebook has stated before how successful the Photos product was at driving engagement. If we look through the lens of the criteria laid out above, we can start to understand why.
A) The majority of users upload photos or are tagged in photos.
B) Tagging allows Facebook to re-engage users and has been so successful that Facebook has heavily invested in facial recognition to make tagging easier.
C) Photos provides a significant increase in the core product value by allowing people to much more easily share important moments and events in their lives and allowing people to be much better connected with what their friends and family are doing.

News Feed: News Feed was very controversial when it first launched in 2006, but ultimately led to a significant uptick in user engagement. Looking at our criteria:
A) Pretty much 100% of users view their News Feed.
B) News Feed dramatically improved the content sharing engagement loop. After News Feed, when someone shared a set of photos or a status update, those shares had dramatically higher visibility compared to when they were only visible by navigating to a user’s profile. This meant that these posts now received many more likes and comments that Facebook would then notify the user about.
C) News Feed provided another step-change improvement to Facebook’s mission. It was now much easier to passively stay connected to friends by seeing their posts and updates. As previously mentioned, it also fundamentally altered Facebook’s value proposition. It was no longer just a way to keep in touch, but it was now a way to broadcast and communicate with your social graph.

Platform: In a bold move at the time, Facebook opened up their platform to third party developers in 2006 [2]. Although Facebook has since clamped down significantly on the platform in the interest of user experience, Facebook Platform was a big engagement driver for a period of a few years.
A) For the first few years, a significant percentage of users interacted directly with Facebook apps or would get notifications from others using the apps.
B) Facebook outsourced the work of constructing the engagement loop to third-parties. Apps like Farmville, etc. had to create their own engagement loops to survive by using gamification mechanics such as crop harvesting to bring users back. By bringing a user back to Farmville, Zynga also brought a user back to Facebook. They also generated billions of notifications to Facebook users by getting users to spam app requests out to friends until Facebook finally clamped down on the app request spam.
C) Social gaming was the primary category of app that achieved significant traction on Facebook Platform. Those apps did little to improve Facebook’s core product value, so point C is not applicable.

Chat: Messages have been part of Facebook since its initial launch in 2004. However, in 2008 Facebook launched Chat [3]. Examining Chat, we can see that:
A) To ensure mass adoption, Facebook made Chat highly visible by including it as a sidebar overlay on every page.
B) Chat creates a very compelling high-frequency engagement loop.
C) Chat again significantly expanded on the ability for friends and family to stay connected through Facebook and eventually evolved into Facebook Messenger.

Features That Didn’t Drive Growth and Engagement

Questions: Launched in 2010 [4], possibly as a jab at Quora co-founder and former Facebook CTO Adam D’Angelo, Questions never really gained much traction.
A) The number of users using questions rapidly dropped off after the initial launch.
B) Questions did create an engagement loop, but the frequency was relatively low.
C) Questions was an incremental improvement and did not significantly expand on Facebook’s value proposition since people could already ask questions by just posting a status update.

Places: Also launched in 2010 [5], Places was Facebook’s response to the surging popularity of Foursquare. Places promised to enable people to share where they were.
A) Facebook probably did get over half the users to use Places through both checkins and location tagging in photos.
B) Places did not create an engagement loop.
C) Places was an incremental improvement and did not significantly expand on the value proposition with regards to sharing since people could already share where they were through status updates or picture descriptions.

Deals and Gifts: Deals and Gifts are both pretty similar and were aimed more at monetization than growth, but they are still worth covering. Deals launched in 2010 (must have been a busy year) [6]. Gifts first launched in 2007 [7] and then re-launched in 2012 following Facebook’s acquisition of Karma [8]. Looking at our criteria:
A) Fewer than 50% of users bought a deal/gift.
B) Deals and Gifts did have the potential to create an engagement loop via a daily email, but Facebook didn’t capitalize on it. Even if they did, they would have discovered the same thing Groupon and LivingSocial did, which was that click through rates on the emails decay significantly over time as users grow tired of receiving deal after deal (or gift) they are not interested in buying.
C) Deals were not tied to Facebook’s value proposition at all. Gifts at least helped people be more connected, but the frequency was too low and the consumer adoption wasn’t there.

Timeline: Introduced in 2011, Timeline served two purposes. First, it was meant to help users rediscover things they had shared in the past and second, it allow users to more easily share from other apps [9]. However, according to insiders, it did not lead to any significant gains in growth or engagement.
A) Over 50% interact with Timeline.
B) Timeline did not create a significant engagement loop. It generated a lot more posts from apps, but due to the trivial nature of the information being shared, they did not receive many likes and comments compared to organic shares and posts.
C) Timeline was more of an incremental improvement in the core product value rather than a step-change. In terms of being connected to friends and family, sure, you could now more easily scroll through hundreds of posts your friends had made over the years, but no one does that on a regular basis. In terms of sharing, the data shared from apps was trivial information such as a song you listened to or an article you read, but it didn’t necessarily mean you liked the song or thought the article was interesting, so the share had little value.

So What Does This All Mean?

Should we stop all core product feature development that doesn’t meet these criteria? Of course not. Core product teams should continue to develop features to make the product incrementally better and make users happy. However, if you are contemplating adding a new feature on the basis of expecting it to drive growth and engagement, you should ruthlessly evaluate it against these criteria.

Agree? Disagree? Discuss this post on growthhackers.com

Acknowledgements: Special thanks to Casey Winters and Stephanie Egan for helping review and refine this post.

How Can Reddit Solve its Growth Problem?

Reddit has been undergoing a lot of turmoil lately. CEO Ellen Pao resigned ostensibly because she felt she couldn’t deliver the growth numbers the board wanted to see in the next six months. A question was posed yesterday on /r/Entrepreneur about “What would you do as Reddit’s CEO to grow the user base in the next 6 months”. The comments in the post were filled with ideas of new features that could be added or existing features that could be tweaked. For instance suggestions were to tweak the upvote/downvote system, build improved moderator tools, or give greater visibility to underused features such as multireddits, etc

Do Features Drive Growth?

I think one mistake people often make is thinking that new features can help spur growth. They think more features == more user value == more growth. Whether you’re a tiny startup just getting off the ground or a mature product used by hundreds of millions of people, I think new features rarely lead to a significant change in growth trajectory. I believe this is because for a new feature to drive more growth it can’t just add incrementally more value; it has to create a step change in the amount of value that the average user gets from the product.

Reddit has a few rough edges, but they couldn’t have grown to 130 million monthly unique visitors if they didn’t have solid product/market fit & weren’t delivering a ton of value to users already. So, I don’t think adding new features are the answer.

So What Should Reddit Do?

I hypothesize reddit derives a majority of its traffic from core users who by habit check reddit on a daily basis, referral traffic from blogs, and SEO. I think the biggest thing Reddit is missing is an engagement loop to bring non-core users back. Reddit currently does a very poor job of utilizing email, push notifications, and other social media platforms to re-engage users. My guess is because they are worried about being spammy, which is a huge mistake, since these channels can be leveraged in a non-spammy way that actually puts users first.

So what would I do?

1) Re-engagement emails for non-daily active users that give them a digest of the top 10 posts of the previous day or the previous week. I think less active redditors would get a ton of value because it would allow them to discover content they may have otherwise missed.

2) Push notifications for trending posts where timeliness matters (ex: AMAs, breaking news, etc). Often users complain that they discover a post too late after it already has thousands of comments and they feel any comment they make at that point would just get lost in the crowd.

3) Currently Reddit has about 130MM monthly uniques, but only about 9MM registered accounts. In order to make the engagement loop work Reddit would need more aggressive signup prompts for unauth users. Reddit’s user base are pretty anti-signup but I think communicating the value of creating an account would help convince users. Once a user is signed up they can start curating their subreddit subscriptions, they can start engaging in discussions and submitting links (and they can also start receiving the previously mentioned  re-engagement hooks).

4) Finally Buzzfeed, 9gag, the chive, etc., leech a ton of their traffic by repackaging stuff from Reddit and posting it to social media (namely Facebook). Invest in making sharing much more prominent so Reddit can start to capture some of that traffic. This doubles as both an acquisition strategy & a re-engagement strategy.

What would you do?

Share your thoughts in the discussion on growthhackers.com. If anyone from Reddit (or any other startup for that matter) wants some Growth advice, feel free to drop me a line.

Growth Tools & Frameworks

We recently held a meetup at Pinterest’s offices to discuss some of the tools & frameworks that some of the most successful companies have built in-house to enable them to drive growth at scale. We had a great turnout to hear speakers from Dropbox, Pinterest, and Facebook.

A few of the tools & frameworks we heard about were:

– Gandalf, a framework to target marketing messages & campaigns to users

– How Dropbox gets new users to bridge the gap between desktop and mobile

– Copytune, a framework for optimizing copy on a per language basis

– How to build an SEO Experimentation framework

– How Facebook uses “quick experiments” to assess the impact of even the smallest changes, such as bug fixes

If you’re an engineer interested in growth, join the SF Growth Engineering Meetup to find out more about these events in the future.

About the Speakers:

Darius Contractor – Darius works on Growth at Dropbox. He’s previously VP of Engineering at Bebo (acquired by AOL) and PM/Senior Eng at Tickle.com (acquired by Monster). He focuses on building the right product as simply as possible, iterative engineering and having fun. Occasionally, he blogs about psychology at http://darius.com

Viraj Mody – Viraj is an Engineering Manager and has been at Dropbox for 2.5 years where he focuses on onboarding/education/engagement initiatives & building infrastructure for growth. Before Dropbox, Viraj was a founder of Audiogalaxy (acquired by Dropbox in 2012).

John Egan – John is a lead engineer on the Growth team at Pinterest where he leads up efforts on emails & notifications. Prior to Pinterest, he led the Growth engineering team at Shopkick (acquired by SK Planet). You can read his thoughts on growth at http://jwegan.com

Julie Ahn – Julie is a software engineer on the Growth team at Pinterest where she focuses on search engine optimization. She built out the SEO experimentation framework which allows Pinterest to demystify SEO and help drive millions of incremental visits a day to Pinterest. Prior to Pinterest, she was a mechanical engineer in South Korea.

Ran Makavy – Ran is a Director of Product Management at Facebook. He spent the first three years on the Growth team, looking at mobile and emerging market. Today, he is running Facebook’s Local and Entities teams, building consumer products around places and location. Before Facebook, he co-founded Snaptu & grew it to over 100 million active users before it was acquired by Facebook.


Why You Should Be A/B Testing Your Infrastructure

The benefits of using a data-driven approach to product development are widely known. Most companies  understand the benefits of running an A/B experiment when adding a new feature or redesigning a page. While engineers and product managers have embraced a data-driven approach to product development, few think to apply it to backend development. We’ve applied A/B testing to major infrastructural changes at Pinterest and have found it extremely helpful in validating those changes have no negative user-facing impact.

Bugs are simply unavoidable when it comes to developing complex software. It’s often hard to prove you’ve you covered all possible edge cases, all possible error cases and all possible performance issues. However, when replacing or re-architecting an existing system, you have the unique opportunity to prove that the new system is at least as good as the one it’s replacing. For rapidly growing companies like Pinterest, the necessity to re-architect or replace one component of our infrastructure happens relatively frequently. We rely heavily on logging, monitoring and unit tests to ensure we’re creating quality code. However, we also run A/B experiments whenever possible as a final step of validation to ensure there’s no unintended impact on Pinners. The way we run the experiment is pretty simple: half of Pinners are sent down the old code path and hit the old system and the other half use the new system. We then monitor the results to make sure there’s no impact across all our key metrics for Pinners in the treatment group. Here are the results of three such experiments.

2013: A new web framework

Our commitment to A/B testing infrastructural changes was forged in early 2013 when we rewrote our web framework. Our legacy code had grown increasingly unwieldy over time, and its functionality was beginning to diverge from that of our mobile apps because it ran through completely independent code paths. So we built a new web framework (code-named Denzel) that was modular and composable and consumed the same API as our mobile clients. At the same time we redesigned the look and feel of the website.

When it came time to launch, we debated extensively whether we should run an experiment at all, since we were fully organizationally committed to the change and hadn’t yet run many experiments on changes of this magnitude. But when we ran the experiment on a small fraction of our traffic, we discovered not only major bugs in some clients we hadn’t fully tested but also that some minor features we hadn’t ported over to the new framework were in fact driving significant user engagement. We reinstated these features and fixed the bugs before fully rolling out the new website, which gave Pinners a better experience and allowed us to understand our product better at the same time.

This first trial by fire helped us establish a broad culture of experimentation and data-driven decision-making, as well as learn to break down big changes into individually testable components.

2014: Pyapns

We rely on an open-source library called pyapns for sending push notifications via Apple’s servers. The project was written several years ago and wasn’t well maintained. Based on our data and what we’d heard from other companies, we had concerns about its reliability. We decided to test out using a different library called PyAPNs, which seemed better written and better maintained. We set up an A/B experiment, monitored the results and found that there was a 1 percent decrease in our visitors with PyAPNs. We did some digging and couldn’t determine the cause for the drop, so we eventually decided to roll back and stick with pyapns.

Figure 1: Experiment results for replacing pyapns

2015: User Service

We’ve slowly been moving towards a more service-oriented architecture. Recently we extracted a lot of our code for managing users and encapsulated it into our new UserService. We took an iterative approach to building the service, extracting one piece of functionality at a time. With such a major refactor of how we handle all user-related data, we wanted to ensure  nothing broke. We set up an experiment for each major piece of functionality that was extracted, for a total of three experiments. Each experiment completed successfully showing no drop in any metrics. The results have given us strong confidence that this new UserService is at parity with the previous code.

We’ve had a lot of success with A/B testing our infrastructure. It’s helped us identify when changes have caused a serious negative impact that we probably wouldn’t have noticed. When they go well, they also give us the confidence that a new system is performing as expected. If you’re not A/B testing your infrastructure changes, you really should be.

By: John Egan is a growth engineer and Andrea Burbank is a data scientist at Pinterest

Acknowledgements: Dan Feng, Josh Inkenbrandt, Nadine Harik, Vy Phan, John Egan and Andrea Burbank for helping run the experiments covered in this post.

Originally published on the Pinterest Engineering Blog