Slow & Steady? That’s for Someone Else’s Race

By Pablo Flores, Engineering Head of Customer Journey & Kevin Guo, UI Architect

StubHub
StubHub Product & Tech Blog

--

The team that started this endeavor.

Speed is the best user experience. Delayed service, well, is the worst. Every delay in response time in our products frustrates the customer and ultimately impacts the bottom line. And as you can imagine, that frustrates us!

The beauty of the P&T team at StubHub is that we are always quick to identify our faults and then build the solutions that help maintain our standing as a global leader in e-commerce.

In 2017, we were not as fast as we felt we could have been. We wanted to build a better, faster experience for our customers while also establishing an internal process that would ensure this evolution could happen continuously.

This meant making it simpler for our developers to code across the globe.

First, we had to identify which parts of our process were working and which ones were not.

We realized we were not building our products or developing solutions quickly enough. When we did, we were too slow to deploy them. A lot of this was the result of how our teams were structured, which impeded (rather than fostered) global cohesion when building our products and solutions.

For better (or really for worse), each team often worked on specific features for different pages — like search fields and drop-down menus. In some instances, there was duplication of work. This lack of coordination between our different teams was causing a massive code base with plenty of tech debt.

The SDLC was not as strong as we believed it could be. Merging code for global teams working on long-lived integration branches was painful. Challenges managing scattered (vs. centralized) dependencies and resources meant that sometimes we’d surface a version of the resource to the user that was outdated.

Due to historical reasons, we were doing an unnecessary amount of integration testing for the sake of code coverage, but not on unit testing, which slowed down the release process.

To solve the problem, we needed to both speed up our team’s execution and structure their work so it could be easily incorporated into the global solution. We needed to start transforming our tech and processes. If we acted like a startup and fought for the best user experience for our customers, the world could be our oyster.

Embarking on the quest to change our tech stack required dedication, courage and commitment to the cause in the face of considerable uncertainty. As we soon learned, our team was brave and bold enough for the mission at hand.

Thinking Big, Starting Small

To re-platform a product that is used daily by millions of users is a scary endeavor. We knew that to conquer the global market, we needed to delight customers with a fast and seamless experience.

We had to think globally but deliver locally. That said, where do you start?

We first we set up a small team to build the foundations of the new stack, starting with our homepage. This was the Proof-of-Concept for the new stack — it needed to have faster performance, be ready for global markets (including SEO), and incorporate component architecture.

We needed faster performance on slow 3G networks in Europe using fewer downloaded bytes, especially on mobile platforms. We wanted a parallel download of HTML, fonts, and major CSS. The solution had to have an optimized PWA (Progressive Web Application) where:

  • Shared libraries and chunks are pre-processed at build time
  • A service worker takes care of static artifacts requests and their caching
  • Non-critical code is “lazy loaded” after page load

When this new stack had about 80 percent of the functionality of the running site and hit our criteria, the home team paused their work on the old stack and finished the rest of the functionality on the new stack.

Then the first small team moved on to re-platform another part of the site, and we repeated the sequence. Over time, this evolved into smaller teams of engineers creating real and reusable components for all of our pages on top of the new stack.

We had to shift from building pages to building front-end features as components that would be reusable by different developers in different flows. We needed to go from application development to component development and build assets.

But becoming “faster” is hardly an overnight thing. Then again, as we discovered in our pursuit of speed, it also doesn’t have to be a super-long slog. We believe you have to start by, well, starting. PowerPoint presentations and talks can’t compare to running code that makes a statement. Code after all, wins arguments.

Tackling the Problem Technically

We approached our quest for speed and efficiency by focusing on the UI platform, component architecture, BFF layer, dev tools, and infrastructure.

1. UI Platform

As we were thinking about how to rewrite our application for optimal performance, we had to first assemble a small team to think beyond performance and application features. The first step required a bit of forensics:

— How did our apps interact and integrate with other systems in our infrastructure?

— How did the apps operate within eBay and StubHub processes?

— How can we refine these details and implement them into simple interfaces and UI components?

— How can we design and build a foundation that is adaptive to infrastructure and organization changes?

We incorporated the designs of a/b tests, toggles, tracking, logging, routing, traffic splitting, authentication, security and caching all the way into the component architecture. When we moved our applications from our data centers to cloud solutions, it would only be the platform domain that needed to change.

2. Component Architecture

For component architecture, we decided to use React for UI components, and Redux for event/action handling. React provided a simple and straightforward interface that helped us work towards our goal of building reusable UI components, but we still needed extra work to make those UI components independent and self-contained, dynamically loadable, manageable and configurable.

So, we created our own implementation for Redux connect. It allows React component to register/unregister reducers on the fly and manage its own state data without polluting the global state.

As many people familiar with React know, React components can be rendered on both the client and the server side. We created a way that allows developers to simply annotate whether any React component should be rendered on the server side.

Afterwards, our server-side render (SSR) engine could dynamically figure out all API calls and user-specific experiences that are needed for a given page. Server-side rendering could be configured into different strategies for different types of traffic. For instance, we can configure API calls to fail faster, and we can let SEO crawlers wait longer for the content to be ready.

We needed our pages to be a composition of components where each one had been designed, built and tested to comply with specific requirements.

We needed our front-end developers to decompose a user interface design into many smaller units. Smaller units are easier to test and help us produce assets to use in other ways. We knew that we would not get every component right the first time. For example, where does one component end and another begin? But this is an iterative process; as future use cases appear, we would refactor our components as needed.

The architecture behind our new stack.

3. Build a BFF Layer

We built a middle layer called BFF (Back-end for Front-end) on Node that effectively created back-end technology that could be managed and implemented by our front-end developers. We needed to minimize the number of HTTP requests from the applications to our business APIs that hold inventory, tickets, catalog, and the like. With BFF, the team could design custom endpoints that connected to our data centers, therefore dramatically reducing traffic, removing complexity on our applications, and providing a better experience to our users.

4. Tools

We turned to a group of third-party tools that proved to be tremendously helpful:

Storybook

Storybook is an open-source web app that we use as an internal catalog of assets that’s accessible to our engineering teams, project managers and designers.

We asked our developers to publish the components to Storybook. Through Storybook, our product owner, designers, engineers, and quality control can discover, discuss and iterate our components in a way that maximizes re-usability, decreases time to market, improves quality, and provides a more consistent user experience. It’s also a great setup to test the components through the development life cycle.

NPM: Improve Dependency Management

We used an NPM registry to publish and release all of our own components and libraries. We also used it for security and vulnerability scanning on all third-party dependencies.

We also turned to Lerna for dependencies between these rapidly-evolving UI components. This helped simplify the onerous process of having to manually update version numbers and publishing components to our internal NPM registry (often one at a time). It also sped up adoption of newer component versions.

Webpack: Optimized Artifacts Bundling and Processing

Like others, we used webpack to process, build and bundle static resources. Components codes could be bundled into small chunks, so that they could be lazy loaded when needed, or pre-fetched by service-worker.

5. Akamai

Akamai, a content delivery network and cloud service, is generally used to cache static contents on the edge. We figured out it can help us do more than cache. We used Akamai to do phased release and ramp up traffic independently on different pages built on new stack. Akamai also helped us get a better understanding of user signals — like network speed, device information and geolocation. Then we could serve up the right customer experience based on these signals. In cases where there was limited network speed on a mobile device, we could choose to send less data or only important content to user.

Challenges

In a project as ambitious as rolling out a new tech stack to our users, there was bound to be some challenges. We were building for backwards compatibility, along with identifying tech debts to address later. We also had to think about how to split and ramp up traffic when we first started testing this new stack with users.

Without changing our domains and URL structure (as an SEO restriction), we needed to figure out how to keep our customers in the same user experience when they are navigating back and forth. This involved coordinating three aspects of network routing: infrastructure (Akamai and network firewall), application server, and front-end application.

We had to build new React components in existing pages (when appropriate), without sacrificing web performance, all while maintaining feature parity in the old stack and the new stack.

The Encouraging Results

The fruits of our labor are very encouraging. We went from a front-end stack that was large and monolithic to a full-stack base that serves up a PWA for a much better user experience. In some cases, we are now 3xs faster than before, and this will continue to improve as we move our products and data closer to our global customers.

Fast load times provide the best experience. We will need to monitor all additions to our systems on an ongoing basis. Our code, CSS, images, caching, network latency, CI/CD, and third-party integrations all help or hinder our ability to deliver to our customers. It takes a village to build it right.

Our front-end developers are now interacting and managing front- and back-end components. Our code base is much smaller, resulting in increased speed on our website.

We also have fewer production issues. When they do arise, thanks to the components, they’re easier to solve. We worked to unify our global team of developers, making it easier to collaborate. We can now distribute the design and implementation of components to our development centers in Europe, Asia, and America. Developers now get to spend more time building important business functionalities inside a component instead of building a snippet of code for minor touch-ups.

Side-by-side comparison of our homepage, as measured in seconds. UK on a 3G network on the original stack (left) vs. the new stack (right). Video was generated using webpagetest.org. (This GIF is sped up for viewing efficiency.)

Fast as a ‘Bullet’

The best result we’ve achieved from our work is the one we set out to reach in the first place: speed. We are now faster than ever before.

Our customer’s user experience has tripled in speed, especially on 3G networks in international markets. Our new stack allows for faster development cycles, which lowers developing cost and improves our overall performance. We’re becoming faster for our global customers and our global P&T team. We did this with our customer-first credo and by acting nimbly and tirelessly — just like a start-up.

By the way, guess what we’ve named this new tech stack?

Bullet.

Are you passionate/excited about React, Redux, Node.js, component architecture, minimizing time from conception to user, a/b testing, mobile, tools, and more? Come join us.

--

--

StubHub
StubHub Product & Tech Blog

Building better fan experiences. Product-focused, tech-driven, business-minded.