How We Survived 10k Requests a Second: Switching to Signed Asset URLs in an Emergency

Adam FortunaAvatar for Adam Fortuna

By Adam Fortuna

11 min read

Yesterday I woke up to a ping on my Apple Watch of unusual spending on my Hardcover debit card. It’s not unusual to get a ping about an expense, but this was a new one. Google Cloud had charged me a round $100 – an unusual amount to spend.

The previous day I was experimenting with Google Cloud Run, trying to migrate our Next.js staging environment from Vercel to there to save some money. I assumed I misconfigured that service and turned it off and went about my day.

A few hours later I got another expense alert: this time for $200 (!). Now I was worried. It didn’t help that I was working from a coffee shop, it was about to rain and my laptop battery was almost dead. 😅

After speed walking the trip home while listening to Bride (I’m making my way through the Trending Books), I got to work tracking down what was happening.

What’s the Issue?

My first stop was to Google Cloud Billing. That should narrow the issue down. The new expense was there – I wasn’t crazy. It seemed to level off, which was a good sign.

This confirmed there was something costly happening, but what?

I clicked over to the breakdown by service and it was clear: Cloud Storage expenses were up 2,098% 😅

Whelp, I wondered if this day would come. Let me back up and mention what services we’re using real quick.

What Google Cloud Services Does Hardcover Use?

Not much. In fact only 3. The rest of our services are hosted elsewhere.

Google Cloud Storage – We store all cover images, author avatars, user avatars, and various static images in Google Cloud Storage. I’ve previously used S3 for this, but we’re anti-amazon.

Google Cloud Run – We have exactly one process in Cloud Run, our Image Resize Service (Imaginary) I wrote about last year. It cut our $1,000 bill down to $50/month or less with no loss in functionality. If you’re using Imgix or Cloudinary, it’s kind of amazing.

Google CDN – The Image Resize Service doesn’t do any caching, so we throw a CDN in front of it that aggressively caches images.

Cloud Run and the Google CDN have worked absolutely perfectly. I haven’t even touched them in 11 months. Not once. I hadn’t touched Google Cloud Storage either, but that was the problem. 😅

When I initially setup Google Cloud Storage, I made the bucket (the storage container per-environment) public with optional private files. This allowed user avatars, cover images and other images we show on the site to be publicly accessed, but save user uploaded CSVs privately in the same place.

For their part, Google adds a lot of alert messages when a bucket has public access. It’s almost like they know something I don’t. Turns out the Internet is a Dark Forest (I just re-read The Three Body Problem after the Netflix series and will use any chance I can to drop it into conversation 😂).

My decision to keep our Google Cloud Storage public was the culprit.

So, What Actually Happened?

We’re still working that out and looking for ways to improve, but here’s the theory.

Someone has been hitting our API pretty hard recently. They likely downloaded a large amount of data from it – including every edition with URLs for covers.

Those cover images are direct URLs to Google Cloud Storage. For example:

Example URL

https://storage.googleapis.com/hardcover/external_data/ 36306160/04fdf2f73287526f8326413f4d3d7ec77999b832.jpegCode language: JavaScript (javascript)

These are the images that were public.

Downloading a few images from here wouldn’t have shown up on my radar. In this case, someone decided to download up to 10k images/second… for close to 7 hours.

This is equivalent to about 650 mb/second for that time period. That’s about 16 TB of data. We don’t even have that much in our bucket! They must’ve downloaded many things multiple times.

For most people in our Discord they’re using our API to create fun things based on their history. It turns out we need more protections in place to stop bulk access.

An Easy Solution With Issues

The solution for this isn’t technically difficult, but it’s a pain: make this bucket private. Doing that immediately would cause the site to no longer show images, and would prevent anyone using the API from getting images. We needed a plan to make it private and generate signed URLs for all images that expire after a set period of time.

Signed URLs are a variant of the original URL with additional parameters. I think of it like sending an address (the URL) and a password (the additional parameters that change).

You can generate a signed URL using the would allow access for a specific file and a duration. That URL will look something like this:

Example URL

https://storage.googleapis.com/hardcover/external_data/36306160/04fdf2f73287526f8326413f4d3d7ec77999b832.jpeg?GoogleAccessId=hardcover-production%40hardcover-313100.iam.gserviceaccount.com&Expires=1723846732&Signature=gQ81vjCntrus3j5xbr2r7XEQ30DCTye4ikwdrAQvhwlGGeUJuJ760n9f2o%2B70zjM91%2Bng3C4pWgZzp2DFWdXH2%2BAZMFv1pYMzk%2F20x66NKnZ2dX%2BbQuVu6BgDsIw%2BvmLzEnsMbN6RGm0Dlq0V8e8JoEWrohR5UkN1n5YscnOBywZgsSxIQ8KrL079GeCvWvdE%2B%2BAldLuHT0JhFyo76hJE%2Ba7tCUBuL8drPQSKaHguLrbMjrtuW8%2FfuKRSOnY8wMgI%2FQcFVZPDuA5uVXchi67zVu26RFraBO93MxfhwuIpFHVft6ViRxLF0irLp%2FGDTzDPwpR5vBSLlMPk58ByxxTeg%3D%3DCode language: JavaScript (javascript)

This URL contains a few specific parameters that Google Cloud Storage can use to authenticate this URL including GoogleAccessId, Expires and the Signature. Unless all 3 of these are correct you’ll get an Access Denied error from Google Cloud Storage.

On the walk home from the coffee shop I started brainstorming how to add these to Hardcover. If you’re ever wondering if a walk will help you think: the answer is most likely yes. 😂

Failed Solution #1: Change the URLs in our Database

I know what I didn’t want to do: rewrite all images in our database. Even if we make them private, those image URLs are correct. Also, since signed URLs expire after a set period of time, if we went this route, we’d need to regenerate 5 million signed URLs over that time period.

This did get me thinking about how long we should allow URLs to be used for. I landed on one day. We can’t regenerate 5 million signed URLs everyday – nor do we need to. This option was out.

Failed Solution #2: Update the API to return Signed URLs

Our API (Hasura) connects straight to our database. If someone requests 100 books with covers, what they’re getting back is straight from our database. Ruby and Rails aren’t involved.

In other words, we have no way to override the static value from that column in the database. Nor should we. That endpoint to get data about editions and books should be allowed to be accessed at a much higher rate. If we were to generate signed URLs at that step it would slow down the entire system which relies on book data.

Having our API return signed URLs in bulk was out.

Failed Solution #3: Generate Signed URLs in Next.js

Next.js is responsible for presenting all information to visitors of this site. It seemed like this might be a valid solution. Before we show a URL for any image we could generate a signed URL, cache it in Redis for 24 hours then use that.

There are two major problems with this. First off, generating the signed URL would need to happen on the server. Unfortunately, we use cover images that have been downloaded client side. It’d be unrealistic to fetch information about a book on the Next.js client side, send it to the server to get the URL, then use that. Not impossible, but not great.

The other problem was bigger: API users wouldn’t have access to this! We’d effectively taken away the ability to download images.

Actual Solution: Just use Ruby on Rails

I’m a huge fan of Ruby on Rails. I’ve been using it since before version 1.0 back in 2005. All of Hardcover’s backend business logic is in Rails.

Back when I worked at Code School, we served a lot of videos. At one point our video host went down and we scurried to get a backup solution in place. We landed on using an intermediary service that would redirect to the actual host. I thought this would work for Hardcover too.

Updating the Website to Support Signed URLs

Here’s how that works in practice.

Rather including the image from Google Cloud Storage directly, we’d change this to proxy through our Ruby on Rails site. For example:

Example URL

https://projector.hardcover.app/signed/images/crop?url=https://storage.googleapis.com/hardcover/…jpeg&width=100&height=100Code language: JavaScript (javascript)

That URL is doing a lot. Let’s break it down.

First, it’s hitting an endpoint in Rails, in this case it’s a route in our Ruby on Rails application.

rails/config/routes.rb

namespace :signed_assets, path: :signed do
  resource :images do
    get :crop
    get :enlarge
  end
endCode language: Ruby (ruby)

This tells Rails to create a two new routes at /signed/images/crop and /signed/images/enlarge. These will go to the controller which needs two methods for enlarge and crop.

Those actions need to do a two things: generate a signed URL for the passed in URL and redirect to that URL.

It should also cache this generated URL. We don’t need to regenerate it every time someone looks at cover. I decided to cache these for 1 day.

rails/controllers/signed_assets/images_controller.rb

module SignedAssets
  class ImagesController < ApplicationController

    # GET /signed/images/enlarge
    def enlarge
      redirect_to_signed_url("enlarge")
    end

    # GET /signed/images/crop
    def crop
      redirect_to_signed_url("crop")
    end

    private

    def signed_params
      @enlarge_params ||= params.permit(:url, :width, :height, :type)
    end

    def url
      @url ||= Rails.cache.fetch("/signed/images/signed/#{signed_params[:url]}", expires_in: 24.hours) do
        ExternalStorage::Google::Cloud.sign(signed_params[:url])
      end
    end

    def redirect_to_signed_url action
      if url.nil?
        head :unprocessable_entity
      else
        redirect_to "https://cdn.hardcover.app/#{action}?url=#{CGI.escape(url)}&width=#{signed_params[:width]}&height=#{signed_params[:height]}&type=#{signed_params[:type]}"
      end
    end
  end
endCode language: Ruby (ruby)

This is a mostly standard Ruby on Rails Controller. The public actions are URL endpoints we set in the routes file. The private methods help with code reuse and organization.

The bulk of the work happens in the url method.

First, we check if the given URL is in the cache (we’re using Redis). If something with those contents exist in Redis, we return it any never run the contents of the block that comes after.

If a signed URL isn’t in the cache, we call ExternalStorage::Google::Cloud.sign to generate a signed URL.

I’m not going to get into generating a signed URL. We’re using the google-cloud-storage Ruby gem which handles everything, including the line that actually hits the Google API:

ExternalStorage::Google::Cloud.sign

signed_url = file.signed_url method: "GET", expires: 3600*25 # expires dailyCode language: Ruby (ruby)

These URL endpoints need to be FAST. Loading a page with 100 covers means this endpoint will be hit 100 times. My first iteration didn’t have caching, which led to a 250 ms wait PER IMAGE. With caching it’s 250 ms for the first request, then less than 1 ms after that.

Update: After this post was published, thanks to comments from Hacker News, I realized there are quicker ways to generate this URL without hitting the Google API.

The generated URL will be the same, which also means that the image will be cached on the CDN. If a user refreshes a page, they’ll still hit Rails to generate URLs for all images, but they’ll use the images cached by their browser. (side note: I’d like to improve this even more. Any recommendations?)

Updating the API to Support Signed URLs

Things are working for our website visitors, but we need to do one more step for API users: create a new endpoint for generating signed URLs.

Hardcover has an API that registered users can use for accessing their book and reading data. If you’re building something with it, you’ll have our existing Google Cloud Storage URLs already.

I ended up creating a single new endpoint, image_url_signed. It takes in a URL (which could be a cover, avatar or anything) and we return a signed URL.

Behind the scenes this Hasura API endpoint will hit another Rails controller with a single endpoint for generating a signed URL. It’s pretty much the same as the other controller.

rails/app/controllers/hasura/signed_assets_controller.rb

module Hasura
  class SignedAssetsController < Hasura::BaseController
    skip_before_action :set_paper_trail_whodunnit

    # Returns a signed asset URL for the given url
    # GET /hasura/signed_assets
    def index
      url = Rails.cache.fetch("/signed/images/signed/#{input_url}", expires_in: 24.hours) do
        ExternalStorage::Google::Cloud.sign(input_url)
      end
      render json: {
        url: url
      } 
    rescue ExternalStorage::Google::Cloud::AccessDenied
      head :forbidden
    end

    private

    def input_url
      @input_url ||= params.require(:url)
    end
  end
end
Code language: Ruby (ruby)

It even uses the same cache as for website users! 🙌

Last Step: Throttling API Use

With this setup everything works, but we haven’t fully fixed the problem. Someone could still hit our new API endpoints repeatedly and bypass our newfound security.

Hasura has an option that could help here: API Rate limits.

Unfortunately, this is a Hasura Enterprise feature, and we’re mere peasants on the free version. It can even rate limit by user id from a JWT – which is exactly how we determine the owner of a request.

I’m considering moving back to enterprise once we can afford it, but that’s out for now.

Fortunately this endpoint is in Rails. We can lock it down on that side instead. After a little Googling, I found the beautiful rack-attack gem that does exactly what I was looking for.

If you’re not from the Ruby world, you might not have heard of Rack. Think of Rack is the base layer that most Ruby web frameworks are built on (Ruby on Rails, Sinatra, Hanami and others). It handles the low level request/response cycle.

The rack-attack gem builds on that with features for IP allowing, blocking and throttling.

In our case we wanted to throttle the API endpoint with at 60 requests a minute, but the website endpoint at 500 a minute – equal to browsing at a steady pace.

This ended up being so much easier than I thought. It only needs a new initializer.

rails/config/initializers/rack-attack.rb

class Rack::Attack
  Rack::Attack.cache.store = ActiveSupport::Cache::RedisCacheStore.new(url: ENV["REDIS_URL"]) 

  throttle("hasura/signed_assets", limit: 60, period: 1.minute) do |req|
    req.params["user_id"] if req.path == "/hasura/signed_assets"
  end

  throttle("/signed/images/enlarge", limit: 500, period: 1.minute) do |req|
    req.ip if req.path == "/signed/images/enlarge" || req.path == "/signed/images/crop"
  end
endCode language: Ruby (ruby)

This also uses Redis and will generate a key for each endpoint using the value returned from the block.

This allows us to throttle the API endpoint to 60 requests/minute based on user_id, while the public endpoint used by the website has a limit of 500 requests/minute based on IP address.

What’s great about this setup is that this run before Rails has even handled the request – that’s the beauty of middleware. This will return a 429 Too Many Requests header after they’ve reached their limit.

Limiting the API More?

One of the reasons I started Hardcover in the first place was because Goodreads discontinued their API. I’ve never run a public API before, but want to make sure we’re being good stewards of book data by the community as well as everyones personal book data.

I have a few other takeaways from this experience that we’ll put into place in the near future: mostly for rate limiting, reporting, limiting API access to areas that you don’t need access to and expanding on our API terms of service.

What will always be true is that you’ll be able to get any data about books, editions, authors, series, publishers, characters, your library and whatever people that follow you have decided to share. 📚

← More from the blog