System Design Series: How to Build a URL Shortening Service - Part One
Let's deep dive into the world of building scalable systems
Designing a system capable of supporting millions of requests sustainably is challenging and requires constant improvements and diligent refinements.
In this article, we will dissect the components necessary to construct a robust URL-shortening service that exemplifies the principles of effective system design.
Understanding the problem and establishing the project scope
During the initial discussion either for the design of a new service or a system design interview, it is important to understand the scope and ask clarification questions. So our problem is given a long URL like:
https://www.amazon.ca/Pragmatic-Programmer-journey-mastery-Anniversary/dp/0135957052/ref=bmx_dp_mlby5vhn_d_sccl_2_1/139-2360007-5644232?pd_rd_w=sdEdw&content-id=amzn1.sym.d0a06cce-14a1-4b3b-af20-cd74bdfb3906&pf_rd_p=d0a06cce-14a1-4b3b-af20-cd74bdfb3906&pf_rd_r=19ANZZ932SV3YQ9RJYWM&pd_rd_wg=Xro6o&pd_rd_r=f0908fbd-23fd-4423-92a4-5e545c396b59&pd_rd_i=0135957052&psc=1
We need to shorten this URL to facilitate sharing it or save storage space, thus the service needs to accept a long URL as an input and return a short URL with a unique identifier like:
https://short-url.com/c2705e71-f4b6-4e65-b13e
High-level design
Since we better understand the problem, let’s start with a high-level design using the C4 Model to represent the solution visually. I have an article presenting the C4 Model which you can check to clarify how it works.
Our requirements for a high-level design on the C4 Model container level would be the following:
As we can see in the diagram the workflow will be the following:
User submits the URL to be shortener
URL Shortener Service will generate a new Short URL and save it in the database
URL Shortener Service will return the short URL to the user
User accesses the short URL
URL Shortener Service looks for the long URL in the database
URL Shortener Service redirects the user to the long URL link
API Design
API endpoints facilitate the communication between clients and servers. We will design the APIs REST-style. If you are unfamiliar with Restful API, I highly recommend you consult the REST API Tutorial website.
I’m planning to write a guideline on how to design good RESTful APIs so subscribe and stay tuned:
A URL shortener primarily needs two API endpoints:
URL shortening: To create a new short URL, a client sends a POST request containing one parameter: the original long URL.
The API looks like this: POST - api/v1/short?longUrl= {longURL} returns short URLURL redirecting. A client sends a GET request to redirect a short URL to the corresponding long URL.
The API looks like this: GET - /{shortURL} Return long URL for HTTP redirection
Once the server receives a short URL request, it changes the short URL to the long URL with a 301 redirect response. An HTTP code 301 redirect shows that the requested URL is “permanently” moved to the long URL.
The browser caches the response, and subsequent requests for the same URL will not be sent to the URL shortening service. Instead, requests are redirected to the long URL server directly. Here’s an example using Insomnia REST Client:
Perfect we have all the cases for this workflow defined, and the API endpoints as well. Now we can start the implementation since we do not have any other requirements, let’s start with a basic (and naive) implementation and improve it incrementally.
A first and naive implementation
For the first implementation let’s prepare the application setup, I’m going to create a Spring boot service using Java, for the Database let’s use MySQL, and to run everything easily locally let’s use Docker. You can check the setup details and try it yourself from our system design series repo on GitHub.
For brevity let’s focus on business logic. Here’s our URLShorter domain:
@Entity
@Getter
@RequiredArgsConstructor
public class URLShorter {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
private String longURL;
private String shortURLId;
public URLShorter(String longURL) {
this.longURL = longURL;
this.shortURLId = UUID.randomUUID().toString();
}
}
When a new URLShorter is created we’ll generate a UUID (Universally unique identifier) as the shortURLId.
On the Service Layer, we have the logic to create and save the short URL, and retrieve the long URL:
@Service
@RequiredArgsConstructor
public class URLShorterService {
private final URLShorterRepository repository;
@Value("${urlshortener.baseshorturlpath}")
private String BASE_SHORT_URL_PATH;
public String shortenURL(String longURL) {
var optionalUrlShorter = repository.findByLongURL(longURL);
if (optionalUrlShorter.isPresent()) {
return buildShortURL(optionalUrlShorter.get().getShortURLId());
}
var urlShorter = repository.save(new URLShorter(longURL));
return buildShortURL(urlShorter.getShortURLId());
}
public Optional<String> getLongURL(String shortURLId) {
return repository.findByShortURLId(shortURLId)
.flatMap(urlShorter -> Optional.of(urlShorter.getLongURL()));
}
private String buildShortURL(String shortURLId) {
return BASE_SHORT_URL_PATH + "/" + shortURLId;
}
}
On the shortenURL method, we first check if that long URL already exists on DB then return it with the BASE_PATH defined as an application property from application.properties, otherwise, it’s going to add a new entry in the database and return the new shortURL.
On the getLongURL method, we look up the longURL using the shortURLId
On the Controller layer, as we described in the API Design section we have two endpoints:
POST - /api/v1/short?longUrl= {longURL}
GET - /{shortURL}
@RestController
@RequiredArgsConstructor
public class URLShorterController {
private final URLShorterService service;
@PostMapping("/api/v1/short")
public String shortenUrl(@RequestParam("longUrl") String longURL) {
return service.shortenURL(longURL);
}
@GetMapping("/{shortUrl}")
public ResponseEntity redirect(@PathVariable("shortUrl") String shortURL) {
return service.getLongURL(shortURL)
.map(longUrl -> ResponseEntity
.status(HttpStatus.MOVED_PERMANENTLY)
.header(HttpHeaders.LOCATION, longUrl)
.build())
.orElseGet(() -> ResponseEntity
.status(HttpStatus.NOT_FOUND)
.build());
}
}
The first endpoint is pretty straightforward, we send the longURL as a query string and get the shortURL as a response.
For the second endpoint, we access the shortURL and get HTTP code 301 redirect to the longURL if the shortURL exists in the database, otherwise, it will return HTTP 404 Not found as the response.
Let’s test the first implementation
First, we run the application using Docker Compose:
docker-compose up
Then we can access the application on localhost:8080 this will be our base URL for the short URLs. Let’s add a new longURL as we described in the requirements, you can use cURL or any HTTP UI client like Postman or Insomnia:
curl --request POST --url 'localhost:8080/api/v1/short?longUrl=https%3A%2F%2Fwww.amazon.ca%2FPragmatic-Programmer-journey-mastery-Anniversary%2Fdp%2F0135957052%2Fref%3Dbmx_dp_mlby5vhn_d_sccl_2_1%2F139-2360007-5644232%3Fpd_rd_w%3DsdEdw&content-id=amzn1.sym.d0a06cce-14a1-4b3b-af20-cd74bdfb3906&pf_rd_p=d0a06cce-14a1-4b3b-af20-cd74bdfb3906&pf_rd_r=19ANZZ932SV3YQ9RJYWM&pd_rd_wg=Xro6o&pd_rd_r=f0908fbd-23fd-4423-92a4-5e545c396b59&pd_rd_i=0135957052&psc=1'
The response is:
localhost:8080/f1c899d6-bed9-4eae-be7d-3a40efaca2e4
Now let’s try to access the shortURL:
And it works! We were redirected to the original long URL.
Ok, this was our first implementation, but as real services, the requirements can change. Let’s analyze the new requirements and discuss improving the performance.
Change in the requirements and second version planning
Our first implementation works, however, some requirements have changed:
We are using a UUID as the identifier, but it uses 32 digits plus the dashes, we want to be able to have the URL as short as possible;
We are expecting 10 million URLs to be generated per day;
We need to store the stale URLs for at least 5 years;
Our read/write ratio is expected to be 10:1;
The read performance must be as fast as possible with a small infrastructure, ideally, the whole response time from entering the URL in the browser and getting redirected to the original long URL should be less than 100ms.
We have a lot of work to do and as this first part is long enough already, let’s implement those changes in the second part.
Wrapping Up
In this article, we defined the solution scope, designed the high-level view, and the API contract, and developed the first version of a short URL shortening service.
In the second part, we’ll implement the new requirements, improve the performance, and discuss how to prepare the service for high-scalability scenarios, stay tuned!