Case Study: Weaverbirds

At the end of last year I had involvement in the development of an application that is focused on building custom stories to explain difficult situations to children, such as parents divorcing or the passing away of relatives. My role was predominately backend and infrastructure focused, with some light involvement in assisting with frontend tasks, including the development of a bridging module which provided a layer for the client-side application to interact with the API that I was developing. It was one of the more interesting and enjoyable projects I've worked on in the last few years, since it presented some challenges that I haven't had to deal with in my career so far - compiling books designed by users of the application.

Scoping

The first part of my involvement was to help scope the project from a technical perspective and decide on the ideal infrastructure, tools, frameworks and general technology that would be involved. Obviously my recommendations were strongly aligned with my professional experience since this would have the highest chance of being successful and having a good result. Fortunately the frontend developer that introduced me to the project was an ex-colleague, so we were aligned when it came to the way we approached the frontend as well.

There were several large components to the overall project that needed to be looked at individually but that also needed to cross-communicate and behave like a single unit of work. The major parts were:

  • The main website for users to learn about the product, find content that acts as a companion to the product (e.g. support material for parents dealing with a tough situation that may affect their children) and general company information, concat details, etc.
  • The backend that would handle all of the billing, data storage, compilation of the custom user stories, administrative functionality, asset handling and so on.
  • The frontend of the platform itself, where users go to create their account, design their characters and build their stories, place their order and review past activity.

Somewhere in the mix there also needed to be an area where customers, orders, transactions, user artefacts (characters, books) and so on can be reviewed by the client.

This project was the foundation of a new business and it needed to be approached in a way that will enable many iterations of development over a long period and the ability to properly scale if successful. I was tremendously diligent in the choices I made about the technology and approach, building a solid foundation rather than an MVP.

Infrastructure

Since Digital Ocean is the provider that I personally had the most experience with, we went with them for all of the infrastructure requirements that came out of the scoping phase. The products that we needed to leverage were:

  • A single Droplet (virtual machine) running Docker on Ubuntu. This is where the website, backend application and frontend application all ended up, as well as some peripheral services that assist with diagnostics, package management, etc.
  • A Spaces instance, to store all of the files that are generated as a consequence of interaction with the application by users i.e. their character assets and compiled stories. This is essential long-term since it keeps the runtime artefacts separate from the application itself (which is how more traditional applications might work); allowing for things like upscaling the application across many instances under a load-balancer, cheap capacity increases if the volumes become large and a central location for all files across environments, which makes iterating on the project over a long period significantly easier.
  • A managed Postgres instance, to store all of the structured data generated and used by the backend application and website (since it is built on a CMS). Postgres is my preferred relational database since it has the best mix of features that support data integrity, proper JSONB data types and general maturity and community adoption.

CI & Deployment

Since I was responsible for the infrastructure, I was also the one responsible for deciding how the application would be deployed there. The Git repositories for all of the code is hosted on GitLab, where I make use of their awesome CI and CD pipelines to build and deploy each of the separate application pieces. For the website, backend and frontend, I build Docker images and push them to GitLab's image registry alongside the code. Packages such as the bridging module are published to our Verdaccio instance when new tags are created.

Software, Libraries & Frameworks

Website

We chose Craft CMS for the website itself, again based on where my experience was when it came to content-management systems. Using a CMS was obviously essential for this part of the project, since it needed to be easily updatable by the client in several areas (support, blog) and investing time into a bespoke solution for something as typical as this would have been a huge waste of money for the client and energy for myself. I have had a lot of good experience with the system over the last 5+ years, choosing it exclusively over any alternatives such as the substantially more popular WordPress, which I specifically avoid working with on purpose to avoid tainting my skillset.

No special CSS or JavaScript build pipelines or preprocessing was used for the website; we opted for a vanilla approach to keep it simple and get it off the ground quickly, as this was the first part due to go live quickly to have a presence and build some momentum.

Platform Backend

The backend was the most complex element since it abstracts all of the third-party integration, data storage, asset generation and other heavy lifting of the product into a single API used by the other parts of the system. This was the focus of my involvement and fully my responsibility as there were no other backend developers involved in the project.

I chose to build the backend as a Node application, with TypeScript as the language and Nest the framework of choice. Though I have many years experience with PHP which would likely be the language of choice for this style of project using either Symfony or even Laravel by a typical backend developer, the last two years of experience I have is almost exclusively with TypeScript, which I have grown to love and trust far more than PHP, especially for applications this complex. My production experience with Nest spans several significant projects that I have done both within the bounds of my day-to-day job as well as other freelance projects. It is a great framework that aligns very well with my expectations from something that is built on a language that can be taken to extremes with it's typing features, as well as my personal coding style and approach to building applications.

For the database interaction I leveraged TypeORM, which is another library that I have a lot of experience and success with. Nest provides a first-party module that connects itself with TypeORM in a standardised way, which is a good demonstration of the fact that they work very well together. TypeORM migrations were used liberally during development and updates to make sure the application can be quickly brought back to its most current state, which is very valuable for ongoing development.

For PDF composition, which is ultimately what the application is responsible for, I used pdfkit. I have experience with other options such as html-pdf, but I have had trouble with the text rendering or getting the elements on the page to behave the way I want. PDFKit is purer (more explicit) in the way you attach images, text and other elements to the page and gives much better results overall, but there was much more work involved in designing a data model for what an abstracted template for book looks like and converting that to a properly organised series of pages and elements on those pages to make a story. The results were definitely worth it and the PDFs turned out really well; both digitally and the physically printed variations.

The system integrates with third parties for payment processing, CRM subscriptions and order fulfilment (printing the physical copies of stories, managing the shipment to the customers).

Overall this project is one of the best that I have written in terms of code quality for maintainability and scalability. It was also one of the most enjoyable projects I've had in a while, a proxy for a good balance of complexity, challenges faced and interesting problems to solve.

Platform Frontend

Since the frontend developer and myself had worked together in the past and that work was on top of Vue, that is the framework we selected for the build of the frontend for this project. Although I typically reserve the use of Vue over React for instances where the progressive application of Vue has a high level of relevance (e.g. typical websites where most of the markup is generated server-side and just needs to be enriched with interactivity), it made a lot more sense in this case to stick with something where the person doing most of the work felt the most comfortable. The main curve-ball I introduced was the use of TypeScript, to be consistent with the backend codebase and to reap the benefits of a typed language.

My main involvement in the frontend space was developing a separate module that abstracted interaction with the backend API that I developed. The module was developed in TypeScript to make it trivial for the frontend developer to skim over the heavily commented type definitions and see what all of the available endpoints were, exactly what they returned and what the expected inputs were. This reduced the overhead of communicating what endpoints were available and how they worked throughout the project development cycle, since I just kept the module in line with what was last deployed. I set up a private Verdaccio instance to publish the module to, expecting that there will be additional isolated components that need somewhere to live in the future.

Data Administration

I leveraged the Craft dashboard to display customer and order data via a custom module. The module simply integrates directly with the backend using administrative APIs, and renders the output in Craft's Twig templates that extend the dashboard templates. This worked pretty well and keeps all of the content and customer management in one place, which is nice and streamlined and saved a lot on the time we would have had to invest in building a completely separate UI. With that said, this would not be a great direction to continue in moving into the future and a dedicated UI would make more sense.

Altogether in a simplified form it looks like this:

Simplified architecture diagram

Diving Deeper: Building Stories

Since the ultimate output of the application is stories that users have created and this was the most interesting part of the project, I'll go into more detail about how that was approached and implemented. The objective was to have a system where we could define what the skeleton of a book looks like, what things belonged in it, where they were positioned and what user-supplied data could be injected into the pages.

Story Abstraction

The first part of the task was to define an abstraction of what a story can look like, which was made up of the following components:

  • The top-level definition of a story; its name, description, thumbnail asset and other overarching metadata.
  • Themes that belonged to the root definition. The themes determine the configuration options presented to the user, the pages that will be a part of the story and the contents of those pages.
  • Theme configuration items, which are presented to the user for data collection and can be injected into the initial scaffolding of a story or make alterations to the elements that appear on the pages. Configuration items are associated with a collection type (e.g. free text, multiple choice) and the data is obviously persisted as part of the story that a user creates.
  • Pages, which provide their position within a template (i.e. the page number), the elements that should appear on them and where those elements should appear. Pages can also contain activities, which are selected from a group of potential options during the scaffolding of the story.
  • Text templates, which contain Nunjucks code that is compiled with the user input for theme configuration items, attributes of the selected characters and other metadata from various layers of the definition and template selection to produce the initial text shown during the story edit phase of the app.
  • Images, which are self-explanatory.
  • Character slots, which provide the association between the characters created by the user at the onset of building their story, the pages they should appear on, and the position and size of the characters within those pages.

This is a heavily simplified visual representation of the above:

Simplified story abstraction diagram

Having this abstracted structure to represent a theme provided the source data for the scaffolding of a story, which is the first phase of customisation presented to the user.

Story Scaffolding

Once the user selects a story template, selects a child theme, inputs their configuration option answers and creates the characters that the theme requires to be a part of the story, a first-pass of that story is scaffolded for further editing. This first iteration is actually a fully-fledged story that can be purchased immediately without additional input from the user, however we give them the option to then fine-tune the text in each of the pages.

Scaffolded books are stored with references to the theme that was used to generate them and all of the data entered by the user. A completely new series of page data is generated by compiling the Nunjucks based text templates found across the pages that belong to the selected theme, at which point they are fully editable by the user without affecting the underlying templates, which are only used to produce the initial story.

PDF Building

PDF compilation targets a scaffolded story and is able to extract all of the associated template and user data during the rendering phase. The rendering phase goes through the following process:

  1. Load all user and theme data associated with the target story.
  2. Load all of the asset data (images, characters, etc) from the Spaces instance into buffers so they can be written to the PDF.
  3. Iterate over each page in the target story and execute the sequence described below.
  4. Compile the PDF and upload it to our Spaces instance.

For each of the pages described in step 3 above, we then:

  1. Build a list of render jobs - instances that define the type of rendering that needs to be done, the position and size of the element that will be rendered, etc.
  2. Sort the jobs on their depth, an attribute attached to the elements that can appear on pages.
  3. Iterate over the sorted render jobs and execute their render functionality, which handles actually attaching the relevant item to the page with all their expected attributes.

Outside of this, the system overall is pretty typical (APIs for CRUD of various entities, user registration and authentication, etc).

Things I Would Change

Although the system is very stable and reliable, there are some parts that have room for improvement (in their nature, not the code behind them specifically). The first thing I would attack given the budget is to move PDF generation to a serverless architecture such as OpenFaaS that can scale horizontally for concurrency. Currently, PDF generation for orders containing multiple stories is sequential, and each one can take a large amount of time to produce. Orders with a large amount of PDFs to be generated will reach the response timeout limit set by CloudFlare, which we use for the application. Although the books will still be generated, payment taken and an invoice sent to the customer eventually, the user experience at checkout is not optimal. The current system will also not scale for large amounts of overlapping orders, since the resources are bound and quite low at this stage.

A secondary improvement on my mind is to create additional virtual machines to host the non-primary user facing parts of the application, such as the Verdaccio instance, staging version of the application and so on. Obviously at the onset of a new project that has no traction yet, this is not necessary and just amplifies running costs; but it should be one of the first things to change once visitor volumes start increasing, especially since it's trivial to do and has immediate benefits.

Summary

Weaverbirds was a very enjoyable and successful project. You can check out the website and platform at the following links:

Show Comments