Waseem Jan Syed Waseem Jan

From Chaos to Cohesion: Migrating Multiple Repositories to a Monolithic Architecture

At Tasq.io, I led a challenging yet rewarding initiative: transitioning several loosely connected repositories into a cohesive monolithic repository. This task, however, wasn’t just about merging code; it also involved addressing fundamental issues within each repo, from missing unit tests to inconsistent code quality. Through a structured migration process and the introduction of automated tools like Black, Ruff, and Sourcery, I transformed our development workflow into one that’s maintainable.

Why Move to a Monolith?

Managing multiple repositories can complicate code management leading to issues like:

  • Code Duplication: Shared logic is often duplicated across projects.
  • Dependency Management: Each repository requires separate dependency tracking.
  • Deployment Inefficiencies: CI/CD pipelines become challenging to maintain across repos.

Moving to a monolithic repo using Polylith’s modular approach presented a solution that allowed us to consolidate shared code, simplify dependency handling, and centralize deployment.

The Initial Challenges

The transition wasn’t without hurdles. The codebases lacked essential elements like:

  1. Unit Tests: A lack of unit tests made it challenging to ensure that functionality remained intact during the migration.
  2. Code Quality: Without standardized linting or formatting, code readability and maintainability were inconsistent.
  3. Code Duplication: Shared functions and utilities were duplicated across repositories, making updates complex and error-prone.

These challenges meant that, in addition to migrating code, I had to improve code quality and enforce new standards across the monolith.

Using Polylith’s Approach to Structure the Monolith

The Polylith architectural style helped us structure the monolith in a way that balanced modularity with code reuse:

  • Components: Independent, reusable modules containing business logic.
  • Bases: Containers that allow components to interact and function as coherent applications.
  • Projects: Representing top-level applications, these projects leveraged shared components without redundant code.

Building with Pants for Scalability

Using Pants build enabled us to manage the growing codebase efficiently:

  • Selective Builds: Pants allows building only the changed code, significantly reducing build times.
  • Dependency Isolation: With Pants, we could isolate dependencies per module, avoiding version conflicts and unnecessary bloat.
  • Optimized Testing: By testing only impacted modules, Pants helped streamline the CI process, saving time and resources.

Improving Code Quality with Automated Tools

After the migration, I introduced several automated tools to improve code quality, enforce consistency, and reduce technical debt:

  • Black: An automatic code formatter that enforced a unified style across the monolith, making the codebase cleaner and more readable.
  • Ruff: A fast, Python linter that ensured our code adhered to best practices, catching potential bugs and enforcing style consistency.
  • Sourcery: A tool to detect and remove code smells, simplify logic, and provide intelligent code suggestions, Sourcery helped maintain cleaner and more efficient code.

With these tools, I configured a GitHub workflow for automated linting and formatting, ensuring code quality checks were part of our CI/CD pipeline.

Incrementally Adding Unit Tests

The lack of unit tests in the initial repos posed a risk to functionality during migration. Here’s how I approached testing:

  • Module-by-Module Testing: Before migrating each repository, I added unit tests for its core functions, focusing on critical components first.
  • Automation: After migration, unit tests were integrated into the GitHub workflow, enabling automated testing for each pull request.
  • Code Coverage Tracking: I set up code coverage tracking to measure test completeness and ensure no gaps.

The transition to a monolith not only streamlined our development process but also brought improvements in code quality. Important conclusions:

  • Reduced Complexity: A single repository reduced the complexity of working across multiple repos.
  • Enhanced Code Quality: Automated tools like Black, Ruff, and Sourcery maintained consistent, high-quality code.
  • Reduced Redundancy: Consolidating duplicated code reduced technical debt and improved maintainability.
  • Optimized Build and Testing: With Pants, our CI/CD process became more efficient, reducing time spent on builds and tests.

It was an opportunity to improve some processes and establish a robust framework for sustainable growth. By combining Polylith’s modular approach, Pants build, and automated quality tools, I transformed our codebase into a maintainable monolith. This took a significant amount of time and required many changes, but the results were well worth the effort.