From Chaos to Cohesion: Migrating Multiple Repositories to a Monolithic Architecture
At Tasq.io, I led a challenging yet rewarding initiative: transitioning several loosely connected repositories into a cohesive monolithic repository. This task, however, wasn’t just about merging code; it also involved addressing fundamental issues within each repo, from missing unit tests to inconsistent code quality. Through a structured migration process and the introduction of automated tools like Black, Ruff, and Sourcery, I transformed our development workflow into one that’s maintainable.
Why Move to a Monolith?
Managing multiple repositories can complicate code management leading to issues like:
- Code Duplication: Shared logic is often duplicated across projects.
- Dependency Management: Each repository requires separate dependency tracking.
- Deployment Inefficiencies: CI/CD pipelines become challenging to maintain across repos.
Moving to a monolithic repo using Polylith’s modular approach presented a solution that allowed us to consolidate shared code, simplify dependency handling, and centralize deployment.
The Initial Challenges
The transition wasn’t without hurdles. The codebases lacked essential elements like:
- Unit Tests: A lack of unit tests made it challenging to ensure that functionality remained intact during the migration.
- Code Quality: Without standardized linting or formatting, code readability and maintainability were inconsistent.
- Code Duplication: Shared functions and utilities were duplicated across repositories, making updates complex and error-prone.
These challenges meant that, in addition to migrating code, I had to improve code quality and enforce new standards across the monolith.
Using Polylith’s Approach to Structure the Monolith
The Polylith architectural style helped us structure the monolith in a way that balanced modularity with code reuse:
- Components: Independent, reusable modules containing business logic.
- Bases: Containers that allow components to interact and function as coherent applications.
- Projects: Representing top-level applications, these projects leveraged shared components without redundant code.
Building with Pants for Scalability
Using Pants build enabled us to manage the growing codebase efficiently:
- Selective Builds: Pants allows building only the changed code, significantly reducing build times.
- Dependency Isolation: With Pants, we could isolate dependencies per module, avoiding version conflicts and unnecessary bloat.
- Optimized Testing: By testing only impacted modules, Pants helped streamline the CI process, saving time and resources.
Improving Code Quality with Automated Tools
After the migration, I introduced several automated tools to improve code quality, enforce consistency, and reduce technical debt:
- Black: An automatic code formatter that enforced a unified style across the monolith, making the codebase cleaner and more readable.
- Ruff: A fast, Python linter that ensured our code adhered to best practices, catching potential bugs and enforcing style consistency.
- Sourcery: A tool to detect and remove code smells, simplify logic, and provide intelligent code suggestions, Sourcery helped maintain cleaner and more efficient code.
With these tools, I configured a GitHub workflow for automated linting and formatting, ensuring code quality checks were part of our CI/CD pipeline.
Incrementally Adding Unit Tests
The lack of unit tests in the initial repos posed a risk to functionality during migration. Here’s how I approached testing:
- Module-by-Module Testing: Before migrating each repository, I added unit tests for its core functions, focusing on critical components first.
- Automation: After migration, unit tests were integrated into the GitHub workflow, enabling automated testing for each pull request.
- Code Coverage Tracking: I set up code coverage tracking to measure test completeness and ensure no gaps.
The transition to a monolith not only streamlined our development process but also brought improvements in code quality. Important conclusions:
- Reduced Complexity: A single repository reduced the complexity of working across multiple repos.
- Enhanced Code Quality: Automated tools like Black, Ruff, and Sourcery maintained consistent, high-quality code.
- Reduced Redundancy: Consolidating duplicated code reduced technical debt and improved maintainability.
- Optimized Build and Testing: With Pants, our CI/CD process became more efficient, reducing time spent on builds and tests.
It was an opportunity to improve some processes and establish a robust framework for sustainable growth. By combining Polylith’s modular approach, Pants build, and automated quality tools, I transformed our codebase into a maintainable monolith. This took a significant amount of time and required many changes, but the results were well worth the effort.