Back to a monorepo

As discussed on a call today I am putting down ideas about migrating our current 3 repositories (backend, frontend, root-model) into a single monorepo.

Reasons why we had it split:

  • Previously frontend (Flash) used to be separate so it made sense to have HTML separated, too
  • There was a fear that backend developers would be overflowed with frontend commits totally unrelated to the backend
  • We used to split backend and frontend development between different people.
  • We did not have automated builds/testing on pull request level (GitHub Actions)
  • The code was not public, there were no external contributors.
  • We were using Bitbucket where pull requests were not so easy to read

It turns out that there are commits that span all three repositories. Example could be Chat introduction:

  • Special controller was introduced in origam-source
  • Supporting data model was introduced in a chat package in origam-model
  • Frontend was obviously delivered in the origam-html

This would be obviously a big commit but it happens time to time. Another example could be the recent introduction of frontend plugins. But smaller commits that span both backend and frontend appear regularly.

Split pull-requests cause the following issues:

  • They have to be approved simultaneously, otherwise they break the nightly build. But different reviewers might have a different pace.
  • Automated integration testing of each pull request is impossible using GitHub Actions. It is impossible to pull from 3 repositories in their different branches to test a single, multi repo spanning pull request.

This all leads me to an idea of merging all three repositories into a single one. Not sure if we should keep the other three there (for blaming purposes) and create a 4th with a completely new structure or to reuse origam-source and reorganise it?

For me, single repository brings more positives than negatives. I’d keep origam-source and import the others into it. I don’t see particular need for blaming yet.

Of course we should layout the repository structure correctly, so we don’t have to change it each month.

Some reading here:

The question is how should we structure the plugin development? I think plugins should deserve separate repositories as they are clearly independent modules.

So I think:

  • main frontend/backend should be monorepo as the frontend depends on the backend (e.g. API definitions, expected action flow, etc.)
  • parts shared with plugins should be delivered as packages (npm, nuget) so independent developers just use these
  • so plugins can have independent repositories as the dependencies will be managed by the respective package managers

This way we will partly go monorepo and partly split where it makes sense.

Automated plugin testing can work like this:

  • New PR is made to the monorepo
  • Frontend + backend builds and tests are executed
  • All standard plugins (maintained by us) are cloned from their master branches
  • Plugins get tested against the PR (so we are sure they don’t break)

I propose to create the following directory structure in the new monorepo:

/backend with all the current .net sources
/backend-tests with the .net test project sources
/docker with the contents of the current /OrigamDocker folder
/frontend-html with the contents of the current origam-html repo
/frontend-html-tests with the html test project sources
/model-root with the contents o the current origam-model repo
/model-tests with the contents of the current origam-demo repo
/build with the contents of the current origam-build repo

This way the repository is self-contained and any branch will contain everything needed to build that branch.

What about origam-html-chat? It is included with our standard build.

Chat is an optional module so I would leave it on the plug-in level – separate. I described how we should handle plugin testing in the previous posts.

My tests with monorepo look to be successful. The history is preserved and the new commits are working as well.

Now let’s discuss the migration and the impacts on CI/CD.

  • It was decided to transfer only master branch. 2021.1 will be maintained in the old repositories. As a result we’re going to have releases on two URLs. Is this OK?
  • For sake of simplicity I suggest start with a new tag 2021.2-beta.3.
  • We need to review CODEOWNERS rule on pull requests. Each of the main folder should have at least two code owners, because code owner can’t approve own pull request.This might be difficult for parts that are run as one man show at the moment (frontend-html, docker). Other option would be to not to require code owner’s approval.
  • New build pipeline needs to be established.
  • Github actions need to be reapplied.
  • The old origam-source should be renamed to origam-source-2021.1 and monorepo should bear the name origam-source.

I don’t see problems with releases locations ith the suggested repository naming.

So far we require a review for all repos, don’t we? I don’t see a problem with PR review rules.

As for the pipelines - shouldn’t we, after configuring GitHub Actions for integration tests, just use those for producing a release? So dumping Azure DevOps?

We need to name code owners.

We would still need to use pipelines for 2021.1 or this would be also moved to GitHub Actions? So dumping Azure DevOps completely is not possible. Anyway I don’t mind switching to GitHub Actions.

If we use GitHub Actions, then we don’t need build folder.

We should keep old as is.

Why not keeping code owners as they were in the separate repositories? We had code owners everywhere except of model, but those can be same as backend.

I set up code viewers as in the old repositories and we adjust according our needs later.

As for the GitHub Actions it looks to me like matter of decision. So if you say do it on GitHub Actions, the only reason not to do it would be to find out that we can’t achieve our goals through them. But I don’t have any prior experience with them to be able to know it now. I had a quick look into the documentation and it should be possible.

I think that we will basically need to compile and produce everything in order to do full integration tests. So then why duplicating it in DevOps again?

I agree, no need for duplication. How often do you plan to run full integration tests?

On each PR. But release should be produced only for the master branch.

And we still have one scheduled release build per day?

If all tests go through then I see no reason why not to release immediately.