Skip to end of metadata
Go to start of metadata

Intro

Modern illumos development is maintained via the GitHub repositories. Older Mercurial repos are somewhat maintained, but may lag behind in updates.

Now I am learning about Git and logging the steps below, to implement the Working on several bugs at once and Managing multiple workspaces approach with the GitHub repositories (wink)

On cloning

Each git repository is complete and unique, including all history and branches of the project. This data is stored in .git/objects/pack/* subdirectory; if a clone is created in the same filesystem, these files are hardlinked into the new repository to save some time and space. The checked-out source code for the currently selected branch is in unique files (using disk space over and over again).

Apparently, each cloned repository has one current "head" branch (master or a feature branch), to which corresponds the checked-out source code (other files in the workspace directory). It seems that for each branched workspace, a separate repo with a particular head branch should be instantiated (cloned by git... or zfs).

Terminology

For the terms and repo names below, there will be:

  • the upstream - common illumos-gate repo where all contributions end up to make up the "vanilla" OS/Net consolidation, available at https://github.com/illumos/illumos-gate
  • the origin - each developer's personal fork of the upstream illumos-gate, publicly maintained at GitHub
  • golden - a local repository (on the developer's workstation and/or build host) kept in sync with both origin and upstream, serving as the location where your development changes end up (by pushes from branch-head repositories) for redistribution of any upstream and downstream changes into your public and compilation repos; directory name is illumos-gate-golden
  • build - a local repository into which you pull changes from your golden one for compilation of the whole illumos-gate; by common convention, its directory is illumos-gate, though in this example it would be illumos-gate-build.
  • featureX - a workspace for changes to the sources related to a particular bug/feature branch. Apparently, each workspace is a full repository (clone) that is switched to one or another branch as its head (where the commits will land into).

Action

Create a developer's personal fork of illumos-gate (the "origin")

Login to www.github.com and go to https://github.com/illumos/illumos-gate/fork to create the personal (or corporate) fork of illumos-gate. This will be used to publish the results of your work for others (including the common upstream) to review and ultimately merge ("pull"). This can also be used as a backup and a means to disseminate your work among the several hosts you yourself might be using (i.e. a workstation at work, a laptop on the road, a dedicated build host) – not necessarily all connected to each other at the same time.

Environment variables

The code snippets below will rely on the following variables set in your root and/or unprivileged development user (illumos-dev in examples below):

Note that  the example above is "portable", i.e. in that nearly every system has an rpool, but you might have and might want to choose a different pool (i.e. bigger, faster, whatever).

Prepare the development user's home directory and working datasets

On developer's workstation or build host, prepare the environment including the git program suite, workaround for HTTPS with CURL from  https://www.illumos.org/issues/1536 and just the working dataset (if not yet created), all done as root (or equivalent via pfexec):

Configure the developer's Git global credentials

There are several layers of Git configuration data that can be stored in the system paths (common for all developers), in each user's settings ("global" settings of the user), and in each single repository configuration; with the most-specific version of the config being applied. While the configuration files are just blocks of text located in different paths, it is customary to maintain them with the git config command. For example, "personal" information about the user that can be added into commit metadata belongs to the user's settings:

Create the local repository ("golden")

Then as the development/build user (i.e. illumos-dev above) go get the source code (the target directory should be empty; note that ZFS/CIFS autosharing might create an .$EXTEND subdirectory causing problems, and that it may be safely removed):

Wait a bit...

Add the upstream link:

This should leave you with your local golden repository cloned from your public "origin" repo and related to the common "upstream" repo, which makes it easy to pull in changes from both and push changes to your public "origin". Note that occasional pulling from your public "origin" might be useful for example if it is used to synchronize the code on your different hosts like a laptop on the road and the build/test server farm at work.

This local golden repository would serve as the base template for your other local work (on particular features or bugs), and when those quests are completed and tested – it will be used pull in the resulting changesets and push them to your public "origin".

Add a script to pull in code updates

In fact, let's embed the script now to automate git pulls from your current repository's configured remote peers:

Note that for each subsequent ZFS-clone of this dataset, you end up with a copy of this script which might be left as-is to use the per-repository-clone configured remote peers, or adapted to your needs (such as by uncommenting or commenting away certain GATES definition lines). Also note that this example pulls in the master heads; for specific projects you might be interested in following some other branches – then amend the repo's copy of the script accordingly.

ZFS-cloning a Git repository

Note that there are several ways to clone Git repositories. A portable way on any platform is to run git clone which creates a copy (or hardlinks if in the same filesystem) of the repository files, and instantiates a new copy of all editable files in the selected head branch into the workspace. This does waste some space and time for making each copy, in comparison to what can be done with filesystem clones available with ZFS (unless you use deduplication, which might save space but might also be wasteful on other resources – including space unless you have too many copies of the same data) (smile)

Namely, the suggested procedure with ZFS gets away by just cloning a dataset with the reference repository and rewriting links to the origins or upstreams used (if you want to follow and/or merge another line of development, for example). In this process only the files that need to be changed are replaced, even if you work with a different branch, which saves quite a bit of space. Since my development attempts and/or builds are often done on older servers, or in VM's on laptops, free space is constrained and valued and such savings become a substantial factor.

Even if you use such clones to track remote projects for private compilation of a non-vanilla illumos-gate consolidation (such as additions from the illumos-gate-lxbrandz example below), the general premise is that most of the codebase is common, deviations are minimal and mostly relate to features added or changed by a particular upstream repository, and space savings thanks to ZFS dataset cloning remain in force... and also you get to easily keep some pre-defined customizations of the working directory, such as that Update-pull.sh script above, or prepared illumos.sh compilation settings, or some symlinks to separate datasets with package or proto storage directories, etc. 

In particular, the actual compilation of the illumos-gate per this procedure happens in a dedicated repository workspace which can pull in changes from a number of source repositories, so that cloning of the pure source-code repositories does not carry over the binary object files. In this example we will create a clone just for that – building, though I will generalize the code snippet for any "feature":

The idea with modification of the "origin" for local working repositories is to have the local "golden" repository serve as the buffer between your local development and the published work... to save on traffic and as an extra layer of security from embarassments (since an "origin" remote repository is used for pulls and pushes by default), or something like that... (smile)

Tracking a remote project via git

Let's add a replica of some non-vanilla upstream with features you want to test or use, but that have not yet been RTI'd into the common codebase, for this example I will track the (alas, unfinished) work on the lx brand to run zones with native Linux operating environments:

Patching from a remote project's development

In a similar vein to development done locally, you might have some code to try available as a patch-file (perhaps, a part of someone's webrev or a similar way to publish the work's results). To give it a spin, you might just apply the patch to the source code replica:

Building a combined project

To automate building the illumos-gate with a merger of the current vanilla upstream, the additional patches that were received in one way or another, and perhaps your local development results, you might want to combine all these variations of the common codebase into one workspace (earlier we've defined the illumos-gate-build just for such a purpose – to be the only place where the compiled objects and binaries spawn en-masse). At least, this should be an easy job if the changes involve files and locations in them different enough not to conflict in the mechanics of the patch process as well as the source code logic.

For a simple case, just git remote add lxbrand ../illumos-gate-lxbrand or somesuch, for each cloned source code repository you are interested in, and the Update-pull.sh would take care of importing the changes for you. Then you'd just run nightly.sh (assuming the other setup needed How To Build illumos has been completed) and wait for results.

Homework: expand the script to detect the head branch used by a particular local repo replica (if not master) and import that – because the local per-feature workspaces should really take advantage of the git branching capabilities (and not pretend to each be a master), so that overall development on each can be ultimately tracked by the same repository (laid out on the public "origin"). Also TODO: explain how this branching should properly be done with a snippet of code in this page. 

Updating the developer's personal fork from upstream

Via GitHub Web-interface (for a few changesets) 

Go to your repo on the GitHub site and click the green icon near repo title (just above the file list), on the next page click the suggestion to rebase the comparison. This goes to a link like https://github.com/illumos/illumos-gate/compare/jimklimov:master...illumos:master (in this URL jimklimov is my login name – fix to yours accordingly) and compares my own origin repo to the central "vanilla" upstream. There is a button to Create a pull request (so my origin repo can pull from upstream), for cases where merges can be processed automatically (otherwise editing files on the workstation's local repo and pushing the changes to origin is needed). After creation of the pull request, I can click "Merge the request" and then "Confirm the merge" to complete and close this pull request.

Via command-line

To fetch many changesets at a time (i.e. if dozens of commits have happened since last sync), and/or to actually merge them where manual processing and human decisions are needed, I can go to my workstation to pull the changes from upstream, merge them with my current work if needed, and push them to my public origin, as detailed in numerous posts linked above.

 

Labels:
  1. Feb 02, 2015

    FYI: While working with Git repositories stored on illumos-based ZFS systems, I ultimately automated as much as I felt reasonable to As a result, there are some git plugin scripts available here: https://github.com/jimklimov/git-scripts

    Relevant to this article, git zclone workspace newws would zfs-snapshot the dataset which backs the "workspace" directory and create a new cloned dataset "newws" – and this is just a simple usecase (it is more flexible than that, including replication of remote HTTP/SSH repos into a newly created local ZFS dataset)