Intro

Modern illumos development is maintained via the GitHub repositories. Older Mercurial repos are somewhat maintained, but may lag behind in updates.

Now I am learning about Git and logging the steps below, to implement the Working on several bugs at once and Managing multiple workspaces approach with the GitHub repositories (wink)

Preliminary reading and links

On cloning

Each git repository is complete and unique, including all history and branches of the project. This data is stored in .git/objects/pack/* subdirectory; if a clone is created in the same filesystem, these files are hardlinked into the new repository to save some time and space. The checked-out source code for the currently selected branch is in unique files (using disk space over and over again).

Apparently, each cloned repository has one current "head" branch (master or a feature branch), to which corresponds the checked-out source code (other files in the workspace directory). It seems that for each branched workspace, a separate repo with a particular head branch should be instantiated (cloned by git... or zfs).

Terminology

For the terms and repo names below, there will be:

Action

Create a developer's personal fork of illumos-gate (the "origin")

Login to www.github.com and go to https://github.com/illumos/illumos-gate/fork to create the personal (or corporate) fork of illumos-gate. This will be used to publish the results of your work for others (including the common upstream) to review and ultimately merge ("pull"). This can also be used as a backup and a means to disseminate your work among the several hosts you yourself might be using (i.e. a workstation at work, a laptop on the road, a dedicated build host) – not necessarily all connected to each other at the same time.

Environment variables

The code snippets below will rely on the following variables set in your root and/or unprivileged development user (illumos-dev in examples below):

### Envvars for the snippets below
:; DEVUSER="illumos-dev"
:; DEVHOME="/export/home/$DEVUSER"
:; DEVROOTDS="rpool/export/home/$DEVUSER"
:; GITHUBLOGIN="$DEVUSER"

Note that  the example above is "portable", i.e. in that nearly every system has an rpool, but you might have and might want to choose a different pool (i.e. bigger, faster, whatever).

Prepare the development user's home directory and working datasets

On developer's workstation or build host, prepare the environment including the git program suite, workaround for HTTPS with CURL from  https://www.illumos.org/issues/1536 and just the working dataset (if not yet created), all done as root (or equivalent via pfexec):

### Ensure availability of GIT (beside other illumos-compilation packages)
:; pkg install git

 
### Workaround for cURL with HTTPS, needed for GIT
:; [ ! -d /etc/curl -o ! -f /etc/curl/curlCA ] && \
   mkdir /etc/curl && cat /etc/certs/CA/*.pem > /etc/curl/curlCA
 

### Create datasets for the build user
:; zfs create -o compression=gzip-9 -o mountpoint="$DEVHOME" "$DEVROOTDS"
:; zfs create "$DEVROOTDS"/code

 
### On a personal development machine, to match other illumos-building
### instructions, you can provide a "/code" path:
:; ln -s "$DEVHOME/code" /
 
### Create the dev(build)-user account as well, if one is not yet present
:; useradd -d "$DEVHOME" -g staff -m -s /bin/bash "$DEVUSER"
### Might complain that
###   UX: useradd: illumos-dev name too long.
### but works anyway
 
### Do this only for initializing a new user
:; ( cd /etc/skel && tar cf - . ) | ( cd "$DEVHOME" && tar xf - )
 
:; chown -R "$DEVUSER" "$DEVHOME"
:; zfs allow -l -d -u "$DEVUSER" \
   create,destroy,snapshot,rollback,clone,promote,rename,mount,send,receive \
   "$DEVROOTDS"
 
### To simplify the configuration below in context of different users
:; grep DEVHOME "$DEVHOME"/.profile >/dev/null || \
   cat >> "$DEVHOME"/.profile << EOF
DEVUSER="$DEVUSER"
DEVHOME="$DEVHOME"
DEVROOTDS="$DEVROOTDS"
GITHUBLOGIN="$GITHUBLOGIN"
EOF

Configure the developer's Git global credentials

There are several layers of Git configuration data that can be stored in the system paths (common for all developers), in each user's settings ("global" settings of the user), and in each single repository configuration; with the most-specific version of the config being applied. While the configuration files are just blocks of text located in different paths, it is customary to maintain them with the git config command. For example, "personal" information about the user that can be added into commit metadata belongs to the user's settings:

### :; su - "$DEVUSER"
:; git config --global user.name "Frodo Baggins" 
:; git config --global user.email "frodo.baggins@underhill.net" 
:; git config --global core.editor "/bin/mc -e"
 
# Add some SVN-like aliases
:; git config --global alias.st status ; \
   git config --global alias.co checkout ; \
   git config --global alias.br branch ; \
   git config --global alias.up rebase ; \
   git config --global alias.ci commit

Create the local repository ("golden")

Then as the development/build user (i.e. illumos-dev above) go get the source code (the target directory should be empty; note that ZFS/CIFS autosharing might create an .$EXTEND subdirectory causing problems, and that it may be safely removed):

### :; su - "$DEVUSER"
:; cd "$DEVHOME"/code

:; zfs create "$DEVROOTDS"/code/illumos-gate-golden 
:; rm -rf 'illumos-gate-golden/.$EXTEND' >/dev/null 2>&1
:; git clone https://github.com/"$GITHUBLOGIN"/illumos-gate.git illumos-gate-golden

Wait a bit...

Add the upstream link:

:; ( cd illumos-gate-golden && \
   git remote add upstream https://github.com/illumos/illumos-gate )

This should leave you with your local golden repository cloned from your public "origin" repo and related to the common "upstream" repo, which makes it easy to pull in changes from both and push changes to your public "origin". Note that occasional pulling from your public "origin" might be useful for example if it is used to synchronize the code on your different hosts like a laptop on the road and the build/test server farm at work.

This local golden repository would serve as the base template for your other local work (on particular features or bugs), and when those quests are completed and tested – it will be used pull in the resulting changesets and push them to your public "origin".

Add a script to pull in code updates

In fact, let's embed the script now to automate git pulls from your current repository's configured remote peers:

:; ( cd illumos-gate-golden && \
cat > Update-pull.sh << EOF
#!/bin/bash
# Pull in updates from configured (remote) or explicit (URL/path) git peers
GATES=""
GATES="\$GATES `git remote show`"
#GATES="\$GATES ../illumos-gate-golden"
#GATES="\$GATES ../illumos-gate-lzbrandz"
GATES="\$GATES upstream origin"
RES=0
for GATE in \$GATES; do
        echo "git pull \$GATE master"
        git pull \$GATE master || RES=\$?
done
exit \$RES
EOF
chmod +x Update-pull.sh )

Note that for each subsequent ZFS-clone of this dataset, you end up with a copy of this script which might be left as-is to use the per-repository-clone configured remote peers, or adapted to your needs (such as by uncommenting or commenting away certain GATES definition lines). Also note that this example pulls in the master heads; for specific projects you might be interested in following some other branches – then amend the repo's copy of the script accordingly.

ZFS-cloning a Git repository

Note that there are several ways to clone Git repositories. A portable way on any platform is to run git clone which creates a copy (or hardlinks if in the same filesystem) of the repository files, and instantiates a new copy of all editable files in the selected head branch into the workspace. This does waste some space and time for making each copy, in comparison to what can be done with filesystem clones available with ZFS (unless you use deduplication, which might save space but might also be wasteful on other resources – including space unless you have too many copies of the same data) (smile)

Namely, the suggested procedure with ZFS gets away by just cloning a dataset with the reference repository and rewriting links to the origins or upstreams used (if you want to follow and/or merge another line of development, for example). In this process only the files that need to be changed are replaced, even if you work with a different branch, which saves quite a bit of space. Since my development attempts and/or builds are often done on older servers, or in VM's on laptops, free space is constrained and valued and such savings become a substantial factor.

Even if you use such clones to track remote projects for private compilation of a non-vanilla illumos-gate consolidation (such as additions from the illumos-gate-lxbrandz example below), the general premise is that most of the codebase is common, deviations are minimal and mostly relate to features added or changed by a particular upstream repository, and space savings thanks to ZFS dataset cloning remain in force... and also you get to easily keep some pre-defined customizations of the working directory, such as that Update-pull.sh script above, or prepared illumos.sh compilation settings, or some symlinks to separate datasets with package or proto storage directories, etc. 

In particular, the actual compilation of the illumos-gate per this procedure happens in a dedicated repository workspace which can pull in changes from a number of source repositories, so that cloning of the pure source-code repositories does not carry over the binary object files. In this example we will create a clone just for that – building, though I will generalize the code snippet for any "feature":

### :; su - "$DEVUSER"
 
:; FEATURE="build"
:; TS_NOW="`date -u "+%Y%m%d%H%M%SZ"`"
:; zfs snapshot "$DEVROOTDS"/code/illumos-gate-golden@"$TS_NOW" && \
   zfs clone "$DEVROOTDS"/code/illumos-gate-golden@"$TS_NOW" "$DEVROOTDS"/code/illumos-gate-"$FEATURE" 
 
:; cd "$DEVHOME"/code/illumos-gate-"$FEATURE"
:; git remote rm origin
:; git remote add origin `pwd`/../illumos-gate-golden

The idea with modification of the "origin" for local working repositories is to have the local "golden" repository serve as the buffer between your local development and the published work... to save on traffic and as an extra layer of security from embarassments (since an "origin" remote repository is used for pulls and pushes by default), or something like that... (smile)

Tracking a remote project via git

Let's add a replica of some non-vanilla upstream with features you want to test or use, but that have not yet been RTI'd into the common codebase, for this example I will track the (alas, unfinished) work on the lx brand to run zones with native Linux operating environments:

### :; su - "$DEVUSER"
 
:; FEATURE="lxbrandz"
:; TS_NOW="`date -u "+%Y%m%d%H%M%SZ"`"
:; zfs snapshot "$DEVROOTDS"/code/illumos-gate-golden@"$TS_NOW" && \
   zfs clone "$DEVROOTDS"/code/illumos-gate-golden@"$TS_NOW" "$DEVROOTDS"/code/illumos-gate-"$FEATURE" 
 
:; cd "$DEVHOME"/code/illumos-gate-"$FEATURE"
:; git remote rm origin
:; git remote add origin `pwd`/../illumos-gate-golden
 
### Add the remote upstream and pull its code
:; git remote add lxbrand https://github.com/IRIXUser/illumos-gate
:; git pull lxbrand master

Patching from a remote project's development

In a similar vein to development done locally, you might have some code to try available as a patch-file (perhaps, a part of someone's webrev or a similar way to publish the work's results). To give it a spin, you might just apply the patch to the source code replica:

### :; su - "$DEVUSER"
 
:; FEATURE="lxbrandz-patch"
:; TS_NOW="`date -u "+%Y%m%d%H%M%SZ"`"
:; zfs snapshot "$DEVROOTDS"/code/illumos-gate-golden@"$TS_NOW" && \
   zfs clone "$DEVROOTDS"/code/illumos-gate-golden@"$TS_NOW" "$DEVROOTDS"/code/illumos-gate-"$FEATURE" 
 
:; cd "$DEVHOME"/code/illumos-gate-"$FEATURE"
:; git remote rm origin
:; git remote add origin `pwd`/../illumos-gate-golden
 
### Apply the patch
:; gpatch -p1 < ../illumos-gate-lx24-lx26.patch
:; git commit -m 'Imported patch illumos-gate-lx24-lx26.patch'

Building a combined project

To automate building the illumos-gate with a merger of the current vanilla upstream, the additional patches that were received in one way or another, and perhaps your local development results, you might want to combine all these variations of the common codebase into one workspace (earlier we've defined the illumos-gate-build just for such a purpose – to be the only place where the compiled objects and binaries spawn en-masse). At least, this should be an easy job if the changes involve files and locations in them different enough not to conflict in the mechanics of the patch process as well as the source code logic.

For a simple case, just git remote add lxbrand ../illumos-gate-lxbrand or somesuch, for each cloned source code repository you are interested in, and the Update-pull.sh would take care of importing the changes for you. Then you'd just run nightly.sh (assuming the other setup needed How To Build illumos has been completed) and wait for results.

Homework: expand the script to detect the head branch used by a particular local repo replica (if not master) and import that – because the local per-feature workspaces should really take advantage of the git branching capabilities (and not pretend to each be a master), so that overall development on each can be ultimately tracked by the same repository (laid out on the public "origin"). Also TODO: explain how this branching should properly be done with a snippet of code in this page. 

Updating the developer's personal fork from upstream

Via GitHub Web-interface (for a few changesets) 

Go to your repo on the GitHub site and click the green icon near repo title (just above the file list), on the next page click the suggestion to rebase the comparison. This goes to a link like https://github.com/illumos/illumos-gate/compare/jimklimov:master...illumos:master (in this URL jimklimov is my login name – fix to yours accordingly) and compares my own origin repo to the central "vanilla" upstream. There is a button to Create a pull request (so my origin repo can pull from upstream), for cases where merges can be processed automatically (otherwise editing files on the workstation's local repo and pushing the changes to origin is needed). After creation of the pull request, I can click "Merge the request" and then "Confirm the merge" to complete and close this pull request.

Via command-line

To fetch many changesets at a time (i.e. if dozens of commits have happened since last sync), and/or to actually merge them where manual processing and human decisions are needed, I can go to my workstation to pull the changes from upstream, merge them with my current work if needed, and push them to my public origin, as detailed in numerous posts linked above.