Tag: git

New Git Server

I love GitLab, but it’s resource intensive and I wasn’t able to run it on the current hardware. Getting a new server has been a slow process, so I needed another solution. We use the Bonobo Git Server at work — it’s really simple and doesn’t provide all of the analysis and workflow functionality of GitLab (e.g. there are no pull requests!) … but it does keep revisions of code to which you can revert. And that’s better than trying to figure out what changed and broke a program. So I set one up at home.

Super simple — except they forget a few steps — like they don’t mean you need to install .NET 4.6. You need to go in the Add Roles & Features, under Application Development Features, and enable ASP.NET. 4.5 worked fine for me. You’ve also got to restart your IIS MMC if you left it running, and change the application pool over to one that actually uses ASP.NET.

Once that was installed, a few quick changes to web.config enabled AD-based authentication. And now we’ve got a git server at home that I can leave running 24×7.

Except … I tried using a new git server and got the following error:
schannel: next InitializeSecurityContext failed: Unknown error (0x80092012) –
The revocation function was unable to check revocation for the certificate.

To resolve the issue, I needed to changed from Windows to OpenSSL certificate handling:
git config –global http.sslBackend openssl

Except that blew away my config line that defines the SSL CA CRT file. Once I restored my configuration … voila, I can use the git client with my new git server
git config –system http.sslcainfo “c:\Program Files\Git\mingw64\ssl\certs\ca-bundle.crt”

Git For Configuration Management

I am starting to use git to manage application server configurations — partially to ensure team members are familiarizing themselves with git and thinking about it when they update code (we’ve seen a LOT of tweaks that are not pushed to the git server), but also to reduce the administrative overhead of managing servers.

The best use case thus far has been our sendmail environment — seven servers with three configuration bases. By issuing certificates with SAN values for each host name and the VIP name, we are able to use the same cert and config file on each server in a functional group. Admins can make changes to the config offline (i.e. we’re not live-editing config files on the sendmail servers), there is history to who made the changes {and a quick means of reverting changes), and, using a cron’d pull, we can ensure changes are consistent across the environment.

Active Directory Federation Services (ADFS) Relying Party Trust Cert Expiry

At work, we received a critical ticket for an application that was unable to authenticate to ADFS. Nothing globally wrong – other applications are authenticating. A long call later, we discovered that the app’s certificate has expired. Why would the application not monitor their certificate expiry dates?? That’s an excellent question, but not one over which I have any control.

can monitor their certs on our side. So I wrote a quick powershell script to grab certificates from the relying party trusts and alerts us if any certs will be expiring in the next 30 days. It has to run on the ADFS server – I’d love to get it moved to the automation server in the future. I expect get-adfsrelyingpartytrust returns disabled agreements. I want to filter out disabled agreements.

Git Pull Requests

I have finally run through the process of submitting a pull request to suggest changes to a Git repository. Do the normal ‘stuff’ either to make a new project or to clone an existing project to your computer. Create a new branch and check out that branch.

C:\ljr\git>git clone https://github.com/ljr55555/SampleProject

Cloning into ‘SampleProject’…

remote: Counting objects: 4, done.

remote: Compressing objects: 100% (3/3), done.

remote: Total 4 (delta 0), reused 0 (delta 0), pack-reused 0

Unpacking objects: 100% (4/4), done.

C:\ljr\git>cd SampleProject

C:\ljr\git\SampleProject>git branch newEdits

C:\ljr\git\SampleProject>git checkout newEdits

Switched to branch ‘newEdits’

Make some changes and commit them to your branch

C:\ljr\git\SampleProject>git add helloworld.pl

C:\ljr\git\SampleProject>git commit -m “Added hello world script”

C:\ljr\git\SampleProject>git push origin newEdits

Counting objects: 3, done.

Delta compression using up to 4 threads.

Compressing objects: 100% (3/3), done.

Writing objects: 100% (3/3), 408 bytes | 408.00 KiB/s, done.

Total 3 (delta 0), reused 0 (delta 0)

To https://github.com/ljr55555/SampleProject

 * [new branch]      newEdits -> newEdits

On the GitHub site, click the “new pull request” button. Since you select the two branches within the pull request, it doesn’t seem to matter which branch’s “Pull request” tab you select.

Select the source branch and the one with your changes. Verify you can merge the branches (otherwise you’ve got a problem and need to resolve conflicts). Review the changes, then click “Create pull request”

Here’s another place for comments – comments on the pull request, not the commit comments. Click “Create pull request”.

Click “Create pull request” and you’ve got one! Now what do we do with it (i.e. if you’re the repository owner and receive a pull request). If you check the “Pull request” tab on your project, you should see one now.

Click on it to explore the changes that have been made – the “Commits” tab will have the commits, and the “Files changed” tab will show you the specific changes that have been made.

You could just comment and close the pull request (if, for instance, there was a reason you had not implemented the project that way and do not wish to incorporate the changes into your master branch). Assuming you do wish to incorporate the code, there are a couple of ways you can merge the new code into your base branch. The default is generally a good, or read the doc at https://help.github.com/articles/about-pull-request-merges/

Select the appropriate merge type and click the big green button. You have an opportunity to edit the commit message at this point, or just click “Confirm merge”

Voila, it is merged in. You can write some comment to close out the pull request.

There is a notification that the request was completed and the branch can be deleted.

And the project no longer has any open pull requests (you can remove the “is open” filter and see the request again).

And finally, someone should delete the branch. Is that the person who created the branch? Is that the person who maintains the repository? No idea! I’d delete my own, to keep things tidy … but I wouldn’t be offended if the maintainer deleted it either.

 

Visual Studio Code

We found a free, open source code editor from Microsoft called Visual Studio Code — there are downloadable modules that include formatting for a variety of programming languages (c#, cpp, fortran), scripts (perl, php), and other useful formats like MySQL, Apache httpd config files. It also serves as a GUI front end to git. And that is something I’ve been trying to find since I inherited a git server at work — a way for people to avoid having to remember a dozen different git commands.

Certificate Error On Git

Finally got around to switching my GitLab site over to HTTPS — made an ssl folder in /etc/gitlab and then placed the public/private key pair in that folder. Files named with the external URL hostname with a key and crt suffix (gitlab.rushworth.us.crt and gitlab.rushworth.us.key in my case). Then in gitlab.rb, I changed the external_url to an https:// prefix. Voila, a secure GitLab server.

Oops – forgot about the client. Adding the secure site as the remote, I get “unable to get local issuer certificate” on the git client. Since I used a CA signed certificate, I just had to put the CA public key into git’s ca bundle. If you use a self-signed certificate, I believe the certificate public key would need to be used.

Where is git’s CA bundle? Ask git:

C:\Program Files\Git\bin>git config –list
core.symlinks=false
core.autocrlf=true
core.fscache=true
color.diff=auto
color.status=auto
color.branch=auto
color.interactive=true
help.format=html
rebase.autosquash=true
http.sslcainfo=C:/Program Files/Git/mingw64/ssl/certs/ca-bundle.crt
diff.astextplain.textconv=astextplain
filter.lfs.clean=git-lfs clean — %f
filter.lfs.smudge=git-lfs smudge — %f
filter.lfs.required=true
filter.lfs.process=git-lfs filter-process
credential.helper=manager
http.sslverify=true

Edit that file with something that understands Unix new line characters and paste your CA public key at the end of the file.

Git

I mentioned that I had inherited a Git implementation last week. Here is the documentation I created to teach my coworkers what Git is and how to use it. Some isn’t applicable outside of our environment (you won’t care about the AD groups that control access to the system), some is applicable for small non-dedicated development teams … but I figured I’d post the presentation and quick reference guide on the Internet in case it was useful to someone else.

Background:

Git is a system that provides version control for files – we’re using it to control script/program code versions (source control management), but I could put this document in Git and use the version control to manage edits to the document. You can use it to maintain configuration files – allowing config changes to be traceable. You could use it as a cookbook if you were so inclined – a chef tinkering with a recipe might be interested in going back a few versions and trying something else.

Git provides some functionality that is redundant to other systems – you could, for instance, import our scripts to SharePoint and make code changes within SharePoint. The individual replacing the file is recorded. If a previous version is needed, SharePoint maintains previous versions that can be recovered. Why use Git instead of SharePoint? Git makes it easier to have multiple developers working on a program, including functions to “merge” the edited files together. You can have different versions of the whole project – in SharePoint, I can see different versions of each file, but I have no way of correlating which version of file x.cs goes with y.h … which makes the versioning less useful. The inverse is also true — I’m speaking about git as a source code management platform, but we use it to maintain configuration files too. There are even less IT/source control uses for git out there — anything where tracking who changed what when is valuable could leverage git.

If you want the history, LMGTFY 🙂 Or, you know, read WikiPedia. LT;DR: It’s one of Torvalds’s projects, initially used for Linux kernel development and has since become a widely adopted source control management platform. If you have ever looked at a project on GitHub, you have seen a little bit of Git. GitHub is a massive, public Git repository. Because Git has significant adoption within the OpenSource community, there are a lot of good documents on its internal mechanisms (https://book.git-scm.com/book/en/v2/Git-Internals-Git-Objects for example, if you are interested in how data is stored), how it is used (Google “git cheatsheet” and there are thousands of them, or full books like https://git-scm.com/book/en/v2), and oddball errors that might crop up.

Implementation

We have a Bobobo Git server using Active Directory for both authentication and authorization. The server source is available on GitHub (https://github.com/jakubgarfield/Bonobo-Git-Server) where you can see issues and be included in conversation about source updates (subscribing lets you know when new versions should be available for install).

Questions and bugs regarding the program are maintained in the GitHub issues section. The Google forum that may come up in searches is not active and was retained for history.

This brainshare is primarily to show client-side usage of the Git server. Server setup, configuration, and management is not the focus. One thing I will highlight on the server config: the groups used to provide authorization are not preexisting or nested groups. This means new team members will need to be added to the appropriate “Windstream CSG Git …” group to use the server.

<add key=”ActiveDirectoryMemberGroupName” value=”Windstream CSG Git Users” />

<add key=”ActiveDirectoryTeamMapping” value=”VDI=Windstream CSG Git VDI,SharePoint=Windstream CSG Git SharePoint, Directory Design=Windstream CSG Git Directory Design”/>

<add key=”ActiveDirectoryRoleMapping” value=”Administrator=Windstream CSG Git Admins” />

The above snippet is from the Web.config file located on the server at F:\inetpub\www\CSG – if new groups need to be added to the Git server, that is where the magic happens.

Some Terminology:

A repository is a storage location. It can store one file, it could be a whole bunch of files that make up a single program (e.g. the CSOCheck Visual Studio project), it could be a bunch of independent programs that have similar purposes (e.g. ‘Provisioning’ that holds all our provisioning scripts). A repository could be all our code glommed into one place (don’t do this – it makes maintaining an individual program more difficult).

A branch is another server-hosted copy of the project. You don’t want to directly edit the in-use production code (we do but this is certainly not a programming best practice!) – a branch is a copy on which development is done. Once development has been completed, the branch is merged back into the master copy. Looking at Git with small projects and a small number of developers, I wouldn’t expect to see a lot of branches on a project. A large program with a lot of dedicated developers may have some break/fix branches as well as longer term feature enhancement branches.

A fork is a personal copy of a repository. In OpenSource development, forks avoid making changes in someone else’s repository. You create your fork, work within your copy, then offer the changes in your fork for inclusion in the project. We don’t have much need to create forks — we would create a branch within the project.

Project – Bonobo does not seem to have projects, but other Git implementations do. A  project includes the repository, an issues log, pull requests, and sometimes even a Wiki for the application. If you see someone referring to a project, for us that is just the repository.

The project maintainer is the individual who “owns” the project – this isn’t a project sponsor (a non-tech individual who owns a business relationship) but a technical supervisor for the development work who may also have project sponsor define-requirement type roles. The maintainer decides if changes and features are added. You can suggest changes or features – in OpenSource projects, review the existing issues to see if the feature was already requested (and make a new issue to request the feature if one does not exist) before spending a lot of time working on code that will not be accepted. Your idea may be something people are excited to see included in the project. Or it may be something they don’t want (you can always make a fork and add the feature to your iteration of the project). Even a bugfix – your proposed solution may be accepted. Or there may be a reason the maintainer wants to use a different approach to the issue. We do not have project maintainers.

Commit – this is basically making changes to the branch (add a file, delete a file, or modify a file). A commit should represent a single change. By that, I don’t mean every time you change a line, make a commit. You may well have to update a hundred lines of code across five different files to resolve an issue or implement a feature. But it’s just *one* issue or feature being implemented in the commit. You shouldn’t have a commit that implements SSL encryption in LDAP authentication and allows individuals to approve requests for direct reports. These two things have nothing to do with each other, even if they happen to be the two cards you’ve worked on today.

Commit messages associated with commits where you can indicate what is being changed in the commit. A “good” commit message is like well commented code – don’t provide too much info like “I added XYZ to line 81 on file abc.def, but don’t write “Bug fixes” either. A commit message should convey what has been changed without someone having to diff the versions (i.e. saves time). In more formal software development, commit messages also aid in the creation of release notes. Something like “Changed new user template to include ourOrgPerson objectClass” provides enough detail that we can tell what the commit did – if someone wants to find out what lines got edited, they can diff the files and tell. You can view the commit history in the web site or by using “git log”.

Push is the process of updating the server repository with changes you have made on your local repository.

Pull Request is a term you may encounter when reading Git documentation or participating in GitHub. The request basically clue someone into the fact you’ve got code to be reviewed or integrated into an upstream branch. The project maintainer would, once the changes had been reviewed and agreed upon, merge the feature into the repository and close the pull request. This is not a process we are following, nor are code-related discussions or issue lists tracked within the Git server.

Deploy – once the pull request has been approved, you can deploy and test the changes. If the changes do not work, you roll back by re-deploying the existing master.

Merge is used to combine an individual’s local repository with a server-housed copy of the branch or to combine two branches.

GitHub is an Internet based Git repository used by a lot of people and a lot of OpenSource projects. Projects are publicly readable (well, projects held in free accounts. There’s an add-on fee that allows you to maintain private projects). Yes, we could just get Git enterprise licenses and use the hosted service. We elected to deploy an internally hosted and maintained server.

Process Flow:

In a simple development environment like we have (we’re not dedicated programmers working on enormous applications), branches are straightforward. We’ve got a master for a project. When we encounter a problem, or wish to expand functionality, we make a working branch. Sort the issue or add the feature, commit. Deploy and test the code, then merge your working branch back into master.

If there is only one person working on a project at a time, merge conflicts are not really a thing. We don’t have ten different branches, we don’t have branches from branches (e.g. a branch for implementing external authentication which then has a branch for LDAP, DB table, and external authentication providers. Then the external authentication provider branch has a branch for Facebook, .NET, and Google authentication providers.). When you have a tree full of branches, you need to resolve merge conflicts (pick which change makes it) before you can merge your pull request branch.

A lot of “how to use Git” is process-related and not technical how-to stuff. Questions like “what are your software development lifecycle management processes?” and “What are your criterion for creating a new branch?”

When I worked with full-time developers, we created an “EmerStaging” branch for development on critical incidents and a “DevStaging” branch for development on non-critical incidents. The EmerStaging branch was only intended to be around for a few days – the branch would start out identical to the master, whatever big deal issue would be sorted, then the branch merged back into the master. These changes would then be sync’d down to all other branches (we don’t want the bug to impact development or, worse, to be reintroduced as someone merges in their long-term development project). The DevStaging branch was always present – there’s always a backlog of lower priority bug-fix type stuff to be done – and the project maintainer would ensure the downstream branches were updated when they processed pull requests. In addition to these break/fix branches, a new branch existed for the long-term development work – next version application or specific new features that had not been assigned to a specific release iteration.

Our environment is not so complex – we should be able to get by with one development branch when there is active development on a project and only the master branch when changes are not being made. Following this process, we avoid the challenges of synchronizing and merging multiple branches and sub-branches.

The Git Client

Simply put, a Git client puts files on the local disk and pushes those files back to the server. The first step is getting a git client installed. The examples I am showing today are using the CLI utilities from https://git-scm.com/download/win simply because I already use them at home (it’s the version . Yes, there are other git clients. Lots. If you have used a different client that you prefer, go for it. Different clients will not corrupt a repository.

Some IDE’s have Git integration – their own Git client – it may or may not work with our implementation (some are specific to GitHub / GitHub Enterprise which is not the same thing). If you are using an IDE, it may be convenient to research integrating your IDE directly with Git. There is no need – you can use the command line utilities to retrieve files, switch over to the IDE for your development work, and then use the command line utilities to add, commit, and merge your changes.

To install the Git-SCM clients, download and run the installer. Selecting the defaults on the installation are sufficient – although if you do not have the Win32 port of the GNU utilities, you can select the third option to get grep and such in DOS.

Once the installation completes, grab the two files from \\CWWAPP695\c$\Program Files\Git\mingw64\ssl\certs and put them into your install path\git\mingw64\ssl\certs folder (I renamed the existing ones, but there’s no reason not to delete them). If you see the error “SSL certificate problem: unable to get local issuer certificate”, re-read the last sentence and try again.

Identify a folder on your computer into which you want to clone projects. You can store different projects in distinct locations or you can have a top-level folder in which all your projects are housed.

Creating A New Repository

Log into https://csggit.windstream.com using your Active Directory username and password (no need to specify domain). Repositories are sorted into groups – a group may be a single application project. For example, “AD Password Filter”. A group may contain several different application projects – for example, “Auth Samples”.

To create a new repository, click the big blue button in the upper right-hand corner that says exactly that. Provide a name for the repository – this cannot contain spaces, but should be descriptive enough that people do not need to actually read through the code to see what the program does. I am making a project called “HelloWorld” because … tradition.

Supplying a group name will sort the repository into a group on that first page – please do this, even if your group is your program. Otherwise it’s like creating all of your files in one folder … fine for a small number of files, but quickly difficult to look at. We may want to make a Misc group to hold oddball one-off programs.

The description field provides a place for freeform text describing the purpose of the program. This doesn’t have to be long, but it would be nice to have something. We can consider adding the server(s) to which the code is deployed – that would provide a quick way to list our scripts, what they do, and where they run.

Contributors can clone, push, and pull a repository. Administrators are additionally able to edit the repository details (i.e. change the stuff we’re putting in here now) and delete the repository.

Select the team(s) which will need to access the repository. Disclaimer – most of my experience with Git is at home using a GitLab server. There are only two of us, so permissioning isn’t really a concern. Not sure exactly how secure this is (i.e. if I don’t select Directory Design, can they still view the source but cannot write to it? Do they not even see the repository? I’d not interested enough to get another ID and add it into the security group, but if someone wants to test now … that would be cool.). Click ‘Create’ and the repository will be created.

Look near the top of the page – there will be a hyperlink to go to the new repository. Click that.

We will need the “General Url” as we begin working with the repository (i.e. copy the link address now).

Working With Your Repository

Now that you have a project and URL, clone the project to your local repository – if this is a new project, ignore the warning. If this is a project you expect to have some existing content … well, don’t ignore the error:

D:\tempCSG\ljr\Git>git clone https://csggit.windstream.com/CSG/LJR.git

Cloning into ‘LJR’…

warning: You appear to have cloned an empty repository.

 

If you are using the Git credential manager, you will be asked to authenticate to the server the first time you clone a repository. You do not need to specify the domain. When you change your password, you can use the Windows Credential Manager to edit your stored credential.

Once the connection has been authenticated, the client will clone the repository and volia, we have stuff

D:\tempCSG\ljr\Git\LJR>dir

Volume in drive D is Data

Volume Serial Number is FA7B-B3E4

 

Directory of D:\tempCSG\ljr\Git\LJR

 

06/26/2017  02:33 PM    <DIR>          .

06/26/2017  02:33 PM    <DIR>          ..

0 File(s)              0 bytes

2 Dir(s)   9,644,441,600 bytes free

 

OK, that wasn’t a whole lot of stuff – it just created a folder for my application! Make some files in there – that may mean using the folder as your IDE project location. It may mean using notepad and making a new file. Whatever your approach, make a new file and add some code.

D:\tempCSG\ljr\Git\LJR>notepad helloworld.pl

Then add the new file(s) to the local git repository – important bit here, we are currently making changes to our copy. If you check the server, it is still an empty project.

D:\tempCSG\ljr\Git\LJR>git add *

The * is a wildcard – if you are working on a larger project, you can add just the files you are updating (i.e. I could use git add helloworld.pl here). I used the wildcard here because a lot of people like the convenience. Personally, I always recommend programmers add by name to ensure they are adding the proper ‘stuff’ to the project. There are other short-cut add options: git add . will stage new and modified files (not deletions), git add -u will stage modified and deleted files, and git add -A will stage all files.

Then commit – since this is the first file, I am not using a great commit note. Generally I’ve recommended making the first commit note a link to the requirements document … basically if I wanted to find out why we’ve got this program and what it is meant to do, where do I go? In our case, this might be an INC # or a SharePoint URL. Or it might just be a freeform text like “Provision DMZAD group memberships from acildsdb:OSR2.CWSODMZTable”.

D:\tempCSG\ljr\Git\LJR>git commit -m “Created project”

[master (root-commit) 7a66c68] Created project

1 file changed, 2 insertions(+)

create mode 100644 helloworld.pl

 

Check the web view to see what’s in the project: nothing. Push the changes:

D:\tempCSG\ljr\Git\LJR>git push

Counting objects: 3, done.

Writing objects: 100% (3/3), 256 bytes | 0 bytes/s, done.

Total 3 (delta 0), reused 0 (delta 0)

To https://csggit.windstream.com/CSG/LJR.git

* [new branch]      master -> master

I mentioned earlier that making updates in the master branch is not a best practice … the first time around is an exception … there’s no production implementation that you’re going to bugger up. Now that we’ve got a project that’s running in production (pretend), we’ll make a branch when we want to make changes. Check out the branch – this changes your git ‘context’ to the new branch.

D:\tempCSG\ljr\Git\LJR>git branch newEdits

D:\tempCSG\ljr\Git\LJR>git checkout newEdits

Switched to a new branch ‘newEdits’

Yes there is a shortcut to doing this – “git checkout -b newBranchName”. Push the new branch

D:\tempCSG\ljr\Git\LJR>git push origin newEdits

Total 0 (delta 0), reused 0 (delta 0)

To https://csggit.windstream.com/CSG/LJR.git

* [new branch]      newEdits -> newEdits

 

Make some more changes and add the changed file(s) to the local repo

D:\tempCSG\ljr\Git\LJR>notepad helloworld.pl

D:\tempCSG\ljr\Git\LJR>git add helloworld.pl

Commit the changes and push to the server:

D:\tempCSG\ljr\Git\LJR>git commit -m “Added international support”

[newEdits 28365d9] Added international support

1 file changed, 8 insertions(+), 1 deletion(-)

D:\tempCSG\ljr\Git\LJR>git push origin newEdits

Counting objects: 3, done.

Delta compression using up to 2 threads.

Compressing objects: 100% (2/2), done.

Writing objects: 100% (3/3), 335 bytes | 0 bytes/s, done.

Total 3 (delta 0), reused 0 (delta 0)

To https://csggit.windstream.com/CSG/LJR.git

7a66c68..28365d9  newEdits -> newEdits

 

Now if you look @ the repository browser on the web site, https://csggit.windstream.com/CSG/Repository/LJR/newEdits/Blob/helloworld.pl, you will see the additions we’ve made. Add some more and repeat the process.

D:\tempCSG\ljr\Git\LJR>notepad helloworld.pl

D:\tempCSG\ljr\Git\LJR>git commit -m “Added Swedish and Hungarian greetings”

[newEdits b2cbedd] Added Swedish and Hungarian greetings

1 file changed, 2 insertions(+)

 

D:\tempCSG\ljr\Git\LJR>git push origin newEdits

Counting objects: 3, done.

Delta compression using up to 2 threads.

Compressing objects: 100% (2/2), done.

Writing objects: 100% (3/3), 340 bytes | 0 bytes/s, done.

Total 3 (delta 1), reused 0 (delta 0)

To https://csggit.windstream.com/CSG/LJR.git

28365d9..b2cbedd  newEdits -> newEdits

Now look at repository explorer and see how the history is tracked – look @ each commit (notice the commit messages and who made the changes). Click into previous version and see how the differences are tracked.

Fast Forward Merging:

This is possible for simple projects like we’re using – there is a master, a branch for changes, then that branch gets collapsed back into the master when the changes have been finished.

D:\tempCSG\ljr\Git\LJR>git checkout master

Switched to branch ‘master’

Your branch is up-to-date with ‘origin/master’.

 

D:\tempCSG\ljr\Git\LJR>git merge newEdits

Updating 7a66c68..08931d5

Fast-forward

helloworld.pl | 12 +++++++++++-

1 file changed, 11 insertions(+), 1 deletion(-)

 

D:\tempCSG\ljr\Git\LJR>git push origin master

Total 0 (delta 0), reused 0 (delta 0)

To https://csggit.windstream.com/CSG/LJR.git

7a66c68..08931d5  master -> master

 

Check web site – you’ll see your changes in master. But newEdits branch is still there.

D:\tempCSG\ljr\Git\LJR>git branch

* master

newEdits

My recommendation is to collapse the branch (delete it) when you have completed your changes. Otherwise you need to manage branches and merges. If there’s a need for multiple branches of sustained development … that’s beyond the scope of a quick brain share. You can find information on more complex merging operations, including conflict resolution (https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging#_basic_merging) and rebasing (https://git-scm.com/book/en/v2/Git-Branching-Rebasing). Google can also tell you the ongoing debate about etiquette around creating new branches, merging, and rebasing.

To delete a branch once development has been completed and the changes have been merged into master:

D:\tempCSG\ljr\Git\LJR>git push origin –delete newEdits

To https://csggit.windstream.com/CSG/LJR.git

– [deleted]         newEdits

 

D:\tempCSG\ljr\Git\LJR>git branch -d newEdits

Deleted branch newEdits (was 08931d5).

Notice the commit history / notes were copied from the newEdits branch into the master, so we haven’t lost anything by merging our branch into the master.

Your local repository is not automatically updated with changes other people commit to the project. A pull retrieves changes pushed by others to the Git server. Alternately , fetch and merge operations to download the changes and play those changes into your local repository.

Since we are not full-time developers, we might opt not to persistently store projects locally (i.e. we have a specific program that needs to be updated, clone the repository locally, perform the edits, push and merge these edits, then destroy the local copy). Provided two people are not simultaneously working on the same project, the newly cloned project is up-to-date each time you start working on a program.

Stashing Changes

If you are working on a particular branch but not yet ready to commit your changes – and you have a need to work on some else in the previous commit – use “git stash save” to table the changes you’ve currently made. Make whatever changes you need to make, add those changes, commit them, and then use “git stash pop” to return the tabled changes.

Getting Rid Of Stuff

The first question is should you remove something? We often keep old code around for future reference (you want to do something similar, instead of re-writing the whole thing … copy this old program and tweak it for the current need). But leaving every old bit of code in the repository is a bit like never deleting an e-mail message or document on disk … eventually you’ll have a big mess of useless stuff that you’re looking through and backing up.

You could change the repository group to “Archive” (or “zzArchive” so it sorts to the bottom of the web view) – this would retain the code but sort it out into a different logical container to identify it as no longer used code.

Some companies will set up a second Git server dedicated to archive – lower I/O requirements on hardware, not frequently backed up, etc. Old code is pushed up to the archive server and then deleted from the active code server. As we don’t currently have an archive Git server, this isn’t an option. But it is a possibility if inactive code that we want to keep becomes burdensome. Other companies archive the code outside of Git and delete the project from the repository.

To remove a file from the repository, use “git rm filename.xtn”. To remove a repository, you can click the little rubbish can next to the project on the web site.

There is no such thing as removing a group – as soon as no repositories exist in the group, it will disappear from the web view.

A Note On Binary Files

The typical solution to storing large binary files in Git is to implement LFS – this feature  is not yet supported in Bonobo. As such, avoid storing binary files in Git (media, compiled binaries, compressed data).

Binary files tend to be large. Because of the distributed nature of Git, large files are transmitted and stored a lot of places. Frequently changing binary files bloat the server database too. This isn’t to say you cannot store binary files – just that it is a judgement call. Smaller and more static files, great. Three gig files that get updated daily … find another solution.

Many types of binary files do not compress well – especially already compressed files. You can disable delta compression in .gitattributes (*.mp4 binary -delta) to avoid the I/O of attempting to compress already-compressed data.

When merging binary files, diff just tells you they are different. Not particularly illuminating information if you are manually resolving merge conflicts. For non-text files, there may be a filter that allows changes to be represented in a readable format (e.g. Microsoft Word documents) by setting an appropriate filter in .gitattributes (*.docx diff=word). The diff would not include format changes (i.e. if I bolded a specific sentence, that would not be apparent in the diff), but it will display text content that has been updated.

Remote Repositories

The whole point of Git is distributing copies of the repository elsewhere. It is possible to use Git locally – this would allow a single developer to track and revert changes – but typical implementations have multiple developers pulling from and pushing to a remote repository.

You can have more than one remote repository.

D:\tempCSG\ljr\Git\PKIWeb>git remote -v

origin  https://csggit.windstream.com/CSG/PKIWeb.git (fetch)

origin  https://csggit.windstream.com/CSG/PKIWeb.git (push)

You may notice that we have the word origin in some of the commands – this is a default repository created when we clone the branch. You can add additional remote repositories (the example I am using is silly since they are the same location). This could be done to transfer a project to a different repository (moving an out-of-support product to an archive Git server or an acquired company moving repositories into the new company’s repository) or to pull a project from an alternate location (other users who maintain their own project for the same application).

D:\tempCSG\ljr\Git\PKIWeb>git remote add ljr https://csggit.windstream.com/CSG/PKIWeb.git

D:\tempCSG\ljr\Git\PKIWeb>git remote -v

ljr     https://csggit.windstream.com/CSG/PKIWeb.git (fetch)

ljr     https://csggit.windstream.com/CSG/PKIWeb.git (push)

origin  https://csggit.windstream.com/CSG/PKIWeb.git (fetch)

origin  https://csggit.windstream.com/CSG/PKIWeb.git (push)

Normally there’s no need for us to do this (i.e. don’t maintain your own copy of a project, create a branch in the existing one!), except if our project is derivative work of an opensource project that we need to publish externally. You could have both the internal Git server and GitHub registered as repositories. Make your changes and do “push origin” as well as “push whateverYouCallGitHub”.

You can also “fetch origin” and “fetch whateverYouCallGitHub”, but to avoid confusion, I would use the internal Git server as the authoritative repository (anyone else in the group may be editing the code) and only push to GitHub.

When you no longer need a remote repository, you can remove it.

D:\tempCSG\ljr\Git\PKIWeb>git remote rm ljr

 

D:\tempCSG\ljr\Git\PKIWeb>git remote -v

origin  https://csggit.windstream.com/CSG/PKIWeb.git (fetch)

origin  https://csggit.windstream.com/CSG/PKIWeb.git (push)

README

If you participate in GitHub projects, you will notice a README.md file at the root of projects. This is a standard place to include documentation (hence the name), but it is also rendered out in the Git server web site. For an example, see the AD Password Filter project (https://csggit.windstream.com/CSG/Repository/ADPasswordFilter/master/Tree). If there is not a convenient external reference for the initial commit notes, you may want to consider including program documentation in the README.md file.

What if the changes don’t work?

One nice Git feature is undoing changes. The first thing you need to know is that commits have ID numbers (often called a SHA in documentation). You can find that using “git log” or by looking at the web site.

If the changes are local but haven’t been committed to the server, just reset your local copy: git reset –hard ID#

If the changes haven’t been merged into the master branch yet (i.e. you clone your dev branch to the script server, test it … then realize that won’t work), use the git revert functions. First find the commit ID. Then use “git revert ID#” and git will create a commit that is the inverse of the commit specified (it undoes whatever the commit does). Don’t forget to push this revert back to the server.

If the problem is just the commit message, you can modify the message (i.e. remove a typo): git commit –amend -m “This is my new commit message”

You can temporarily revert to a specific commit version (say, to see if the problem you are having was introduced in this version) using “git checkout ID#”. If you intend to make changes from the old state, use “git checkout -b previousState ID# ” to create a new branch from that point.

Ingesting Existing Code

Create the repository. In the directory with your existing code, initialize the directory as a git repository. Add all files to the local repository and commit the initial file load.

D:\Scripts\ljl\wincare-oud>git init

Initialized empty Git repository in D:/Scripts/ljl/wincare-oud/.git/

 

D:\Scripts\ljl\wincare-oud>git add *

 

D:\Scripts\ljl\wincare-oud>git commit -m “Uploading existing code to project”

[master (root-commit) 231bafa] Uploading existing code to project

2 files changed, 76 insertions(+)

create mode 100644 _simulateWincare.pl

create mode 100644 res.txt

 

Add a remote location repository (you can use “git remote -v” to confirm the repository has been added) and push the local repository to the remote

D:\Scripts\ljl\wincare-oud>git remote add origin https://csggit.windstream.com/CSG/WinCareOUDTesting.git

D:\Scripts\ljl\wincare-oud>git push origin master

Counting objects: 4, done.

Delta compression using up to 2 threads.

Compressing objects: 100% (3/3), done.

Writing objects: 100% (4/4), 1.06 KiB | 0 bytes/s, done.

Total 4 (delta 0), reused 0 (delta 0)

To https://csggit.windstream.com/CSG/WinCareOUDTesting.git

* [new branch]      master -> master

But wait …

We’ve got a whole bunch of code written and stashed somewhere … but how does that deploy it? It doesn’t. For compiled code, there would be a build process that follows the commits. Someone like a build manager (or an automated process) takes the updated source code, compiles it, hands it off for testing (may be manual testing by QA people or may be an automated test program), then supplies the compiled binaries for release or deployment.

With our interpreted code, using Git is a process change. Instead of going to the task server, copying the script file to something-ljr.xtn, editing my copy, testing, then moving my copy back to something.xnt – we would branch the master for development, clone the development branch to our workstation or elsewhere on the terminal server, make changes, test, commit and push those changes, then merge the development branch back into master.

Once the branch has been merged into master, use git on the task server to integrate changes. (The shortcut below can also be done as “git fetch origin master” and “git merge master”). I am assuming that fast-forward merges can be done.

D:\Scripts\ljl\wincare-oud>git pull origin master

From https://csggit.windstream.com/CSG/WinCareOUDTesting

* branch            master     -> FETCH_HEAD

Updating 231bafa..202da14

Fast-forward

_simulateWincare.pl | 10 +++++++++-

1 file changed, 9 insertions(+), 1 deletion(-)

On next script execution, the updated code will be used.

Etiquette

There are guidelines to contributing to OpenSource projects (https://opensource.guide/how-to-contribute/) – if you will be working on public projects, read the guidelines and engage with the other developers. Individual projects may have their own guidelines – Git itself is an OpenSource project on GitHub, but pull requests with the obvious repository (named Git) are ignored.

Here, we all know each other … if you see a ticket that requests a new column in a report or a different format for an export, make a development branch, sort the issue, test it, and merge the development branch back into master.

There is one part of the OpenSource guidelines that produce more readable code when multiple individuals are contributing: coding standards. Software development teams have formal documents that define all manner of form within their coding. How to name variables. Are spaces or newlines used before braces? Are spaces used before parenthesis? How are functions named? What does a program or function comment block look like? How are variable and function names cased? When looking at OpenSource projects – or our internal team code – there isn’t a single coding standard. In the absence of a company-supplied standard, most individuals have one of their own. From a class, from a previous job … something.

Some people prefix variable names with type indicators (in statically cast language, you’ve got to search up to the variable declaration otherwise). Some people appreciate concise code and write if(x == y){ doWhatever; } all on one line, others would consider that hopelessly unreadable. Some people use switch statements, some hate them and would rather long-form the if/elseif/else version. If you are making a quick change (+2 needs to be +4 or some word was misspelt), you don’t need to review the code to see how it is written. Anything beyond a quick edit, it is polite to look at how the project maintainer (or original author in our case) has written the code and follow their form.

 

Git Deployment

I ‘inherited’ the Git server at work — which means I had to learn how the back end component of Git works (beyond my file-system based implementation where there are just clients and a disk location). It is not as complicated as I feared. The chap who had deployed the Git backend at work chose Bonobo — since he no longer works for the company, I cannot just ask why this particular implementation. It’s Windows based and priced in our 0$ budget, and I am certain these were selling points. It seems quite stripped down compared to GitHub too — none of the issue tracking / Wiki / chat about it features. Which, for what my department does, is fine. We are not software developers. We have a lot of internal code for task automation, we have some internal code for departmental web sites, and we have some sample code we hand out to other developers (i.e. someone wants to start using LDAP or ADFS authentication, we can give them a sample implementation in their language). There aren’t feature requests. Generally speaking, there aren’t simultaneous development tasks on a project.

Since I deciphered the server implementation at work, I wanted to set up a Git server at home too. The limited feature set of Bonobo was off-putting. I wanted integrated issue tracking. Looking at the available opensource and free options, I selected GitLab. As a sandbox — poke around the server, see how it works and what features it offers — I wanted something ready-to-go. I noticed that there is a Docker container for the project. I helped a few friends who were testing Docker as a development and deployment methodology (I’ve even suggested it for my employer’s internal development staff … being able to develop and run an application with an integrated web server *without* needing the Windows permissions and configuration for a web server (and doing it all over again when your computer is replaced) seemed efficient. But I’d never actually used a Docker container before. It is incredibly easy.

Install docker — a bit obvious, but that was the most time consuming part of the process. I elected to install it on my Windows laptop for expediency. If we decide not to use GitLab, I haven’t thrown a bunch of unnecessary binaries on the server. Lenovo, as a default, does not enable virtualisation. Getting into the BIOS config tool (shift then click the power button, keep holding shift whilst you click restart) was the most time consuming bit of the installation.

Once Docker is installed, pull the container from the Docker store (docker pull gitlab/gitlab-ce). Then run it (docker run –detach –hostname gitlab.rushworth.us –publish 443:443 –publish 80:80 –publish 22:22 –name gitlab –restart always –volume /srv/gitlab/config://c/gldata/etc –volume /srv/gitlab/logs:/var/log/gitlab –volume /srv/gitlab/data://c/gldata/data –volume /svr/docker/gitlab/gitlab://c/gldata/gitlab gitlab/gitlab-ce:latest). You can remap ports (e.g. publish 8443:443) if needed.

Not quite there yet — you’ve got to edit the container config (docker exec -it gitlab vi /etc/gitlab/gitlab.rb) for your environment. Set a valid external url (external_url ‘http://gitlab.rushworth.us’). I also enabled LDAP authentication to test that out.


gitlab_rails[‘ldap_enabled’] = true

###! **remember to close this block with ‘EOS’ below**
gitlab_rails[‘ldap_servers’] = YAML.load <<-‘EOS’
main: # ‘main’ is the GitLab ‘provider ID’ of this LDAP server
label: ‘LDAP’
host: ‘ADHostname.rushworth.us’
port: 636
uid: ‘sAMAccountName’
method: ‘ssl’ # “tls” or “ssl” or “plain”
bind_dn: ‘cn=UserID,ou=SystemAccounts,dc=domain,dc=ccTLD’
password: ‘AccountPasswordGoesHere’
active_directory: true
allow_username_or_email_login: false
block_auto_created_users: false
base: ‘ou=ResourceUsers,dc=domain,dc=ccTLD’
user_filter: ‘(&(sAMAccountName=*))’ # Can add attribute value to restrict authorized users to GitLab access, we leave open to all valid user accounts in the OU. Should be able to authorize based on group membership using linked attribute value like (&(memberOf=cn=group,ou=groupOU,dc=domain,dc=ccTLD))
attributes:
username: [‘uid’, ‘userid’, ‘sAMAccountName’]
email: [‘mail’, ’email’, ‘userPrincipalName’]
name: ‘cn’
first_name: ‘givenName’
last_name: ‘sn’

EOS


The default is to retain a lot of log files — 30 days! This might be reasonable in a corporate environment, but even for production at home … that’s a lot of space dedicated to log files.


logging[‘logrotate_frequency’] = “daily” # rotate logs daily
logging[‘logrotate_rotate’] = 3 # keep 3 rotated logs
logging[‘logrotate_compress’] = “compress” # see ‘man logrotate’
logging[‘logrotate_method’] = “copytruncate” # see ‘man logrotate’


And finally configure SMTP for outbound mail. We don’t use authentication on our SMTP server; it controls relay based on source IP. We do use starttls, but the certificate is not going to be trusted without additional configuration … so I set the ssl verify mode to none.


gitlab_rails[‘smtp_enable’] = true
gitlab_rails[‘smtp_address’] = “smtp.hostname.ccTLD”
gitlab_rails[‘smtp_port’] = 25
# gitlab_rails[‘smtp_user_name’] = “smtp user”
# gitlab_rails[‘smtp_password’] = “smtp password”
# gitlab_rails[‘smtp_domain’] = “example.com”
# gitlab_rails[‘smtp_authentication’] = “login”
gitlab_rails[‘smtp_enable_starttls_auto’] = true
# gitlab_rails[‘smtp_tls’] = false

###! **Can be: ‘none’, ‘peer’, ‘client_once’, ‘fail_if_no_peer_cert’**
###! Docs: http://api.rubyonrails.org/classes/ActionMailer/Base.html
gitlab_rails[‘smtp_openssl_verify_mode’] = ‘none’


Once the config has been updated, restart the container (docker restart gitlab).

Access the web site and you’ll be prompted to set a password for the admin user, root. You can click the ‘ldap’ tab and log in with Active Directory credentials. Fin.

If we deploy this for a production system, I would set up SSL on the web site and possibly externalize the GitLab database to MySQL. The external database is more of an academic experiment because we already use MySQL (and I still don’t want  to learn about vacuuming PostgreSQL).

Git, Version Management, Branches, and Sub-modules

As we have increased in staff, we’ve gained a few new programmers. While it was easy enough for us to avoid stepping on each other’s toes, we have experienced several production problems that could be addressed by rethinking our repository configuration.

Current state: We have a monolithic repository for different batch servers. Each server has a clone of the repository, and the development equivalent has a clone of the same repository. The repository has top-level folders for each independent script. There is a SharedTools top-level folder for reusable functions.

Changes are made on forks located both on the development server and individuals’ computers, tested on the development server, then pushed to the repo. Under a CRQ, a pull is performed from the production server to elevate the new code. Glomming dozens of scripts into a single repository was simple and quick; but, with new people involved with development efforts, we have experienced challenges with changes being lost, unintentional elevation of code, and having UAT run against under-development code.

Pitfalls: Four people working on four different scripts are working in the same repository. We have had individuals developing on their laptop overwrite changes (force push is dangerous, even if force-with-lease is used), we have had individuals developing on the dev server commit other people’s edits (git add * isn’t a good idea in a shared environment – specifically add changed files to your commit), and we’ve had duplication of effort (which is certainly a problem outside of development efforts, and one that can be addressed outside of git).

We could address the issues we’ve seen through training and communication – ensure anyone contributing code to the repository adequately understands what force push means, appreciates what wildcards include, and generally have a more nuanced understanding of git than the one-hour training I provided last year. But I think we should consider the LOE and advantages of using a technical solution to ensure less experienced git users are able to successfully use our repositories.

Proposal – Functional Splits:

While we have a few more individuals with development experience, they are quite specifically Windows script developers (PowerShell, VBScript, etc). We could just stop using the Windows batch server and let the two or three Microsoft guys figure it out for themselves. This limits individual growth – I “don’t do” PowerShell development, the Windows guys don’t learn Linux. And, as the group changes over time, we have not addressed the underlying problem of multiple people working on the same codebase.

Proposal – Git Changes:

We can begin using branches for development efforts and reserve “master” for ready-for-deployment code. Doing so, we eliminate the possibility of inadvertently elevating code before it is ready – only commands targeted to “origin master” will be run on production servers.

Using descriptive branch names (Initials-ScriptFolderName-SummaryOfChange) will help eliminate duplicated efforts. If I notice we need to send a few mass mails with inline images, seeing “TJR-sendMassMail-AddInlineImages” in the branch list lets me know you’ve got it covered. And “TJR-sendMassMail-RecipientListFromLiveLDAPQuery” lets me know you’re working on something else and I’m setting myself up for merge challenges by working on my change right now. If both of our changes are high priority, we might choose to work through a merge operation. This would be an informed, upfront decision instead of a surprise message indicating that fast-forward merging is not possible.

In large development projects, branch management can become a full-time pursuit. I do not think that will be an issue in our case. Minimizing the number of branches used, and not creating branches based on branches, makes branch management a simpler task. We should be able to perform fast-forward merges to push code into master because our branches modify different files in the repository.

To begin a development effort, create a branch and push it to the git server. Make your changes within that branch, and ensure you keep your branch in sync with master – you cannot merge branches that are “behind” into master without force. Once you are finished with your development, merge your branch into master and delete your branch. This approach will require additional training to ensure everyone understands how to create, rebase, merge, and delete branches (and not to just force operations because it lets you complete your task).

Instead of using ‘master’ for production code, the inverse is equally viable: create a “stable” branch that is for production code and only pull that branch to PROD servers. I believe this approach is done to prevent accidental changes to prod code – you’ve got to intentionally target “origin stable” with an operation to impact production code.

Our single repository configuration is a detriment to using branches if development is performed on the DEV server. To illustrate the issue, create a BranchTesting repo and add a single file to master. Create a Branch1 branch in one command window and check it out. Create a Branch2 in a second command window and check it out. In your first command window, add a file and commit it. In your second command window, add a file and commit it. You will find that both files have been committed to Branch2.

How can we address this issue?

Develop on our individual workstations instead of the DEV server. Not sharing a file set for our development efforts eliminates the branch context switching problem. If you clone the repo to your laptop, Gary clones the repo to his laptop, and I clone the repo to my laptop … you can create TJR-sendMassMail-AddInlineImages on your computer, write and test the changes locally, commit the changes and pull them to the DEV server for more robust testing, and then merge your changes into master when you are ready to elevate the code. I can simultaneously create LJR-monitorLDAPReplication-AddOUD11Servers, do my thing, commit changes and pull them to the DEV server (first using “git branch” to determine if someone else is already testing their branch on the DEV server), and merge my stuff into master when I’m ready to elevate. Other than remembering to ensure you verify that DEV has master checked out (i.e. no one else is testing, so the resource is free), we do not have resource contention.

While it may not be desirable to fill up our laptop drives with the entire code set from six different application servers, sparse-checkout allows you to select the specific folders that will come down to your fork.

The advantage of this approach is that it has no initial LOE beyond training and process change. The repositories are left as-is, and we start using them differently.

Unfortunately, this approach may not be viable in some instances – when access to data sources is restricted by IP ACL, you may not be able to do more than linting on your laptop. It may not even be possible to configure a Windows laptop to run some of our code – some Linux requirements are difficult to address in Windows (the PKI website’s cert info check, for instance), and testing code on Windows may not ensure successful operation on the Linux hosts.

Break the monolithic repositories into discrete repositories and use submodules allow the multiple independent repositories to be “rolled up” into a top-level repository. Development is done in the submodule repositories. I can clone monitorLDAPReplication, you can clone sendMassMail, etc. changes can be made within our branches of these completely different repositories and merged into the individual repository’s master branch for release to the production environment. Release can be done for the superset (“–recurse-submodules”) or individual sub-modules.

This would require splitting a repository into its individual components and configuring the sub-module relationships. This can be a scripted operation, and it is an incremental change to the script I used to create the repositories and ingest code; but the LOE for implementation is a few days of script writing / testing. Training will be required to ensure individuals can register their submodules within the top-level repo, and we will need to accustom ourselves to maintaining individual repos.

Or just break monolithic repositories into discrete repositories. The level of effort is about the same initially, but no one needs to learn how to set up a new submodule. We lose single-repo conveniences, but there’s literally no association between our different script folders where someone working in X could inadvertently impact Y.