Line endings handling in SVN, Git and SubGit

People who use Git on Windows often complain about “LF will be replaced by CRLF” warning an other problems related to line endings. While googling error messages one can see a lot of posts written in 2008-2010 years that recommend to set “core.autocrlf” config option to true or false.

These recommendations are not optimal and obsolete, because

  1. “core.autocrlf” option is not under version control
  2. “core.autocrlf” option is set once for all files and doesn’t allow per file control
  3. “core.autocrlf” can cause wrongs line endings for some files (the set of problem files depends on the option value)
  4. since version 1.7.2 Git has better means to control line endings settings by means of git attributes

Git attributes are specified in .gitattributes files. Line endings are controlled by “text” and “eol” attributes.

  • “text” attribute tells Git whether the file is binary (i.e. no EOL conversion should be performed while checking out and in) or text (perform EOL conversion, always convert to LF while checking in). Possible values are set (EOLs conversion is turned on), unset(EOLs conversion is turned off, default value) and “auto”(if the file is detected as binary, no conversion, otherwise EOLs conversion is performed).
  • “eol” attribute: if set implicitly sets “text” attribute and defines EOL to which the file should be converted while checking out.

The most useful combinations of these attributes are:
1. Always convert to LF in all OSes while checking out to the working copy (useful for shell scripts because some unix shells like Dash fail when encountering into CRLFs)

/file.sh text eol=lf
EOLs conversion for eol=lf attribute

EOLs conversion for eol=lf attribute

2. Always convert to CRLF in all OSes(for bat-scripts, for example)

/file.bat text eol=crlf
EOLs conversion for eol=crlf attribute

EOLs conversion for eol=crlf attribute

3. Always convert to OS-dependent EOL (LF for UNIX, CRLF for Windows), “!eol” means that Git will use “core.eol” setting of .git/config This setting is recommended for the most of source code files

/file.cs text !eol
EOLs conversion for !eol attribute

EOLs conversion for !eol attribute

4. Binary file, no EOLs conversion.

/file.bin -text
No EOL conversion for -text attribute

No EOL conversion for -text attribute

Because 3rd option is recommended for the most of the files it can be set as default with the rule

* text !eol

but thanks to possibility of text=auto value the rule can combine 3rd and 4th option:

* text=auto !eol

This means that for binary files no EOL conversion is performed, for text files all EOLs are replaced with OS-dependent EOL while checking out and replaced with LF while checking in.

In Subversion EOLs-related problem is solved long ago by svn:eol-style and svn:mime-type properties.
1. svn:eol-style=LF — convert to LF in all OSes while checking out to the working copy
2. svn:eol-style=CRLF — convert to CRLF in all OSes while checking out to the working copy
3. svn:eol-style=native — convert to OS-dependent EOL in all OSes while checking out to the working copy
4. svn:eol-style is not set (optionally svn:mime-type is set to any value that doesn’t start with “text/”) — binary file, no conversion

As one can see 1-4 options for Git correspond to 1-4 options for SVN, and SubGit actually performs this conversion on the fly, so Git users see the same behaviour after cloning an Git repository as SVN user after checking out the linked SVN repository.

How SubGit converts line endings

How SubGit converts line endings

Not that for the case svn:eol-style=CRLF Subversion keeps contents with CRLFs, but Git expects blobs to contain LFs, so the contents conversion is necessary in this case. Unlike SubGit git-svn doesn’t perform contents conversion at all, that can result into EOLs problems. Also from this picture one can see why it is better to use per-file rule rather than one global “core.autocrlf=true” (that is the same as “* eol=crlf” rule) or “core.autocrlf=false” (that corresponds to “* -text” rule) values.

Unfortunately not all Git EOLs settings can be mapped one-to-one to SVN settings. One of the reason: Git allows recursive rules like “* attribute=value” and Subversion doesn’t. In this case SubGit applies changes in recursive Git rules to every Subversion file. But usually this doesn’t lead to any problems.

Default “* text=auto !eol” rule of .gitattributes can be also considered as Git analog of autoproperties because it applies automatically to every newly added file.

Git mirror of remote SVN repository

Some of you may know that we are already working on the next version of our tool, SubGit 2.0. This new version enables bi-directional synchronization of Git and SVN repositories located on different hosts.

You’ve heard it right, it is remote bi-directional Git mirror of an arbitrary SVN repository.

Many users demand that functionality. The reason is obvious: some developers have no local access to SVN repositories, so they cannot use SubGit. Today I will show you a little trick with the current version, SubGit 1.0; it enables writable Git mirror of remote SVN repository with minimal admin access.

So, you have a Subversion repository somewhere at http://company.com/svn/repos/, let’s build Git mirror out of that:

1. Create empty Subversion repository locally:

$ svnadmin create repos

2. Fetch remote repository into the local one with rdump-load cycle (see topic on replication with svnrdump in Subversion Book):

$ svnrdump dump http://company.com/repos/repos | svnadmin load repos

3. Install SubGit into the local repository to add Git part (see SubGit Book for more details):

$ subgit configure repos

Adjust the configuration accordingly to your needs: specify branches and tags layout, add authors information, etc. Finally, create a translated Git repository by installing SubGit:

$ subgit install repos

In the remaining part we will use svnsync to enable synchronization of created Git repository with remote SVN repository.

4. Make sure pre-revprop-change hook is enabled in remote repository. This is needed to make svnsync work. Unfortunately, this needs admin access to repository but often times this hook is already enabled for repository.

$ nano /path/to/remote/repos/hooks/pre-revprop-change

#!/bin/sh
exit 0

Do not forget to make hook script executable:

$ chmod +x /path/to/remote/repos/hooks/pre-revprop-change

5. Synchronize the local repository with the original one using svnsync (see topic on replication with svnsync in Subversion Book):

$ svnsync initialize http://company.com/svn/repos file://path/to/repos
$ svnsync synchronize http://company.com/svn/repos

6. Schedule svnsync runs to enable replication of the new revisions translated from Git by SubGit:

$ crontab -e

# m h dom mon dow command
3 * * * * svnsync synchronize http://company.com/svn/repos

If you’ve followed the steps above you have created Git mirror of remote Subversion repository. Be aware of certain drawbacks this approach implies:

  • Due to limitations of svnsync remote SVN repository must be read-only. That means no one should ever commit directly to this repository; committers should use Git mirror to publish all the changes.
  • Admin should enable pre-revprop-change hook to make svnsync work. Hopefully, the hook is already enabled for your repository as some admins do that beforehand.
  • The described approach does not work in case of a project hosted in SVN repository along with other projects: any modification committed into some other project inside the same Subversion repository inevitably breaks synchronization.

Fortunately, SubGit 2.0 doesn’t have any of these problems. Stay tuned!

Git mirrors of SVNKit and SqlJet repositories

Hi,

We use our SubGit product for self-hosting for more than a year already, but until now we didn’t have Git side of repository open for the public. Today I’ve found some time to make necessary changes to the Apache configuration and, voilà, our dear users may now use Git to get SVNKit or SqlJet source code. Just run:


$ git clone http://svn.svnkit.com/git/svnkit svnkit
$ git clone http://svn.svnkit.com/git/sqljet sqljet

To clone corresponding project’s git repository!

SubGit-based server side git-svn mirror works like a charm :)

From Svn to Gitolite

Hi,

Today I’d like to describe how to migrate from Svn to Git, minding existing infrastructure. One of the most common Subversion servers configurations is Linux server with Apache and mod_dav_svn module serving one or more Subversion repositories. Each Subversion repository may contain one or more projects. Of course, I will use SubGit to make migration smooth, i.e. without a need to force users to make a switch from Subversion to Git overnight.

For Git, I’ve chosen Gitolite as a Git management tool, as it is relatively easy to install and lot of documentation resources.

Background

Here is the first picture. In blue initial configuration is shown and in gray are what we will gradually build in this post. This sample migration takes place on Ubuntu Linux server:

Initial Setup

Step 1: Install Gitolite and create ‘Git’ user

To install Gitolite I’m using “install as root” method from Gitolite documentation. Few migration-specific additions are marked with the bold font.

On your workstation:

  • copy your ~/.ssh/id_rsa.pub file to /tmp/YourName.pub on the server. (The name of this file determines your gitolite username, so if you leave it as id_rsa.pub, your gitolite username will be id_rsa, which may not be what you want).

On your server, as root:

git clone git://github.com/sitaramc/gitolite
gitolite/src/gl-system-install
# defaults to being the same as:
# gitolite/src/gl-system-install /usr/local/bin /var/gitolite/conf /var/gitolite/hooks

# to upgrade gitolite, repeat the above commands. Make sure you use the
# same arguments for the last command each time.

# Git user should be in the same group as your Apache user:
useradd git -g www-data -m -s /bin/bash

# switch to the hosting user
su - git

# (now as git)
gl-setup /tmp/YourName.pub

# when prompted edit the following values in /home/git/.gitolite.rc file:
$REPO_UMASK = 0002; # default is 0077
$GL_GITCONFIG_KEYS = "core.sharedRepository"; # default is ""

# Change repositories directory permissions to let Apache user
# read it contents:
chmod ug+rx /home/git/repositories

On your workstation:

git clone git@server:gitolite-admin

Gitolite Installed

Step 2: Create Git repositories with Gitolite

Create empty Git repository for each of your Subversion project. In this post I assume that there are three Subversion projects in a single Subversion repository (p1, p2, p3) each with a standard trunk/branches/tags layout. You may have different distribution as well as different layout.

On your workstation, edit gitolite-admin/conf/gitolite.conf file:

repo    gitolite-admin
        RW+     =   alex
repo    testing
        RW+     =   @all

repo    p1
        RW+     =   alex
        config core.sharedRepository = true
repo    p2
        RW+     =   alex
        config core.sharedRepository = true
repo    p3
        RW+     =   alex
        config core.sharedRepository = true

Add more access rules if necessary, commit and push your change:


git add conf/gitolite.conf
git commit -m "repositories added"
git push

Git Repositories Created

Step 3: Set up smooth Svn to Git migration

On server download and install SubGit. Use either debian package distribution or zip archive.

sudo dpkg -i subgit_1.0.0-EAP-902_all.deb

Or

unzip subgit_1.0.0-EAP-902.zip

To set up migration, as www-data user, run:

$ sudo -u www-data subgit configure /var/svn/repos
SubGit version 1.0.0-EAP ('Miai') build #902
This is an EAP build, which you may not like to use in production environment.

Detecting paths eligible for translation...
Subversion to Git mapping has been configured in '/var/svn/repos':
/p1 : /var/svn/repos/git/p1.git
/p2 : /var/svn/repos/git/p2.git
/p3 : /var/svn/repos/git/p3.git

CONFIGURATION SUCCESSFUL

Adjust '/var/svn/repos/conf/subgit.conf' file
and then run
subgit install "/var/svn/repos"
to complete SubGit installation.

SubGit has detected Subversion projects and created default configuration. Now I will edit SubGit configuration file at /var/svn/repos/conf/subgit.conf to specify our Gitolite repositories locations instead of default ones and to mark repository as shared:

[core]
        # shared option must be set to 'true', 
        # as long as apache and gitolite are ran 
        # by different users (i.e. www-data and git)
        shared = true
        ...
        # authors.txt consists of mapping lines:
        # svnUser=gitUser<gitUser@email.com>
        # this file is optional
        authorsFile = conf/authors.txt
[git "p1"]
        translationRoot = p1
        repository = /home/git/repositories/p1.git
        ....
[git "p2"]
        translationRoot = p2
        repository = /home/git/repositories/p2.git
        ...
[git "p3"]
        translationRoot = p3
        repository = /home/git/repositories/p3.git
        ...

As soon as you’re happy with configuration, enable migration:

$ sudo -u www-data subgit install /var/svn/repos
SubGit version 1.0.0-EAP ('Miai') build #902
...
INSTALLATION SUCCESSFUL
...

That’s all, Gitolite and smooth Svn To Git migration is now configured!

Smooth Svn to Git migration

Commit changes to Subversion and Git users will pull them into their Git clones, push commits to Git repository and Subversion users will receive them. Try it :)

In case you have any questions or suggestions, please feel free to contact me at support@subgit.com
Thanks!

VisualSVN Subversion Server and Git

I’d like to start this blog with a few real-world examples on how to set up SubGit assuming infrastructure that is already in place. This post will take place in a strange world of Windows.

Initial configuration: VisualSVN Server on 64-bit Windows computer, provides read and write access to two Subversion repositories over HTTPS.

Primary Objective: Make Subversion repositories accessible for reading and modification with Git over HTTPS.
Secondary Objective: Reuse existing authentication settings, that are already configured for Subversion repositories.

In other words I’m about to set up instant bidirectional Svn to Git replication.

Configuration Details

This is a VisualSVN configuration I’ve created for this guide:

Initial VisualSVN Configuration

As you may see project repository is a single-project one with a standard trunk/branches/tags layout, and main repository contains two subproject, each with a standard layout.

VisualSVN is installed in C:\Program Files (x86)\VisualSVN Server directory.
Repositories are located in C:\Repositories directory.
VisualSVN uses standard Subversion authentication settings.

Add Git to Svn

First, install msysGit, you may download installer from their download page. Make sure you select this option when installing msysGit (it is important for enabling HTTP access later!):

Important Git Installation Option

Second, change an account VisualSVN service uses and modify C:\Repository directory (one where repositories are kept) permissions as described in a Visual SVN knowledge base entry. It is important to make VisualSVN service run on behalf of the account that has access to the Git you’ve just installed. Following the way of lesser resistance, I’ve used my personal account for that, but you may create a dedicated one.

After completing this step, I had VisualSVN service running on behalf of “HOST\alex” account and C:\Repository was writable for “HOST\alex”. Yes, my name is Alex.

Finally, download SubGit zip archive and unpack it into C:\SubGit directory. The run “subgit install” on Subversion repositories to enable replication:

> C:\SubGit\bin\subgit install C:\Repositories\main
> C:\SubGit\bin\subgit install C:\Repositories\project

Install SubGit

Install SubGit


Note, that you must run “subgit install” on behalf of the same user that you’ve configured VisualSVN service with.

Configure HTTP Access for Git

Important: VisualSVN comes with more or less truncated version of Apache, but fortunately it misses only one module of those needed for Git (mod_cgi). Download it (mod_cgi.so from Apache 2.2.21 for win32) and put into C:\Program Files (x86)\VisualSVN Server\bin directory.

Alternatively, you may get this missing module by installing Apache and taking mod_cgi.so file from the modules folder.

Then, edit C:\Program Files (x86)\VisualSVN Server\conf\httpd-custom.conf file:

LoadModule cgi_module bin/mod_cgi.so
LoadModule authz_user_module bin/mod_authz_user.so

SetEnvIf Request_URI "^/git/.*$" GIT_PROJECT_ROOT=C:/Repositories
SetEnvIf Request_URI "^/git/.*$" GIT_HTTP_EXPORT_ALL

ScriptAlias /git/ "C:/Program Files (x86)/Git/libexec/git-core/git-http-backend.exe/"

<Location /git>
  Options +ExecCGI

  Require valid-user
  AuthName "VisualSVN Server"
  AuthType Basic
  AuthBasicProvider file
  AuthUserFile "C:/Repositories/htpasswd"
</Location>

Note, that this configuration example uses “Basic” authentication option, granting read and write access to repositories for all Git users that are listed in htpasswd file. Of course you may configure something more complicated here.

Now restart VisualSVN Server service. That’s all!

Git for Svn

Try the following commands now:

# set this environment variable in case you're using SSL and 
# self-signed SSL certificate in VisualSVN (this is default).
> SET GIT_SSL_NO_VERIFY=true

> git clone https://alex@localhost/git/project project
> git clone https://alex@localhost/git/main/git/library lib

Thanks to SubGit, changes pushed from the cloned Git repository will be immediately propagated to the corresponding Svn repository and vice versa – new Svn revisions will be received by a pull performed from a cloned Git repository.

You may find more on SubGit at http://subgit.com/