• How To Clean Up A Repository

    It is always pleasant to start a project from scratch, create its repository yourself, define the coding style accepted in it, and much else. But sometimes we inherit “difficult children” with a heavy past, including a mix of tabs and spaces, different line ending styles, and encodings. Naturally, all this drives the version control system into hysterics. But no matter. Linux to the rescue!

    Formatting

    The expand utility will come in handy.

    find ./ -type f -name "*.php" -exec sh -c 'expand -t 4 {} > _tmp_ && mv _tmp_ {}' \;

    This command will replace tabs with 4 spaces in every *.php file. The unexpand command works in the opposite direction.

    Line Endings

    Historically, it so happened that there are quite a few ways to end a line in a text file. The most common are the Windows version (CR+LF) and Unix (LF). In a repository, though, one should choose one thing. Another wonderful utility, dos2unix, comes to our rescue. It is even more wonderful because there is no need to bother with temporary files.

    find ./ -type f -name "*.php" -exec sh -c 'dos2unix {}' \;

    One More Solution: A Git Hook

    In this case, we process only the files that were changed in this commit, not all the files in the repository in general. Besides that, we check only certain file types; according to common opinion, changing tabs to spaces in binary files is not worth doing.

    # nano repository_path/.git/hooks/pre-commit
    
    ALLOWED_EXTENSIONS=('php' 'phtml' 'js' 'css' 'html' 'htm' )
    git diff --cached --name-only --diff-filter=ACM | while read FILENAME; do
      FILENAME_EXTENSION=${FILENAME#*.}
      for ALLOWED_EXTENSION in "${ALLOWED_EXTENSIONS[@]}"
      do
        if [ "$ALLOWED_EXTENSION" == "$FILENAME_EXTENSION" ] ; then
          # fixing line endings
          dos2unix --quiet $FILENAME
          # converting tabs into spaces
          expand -t 4 $FILENAME > _tmp_ && mv _tmp_ $FILENAME
          git add $FILENAME
        fi
      done
    done
    exit 0