🚀 Executive Summary

TL;DR: Migrating repositories between Azure DevOps (ADO) and GitHub Enterprise can be a productivity drain due to constant context-switching. This guide provides a Python script to automate the core migration process, efficiently moving code, history, and all branches from ADO to GitHub Enterprise.

🎯 Key Takeaways

  • Successful migration requires Azure DevOps and GitHub Enterprise Personal Access Tokens (PATs) with appropriate ‘Code (Read)’ and ‘repo’ scopes, respectively, along with organization/project names and a Python 3 environment.
  • The migration is automated using a Python script that leverages the ‘requests’ library for API interactions (fetching ADO repos, creating GHE repos) and the ‘subprocess’ module for executing Git commands.
  • Repositories are mirrored using a ‘git clone –bare’ command from ADO, followed by a ‘git push –mirror’ command to transfer all branches, tags, and the full commit history to the newly created GitHub Enterprise repository.
  • Configuration details and sensitive credentials are managed securely using a ‘config.env’ file, with a strong recommendation to fetch these secrets from secure vaults like Azure Key Vault or HashiCorp Vault in production setups.
  • Common pitfalls include incorrect PAT scopes, URL mismatches in the ‘config.env’ file, repository name conflicts, and potential timeouts for very large repositories during ‘git clone’ or ‘git push’ operations.

Migrate Azure DevOps Repos to GitHub Enterprise

Migrate Azure DevOps Repos to GitHub Enterprise

Hey there, Darian Vance here. If you’re juggling repositories between Azure DevOps (ADO) and GitHub Enterprise, you know the pain. For a while, my team was split between the two, and the constant context-switching was a hidden productivity killer. We’d have pull requests in one place, pipelines in another, and security scans all over. Consolidating everything into GitHub Enterprise was a game-changer for our workflow, and I want to show you exactly how we automated the core migration process. This guide will help you move the code, the history, and all the branches, saving you hours of manual work.

Prerequisites

Before we dive in, let’s make sure you have the necessary keys to the kingdom. You’re going to need:

  • An Azure DevOps Personal Access Token (PAT): With ‘Code (Read)’ permissions at a minimum. I recommend giving it ‘Read & Write’ if you plan to archive the old repos later.
  • A GitHub Enterprise Personal Access Token (PAT): This needs the ‘repo’ scope to create and write to repositories.
  • Your Organization & Project Names: The ADO organization and project name, as well as your GitHub Enterprise organization name.
  • Python 3 Environment: A working Python 3 installation. We’ll be using the ‘requests’ and ‘python-dotenv’ libraries.

The Step-by-Step Migration Guide

Step 1: Project Setup and Configuration

First things first, let’s get our workspace ready. I’ll skip the standard virtualenv setup since you likely have your own workflow for that. The important part is to create a project directory and install the required Python libraries. You can do this with a simple `pip install requests python-dotenv` command.

Next, create a file named config.env in your project root. This is where we’ll securely store our credentials and configuration details, keeping them out of the script itself. Your file should look like this:

# Azure DevOps Configuration
ADO_ORG="your-ado-organization-name"
ADO_PROJECT="YourADOProjectName"
ADO_PAT="your_personal_access_token_for_ado"
ADO_USER="your_ado_username"

# GitHub Enterprise Configuration
GHE_API_URL="https://your-github-enterprise-url/api/v3"
GHE_ORG="your-github-enterprise-organization"
GHE_PAT="your_personal_access_token_for_github"

Pro Tip: In my production setups, I always fetch these secrets from a secure vault like Azure Key Vault or HashiCorp Vault at runtime. For this tutorial, the `config.env` file is perfectly fine, but avoid committing it to a public repository!

Step 2: The Python Script – Initialization

Now for the fun part. Let’s create our Python script, which I’ll call `migrate_repos.py`. We’ll start by importing the necessary libraries and loading our configuration from the `config.env` file.

The logic here is straightforward: we load the environment variables, then immediately set up the authentication headers we’ll need for our API calls to both ADO and GitHub. This keeps our main logic clean.

import os
import requests
import subprocess
import base64
from dotenv import load_dotenv

# Load environment variables from config.env
load_dotenv('config.env')

# --- Azure DevOps Configuration ---
ADO_ORG = os.getenv("ADO_ORG")
ADO_PROJECT = os.getenv("ADO_PROJECT")
ADO_PAT = os.getenv("ADO_PAT")
ADO_USER = os.getenv("ADO_USER")
ADO_API_URL = f"https://dev.azure.com/{ADO_ORG}/{ADO_PROJECT}/_apis/git/repositories?api-version=6.0"

# --- GitHub Enterprise Configuration ---
GHE_API_URL = os.getenv("GHE_API_URL")
GHE_ORG = os.getenv("GHE_ORG")
GHE_PAT = os.getenv("GHE_PAT")

# --- API Headers ---
# For ADO, we need to use Basic Auth with the PAT.
ado_pat_b64 = base64.b64encode(f":{ADO_PAT}".encode('utf-8')).decode('utf-8')
ADO_HEADERS = {
    'Authorization': f'Basic {ado_pat_b64}',
    'Content-Type': 'application/json'
}

# For GitHub, we use a Bearer token.
GHE_HEADERS = {
    'Authorization': f'token {GHE_PAT}',
    'Accept': 'application/vnd.github.v3+json'
}

def get_ado_repos():
    """Fetches a list of repositories from the specified ADO project."""
    print("Fetching repositories from Azure DevOps...")
    response = requests.get(ADO_API_URL, headers=ADO_HEADERS)
    
    if response.status_code != 200:
        print(f"Error fetching ADO repos: {response.status_code} - {response.text}")
        return []
    
    repositories = response.json().get('value', [])
    print(f"Found {len(repositories)} repositories in project '{ADO_PROJECT}'.")
    return repositories

Step 3: Creating and Mirroring Repositories

Next, we need two key functions. The first, `create_ghe_repo`, will take a repository name and create a new, private repository in our GitHub Enterprise organization. The second, `mirror_repo`, is the workhorse. It will perform the actual Git operations to clone the repository from ADO and push it, with all its history and branches, to the newly created GitHub repo.

Here’s the code for that. I’m using Python’s `subprocess` module to run the Git commands. This is how the script automates what you would otherwise do manually in your terminal.

def create_ghe_repo(repo_name):
    """Creates a new private repository in GitHub Enterprise."""
    url = f"{GHE_API_URL}/orgs/{GHE_ORG}/repos"
    payload = {
        'name': repo_name,
        'private': True,
        'description': f"Migrated from ADO project: {ADO_PROJECT}"
    }
    
    print(f"Creating repository '{repo_name}' in GitHub Enterprise...")
    response = requests.post(url, headers=GHE_HEADERS, json=payload)
    
    if response.status_code == 201:
        print(f"Successfully created GHE repo: {repo_name}")
        return True
    elif response.status_code == 422: # Repository already exists
        print(f"Warning: GHE repo '{repo_name}' already exists. Skipping creation.")
        return True
    else:
        print(f"Error creating GHE repo '{repo_name}': {response.status_code} - {response.text}")
        return False

def mirror_repo(ado_repo_url, ghe_repo_url, repo_name):
    """Mirrors a repository from ADO to GHE using git commands."""
    temp_dir = f"./{repo_name}.git"
    
    # Authenticated URLs
    ado_clone_url = f"https://{ADO_USER}:{ADO_PAT}@dev.azure.com/{ADO_ORG}/{ADO_PROJECT}/_git/{repo_name}"
    ghe_push_url = f"https://x-access-token:{GHE_PAT}@{ghe_repo_url.split('//')[1]}"

    print(f"Starting mirror for '{repo_name}'...")
    
    try:
        # Step 1: Bare clone the repository from ADO
        print(f"  - Cloning (bare) from ADO...")
        subprocess.run(['git', 'clone', '--bare', ado_clone_url, temp_dir], check=True, capture_output=True)

        # Step 2: Mirror push to the new GitHub Enterprise repository
        print(f"  - Pushing (mirror) to GHE...")
        subprocess.run(['git', 'push', '--mirror', ghe_push_url], cwd=temp_dir, check=True, capture_output=True)

        # Step 3: Clean up the local temporary clone
        print(f"  - Cleaning up temporary directory...")
        subprocess.run(['rm', '-rf', temp_dir], check=True)
        
        print(f"Successfully mirrored '{repo_name}' to GHE.")
        return True
    except subprocess.CalledProcessError as e:
        print(f"  -!! GIT COMMAND FAILED for '{repo_name}' !!")
        print(f"  - STDERR: {e.stderr.decode()}")
        return False

Pro Tip: We use a `git clone –bare` followed by `git push –mirror`. This is the most reliable way to copy everything—all branches, all tags, and the full commit history—without creating a working copy on disk. It’s efficient and clean.

Step 4: Putting It All Together

Finally, let’s write the main execution block. This part of the script will call our functions in sequence: get the list of ADO repos, loop through them, create a corresponding repo in GitHub, and then trigger the mirror process.

def main():
    """Main function to orchestrate the migration."""
    ado_repos = get_ado_repos()
    
    if not ado_repos:
        print("No repositories found or an error occurred. Exiting.")
        return
        
    succeeded = 0
    failed = []
    
    for repo in ado_repos:
        repo_name = repo['name']
        
        # We need to construct the GHE repo URL for the push command
        ghe_domain = GHE_API_URL.split('/api/v3')[0].replace('https://', '')
        ghe_repo_url = f"https://{ghe_domain}/{GHE_ORG}/{repo_name}.git"
        
        print(f"\n--- Processing: {repo_name} ---")
        
        # 1. Create the repo in GHE
        if create_ghe_repo(repo_name):
            # 2. If creation is successful, mirror the contents
            if mirror_repo(repo['remoteUrl'], ghe_repo_url, repo_name):
                succeeded += 1
            else:
                failed.append(repo_name)
        else:
            failed.append(repo_name)

    print("\n--- Migration Summary ---")
    print(f"Total repositories processed: {len(ado_repos)}")
    print(f"Successfully migrated: {succeeded}")
    if failed:
        print(f"Failed to migrate: {len(failed)}")
        print("Failed repositories:", ", ".join(failed))
    print("-------------------------")


if __name__ == "__main__":
    main()

And that’s it! Run this script from your terminal (`python3 migrate_repos.py`), and it will methodically work through your ADO project, replicating each repository in GitHub Enterprise.

Common Pitfalls (Where I Usually Mess Up)

Even with a script, things can go wrong. Here are a few traps I’ve fallen into myself:

  • Incorrect PAT Scopes: The number one issue is always PAT permissions. If your GitHub token doesn’t have the full ‘repo’ scope, the script will fail on repo creation. If the ADO token can’t read code, it will fail on the fetch. Double-check them!
  • URL Mismatches: A typo in your ADO organization name or GHE API URL in the `config.env` file can cause all requests to fail. Make sure they are copied exactly.
  • Repository Name Conflicts: The script includes a basic check for existing repos in GitHub, but if you have complex naming conventions or forks, you might need to add more sophisticated logic to handle name clashes.
  • Large Repositories & Timeouts: For massive repositories (multiple gigabytes), the `git clone` or `git push` operations might time out. You may need to run the script on a machine with a fast, stable internet connection or handle these large repos manually as a separate step.

Conclusion

Automating this migration saves a ton of time and, more importantly, reduces the risk of human error. This script provides a solid foundation. From here, you can expand it to migrate branch policies, user permissions, or even integrate it into a larger orchestration tool. By consolidating your source code, you’re paving the way for a more streamlined, secure, and efficient development lifecycle. Good luck!

Darian Vance - Lead Cloud Architect

Darian Vance

Lead Cloud Architect & DevOps Strategist

With over 12 years in system architecture and automation, Darian specializes in simplifying complex cloud infrastructures. An advocate for open-source solutions, he founded TechResolve to provide engineers with actionable, battle-tested troubleshooting guides and robust software alternatives.


🤖 Frequently Asked Questions

âť“ How can I automate the migration of Git repositories from Azure DevOps to GitHub Enterprise?

You can automate this using a Python script that fetches ADO repositories via API, creates corresponding repositories in GitHub Enterprise, and then uses ‘git clone –bare’ and ‘git push –mirror’ commands to transfer all history, branches, and tags.

âť“ How does this automated migration approach compare to alternatives?

This Python script-based automation significantly reduces manual effort and the risk of human error compared to manually cloning and pushing each repository. It provides a robust, scriptable solution for bulk migrations, ensuring full history transfer without creating a working copy on disk, which is more efficient than individual manual transfers.

âť“ What are the most common implementation pitfalls during this migration process?

The most common pitfalls include incorrect Personal Access Token (PAT) scopes for both ADO and GitHub, typos in organization or API URLs in the ‘config.env’ file, repository name conflicts, and potential timeouts when migrating exceptionally large repositories due to network or size constraints.

Leave a Reply

Discover more from TechResolve - SaaS Troubleshooting & Software Alternatives

Subscribe now to keep reading and get access to the full archive.

Continue reading