Solved: Syncing Kindle Highlights to Obsidian Vault

🚀 Executive Summary

TL;DR: Kindle highlights often remain isolated, hindering knowledge connection in Obsidian. This Python script automates the extraction of highlights from `My Clippings.txt` and organizes them into an Obsidian vault, transforming passive reading into an active knowledge management system.

🎯 Key Takeaways

Kindle highlights are stored in the `My Clippings.txt` file, which can be accessed by connecting the Kindle device to a computer via USB.
The Python script parses the `My Clippings.txt` file, groups highlights by book, and writes them into corresponding Markdown files within a specified Obsidian vault path.
The script incorporates critical features like `utf-8-sig` encoding for `My Clippings.txt` to handle BOM, filename sanitization for book titles, and duplicate highlight prevention using a `set` for idempotency.

Syncing Kindle Highlights to Obsidian Vault

Alright, let’s talk knowledge management. I read a lot on my Kindle, and for years, my highlights were just digital dust, trapped on a device. I’d have a great insight, highlight it, and then promptly forget it. The real value comes from connecting these ideas, and for that, I use Obsidian as my digital brain. Manually transferring those highlights was a chore I kept putting off. This simple Python script I’m about to show you bridges that gap. It automatically pulls all my Kindle highlights and organizes them neatly in my Obsidian vault. This little piece of automation saves me from tedious copy-pasting and ensures my best ideas are always in one place, ready to be connected.

Prerequisites

Before we start, make sure you have the following ready:

An Amazon Kindle device.
An existing Obsidian vault on your computer.
Python 3 installed on your system.
A USB cable to connect your Kindle to your computer.

The Guide: Step-by-Step

Step 1: Locate Your “My Clippings.txt” File

This is the magic file where your Kindle stores every highlight and note you’ve ever made. It’s a simple text file, but it’s our source of truth.

Connect your Kindle to your computer using a USB cable. It should appear as a removable drive, like a USB stick.
Open the Kindle drive and navigate into the “documents” folder.
Inside, you’ll find a file named My Clippings.txt. This is what we’ll be working with. Copy this file to a project directory on your computer where we’ll build our script.

Step 2: The Python Script Setup

Now, let’s get our environment ready. You know the standard workflow for a new Python project: create a directory, spin up a virtual environment, and so on. I’ll skip detailing those initial setup commands since you likely have your own preferred way of doing things. For this script, we don’t need any external libraries, just a standard Python installation.

Create a file named kindle_sync.py in the same directory where you placed your My Clippings.txt file.

Step 3: The Core Logic – Our Python Script

This script does the heavy lifting. It reads the clippings file, parses out each highlight, groups them by book, and then writes them into corresponding notes in your Obsidian vault. It’s also smart enough not to create duplicate entries if you run it multiple times.

Here is the full script. Paste this into your kindle_sync.py file:


import re
import os
from collections import defaultdict

# --- Configuration ---
# You MUST change these paths to match your setup.
OBSIDIAN_VAULT_PATH = "/path/to/your/obsidian/vault/kindle-highlights"
CLIPPINGS_FILE_PATH = "My Clippings.txt"
# --- End Configuration ---

def sanitize_filename(name):
    """Removes characters that are invalid for filenames."""
    return re.sub(r'[\\/*?:"<>|]', "", name)

def parse_clippings():
    """Parses the 'My Clippings.txt' file and groups highlights by book."""
    try:
        with open(CLIPPINGS_FILE_PATH, 'r', encoding='utf-8-sig') as f:
            content = f.read()
    except FileNotFoundError:
        print(f"Error: The file {CLIPPINGS_FILE_PATH} was not found.")
        return None

    highlights_by_book = defaultdict(list)
    # The delimiter for each highlight entry
    separator = "=========="
    
    entries = content.split(separator)
    
    for entry in entries:
        lines = entry.strip().split('\n')
        if len(lines) < 4:
            continue
            
        book_title_author = lines[0].strip()
        # Clean up potential invisible characters from the file
        book_title_author = book_title_author.lstrip('\ufeff')

        highlight_content = lines[3].strip()
        
        if book_title_author and highlight_content:
            highlights_by_book[book_title_author].append(highlight_content)
            
    return highlights_by_book

def sync_to_obsidian(highlights_by_book):
    """Writes new highlights to the Obsidian vault, avoiding duplicates."""
    if not os.path.exists(OBSIDIAN_VAULT_PATH):
        print(f"Creating directory: {OBSIDIAN_VAULT_PATH}")
        # This part requires manual intervention if the dir doesn't exist
        # We avoid running filesystem commands like mkdir directly.
        # Please ensure the target directory is created before running.
        pass

    for book, highlights in highlights_by_book.items():
        sanitized_title = sanitize_filename(book)
        note_path = os.path.join(OBSIDIAN_VAULT_PATH, f"{sanitized_title}.md")
        
        existing_highlights = set()
        if os.path.exists(note_path):
            with open(note_path, 'r', encoding='utf-8') as f:
                for line in f:
                    # Assuming highlights are stored as blockquotes
                    if line.startswith('> '):
                        existing_highlights.add(line.strip('> ').strip())

        new_highlights_added = 0
        with open(note_path, 'a', encoding='utf-8') as f:
            if os.path.getsize(note_path) == 0:
                f.write(f"# {book}\n\n")

            for highlight in highlights:
                if highlight not in existing_highlights:
                    f.write(f"> {highlight}\n\n")
                    new_highlights_added += 1
        
        if new_highlights_added > 0:
            print(f"Added {new_highlights_added} new highlights to '{sanitized_title}.md'")

def main():
    print("Starting Kindle highlights sync...")
    all_highlights = parse_clippings()
    if all_highlights:
        sync_to_obsidian(all_highlights)
        print("Sync complete.")
    else:
        print("Sync failed. No highlights found or file error.")

if __name__ == "__main__":
    main()

Before running, you must update the OBSIDIAN_VAULT_PATH variable to the absolute path of the folder within your vault where you want the Kindle notes to be stored. I personally use a subfolder called “Kindle Highlights” to keep things tidy.

Pro Tip: The sanitize_filename function is critical. Book titles can contain characters like ‘:’ or ‘?’ which are invalid in filenames on most operating systems. This function strips them out to prevent errors when the script tries to create the .md files.

Step 4: Running the Sync and Automation

To run the sync manually, just connect your Kindle, copy the latest My Clippings.txt file over, and run the script from your terminal. It’s that simple.

But we’re DevOps engineers, we automate things. For a “set-and-forget” solution on a Linux or macOS system, I use a simple cron job. This job runs the script automatically at a set interval. For example, to run it every Monday at 2 AM, you would set up a cron job like this:

0 2 * * 1 python3 /path/to/your/project/kindle_sync.py

This assumes you’ve set up a workflow to automatically get the `My Clippings.txt` file into your project directory. For a simpler, semi-automated flow, just run the script manually whenever you connect your Kindle to your computer.

Common Pitfalls

Here’s where I stumbled when I first built this, so you don’t have to:

File Encoding: The My Clippings.txt file can have a weird Byte Order Mark (BOM) at the beginning. Using encoding='utf-8-sig' when reading the file handles this gracefully and prevents parsing errors on the first line.
Invalid Vault Path: The most common error is a simple typo in the OBSIDIAN_VAULT_PATH. Double-check that it’s an absolute path and that the directory exists. The script doesn’t create the directory for you, to be safe.
Duplicate Entries: My first version created tons of duplicates. The logic to read the existing file first and store its highlights in a set for a quick lookup is key to making the script idempotent.

Conclusion

And that’s it. You now have a robust pipeline to turn your passive reading highlights into active, connected notes in your knowledge base. This kind of small, targeted automation is what DevOps is all about—identifying friction in a workflow and eliminating it with a bit of code. It frees up your time and mental energy for more important work. Happy syncing!

Darian Vance

Lead Cloud Architect & DevOps Strategist

With over 12 years in system architecture and automation, Darian specializes in simplifying complex cloud infrastructures. An advocate for open-source solutions, he founded TechResolve to provide engineers with actionable, battle-tested troubleshooting guides and robust software alternatives.

🤖 Frequently Asked Questions

❓ How can I automatically sync my Kindle highlights to my Obsidian vault?

You can use a Python script that reads your Kindle’s `My Clippings.txt` file, parses the highlights, and then writes them into individual Markdown files within a specified folder in your Obsidian vault, grouped by book title.

❓ What are the advantages of this script over manually transferring Kindle highlights?

This script automates the entire process, saving significant time and effort compared to manual copy-pasting. It ensures all highlights are consistently formatted, grouped by book, and prevents duplicate entries, making your Obsidian vault a more complete and organized knowledge base.

❓ What is a common issue when implementing this Kindle highlight sync script?

A common pitfall is file encoding errors with `My Clippings.txt` due to a Byte Order Mark (BOM). This is resolved by opening the file with `encoding=’utf-8-sig’` in the Python script. Another common issue is an incorrect `OBSIDIAN_VAULT_PATH`.

TechResolve – SaaS Troubleshooting & Software Alternatives

Leave a ReplyCancel reply