🚀 Executive Summary
TL;DR: Manually verifying WordPress core file integrity is a time-consuming and error-prone task. This Python script automates the process by fetching official MD5 checksums from the WordPress API and comparing them against local file hashes, quickly identifying any modified, added, or missing core files.
🎯 Key Takeaways
- The Python script leverages the official WordPress API (`api.wordpress.org/core/checksums/1.0/`) as the authoritative source for MD5 checksums of core files for a given version.
- It calculates local MD5 hashes for files within `wp-admin`, `wp-includes`, and specific root `wp-` files, explicitly excluding user-specific content like `wp-content` and `wp-config.php`.
- The integrity checker identifies ‘modified files’ (hash mismatch), ‘unexpected files’ (local but not official), and ‘missing files’ (official but not local) to provide a comprehensive security report.
Detecting Modified Core WordPress Files with a Python Integrity Checker
Hey team, Darian here. Let’s talk about a task that used to be a real time-sink for me: verifying the integrity of core WordPress files. After any security alert, I’d find myself spending hours manually running diff commands, trying to spot unauthorized changes. It was tedious and, frankly, error-prone. After one incident where a sneaky modification to a core file almost went unnoticed, I decided to automate the process. This Python script is the result. It turned a two-hour manual slog into a two-minute automated check, and it’s given our team a much faster way to validate our deployments. Let’s build it.
Prerequisites
Before we dive in, make sure you have the following ready:
- Python 3 installed on the machine you’ll run the check from.
- Access to the command line on your server to run the script.
- The path to your WordPress installation directory.
I’ll skip the standard virtualenv setup since you likely have your own workflow for that. Let’s jump straight to the Python logic. Just make sure you install the ‘requests’ library, as we’ll need it to fetch the official checksums from the WordPress API. You can do that with a standard package installation command for Python.
The Guide: Building the Integrity Checker
Step 1: The Strategy – How It Works
The logic is straightforward. We’re not going to guess what the files should look like. We’ll use the official WordPress API as our “source of truth.” The script will:
- Fetch the official list of MD5 checksums for a specific WordPress version directly from WordPress.org.
- Recursively scan our own WordPress installation (excluding user content like `wp-content`) and calculate the MD5 hash for each core file.
- Compare our local hashes against the official list and flag any discrepancies.
This tells us immediately if a file has been modified, if a core file is missing, or if a non-core file has been added to a core directory.
Step 2: Fetching the Official Checksums
First, we need a function to communicate with the WordPress API. It takes a version number and returns a dictionary of official file paths and their corresponding checksums.
import requests
import os
import hashlib
import json
def get_official_checksums(version):
"""Fetches the official WordPress checksums for a given version."""
api_url = f"https://api.wordpress.org/core/checksums/1.0/?version={version}"
try:
response = requests.get(api_url)
response.raise_for_status() # Raises an exception for bad status codes
data = response.json()
# The checksums are nested under a 'checksums' key
return data.get('checksums', {})
except requests.exceptions.RequestException as e:
print(f"Error fetching checksums: {e}")
return None
except json.JSONDecodeError:
print("Error: Failed to decode JSON from the API response.")
return None
Pro Tip: The WordPress API is pretty reliable, but I always include error handling for network issues or unexpected API changes. The `try…except` block here saves a lot of headaches if the API is temporarily down.
Step 3: Calculating Local File Hashes
Next, we need to walk through our local WordPress directory and calculate the MD5 hash for each file. It’s critical to skip directories like `wp-content` and `wp-config.php`, as they contain user-specific data and will always differ from the official repository.
def calculate_local_hashes(wp_path):
"""Calculates MD5 hashes for local WordPress core files."""
local_hashes = {}
# Core directories to scan
core_dirs = ['wp-admin', 'wp-includes']
for directory in core_dirs:
dir_path = os.path.join(wp_path, directory)
for root, _, files in os.walk(dir_path):
for filename in files:
file_path = os.path.join(root, filename)
try:
with open(file_path, 'rb') as f:
file_hash = hashlib.md5(f.read()).hexdigest()
# We need the relative path to match the API's format
relative_path = os.path.relpath(file_path, wp_path).replace('\\', '/')
local_hashes[relative_path] = file_hash
except IOError as e:
print(f"Warning: Could not read file {file_path}: {e}")
# Also include root files, but be careful to exclude user files
for item in os.listdir(wp_path):
item_path = os.path.join(wp_path, item)
if os.path.isfile(item_path) and item.startswith('wp-') and not item == 'wp-config.php':
try:
with open(item_path, 'rb') as f:
file_hash = hashlib.md5(f.read()).hexdigest()
relative_path = os.path.relpath(item_path, wp_path).replace('\\', '/')
local_hashes[relative_path] = file_hash
except IOError as e:
print(f"Warning: Could not read file {item_path}: {e}")
return local_hashes
Step 4: Putting It All Together and Comparing
Now we combine everything into a main function. This part orchestrates the process and prints a clear report of any issues it finds.
def main():
# In a real script, you'd get these from command-line arguments
wordpress_path = './public_html' # IMPORTANT: Change this to your WP path
wordpress_version = '6.4.3' # IMPORTANT: Match this to your WP version
print(f"Starting integrity check for WordPress {wordpress_version} at {wordpress_path}")
official_hashes = get_official_checksums(wordpress_version)
if not official_hashes:
print("Could not retrieve official checksums. Aborting.")
return
# Filter out wp-content from the official list, we don't check it
official_hashes = {k: v for k, v in official_hashes.items() if not k.startswith('wp-content/')}
local_hashes = calculate_local_hashes(wordpress_path)
official_files = set(official_hashes.keys())
local_files = set(local_hashes.keys())
# Find modified, added, and missing files
modified_files = []
for filename in official_files.intersection(local_files):
if official_hashes[filename] != local_hashes[filename]:
modified_files.append(filename)
added_files = list(local_files - official_files)
missing_files = list(official_files - local_files)
# --- Reporting ---
if not modified_files and not added_files and not missing_files:
print("\nSUCCESS: All core files match the official checksums.")
else:
print("\nALERT: Integrity check failed. See details below.")
if modified_files:
print("\n[!] MODIFIED CORE FILES (hashes do not match):")
for f in modified_files:
print(f" - {f}")
if added_files:
print("\n[!] UNEXPECTED FILES in core directories:")
for f in added_files:
print(f" - {f}")
if missing_files:
print("\n[!] MISSING CORE FILES:")
for f in missing_files:
print(f" - {f}")
if __name__ == "__main__":
main()
Pro Tip: For my production setups, I make the WordPress path and version configurable via command-line arguments (using Python’s `argparse` module). This makes the script much more flexible and reusable across different sites without having to edit the code.
Step 5: Scheduling the Check
This script is most powerful when it runs automatically. In a Linux environment, a simple cron job is perfect for this. I typically set mine to run once a week in the early morning. A command for your crontab might look something like this:
0 2 * * 1 python3 integrity_checker.py
This runs the script at 2:00 AM every Monday. You could also integrate this into a CI/CD pipeline as a post-deployment verification step.
Common Pitfalls
Here are a few places I’ve stumbled in the past, so you can avoid them:
- Version Mismatch: The most common error. If your site is running WordPress 6.4.2 but your script is checking against 6.4.3 checksums, you’ll get a ton of false positives. Always ensure the version number in the script matches your installed version.
- Forgetting `wp-config.php`: If you don’t explicitly exclude `wp-config.php`, it will always be flagged as “modified” because it’s unique to your installation.
- File Permissions: The script needs read access to the WordPress files. If it runs as a user without sufficient permissions, it will fail to read files and might report them as missing.
Conclusion
And there you have it. A lean, effective Python script that provides a solid baseline for your WordPress security monitoring. While it doesn’t replace a dedicated security plugin or a Web Application Firewall (WAF), it serves as an excellent, low-overhead integrity check that can alert you to unauthorized changes the moment they happen. It’s a simple tool, but one that provides immense value and peace of mind. Feel free to adapt it to your needs—perhaps by adding email or Slack notifications for failed checks. Happy scripting!
🤖 Frequently Asked Questions
âť“ How does the Python script identify unauthorized changes in WordPress core files?
The script compares locally calculated MD5 hashes of WordPress core files against a list of official MD5 checksums fetched directly from the WordPress API for a specified version. Discrepancies indicate unauthorized modifications, additions, or missing files.
âť“ How does this integrity checker compare to other WordPress security solutions?
This Python script serves as a lean, low-overhead integrity check specifically for core WordPress files. It complements, but does not replace, broader security measures like dedicated WordPress security plugins or Web Application Firewalls (WAFs), which offer more extensive protection.
âť“ What are the most common issues encountered when setting up this integrity checker?
Key pitfalls include ensuring the script’s `wordpress_version` matches the actual installed WordPress version to avoid false positives, explicitly excluding `wp-config.php` from checks, and verifying the script has sufficient file permissions to read all WordPress files.
Leave a Reply