🚀 Executive Summary
TL;DR: Scripts that work interactively often fail under systemd or cron due to critical differences in execution context, including user permissions, environment variables (like PATH), and the working directory. The solution involves explicitly defining these environmental factors within systemd unit files or using a wrapper script to ensure a predictable and complete execution environment for the service.
🎯 Key Takeaways
- Automated services (systemd, cron) run in a sterile, minimalist environment, lacking the rich user context (PATH, custom ENV_VARS, home directory) available in an interactive SSH session.
- The core issues causing script failures are differences in the executing User & Permissions, missing Environment Variables, and an unexpected Working Directory.
- Robust solutions involve explicitly defining `User`, `Group`, `WorkingDirectory`, `EnvironmentFile`, and using absolute paths in `systemd` unit files, or creating a wrapper script to manually set the required environment before executing the main application.
SEO Summary: Frustrated by scripts that work for you but fail under systemd or cron? Unravel the common but maddening pitfalls of user context, permissions, and environment variables to finally fix those “Permission Denied” and “File Not Found” errors.
You’re Not Crazy: Why Your Script Works For You But Fails For Systemd
I remember it like it was yesterday. 2 AM, a production deployment window for a major e-commerce client. The final step was a simple bash script to clear a cache directory. I’d tested it a dozen times from my own account on `prod-app-01`. It was flawless. I triggered the final pipeline step, which executed the script via a systemd service, and… nothing. The logs were infuriatingly vague: `clear_cache.sh: command not found`. I SSH’d back in, ran `/opt/scripts/clear_cache.sh` myself, and it worked perfectly. It was one of those moments where you question your own sanity. The problem, as it almost always is in these situations, wasn’t the script. It was the context.
The Root of the Problem: You Are Not Your Service
When you log into a server via SSH, you enter a rich, comfortable world tailored just for you. You have your `~/.bashrc` loading aliases, your `PATH` environment variable is full of useful locations like `/usr/local/bin`, and your home directory (`~`) is, well, yours. Your scripts know where to find things because your shell knows where you are.
A `systemd` service, a `cron` job, or even a Jenkins agent runs in a completely different world. It’s a sterile, minimalist, and often non-interactive environment. It logs in as a different user (e.g., `www-data`, `nginx`, or a dedicated service account like `app-runner`), has a barebones `PATH`, and has no idea what your personal `~/.aws/credentials` file is.
The core issue boils down to these three things:
- User & Permissions: The service is running as a user who can’t read/write/execute the files your user can.
- Environment Variables: The service has a minimal `PATH` and is missing any custom `ENV_VARS` you’ve set in your profile.
- Working Directory: The script isn’t running from the directory you think it is. Relative paths like `../data/file.csv` will break instantly.
| Context | Your SSH Session (`darian`) | Systemd Service (`app-runner`) |
| Who Am I? | darian |
app-runner (or root if not specified) |
| Home Directory (~) | /home/darian |
/var/lib/app-runner |
| $PATH | /home/darian/bin:/usr/local/bin:/usr/bin:/bin |
/usr/bin:/bin |
| Working Directory | Usually your home directory or project dir | Often / unless specified |
The Solutions: From Quick Diagnosis to a Rock-Solid Fix
Once you accept that you’re debugging an environment problem, not a code problem, you can start making progress. Here are the three approaches I use, from discovery to deployment.
Solution 1: The Quick Fix (Replicate The Environment)
Never debug the service directly. First, prove your theory by becoming the user the service runs as. This is the fastest way to see the exact error the service is seeing. Use `sudo` to run a command, or even open a shell, as that user.
Let’s say your service is supposed to run as the `deploy-bot` user.
# Try to run the command directly as the user
sudo -u deploy-bot /opt/scripts/backup.sh
# If it's more complex, get a full shell as that user
# The -s flag gives you a shell, making it interactive
sudo -u deploy-bot -s /bin/bash
# Once inside the new shell, try running your script
whoami
# Output: deploy-bot
pwd
# Output: / (or the user's home dir)
/opt/scripts/backup.sh
# Now you'll see the REAL error, like "Permission denied: /root/secrets.txt"
This is purely for diagnostics, but it’s the most important step. It tells you immediately if the problem is permissions, a missing program in the `PATH`, or a bad relative path.
Solution 2: The Permanent Fix (A Proper `systemd` Unit File)
The “right” way to fix this is to be explicit in your `systemd` unit file. You’re giving the service the exact context it needs to run successfully and repeatably. Stop relying on implicit environments.
Here’s a typical, problematic `[Service]` block:
# /etc/systemd/system/bad-backup.service
[Unit]
Description=Nightly Database Backup - PROBLEMATIC
[Service]
Type=oneshot
ExecStart=/opt/scripts/backup.sh
And here is the robust, production-ready version:
# /etc/systemd/system/good-backup.service
[Unit]
Description=Nightly Database Backup - ROBUST
[Service]
Type=oneshot
# Be explicit about WHO runs this
User=deploy-bot
Group=deploy-bot
# Be explicit about WHERE it runs from
WorkingDirectory=/opt/scripts
# Be explicit about WHAT it needs
# This file can contain KEY=VALUE pairs like DB_HOST=prod-db-01
EnvironmentFile=/etc/default/backup-service-env
# Use absolute paths!
ExecStart=/usr/bin/bash /opt/scripts/backup.sh
[Install]
WantedBy=multi-user.target
By defining the `User`, `WorkingDirectory`, and `EnvironmentFile`, you’ve created a hermetic, predictable environment for your script. No more guesswork.
Pro Tip: Before you `systemctl daemon-reload`, always run `systemd-analyze verify /etc/systemd/system/your-service-name.service`. It will catch syntax errors and save you a lot of headaches.
Solution 3: The ‘Nuclear’ Option (The Wrapper Script)
Sometimes you’re dealing with a legacy application, a complex build tool, or you just don’t have time to refactor the environment properly. This is the “get it done” approach. It’s a bit hacky, but it’s a lifesaver. You create a “wrapper” shell script that builds the environment and then executes your real program.
Your `systemd` file becomes very simple:
# /etc/systemd/system/my-app.service
[Service]
User=app-runner
ExecStart=/opt/scripts/run_app_wrapper.sh
And the magic happens inside `run_app_wrapper.sh`:
#!/bin/bash
# A wrapper script to build a sane environment
# Force a specific working directory
cd /opt/my-app || exit 1
# Source environment variables if needed
if [ -f "/etc/default/my-app-env" ]; then
source "/etc/default/my-app-env"
fi
# Manually set the PATH to include things the service might not have
export PATH="/opt/node/bin:$PATH"
# Log what we're doing for easier debugging
echo "Starting my-app as user $(whoami) from $(pwd)"
echo "Using Node version: $(node --version)"
# Use exec to replace the wrapper process with the actual application
# This ensures signals (like SIGTERM) are passed correctly to the app
exec /opt/node/bin/node server.js
This approach gives you total control. You can log variables, check for files, and build the exact context you need before handing off control to the main application. It’s not elegant, but when it’s 3 AM and you just need the cache cleared, “elegant” can wait.
🤖 Frequently Asked Questions
âť“ Why does my script work when I run it but fail in systemd or cron?
Your script fails in systemd or cron because these services execute in a different, minimalist context. This includes running as a different user with limited permissions, a barebones `PATH` environment variable, missing custom `ENV_VARS`, and often a different working directory than your interactive session.
âť“ What are the recommended ways to fix script failures in systemd services?
The recommended ways include: diagnosing the issue by replicating the service’s environment (`sudo -u
âť“ What is a common implementation pitfall when configuring systemd services for scripts?
A common pitfall is relying on implicit environment settings from a user’s SSH session, such as a rich `PATH` or custom environment variables. This leads to errors like ‘command not found’ or ‘Permission denied’ when the systemd service runs in its isolated, barebones environment without these settings.
Leave a Reply