2 posts tagged with "testing"

How to Safely Unit Test Shell Scripts from LLMs

June 13, 2025 · 5 min read

vivek

So, you just got a shiny new shell script from ChatGPT (or Copilot, or your favorite AI buddy). It looks legit. It even feels right. But then that creeping doubt sets in:

"Wait… is this thing safe to run on production?"

Welcome to the world of unit testing shell scripts generated by LLMs — where the stakes are high, sudo is dangerous, and one wrong rm -rf can ruin your whole day.

In this post, we'll walk through a battle-tested way to safely test and validate scripts that manage real services like PM2, Docker, Nginx, or anything that touches system state.

The Problem With Trusting LLM Shell Scripts#

Frustrated engineer realizing the risks of blindly trusting LLM-generated shell scripts

Large Language Models like ChatGPT are awesome at generating quick shell scripts. But even the best LLM:

Can make assumptions about your environment
Might use the wrong binary name (like pgrep -x PM2 instead of pm2)
Can forget that systemctl restart docker isn't always a no-op

Even if the logic is 90% correct, that 10% can:

Restart your services at the wrong time
Write to incorrect log paths
Break idempotency (runs that shouldn't change state do)

According to a recent study on AI-generated code, about 15% of LLM-generated shell scripts contain potentially dangerous commands when run in production environments.

Strategy 1: Add a `--dry-run` Mode#

Every LLM-generated script should support a --dry-run flag. This lets you preview what the script would do — without actually doing it.

Here's how you add it:

DRY_RUN=false
[[ "$1" == "--dry-run" ]] && DRY_RUN=true

log_action() {
    echo "$(date): $1"
    $DRY_RUN && echo "[DRY RUN] $1" || eval "$1"
}

# Example usage
log_action "sudo systemctl restart nginx"

This pattern gives you traceable, reversible operations.

For more advanced dry-run implementations, check this guide.

Strategy 2: Mock External Commands#

You don't want docker restart or pm2 resurrect running during testing. You can override them like this:

mkdir mock-bin
echo -e '#!/bin/bash\necho "[MOCK] $0 $@"' > mock-bin/docker
chmod +x mock-bin/docker
export PATH="$(pwd)/mock-bin:$PATH"

Now, any call to docker will echo a harmless line instead of nuking your containers. Symlink other dangerous binaries like systemctl, pm2, and rm as needed.

This technique is borrowed from Bash Automated Testing System (BATS), which uses mocking extensively.

Strategy 3: Use `shellcheck`#

LLMs sometimes mess up quoting, variables, or command usage. ShellCheck is your best friend.

Just run:

shellcheck myscript.sh

And it'll tell you:

If variables are unquoted ("$var" vs $var)
If commands are used incorrectly
If your if conditions are malformed

It's like a linter, but for your shell’s sanity.

Strategy 4: Use Functions, Not One Big Blob#

Break your script into testable chunks:

check_pm2() {
    ps aux | grep '[P]M2' > /dev/null
}

restart_all() {
    pm2 resurrect
    docker restart my-app
    systemctl restart nginx
}

Now you can mock and call these functions directly in a test harness without running the whole script. This modular approach mirrors modern software testing principles.

Strategy 5: Log Everything. Seriously.#

Log every decision point. Why? Because "works on my machine" isn't helpful when the container didn't restart or PM2 silently failed.

log() {
    echo "$(date '+%F %T') [LOG] $1" >> /var/log/pm2_watchdog.log
}

Strategy 6: Test in a Sandbox#

If you've got access to Docker or a VM, spin up a replica and try running the script in that environment. Better to break a fake server than your actual one.

Try:

docker run -it ubuntu:20.04
# Then apt install what you need: pm2, docker, nginx, etc.

Check this Docker-based testing guide

Bonus: Tools You Might Love#

Developer presenting useful tools for safely testing shell scripts generated by LLMs

BATS: Bash unit testing framework
shunit2: xUnit-style testing for POSIX shell
assert.sh: dead-simple shell assertion helper
shellspec: full-featured, RSpec-like shell test framework

Final Thoughts: Don't Just Run It — Test It#

Two engineers discussing safe testing practices for LLM-generated shell scripts

It's tempting to copy-paste that LLM-generated shell script and run it. But in production environments — especially ones with critical services like PM2 and Nginx — the safer path is to test before trust.

Use dry-run flags. Mock your commands. Run scripts through shellcheck. Add logging. Test in Docker. Break things in safe places.

With these strategies, you can confidently validate AI-generated shell scripts and ensure they behave as expected before hitting your production servers.

Nife, a hybrid cloud platform, offers a seamless solution for deploying and managing applications across edge, cloud, and on-premise infrastructure. If you're validating shell scripts that deploy services via Docker, PM2, or Kubernetes, it's worth exploring how Nife can simplify and secure that pipeline.

Its containerized app deployment capabilities allow you to manage complex infrastructure with minimal configuration. Moreover, through features like OIKOS Deployments, you gain automation, rollback support, and a centralized view of distributed app lifecycles — all crucial for testing and observability.

Best Practices For Testing And Security in DevOps, Including Automated Security

March 19, 2023 · 6 min read

Tiyasha Bera

DevOps security combines three words: development, operations, and security and its very goal is to remove any barriers that may exist between software development and IT operations._

A survey found that over 58% of businesses had a data breach the previous year, with 41% resulting from software flaws. Infractions may cost businesses millions of dollars and potentially damage their reputation in the industry._

Yet, there has been tremendous progress in the application development processes. Businesses in the modern day often use DevOps practices and technologies while developing new applications and systems. The DevOps method emphasizes incremental deployment rather than a single massive deployment. Daily releases are possible in certain instances. It is not simple, however, to identify security flaws in the daily updates. Thus, security is an extremely important part of the DevOps workflow. Each application development team—development, testing, operations, and production—must take security precautions to prevent breaches. This article discusses DevOps Security's recommended practices for developing and deploying apps safely._

DevOps Security Challenges and Considerations#

The DevOps philosophy has revolutionized how businesses create, run, and maintain their applications and IT infrastructure, whether on or in the cloud. DevOps merges IT development with IT operations, combining demands and specifications, coding, testing, high availability, implementation, and more.

DevOps often collaborates with agile software development procedures, which encourages cross-team alignment, cooperation, and individualized development. DevOps software development is characterized by a constant pursuit of velocity, automation, and monitoring across the whole process, from code integration and testing through release and deployment, as well as infrastructure management. These methods shorten the time it takes to create a product and get it to market while ensuring its features and capabilities evolve in response to market demand and company goals.

Best practices of security in DevOps#

When it comes to safety, what impact does DevOps have? Let's explore how DevOps methods and popular tools create unique security concerns.

1. Implementation of the DevSecOps Model#

Another famous name in the field of DevOps is "DevSecOps." Divorce is the core security technique that all IT companies have been using. The term really refers to the combination of three distinct but interrelated disciplines: development, security, and operations.

DevSecOps is an approach to leveraging security technologies in the DevOps life cycle. Hence, from the outset of application development, security has to be a part of it. By incorporating security into the DevOps process, businesses can create apps that are both reliable and safe from exploits. This strategy is also useful for breaking down barriers between different departments, such as IT and security.

A few fundamental practices are required for a DevSecOps methodology:

Embed security technologies into your development workflow.
Experts in cyber security must review all automated testing.
Developing threat models requires cooperation between development and security teams.
The product backlog should provide top priority to security needs.
Before deploying any new infrastructure, all existing security policies should be examined.

2. Review the code in a smaller size#

You need to read the code in a smaller size to understand it. Reviewing too much code at once is a bad idea, as is reviewing the whole program in one sitting. Examine the piece of the code by piece to ensure thorough examination.

3. Establish a system for dealing with future changes#

Set up a method for handling upcoming changes. After an application has reached the deployment phase, it is no longer desirable to have new features added or old ones taken away by developers. The only thing that can assist you is to start using the change management strategy.

Thus, the change management strategy should be used for application modifications. The developer should be able to make adjustments after the project has been authorized.

4. Maintain active application monitoring#

Security is often overlooked when an application is deployed to a production environment.

The application process should be in a constant state of evaluation. To ensure no new vulnerabilities have been added, you should routinely analyze its code and conduct security tests.

5. Train the development team on security#

Security best practices should also be taught to the development team.

For example, if a new developer doesn't know about SQL injection, you must educate them on what it is, how it works, and how it might damage the program. Don't get technical. Therefore, you must inform the development team of new security regulations and best practices at a wide level.

6. Secure Coding Standards#

Developers focus on the features of an app rather than its security since it is not a top concern for them. Yet, with the growth of cyber risks in the modern day, you must ensure that your development team understands the best security measures before building the application.

For this reason, developers need to be familiar with security technologies that may detect flaws in their code during development and suggest solutions.

7. Use DevOps Security Automation Tools#

If you want to save time and effort in the DevOps processes, you should use security automation tools.

Use automation tools to test an application and create repeatable tests. It will be simple to create safe products with the help of automated tools for code analysis, remote management, configuration management, vulnerability management, etc.

8. Segregate the DevOps Network#

Segmenting the network is a good idea for the company.

A company's resources, including software, hardware, data storage, and more, should not depend on a single network. Hackers who breach your network will have complete access to your company's resources. Hence, having a distinct network for each logical element would be best.

For instance, keeping your development and production networks completely separate is recommended.

Conclusion#

DevOps security may assist in detecting and fixing code vulnerabilities and operational shortcomings before they cause problems. DevOps security guarantees that application and system development is secure from the start. This increases availability lowers data breaches and assures the development and distribution of sophisticated technology to fulfill corporate objectives.

A company that cares about its customers' data security should adhere to these DevOps security best practices. Combining security best practices with the DevOps approach may save a company millions. Start using the security best practices described here for safer and quicker app releases.

Recent posts