ChatGPT vs. My DIY Cleanup Script: Who Cleans Up Better?

Some time ago, while studying bash scripts, I tried to delve deeper into the topic and looked for more practice by solving any, even the smallest, tasks. One of these was a script that deletes temporary files, old dumps, folders node_modulesfrom long-forgotten projects. I found it the other day comp...

🔗 https://www.roastdev.com/post/....chatgpt-vs-my-diy-cl

#news #tech #development

Favicon 
www.roastdev.com

ChatGPT vs. My DIY Cleanup Script: Who Cleans Up Better?

Some time ago, while studying bash scripts, I tried to delve deeper into the topic and looked for more practice by solving any, even the smallest, tasks. One of these was a script that deletes temporary files, old dumps, folders node_modulesfrom long-forgotten projects. I found it the other day completely by accident. I tested it on a virtual machine, the script works, but is terribly hacky and visually unpleasant.What idea did I have? To check if ChatGPT can do the same (and how well) as I did, but more competently. The result was quite instructive: the AI ​​did a great job with the architecture, but really tried to ruin the system with a couple of lines. Below I will tell you how it was.The task is simple, you need to automatically find and delete unnecessary files according to certain rules. My old script was a monolith: with a bunch of repeating findand rm -rfawkward attempts to handle errors. Please do not judge me too much in advance, I was just learning Bash and its capabilities.


The main problems of my creation
Commands rm -rf with variable concatenation are a game of Russian roulette (concatenation is the combination of two or more strings into one).Any gap in the path and the script will silently "fly" past the target or delete the wrong thing.To change the rules, you need to go directly into the code there are no proper settings at the beginning.The script did not log what exactly it deleted (or did not delete?). It worked in silence, which is always alarming.I sent ChatGPT the TOR: "Write a secure and customizable script to search/delete temporary files, caches, and old logs. Add a whitelist of folders that cannot be accessed. Add logging."


Step-by-step code analysis before and after
I'll start by demonstrating that very "cheat" script, for which I am extremely ashamed. It was really hard to share this.
My version (comments were added by me before writing the article for better understanding)
⛶#!/bin/bash
# If $DIR contains a space, the command will split into two
DIRS="/tmp ~/cache ~/projects/*/node_modules"
# Remove everything at once
for dir in $DIRS; do
echo "Removing $dir"
rm -rf "$dir" # Quotes are here, but the for loop breaks them anyway, right?
done
# Find and delete all .log and .tmp files older than 30 days
find ~/ -name "*.log" -mtime +30 -exec rm {} \; 2/dev/null
find ~/ -name "*.tmp" -mtime +30 -exec rm {} \; 2/dev/null
echo "Cleanup done!"This code is a product of colossal laziness (after all, he could have "read the Internet" and done better…). It deletes node_modulesrecursively, without looking, silently ignores any errors ( 2/dev/null), which can hide existing rights issues.


Python version from ChatGPT
I want to point out in advance that the choice of Python is not accidental, but a conscious choice. I wanted to emphasize the difference in approaches between a "script on the knee" and a well-thought-out, reliable tool.ChatGPT, tasked with making quality and secure software, intuitively understands that Bash is not the right tool for the job. The main aspect of any script is security, and Bash has no built-in mechanisms for secure file deletion. Security literally depends on proper escaping and the hope that no variable is accidentally empty.I gave the prompt: "Write a safe and configurable system cleanup script. It should remove old files, have a blocklist for dangerous directories (like /, /home, etc.), and log its actions to a file. Avoid using shell commands directly."Here's what the AI ​​suggested (my comments and subsequent edits were added)
⛶# cleanup_agent.py
import os
import logging
import time
from pathlib import Path
# === CONFIGURATION ===
TARGETS = [
{
"path": os.path.expanduser("~/Downloads"),
"patterns": ["*.tmp", "*.temp", "*.log"],
"max_age_days": 7
},
{
"path": os.path.expanduser("~/.cache"),
"patterns": ["*"],
"max_age_days": 30
}
]
# Directories that must never be touched (!)
BLOCKLIST = {
os.path.expanduser("/"),
os.path.expanduser("~"),
os.path.expanduser("~/.ssh"),
}
# Logging setup
logging.basicConfig(
filename=os.path.expanduser('~/cleanup_agent.log'),
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)
def safe_remove(target: Path):
"""Safely remove a file or empty folder with logging"""
try:
if target.is_file():
target.unlink() # Remove file
logging.info(f"Removed file: {target}")
elif target.is_dir():
target.rmdir() # Remove empty folder only
logging.info(f"Removed empty folder: {target}")
except OSError as e:
# Log error but don't crash
logging.error(f"Failed to remove {target}: {e}")
def cleanup_directory(config: dict):
"""Clean a directory based on config"""
base_dir = Path(config["path"])
if not base_dir.exists():
logging.warning(f"Directory not found: {base_dir}")
return
# Check if directory is in blocklist
if any(blocked in base_dir.parents for blocked in BLOCKLIST):
logging.error(f"Attempted cleanup of protected directory: {base_dir}. Aborting.")
return
current_time = time.time()
for pattern in config["patterns"]:
for target in base_dir.rglob(pattern): # Recursive search
if not target.exists():
continue
# Check file/folder age
age_days = (current_time - target.stat().st_mtime) / (60 * 60 * 24)
if age_days config["max_age_days"]:
safe_remove(target)
if __name__ == "__main__":
logging.info("=== Cleanup started ===")
for job in TARGETS:
cleanup_directory(job)
logging.info("=== Cleanup finished ===
")


Comparative analysis
Where ChatGPT was right
Made the script as secure as possible. Eliminated such rm -rfdirect calls using Python's built-in methods ( unlink, rmdir). A BLOCKLIST, which completely prohibits any attempts to get into /or , appeared $HOME.Added customizability. Instead of hardcode - a good config in the form of a list of dictionaries TARGETS. Need to clean another folder or change the "age"? Just edit the list without changing the code. In my opinion, the right and competent solution.The script now maintains a full log file. Now you can see not only what was deleted, but also why something is going wrong.
Use pathlib.Pathinstead of string concatenation, which is more correct work with paths. It automatically handles different OS and escapes special characters.


Where ChatGPT was not quite right (in my opinion, please correct me if I'm wrong)
A bit dangerous recursive search. Initially AI used base_dir.rglob('')for pattern ""in ~/.cache. This literally means: "go recursively through EVERYTHING in the cache and check the age of EVERY file". For a cache directory, where there are a huge number of small files, this could easily lead to incredibly long and useless work. I would add a condition for a minimum age for such an aggressive cleaning.Imitation of security. The function safe_removetries to delete the folder only if it is empty. This is safe, but completely useless for node_modules. For "non-empty" directories, the script will simply ignore them. It would be worth explicitly specifying this in the logging.Not the most practical templates. The template ""is ~/.cachetoo wide. It would be more correct: ['.bin', 'cache/', 'thumbnails/']etc.What conclusion can be drawn: ChatGPT made a low-quality and slightly dangerous bash script into a nearly production utility with config and logs. But blind confidence in recursive traversal of "everything and everyone" could easily hang the system. AI structures and secures the code perfectly, but it seems to lack a specific understanding of "what exactly should I clean?" As an auxiliary tool for generation, it is an indispensable thing, but you need to know the material well and very carefully monitor the generated code in order to avoid dangerous consequences.


Example of use
As usual, the instructions for the script are in the article (maybe someone will need it?)Save the code to a file cleanup_agent.py.
We edit the config TARGETSfor the required tasks. It is necessary to clean Downloadsonce a week - please. It is necessary to clean Projectsfrom pycache- we add a rule.
Let's launch it and look at the logs.
⛶# Make the script executable
chmod +x cleanup_agent.py
# Run the script
python3 cleanup_agent.py


Check the log output
⛶tail -f ~/cleanup_agent.log
The output in the log will be something like this:
2025-08-19 11:05:32,123 - INFO - === Cleanup started ===
2025-08-19 11:05:32,456 - INFO - Removed file: /home/user/Downloads/old_report.tmp
2025-08-19 11:05:33,001 - ERROR - Failed to remove /home/user/.cache/some_file: [Errno 13] Permission denied
2025-08-19 11:05:33,002 - INFO - === Cleanup finished ===Tip: run with sudo only if you really need to touch protected paths. Otherwise leave permission errors as is.

Similar Posts

Similar

Revolutionize Your Brand: Real-Time Perception Tracking with AI-Powered Sentiment Analysis

This is a submission for the AI Agents Challenge powered by n8n and Bright Data


What I Built
Right now, companies spend millions on agencies and PR firms that only scratch the surface of brand perception — and usually react after the fact👎.What’s missing is a clear, real-time picture ...

🔗 https://www.roastdev.com/post/....revolutionize-your-b

#news #tech #development

Favicon 
www.roastdev.com

Revolutionize Your Brand: Real-Time Perception Tracking with AI-Powered Sentiment Analysis

This is a submission for the AI Agents Challenge powered by n8n and Bright Data


What I Built
Right now, companies spend millions on agencies and PR firms that only scratch the surface of brand perception — and usually react after the fact?.What’s missing is a clear, real-time picture of how people actually see the brand online — the balance between complaints and praises, the preferences customers express, and the sentiment trends that shape the brand's reputation?.This workflow automation combines public discussion platforms monitoring with sentiment analytics to allow companies keep track of online perception of their brand in real time. It also helps in acting as an early warning radar by detecting rising negative sentiment early and alerting the company to make informed decisions before small issues snowball into full-blown crises✨.


Demo
Workflow JSON


n8n Workflow



Technical Implementation



Tools Used


n8n: The automation platform that orchestrates the workflow

Bright Data: For scraping Reddit comments without restrictions or rate limits

Google Gemini: AI model for intelligent sentiment analysis

Google Sheets: For storing and tracking sentiment analysis results



Bright Data Verified Node



Journey
Seeing this challenge, I knew I had to jump on it, given the fact that I have been hearing much about n8n for a while. This challenge really opened me up to all of its possibilities and capabilities. And I must say, it really exceeded my expectations. Building an AI agent was made very seamless to achieve on the platform, and I faced little to no challenges as I walked through setting up the workflow for the agent.Here's to the DEV, n8n, and Bright Data team for hosting this amazing challenge. I really enjoyed the process of building out this solution. Cheers?
Similar

Revolutionizing Small Business: How AI Supercharges Growth and Efficiency




The Unveiled Impact of Artificial Intelligence on Small Business Growth and Productivity
As we dwell in the epoch of innovation and technology, artificial intelligence (AI) has undeniably left its imprint on diverse industries globally. However, its influence is not limited to giant tech co...

🔗 https://www.roastdev.com/post/....revolutionizing-smal

#news #tech #development

Favicon 
www.roastdev.com

Revolutionizing Small Business: How AI Supercharges Growth and Efficiency

The Unveiled Impact of Artificial Intelligence on Small Business Growth and Productivity
As we dwell in the epoch of innovation and technology, artificial intelligence (AI) has undeniably left its imprint on diverse industries globally. However, its influence is not limited to giant tech corporations; small businesses are also increasingly harnessing its capabilities to amplify growth and productivity. This article delves into how AI impacts small businesses, analysing its role in enhancing efficiency and unlocking entrepreneurial potential.


Escalating Efficiency with AI Integration
For small businesses, maintaining high productivity levels and efficient operations can be challenging, but AI technology offers promising solutions. AI-based applications and software can automate routine tasks, reducing manual labour and paving the way for quicker response times and enhanced performance.Advancements in AI have led to the development of intelligent virtual assistants, serving as competent office administrators that offer services 24/7. They help in managing tasks such as emails, scheduling meetings, and customer service, enabling the staff to focus on complex tasks requiring human touch and creative thinking.AI tools also eliminate the margin of human error, thereby improving the accuracy of tasks. For example, AI-powered accounting software can handle invoicing, payroll, tax preparation, and financial reports with impressive precision, eliminating errors that could result in financial loss or legal issues.


Unlocking Growth Opportunities through AI Analytics
The advent of AI has revolutionized data analytics, allowing businesses to make well-informed, strategic decisions. AI makes it feasible to process and interpret vast amounts of data in real-time, delivering valuable insights for small businesses. Through predictive analytics, AI can forecast custo
Similar

Unlock the Power of THOAD: Revolutionize PyTorch Graphs with High-Order Derivatives




Intro
I’m excited to share thoad (short for PyTorch High Order Automatic Differentiation), a Python only library that computes arbitrary order partial derivatives directly on a PyTorch computational graph. The package has been developed within a research project at Universidad Pontificia ...

🔗 https://www.roastdev.com/post/....unlock-the-power-of-

#news #tech #development

Favicon 
www.roastdev.com

Unlock the Power of THOAD: Revolutionize PyTorch Graphs with High-Order Derivatives

Intro
I’m excited to share thoad (short for PyTorch High Order Automatic Differentiation), a Python only library that computes arbitrary order partial derivatives directly on a PyTorch computational graph. The package has been developed within a research project at Universidad Pontificia de Comillas (ICAI), and we are considering publishing an academic article in the future that reviews the mathematical details and the implementation design.At its core, thoad takes a one output, many inputs view of the graph and pushes high order derivatives back to the leaf tensors. Although a 1→N problem can be rewritten as 1→1 by concatenating flattened inputs, as in functional approaches such as jax.jet or functorch, thoad’s graph aware formulation enables an optimization based on unifying independent dimensions (especially batch). This delivers asymptotically better scaling with respect to batch size. We compute derivatives vectorially rather than component by component, which is what makes a pure PyTorch implementation practical without resorting to custom C++ or CUDA.The package is easy to maintain, because it is written entirely in Python and uses PyTorch as its only dependency. The implementation stays at a high level and leans on PyTorch’s vectorized operations, which means no custom C++ or CUDA bindings, no build systems to manage, and fewer platform specific issues. With a single dependency, upgrades and security reviews are simpler, continuous integration is lighter, and contributors can read and modify the code quickly. The UX follows PyTorch closely, so triggering a high order backward pass feels like calling tensor.backward(). You can install from GitHub or PyPI and start immediately:
GitHub: https://github.com/mntsx/thoad

PyPI: https://pypi.org/project/thoad/

In our benchmarks, thoad outperforms torch.autograd for Hessian calculations even on CPU. See the notebook that reproduces the comparison: https://github.com/mntsx/thoad/blob/master/examples/benchmarks/benchmark\_vs\_torch\_autograd.ipynb.The user experience has been one of our main concerns during development. thoad is designed to align closely with PyTorch’s interface philosophy, so running the high order backward pass is practically indistinguishable from calling PyTorch’s own backward. When you need finer control, you can keep or reduce Schwarz symmetries, group variables to restrict mixed partials, and fetch the exact mixed derivative you need. Shapes and independence metadata are also exposed to keep interpretation straightforward.


USING THE PACKAGE
thoad exposes two primary interfaces for computing high-order derivatives:

thoad.backward: a function-based interface that closely resembles torch.Tensor.backward. It provides a quick way to compute high-order gradients without needing to manage an explicit controller object, but it offers only the core functionality (derivative computation and storage).

thoad.Controller: a class-based interface that wraps the output tensor’s subgraph in a controller object. In addition to performing the same high-order backward pass, it gives access to advanced features such as fetching specific mixed partials, inspecting batch-dimension optimizations, overriding backward-function implementations, retaining intermediate partials, and registering custom hooks.




thoad.backward
The thoad.backward function computes high-order partial derivatives of a given output tensor and stores them in each leaf tensor’s .hgrad attribute. Arguments:

tensor: A PyTorch tensor from which to start the backward pass. This tensor must require gradients and be part of a differentiable graph.

order: A positive integer specifying the maximum order of derivatives to compute.

gradient: A tensor with the same shape as tensor to seed the vector-Jacobian product (i.e., custom upstream gradient). If omitted, the default is used.

crossings: A boolean flag (default=False). If set to True, mixed partial derivatives (i.e., derivatives that involve more than one distinct leaf tensor) will be computed.

groups: An iterable of disjoint groups of leaf tensors. When crossings=False, only those mixed partials whose participating leaf tensors all lie within a single group will be calculated. If crossings=True and groups is provided, a ValueError will be raised (they are mutually exclusive).

keep_batch: A boolean flag (default=False) that controls how output dimensions are organized in the computed gradients.



When keep_batch=False**:** Gradients are returned in a fully flattened form. Concretely, think of the gradient tensor as having:


A single “output” axis that lists every element of the original output tensor (flattened into one dimension).
One axis per derivative order, each listing every element of the corresponding input (also flattened).





For an N-th order derivative of a leaf tensor with input_numel elements and an output with output_numel elements, the gradient shape is:



Axis 1: indexes all output_numel outputs

Axes 2…(N+1): each indexes all input_numel inputs







When keep_batch=True: Gradients preserve both a flattened “output” axis and each original output dimension before any input axes. You can visualize it as:



Axis 1 flattens all elements of the output tensor (size = output_numel).

Axes 2...(k+1) correspond exactly to each dimension of the output tensor (if the output was shape (d1, d2, ..., dk), these axes have sizes d1, d2, ..., dk).

Axes (k+2)...(k+N+1) each flatten all input_numel elements of the leaf tensor, one axis per derivative order.





However, if a particular output axis does not influence the gradient for a given leaf, that axis is not expanded and instead becomes a size-1 dimension. This means only those output dimensions that actually affect a particular leaf’s gradient “spread” into the input axes; any untouched axes remain as 1, saving memory.






keep_schwarz: A boolean flag (default=False). If True, symmetric (Schwarz) permutations are retained explicitly instead of being canonicalized/reduced—useful for debugging or inspecting non-reduced layouts.

Returns:
An instance of thoad.Controller wrapping the same tensor and graph.

⛶import torch
import thoad
from torch.nn import functional as F

#### Normal PyTorch workflow
X = torch.rand(size=(10,15), requires_grad=True)
Y = torch.rand(size=(15,20), requires_grad=True)
Z = F.scaled_dot_product_attention(query=X, key=Y.T, value=Y.T)

#### Call thoad backward
order = 2
thoad.backward(tensor=Z, order=order)

#### Checks
## check derivative shapes
for o in range(1, 1 + order):
assert X.hgrad[o - 1].shape == (Z.numel(), *(o * tuple(X.shape)))
assert Y.hgrad[o - 1].shape == (Z.numel(), *(o * tuple(Y.shape)))
## check first derivatives (jacobians)
fn = lambda x, y: F.scaled_dot_product_attention(x, y.T, y.T)
J = torch.autograd.functional.jacobian(fn, (X, Y))
assert torch.allclose(J[0].flatten(), X.hgrad[0].flatten(), atol=1e-6)
assert torch.allclose(J[1].flatten(), Y.hgrad[0].flatten(), atol=1e-6)
## check second derivatives (hessians)
fn = lambda x, y: F.scaled_dot_product_attention(x, y.T, y.T).sum()
H = torch.autograd.functional.hessian(fn, (X, Y))
assert torch.allclose(H[0][0].flatten(), X.hgrad[1].sum(0).flatten(), atol=1e-6)
assert torch.allclose(H[1][1].flatten(), Y.hgrad[1].sum(0).flatten(), atol=1e-6)


thoad.Controller
The Controller class wraps a tensor’s backward subgraph in a controller object, performing the same core high-order backward pass as thoad.backward while exposing advanced customization, inspection, and override capabilities.InstantiationUse the constructor to create a controller for any tensor requiring gradients:⛶controller = thoad.Controller(tensor=GO) ## takes graph output tensor

tensor: A PyTorch Tensor with requires_grad=True and a non-None grad_fn.
Properties

.tensor → Tensor The output tensor underlying this controller. Setter: Replaces the tensor (after validation), rebuilds the internal computation graph, and invalidates any previously computed gradients.

.compatible → bool Indicates whether every backward function in the tensor’s subgraph has a supported high-order implementation. If False, some derivatives may fall back or be unavailable.

.index → Dict[Type[torch.autograd.Function], Type[ExtendedAutogradFunction]] A mapping from base PyTorch autograd.Function classes to thoad’s ExtendedAutogradFunction implementations. Setter: Validates and injects your custom high-order extensions.
Core Methods.backward(order, gradient=None, crossings=False, groups=None, keep_batch=False, keep_schwarz=False) → NonePerforms the high-order backward pass up to the specified derivative order, storing all computed partials in each leaf tensor’s .hgrad attribute.

order (int 0): maximum derivative order.

gradient (Optional[Tensor]): custom upstream gradient with the same shape as controller.tensor.

crossings (bool, default False): If True, mixed partial derivatives across different leaf tensors will be computed.

groups (Optional[Iterable[Iterable[Tensor]]], default None): When crossings=False, restricts mixed partials to those whose leaf tensors all lie within a single group. If crossings=True and groups is provided, a ValueError is raised.

keep_batch (bool, default False): controls whether independent output axes are kept separate (batched) or merged (flattened) in stored/retrieved gradients.

keep_schwarz (bool, default False): if True, retains symmetric permutations explicitly (no Schwarz reduction).
.display_graph() → NonePrints a tree representation of the tensor’s backward subgraph. Supported nodes are shown normally; unsupported ones are annotated with (not supported)..register_backward_hook(variables: Sequence[Tensor], hook: Callable) → NoneRegisters a user-provided hook to run during the backward pass whenever gradients for any of the specified leaf variables are computed.

variables (Sequence[Tensor]): Leaf tensors to monitor.

hook (Callable[[Tuple[Tensor, Tuple[Shape, ...], Tuple[Indep, ...]], dict[AutogradFunction, set[Tensor]]], Tuple[Tensor, Tuple[Shape, ...], Tuple[Indep, ...]]]): Receives the current (Tensor, shapes, indeps) plus contextual info, and must return the modified triple.
.require_grad_(variables: Sequence[Tensor]) → NoneMarks the given leaf variables so that all intermediate partials involving them are retained, even if not required for the final requested gradients. Useful for inspecting or re-using higher-order intermediates..fetch_hgrad(variables: Sequence[Tensor], keep_batch: bool = False, keep_schwarz: bool = False) → Tuple[Tensor, Tuple[Tuple[Shape, ...], Tuple[Indep, ...], VPerm]]Retrieves the precomputed high-order partial corresponding to the ordered sequence of leaf variables.

variables (Sequence[Tensor]): the leaf tensors whose mixed partial you want.

keep_batch (bool, default False): if True, each independent output axis remains a separate batch dimension in the returned tensor; if False, independent axes are distributed/merged into derivative dimensions.

keep_schwarz (bool, default False): if True, returns derivatives retaining symmetric permutations explicitly.
Returns a pair:

Gradient tensor: the computed partial derivatives, shaped according to output and input dimensions (respecting keep_batch/keep_schwarz).

Metadata tuple



Shapes (Tuple[Shape, ...]): the original shape of each leaf tensor.

Indeps (Tuple[Indep, ...]): for each variable, indicates which output axes remained independent (batch) vs. which were merged into derivative axes.

VPerm (Tuple[int, ...]): a permutation that maps the internal derivative layout to the requested variables order.


Use the combination of independent-dimension info and shapes to reshape or interpret the returned gradient tensor in your workflow.
⛶import torch
import thoad
from torch.nn import functional as F

#### Normal PyTorch workflow
X = torch.rand(size=(10,15), requires_grad=True)
Y = torch.rand(size=(15,20), requires_grad=True)
Z = F.scaled_dot_product_attention(query=X, key=Y.T, value=Y.T)

#### Instantiate thoad controller and call backward
order = 2
controller = thoad.Controller(tensor=Z)
controller.backward(order=order, crossings=True)

#### Fetch Partial Derivatives
## fetch T0 and T1 2nd order derivatives
partial_XX, _ = controller.fetch_hgrad(variables=(X, X))
partial_YY, _ = controller.fetch_hgrad(variables=(Y, Y))
assert torch.allclose(partial_XX, X.hgrad[1])
assert torch.allclose(partial_YY, Y.hgrad[1])
## fetch cross derivatives
partial_XY, _ = controller.fetch_hgrad(variables=(X, Y))
partial_YX, _ = controller.fetch_hgrad(variables=(Y, X))
NOTE. A more detailed user guide with examples and feature walkthroughs is available in the notebook: https://github.com/mntsx/thoad/blob/master/examples/user_guide.ipynb
If you give it a try, I would love feedback on the API, corner cases, and models where you want better plug and play support.