Featured image of post File Handling in Bash: Read, Write, and Manage Files Safely for DevOps

File Handling in Bash: Read, Write, and Manage Files Safely for DevOps

Learn file handling in Bash: cat, head, tail, redirection, here-doc, tee, line-by-line reading, permission checks, find, and xargs. Includes DevOps examples for manual log rotation and config backup.

File Handling in Bash for DevOps

In the previous article we used loops to process multiple servers, many log lines, and retry commands that may fail temporarily. The next natural step is file handling: reading logs, writing reports, appending output, creating configuration files, finding old files, and passing file lists to other commands.

In DevOps, most small automation tasks touch files in one way or another: service logs, .env files, Nginx configs, YAML manifests, host lists, build artifacts, or backups. This article focuses on practical Bash patterns: cat, head, tail, redirection, here-doc, tee, line-by-line reading, file checks, find, xargs, and two hands-on examples.


Quickly reading file content with cat, tac, head, and tail

cat reads one or more files and writes the content to standard output. According to GNU Coreutils, if no file is provided, cat reads from standard input.

1
2
cat /etc/os-release
cat app.log deploy.log

A few commands you will often use when inspecting logs or config files:

1
2
3
4
5
6
head -n 20 app.log        # first 20 lines
head -c 1K app.log        # first 1 KiB
tail -n 50 app.log        # last 50 lines
tail -f app.log           # follow a growing log file
tail -F app.log           # better when the log may be rotated
tac app.log | head -n 20  # view the last 20 lines in reverse order

tail -f follows the current file descriptor. When a log is rotated, the old file may be renamed and the application may open a new file. GNU tail -F is equivalent to --follow=name --retry, which is often more useful in operations because it keeps trying to reopen the file by name.


Redirecting output: overwrite, append, and split stdout/stderr

Bash processes redirections from left to right. Common forms include:

1
2
3
4
5
6
command > output.log        # write stdout, overwrite file
command >> output.log       # append stdout
command 2> error.log        # write stderr
command > output.log 2>&1   # write stdout + stderr to the same file
command &> output.log       # Bash shorthand for stdout + stderr
command &>> output.log      # append stdout + stderr

Example: write logs for a deploy step:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#!/usr/bin/env bash
set -euo pipefail

LOG_DIR="./logs"
LOG_FILE="${LOG_DIR}/deploy.log"
mkdir -p "${LOG_DIR}"

{
  echo "===== Deploy started at $(date -Is) ====="
  echo "Running migration..."
  ./migrate.sh
  echo "Restarting service..."
  ./restart-service.sh
  echo "===== Deploy finished at $(date -Is) ====="
} >> "${LOG_FILE}" 2>&1

Using a block like { ...; } >> "${LOG_FILE}" 2>&1 lets you collect logs from many commands into the same file without repeating the redirection on every line.

Note: > overwrites the file. If you want to reduce the risk of accidental overwrites in the current shell, you can enable set -o noclobber; when you intentionally need to overwrite, use >| file.


Here-doc: create configuration files from a script

A here-doc (<<EOF) sends multiple lines of text to a command’s stdin. It is very useful for creating sample configs, unit files, or small JSON payloads.

1
2
3
4
5
cat > app.env <<EOF
APP_ENV=production
APP_PORT=8080
LOG_LEVEL=info
EOF

If the delimiter is not quoted, Bash expands variables inside the here-doc:

1
2
3
4
5
APP_PORT="8080"

cat > app.env <<EOF
APP_PORT=${APP_PORT}
EOF

If you want to keep $, backticks, or ${VAR} literally in the output file, quote the delimiter:

1
2
3
4
cat > template.env <<'EOF'
APP_PORT=${APP_PORT}
DATABASE_URL=${DATABASE_URL}
EOF

According to the Bash Manual, when using <<-EOF, Bash strips leading tabs from the here-doc. This can make a script easier to read, but it strips tabs only, not spaces.


tee: show output and save it at the same time

Redirection with > sends output to a file and usually no longer shows it in the terminal. tee copies standard input to standard output and writes it to a file at the same time.

1
2
./healthcheck.sh | tee healthcheck.log
./healthcheck.sh | tee -a healthcheck.log

According to GNU Coreutils, tee overwrites the file unless you use -a; tee -a appends to the file.

Example: watch build logs while saving an artifact log:

1
2
3
4
5
6
7
#!/usr/bin/env bash
set -euo pipefail

LOG_DIR="./logs"
mkdir -p "${LOG_DIR}"

./build.sh 2>&1 | tee -a "${LOG_DIR}/build.log"

Here, 2>&1 is placed before the pipe so stderr also flows through tee. If you write only ./build.sh | tee ..., many errors printed to stderr will still appear in the terminal but will not be saved to the log file.


Reading a file line by line

A safe pattern for line-by-line reading is while IFS= read -r LINE; do ...; done < file.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
#!/usr/bin/env bash
set -euo pipefail

SERVER_FILE="${1:-servers.txt}"

if [[ ! -r "${SERVER_FILE}" ]]; then
  echo "ERROR: Cannot read ${SERVER_FILE}"
  exit 1
fi

while IFS= read -r SERVER; do
  [[ -n "${SERVER}" ]] || continue
  [[ "${SERVER}" != \#* ]] || continue

  echo "Checking ${SERVER}"
done < "${SERVER_FILE}"

Quick explanation:

  • IFS= preserves leading and trailing whitespace.
  • read -r prevents \ from being treated as an escape character.
  • done < "${SERVER_FILE}" avoids the cat file | while ... pattern, which may run the loop in a subshell in some shells and lose variables after the loop.
  • Quote "${SERVER_FILE}" so paths containing spaces still work.

Check files, directories, and permissions before operating

Before reading, writing, or deleting files in automation, check the conditions explicitly. Some useful tests inside [[ ... ]]:

1
2
3
4
5
6
7
[[ -e "${PATH_NAME}" ]]  # exists
[[ -f "${PATH_NAME}" ]]  # regular file
[[ -d "${PATH_NAME}" ]]  # directory
[[ -r "${PATH_NAME}" ]]  # readable
[[ -w "${PATH_NAME}" ]]  # writable
[[ -x "${PATH_NAME}" ]]  # executable/searchable
[[ -s "${PATH_NAME}" ]]  # exists and size > 0

Example: back up a config only when the file exists and is readable:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#!/usr/bin/env bash
set -euo pipefail

CONFIG_FILE="${1:-/etc/example/app.conf}"
BACKUP_DIR="${BACKUP_DIR:-./backup}"

if [[ ! -f "${CONFIG_FILE}" ]]; then
  echo "ERROR: Not a regular file: ${CONFIG_FILE}"
  exit 1
fi

if [[ ! -r "${CONFIG_FILE}" ]]; then
  echo "ERROR: Cannot read: ${CONFIG_FILE}"
  exit 1
fi

mkdir -p "${BACKUP_DIR}"
cp -- "${CONFIG_FILE}" "${BACKUP_DIR}/$(basename "${CONFIG_FILE}").$(date +%Y%m%d%H%M%S).bak"

The -- after cp ends the option list. This is a good habit when a variable may start with -.


find: search for files by condition

find is useful when you need to search by name, type, time, size, or nested directories.

1
2
3
find /var/log -type f -name "*.log"
find /var/log -type f -name "*.log" -mtime +7
find ./backup -type f -name "*.bak" -size +100M

Common predicates:

  • -type f: regular files only.
  • -type d: directories only.
  • -name "*.log": match by file name.
  • -mtime +7: files whose modification time is more than 7 days ago.
  • -size +100M: files larger than 100 MiB, using GNU find syntax.
  • -maxdepth 1: do not descend too far from the current directory.

Example: review logs older than 14 days without deleting them yet:

1
find /var/log/my-app -type f -name "*.log" -mtime +14 -print

When writing a cleanup script, run with -print first to review the list, then change it to a delete or archive action.


xargs: pass file lists to another command

xargs reads data from stdin, groups it into arguments, and runs a command. It is useful when the file list is long or when you need to pass find results to another command.

Avoid the default form for arbitrary file names:

1
find ./logs -type f -name "*.log" | xargs gzip

By default, xargs splits input on whitespace, so file names containing spaces, tabs, or newlines may be interpreted incorrectly. GNU findutils recommends using find -print0 together with xargs -0 to separate entries with the NUL character:

1
find ./logs -type f -name "*.log" -print0 | xargs -0 gzip

If you use GNU xargs, add -r so the command is not run when input is empty:

1
find ./logs -type f -name "*.log" -print0 | xargs -r -0 gzip

Another option is find -exec ... {} +, which is portable and avoids the pipe:

1
find ./logs -type f -name "*.log" -exec gzip -- {} +

DevOps practice: manual log rotation

In production, log rotation should usually be handled by logrotate or a logging stack. But understanding a small rotate script helps you grasp file operations clearly: check size, rename, create a new file, compress, and clean up.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#!/usr/bin/env bash
set -euo pipefail

LOG_FILE="${1:-./logs/app.log}"
MAX_SIZE_MB="${MAX_SIZE_MB:-100}"
KEEP_DAYS="${KEEP_DAYS:-14}"

if [[ ! -f "${LOG_FILE}" ]]; then
  echo "ERROR: Log file not found: ${LOG_FILE}"
  exit 1
fi

LOG_DIR="$(dirname "${LOG_FILE}")"
LOG_NAME="$(basename "${LOG_FILE}")"
SIZE_MB="$(du -m "${LOG_FILE}" | awk '{print $1}')"

if (( SIZE_MB < MAX_SIZE_MB )); then
  echo "OK: ${LOG_FILE} is ${SIZE_MB}MB, no rotation needed"
  exit 0
fi

TIMESTAMP="$(date +%Y%m%d%H%M%S)"
ROTATED_FILE="${LOG_DIR}/${LOG_NAME}.${TIMESTAMP}"

mv -- "${LOG_FILE}" "${ROTATED_FILE}"
: > "${LOG_FILE}"
gzip -- "${ROTATED_FILE}"

find "${LOG_DIR}" -type f -name "${LOG_NAME}.*.gz" -mtime +"${KEEP_DAYS}" -print -delete

echo "Rotated ${LOG_FILE} -> ${ROTATED_FILE}.gz"

A few notes:

  • : > "${LOG_FILE}" creates a new empty file using the shell builtin : and an empty output redirection.
  • This script is suitable for learning or for a small service. With an app that writes logs continuously, manual rotation may require a signal or log reopen behavior depending on the application.
  • Use find ... -print -delete so you can see which files are removed while cleaning up.

DevOps practice: back up config before deployment

Before deployment, a safe step is backing up important config files into a timestamped directory.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#!/usr/bin/env bash
set -euo pipefail

BACKUP_ROOT="${BACKUP_ROOT:-./config-backups}"
TIMESTAMP="$(date +%Y%m%d%H%M%S)"
BACKUP_DIR="${BACKUP_ROOT}/${TIMESTAMP}"

CONFIG_FILES=(
  "/etc/example/app.conf"
  "/etc/example/worker.conf"
  "/etc/nginx/conf.d/example.conf"
)

mkdir -p "${BACKUP_DIR}"

for CONFIG_FILE in "${CONFIG_FILES[@]}"; do
  if [[ ! -r "${CONFIG_FILE}" ]]; then
    echo "WARN: Skip unreadable config: ${CONFIG_FILE}"
    continue
  fi

  TARGET="${BACKUP_DIR}${CONFIG_FILE}"
  mkdir -p "$(dirname "${TARGET}")"
  cp -p -- "${CONFIG_FILE}" "${TARGET}"
  echo "Backed up ${CONFIG_FILE}"
done

find "${BACKUP_ROOT}" -mindepth 1 -maxdepth 1 -type d -mtime +30 -print -exec rm -rf -- {} +

cp -p preserves mode, ownership, and timestamps when system permissions allow it. TARGET="${BACKUP_DIR}${CONFIG_FILE}" keeps the original path structure inside the backup directory, which makes restore easier.


Common mistakes

  • Using > when you meant append: > overwrites the file; use >> or tee -a if you want to append.
  • Putting redirections in the wrong order: command 2>&1 >file is different from command >file 2>&1. Bash processes redirections from left to right.
  • Forgetting to quote paths: Always use "${FILE}", especially with paths read from input.
  • Using default xargs with arbitrary file names: Prefer find -print0 | xargs -0 or find -exec ... {} +.
  • Deleting files before reviewing: For cleanup, run find ... -print first, then add -delete or rm.
  • Unexpected here-doc expansion: Quote the delimiter (<<'EOF') if you want to keep ${VAR} literally in the output file.

Implementation notes

  • When applying this to your own project, clearly separate the operation type:
    • Quick file/log viewing → cat, head, tail, tail -F.
    • Script logging → redirection block or tee -a.
    • Multi-line config creation → here-doc, with a quoted delimiter when you need literal templates.
    • Searching/processing many files → find, find -exec ... {} +, or find -print0 | xargs -0.
  • Best practices:
    • Start scripts with set -euo pipefail when appropriate.
    • Check -r, -w, -f, and -d before dangerous operations.
    • Use -- before path variables in commands such as cp, mv, rm, and gzip.
    • For cleanup, log affected files with -print or echo.
  • Troubleshooting:
    • Log file does not capture stderr? → Make sure 2>&1 appears before the pipe or that redirection order is correct.
    • tail -f does not show new logs after rotation? → Try tail -F.
    • Script breaks with file names containing spaces? → Check variable quoting and how xargs is used.

🎯 Final thoughts

File handling is a core skill when writing Bash for DevOps. Once you understand redirection, here-doc, tee, line-by-line reading, permission checks, find, and xargs, you can write small but safer scripts for logs, configs, backups, and cleanup.

In the next article, we will move into text processing with grep, awk, and sed: filtering logs, extracting columns, changing config, and counting data from text files more effectively. 🚀


References