Python 101 - 26: Files and the 'with' Keyword

The Context Manager (`with`)

In the old days, you had to manually .open() and .close() every file. If your program crashed after opening a file but before closing it, that file could get corrupted or “locked,” preventing other programs from using it.

The with keyword creates a Context Manager. Think of it as a self-closing door: as soon as your code leaves the indented block, Python slams the file shut for you automatically - even if your code crashes.

# The safe, modern way to work with files
with open("test.txt", "r") as f:
    content = f.read()
    print(content)

# Once we are out of the 'with' block, the file is ALREADY closed.

Modes: How are you opening it?

When you open a file, you must tell Python what you plan to do with it.

r (Read): Default. Opens for reading. Crashes if file doesn’t exist.
w (Write): Danger! Overwrites everything in the file or creates a new one.
a (Append): Adds new text to the end of the existing content.
r+ (Read/Write): Allows you to do both.

# Writing (Overwrites the file)
with open("notes.txt", "w") as f:
    f.write("This is a new note.")

# Appending (Adds to the end)
with open("notes.txt", "a") as f:
    f.write("\nThis is a second note.")

Reading Files Without Crashing Your RAM

If you try to use .read() on a 100GB file, your computer will try to shove all 100GB into your RAM at once, causing a crash. You need to choose the right “tool” based on the size and structure of your data.

”Small File” Method (`f.read()`)

Best for small configuration files or scripts. It pulls everything into memory as a single string.

with open("config.txt", "r") as f:
    # Only use this if you know the file is small!
    content = f.read()

”Big File” (Lazy Iteration)

This is the standard way to handle text files with newlines. It is Lazy, meaning it only keeps one line in memory at a time. You can process a massive file this way on a cheap laptop without any issues.

with open("huge_logs.txt", "r") as f:
    for line in f:
        if "ERROR" in line:
            print(line.strip()) # strip() removes the extra \n

”No Newline” Problem (Chunking)

What if your file is 100GB but has no newlines? (Common with massive JSON blobs or binary data). A for line in f loop would still crash because Python would think the entire file is just one giant “line.”

To solve this, you pass a number to .read(size) to read exactly that many characters at a time.

with open("huge_blob.txt", "r") as f:
    chunk_size = 1024 # Read 1KB at a time
    content = f.read(chunk_size)
    
    while len(content) > 0:
        print(content, end="")        # Process the chunk
        content = f.read(chunk_size)  # Pull the next chunk

`readlines()` and `readline()`

Notice the “s” at the end of the first one

f.readlines(): This reads the entire file and stores every line in a List.
- Danger: If the file is 100GB, your list will take up 100GB of RAM. Avoid this for large files.
f.readline(): This reads exactly one line and stops. Useful for grabbing a specific line, like a header.

with open("data.csv", "r") as f:
    header = f.readline() # Grab just the 1st line (the column names)
    print(f"Columns: {header.strip()}")
    
    # The playhead is now at line 2, ready for the rest of your code

”Playhead” and `f.seek()`

Think of a file like an old VHS tape or a music track. When you read 100 characters, the “playhead” is now at position 100. If you try to read again, you get nothing because you are at the end.

Use f.seek(0) to move the playhead back to the very beginning of the file.

with open("test.txt", "r") as f:
    print(f.read()) # Reads to the end
    print(f.read()) # Prints NOTHING (we are at the end)
    
    f.seek(0)       # Rewind to the start!
    print(f.read()) # Reads the whole thing again

Binary Files (Images, Music, etc.)

By default, Python assumes files contain Text (UTF-8). If you try to read a .jpg image or a .mp3 music file as text, you will get gibberish or a crash.

To handle these, add a b (Binary) to your mode:

# Copying an image file
with open("original.png", "rb") as source:
    with open("copy.png", "wb") as destination:
        for chunk in source:
            destination.write(chunk)

Handling Missing Files

In a perfect world, every file your program needs would be right where it belongs. In reality, files get moved, renamed, or deleted. If you try to read a missing file, Python will crash with a FileNotFoundError.

”Look Before You Leap” (LBYL)

Using pathlib, you check if the “file” is there before you try to open it.

from pathlib import Path

file_path = Path("secrets.txt")

if file_path.exists():
    with open(file_path, "r") as f:
        print(f.read())
else:
    print("Error: The file is missing!")

”Easier to Ask Forgiveness” (EAFP)

You just try to open it, and if it fails, your except catches the error. This is often considered more “Pythonic” because it handles Race Conditions (where a file exists during the check but is deleted a millisecond before the open command).

try:
    with open("secrets.txt", "r") as f:
        print(f.read())
except FileNotFoundError:
    print("Oops! That file doesn't exist.")

How to check if a file is empty

A file might exist, but it might be 0 bytes. Trying to process an empty file can lead to math errors (like dividing by zero when calculating an average).

Modern Way

stat().st_size returns the size in bytes. If it’s 0, the file is empty.

from pathlib import Path

file_path = Path("data.txt")

if file_path.exists() and file_path.stat().st_size == 0:
    print("File is empty! Skipping...")

Classic Way

If you are working on older code, you will see the os module. It does the exact same thing as above.

import os

if os.path.getsize("data.txt") == 0:
    print("File is empty! Skipping...")

”Playhead” Way

If you’ve already opened the file, you can move the playhead to the end and ask for its position using .tell().

with open("data.txt", "r") as f:
    # Move playhead to the very end (0 bytes from the end)
    f.seek(0, 2) 
    
    if f.tell() == 0:
        print("Empty file detected.")
    else:
        f.seek(0) # IMPORTANT: Rewind to the start before reading!
        print(f.read())

”Defensive” File Pattern

In most cases, we combine everything into a single pattern. This pattern separates Validation (checking if the file is ready) from Operation (actually reading it).

from pathlib import Path

file_path = Path("user_data.txt")

try:
    # Validation: Does it exist?
    if not file_path.exists():
        raise FileNotFoundError(f"Missing file: {file_path}")

    # Validation: Is it empty?
    if file_path.stat().st_size == 0:
        print(f"Notice: '{file_path}' is empty. Skipping logic.")
    
    else:
        # Operation: Safe reading
        with open(file_path, "r") as f:
            for line in f:
                print(f"Processing: {line.strip()}")

except FileNotFoundError as e:
    print(f"File Error: {e}")
except Exception as e:
    # Catch-all for other issues (like if the file is locked)
    print(f"Unexpected Error: {e}")