Working with Files and Directories in Python

Advertisement

Advertisement

Introduction

Working with files and directories is a common task when developing in Python. Let's look at several useful tools and methods for working with files and directories.

If you work with files, you might also find my tutorial Working with Binary Data in Python useful.

Working with files

There are several packages that help with working with files in the Python standard library. For example, os and shutil. There are also several built-in functions. We'll look at several tasks below.

Does a file exist

The function os.path.exists() lets you check if a path exists. It accepts a string or in PYthon 3.6+ a Path object.

import os

print(os.path.exists('test.txt'))

Get file size

The function os.stat() can be used to get information about a file including its size in bytes. It returns an os.statresult object.

import os

stats = os.stat('test.txt')
print(stats.st_size)

Truncate a file

The functino os.truncate() will let you truncate a file to an arbitrary length:

import os

# Truncate file to be empty
os.truncate('test.txt', 0)

Get file permissions

The function os.stat() can be used to get information about a file including its permissions. It returns an os.statresult.

import os

stats = os.stat('test.txt')
print(stats.st_mode)
print(oct(stats.st_mode))  # Octal output, e.g. 777, 644

Set file permissions (chmod)

To change file permissions, you can use os.chmod(). You can bitwise OR the following options to set the permissions the way you want. These values come from the stat package: Python stat package documentation.

# import stat
stat.S_IRUSR # Read, user  
stat.S_IWUSR # Write, user
stat.S_IXUSR # Execute, user

stat.S_IRGRP # Read, group
stat.S_IWGRP # Write, group
stat.S_IXGRP # Execute, group

stat.S_IROTH # Read, other
stat.S_IWOTH # Write, other
stat.S_IXOTH # Execute, other

stat.S_IRWXU # Read, write, and execute for user
stat.S_IRWXG # Read, write and execute for group
stat.S_IRWXO # Read, write, and execute for other

For example, to set all permissions for user only, you would bitwise OR all the user permissions liek this: S_IRUSR|S_IWUSR|SIXUSR. This example below shows how to grant all permissions, the equivalent to chmod 777.

import os
import stat

# Equivalent to `chmod 777`
os.chmod('test.txt', stat.S_IRWXU|stat.S_IRWXG|stat.S_IRWXO)

Change ownership (chown)

There are two functions to change ownership: shutil.chown() and os.chown(). The one from shutil is built on top of the os version so we'll focus on that one.

from shutil import chown

# You can use username or uid
# Must provide at least one of user/group
chown('test.txt', user='nanodano')
chown('test.txt', group='sudo')
chown('test.txt', user='root', group='root')
chown('test.txt', user=0, group=0)  # root uids

Get file timestamp

The function os.stat() can be used to get information about a file including its timestamps. It returns an os.statresult object.

The times returned are in seconds:

st_atime - last access time st_mtime - last modify time st_ctime - create time (or last metadata change in *nix)

import os

stats = os.stat('test.txt')
print(stats.st_atime)
print(stats.st_mtime)
print(stats.st_ctime)
print(type(stats.st_mtime))  # <class 'float'>

# If you want more readable/usable format use datetime package
import datetime
date_object = datetime.datetime.fromtimestamp(stats.st_ctime)
print(date_object)
print(date_object.strftime('%Y-%m-%d-%H:%M'))

Set file timestamp

The function os.utime() lets you update the timestamp of a file.

import os
import datetime

# To simply update the timestamps to the current time:
os.utime('test.txt')

# To specify the timestamp to set, in seconds as tuple:
# (access_time, modify_time)
time_in_seconds = datetime.datetime.now().timestamp()
os.utime('test.txt', times=(time_in_seconds, time_in_seconds))

Create a file

To create a new file you simply need to open it, and it will automatically create if it does not exist. To see if a file exists first, use os.exists(). When opening, you will need the w flag at minimum, to specify it is for writing, otherwise it defaults to reading and will not create a new file.

my_file = open('new_file.txt', 'w')
my_file.close()

Delete a file

The function os.remove() will let you delete a file:

import os

os.remove('test.txt')

Copy a file

You can always copy a file by opening the first one, reading it byte-by-byte, and writing the contents to a new file, but there is a more convenient way with shutil.copy().

import shutil

shutil.copy('test.txt', 'test_duplicate.txt')

Move a file

You have a couple options for moving/renaming a file:

Let's look at shutil.move():

import shutil

shutil.move('test.txt', 'data.txt')

Get file extension

Use os.path.splitext() to split a file path apart and grab the file extension. The return value is a tuple that includes the filename and the extension separated. The file extension includes the ..

import os

print(os.path.splitext('test.txt'))
# Output: ('test', '.txt')

Write to a text file

To write a file, open it with w mode. You can use the with statement to automatically handle closing the file. Use write() to output data to the file.

with open('test.txt', 'w') as my_file:
    my_file.write('Hello, world!\n')

Append to a binary file

To open a file for append, include the a flag instead of w, but it will still create the file if it does not exist. To open a file in binary mode instead of text, include the b flag. Use write() just like you would with text, but you can include raw bytes.

with open('data.dat', 'ab') as my_binary_file:
    my_binary_file.write(b'\x00\x00\xFF\xFF')

Read entire file contents in to memory

To read the entire file contents in to memory at once, you can simply call read() with no parameters. This applies to text and binary files.

Here is an example with a text file where you get a str object:

with open('test.txt') as my_file:
    data = my_file.read()
    print(type(data))  # <class 'str'>
    print(len(data))
    print(data)

Here is an example with a binary file where you get a bytes object:

with open('test.txt', 'rb') as my_binary_file:
    data = my_binary_file.read()
    print(type(data))  # <class 'bytes'>
    print(len(data))
    print(data)

Read all lines from a text file

To open a text file for reading, you don't need to provide any flags when opening since those are the default settings. File objects that you have open for reading have a readlines() to easily

with open('test.txt') as my_file:
    lines = my_file.readlines()
    print(len(lines))
    print(type(lines))  # <class 'list'>
    for line in lines:
        print(line.strip())

Read specific number of bytes

To read a specific number of bytes from a file, you just call read() with a parameter that specifies how many bytes to read. After you read, it advances the file cursor forward that many positions as well, so the next read will continue moving forward in the file.

with open('test.txt', 'rb') as my_binary_file:
    one_byte = my_binary_file.read(1)  # Read only a single byte
    print(one_byte)

Seek a position in a file

You may want to jump to the beginning of a file, somewhere in the middle, or directly to the end. With seek() you can jump anywhere you want. If you provide only a single argument, it will take you directly to that byte position. If you specify a second argument, you can tell seek to start relative to the beginning, your current cursor position, or the end of the file.

Options:

os.SEEK_SET - Set position from beginning
os.SEEK_CUR - Seek relative to current cursor position
os.SEEK_END - Seek relative to end of file

Example usage:

import os

with open('test.txt', 'rb') as my_file:
    # Jump to beginning of file
    my_file.seek(0)
    # Equivalent, jump to beginning
    my_file.seek(0, os.SEEK_SET)

    # Read 2 bytes, moving cursor forward
    print(my_file.read(2))
    # Move 2 bytes backwards from current cursor (back to beginning)
    my_file.seek(-2, os.SEEK_CUR)
    # Re-read the same two bytes
    print(my_file.read(2))

    # Go to the very end of the file
    my_file.seek(0, os.SEEK_END)
    # Move back two bytes from current position (end of file)
    my_file.seek(-2, os.SEEK_CUR)
    # Read last two bytes of the file
    print(my_file.read(2))

Is a file a symlink?

Use os.path.islink() to find out if a file is a symbolic link. This will always fetch latest information, where os.is_symlink() may fetch cached results.

import os

print(os.path.islink('test.txt'))

Create a symlink

Use os.symlink() to create a symbolic link.

import os

os.symlink('test.txt', 'symlink_to_test.txt')

Check hard file links

The function os.stat() can be used to get information about a file including info about its hard links. It returns an os.statresult.

import os

stats = os.stat('test.txt')

# Number of hard links (Should always be at least 1, itself)
print(stats.st_nlink)

Create a hard file link

To create a hard link, use os.link():

import os

os.link('test.txt', 'hard_linked_test.txt')

Start a file with default application

The function os.startfile() can be used (only in Windows) to launch a file using its default application. It is like double clicking the file from the Windows Explorer.

import os

os.startfile('text.txt')

Get absolute path of a file

If you only have a filename but you need to full absolute path of the file, you can use os.path.abspath.

import os

print(os.path.abspath('test.txt'))

Get name of current running Python script

The dunder __file__ contains the name of the current running Python script. By itself though, it is only the file name and does not include the full absolute path. To get that, you want to use os.path.abspath().

import os

print(__file__)
print(os.path.abspath(__file__))

Get directory name for a file

Use os.path.dirname to get get just the directory path to a file without including the file itself.

import os

# Get the directory name where the current running Python file resides
print(os.path.dirname(os.path.abspath(__file__)))

Working with directories

While files and directories have a lot in common, and are ultimately both treated as files, there are some operations you want to perform on a directory that you don't necessarily want to do on files. Let's look at some common tasks with directories.

Print current working directory

Use os.getcwd() to get the directory you are currently in, just like pwd.

import os

print(os.getcwd())

Change working directory

Use os.chdir() to change directories just like using cd. You can use .. and relative or absolute paths.

import os

os.chdir('/')

Join directory paths

You can use os.path.join() to join directories using the system-specific directory separator. This is better than manually combining directories with plus signs and slashes yourself.

import os

# Generate a relative path using system-specific directory separators
print(os.path.join('path', 'to', 'myfile.txt'))

Normalize a path

If you have an ugly path that has redundant slashes or includes relative operators, you can clean it up using the os.path.normpath() function.

import os

ugly_path = '/home//nanodano/../nanodano//test.txt'
pretty_path = os.path.normpath(ugly_path)
print(pretty_path)  # /home/nanodano/test.txt

Make directories

There are a few options for creating directories:

Let's look at each one:

import os

# Make a single directory
os.mkdir('new_directory')
import os

# Create a directory including parent directories
os.makedirs('path/to/new/dirs')
from pathlib import Path

# Create a directory including parent directories
target_path = Path('/path/to/new/dirs')
target_path.mkdir(parents=True, exist_ok=True)

Is a directory a mount point?

Use os.path.ismount() to see if a directory is a mount point. For example, in Windows Subsystem for Linux (WSL), /mnt/c is the default mount point to your Windows drive.

import os

os.path.ismount('/mnt/c')

List contents of a directory

Use os.listdir() to get the contents of a directory, excluding . and ..:

import os

dir_contents = os.listdir(os.getcwd())
print(type(dir_contents))  # <class 'list'>
print(dir_contents)
for entry in dir_contents:
    print(entry)

Check if path is file or directory

Building on the previous example to list directory contents, we can inspect the entry to see if it is a file or a directory. You can use os.path.isdir() and os.path.isfile() to check if they are a file or directory.

import os

print(os.path.isdir('test.txt'))
print(os.path.isfile('test.txt'))

Walk a directory

You can use os.walk() to recursively go through directories.

Here is a very basic example of how to use it:

import os

# Recursively go through all directories
for (root_path, directories, files) in os.walk(os.getcwd()):
    print(root_path)
    print(directories)
    print(files)

For more detailed examples, see my tutorial Walk a Directory in Python.

Copy directory tree

Use shutil.copytree() to recursively copy a directory and its contents.

import shutil

shutil.copytree('original_dir', 'copied_dir')
# In 3.8, the `dirs_exist_ok` option is available
# shutil.copytree('original_dir', 'copied_dir', dirs_exist_ok=True)

Delete a directory

Use os.rmdir() to remove a single directory.

import os

# Remove a single directory
os.rmdir('my_useless_dir')

Delete directory tree

To remove an entire directory tree, including sub-directories, you can use shutil.rmtree().

import shutil

# Recursively delete the directory and anything inside
shutil.rmtree('my_useless_dir')

Get user home directory

The function os.path.expanduser() will let you use the special character ~ to represent the user's home directory.

import os

print(os.path.expanduser('~/Downloads'))

Conclusion

After reading this guide you should have a good understanding of how to do many common tasks with files and directories in Python.

References

Advertisement

Advertisement