Skip to content

hydro_s3

The hydro_s3 module provides utilities for interacting with S3-compatible storage services, supporting both MinIO and AWS S3.

MinIO Functions

minio_upload_file

1
def minio_upload_file(client: Minio, bucket_name: str, object_name: str, file_path: str) -> list

Uploads a file to MinIO S3-compatible storage.

Example:

1
2
3
4
from minio import Minio
client = Minio('play.min.io', access_key='...', secret_key='...')
objects = minio_upload_file(client, 'mybucket', 'data.csv', './data.csv')
print(f"Bucket contents: {objects}")

minio_download_file

1
def minio_download_file(client: Minio, bucket_name: str, object_name: str, file_path: str, version_id: str = None) -> None

Downloads a file from MinIO S3-compatible storage.

Example:

1
2
3
from minio import Minio
client = Minio('play.min.io', access_key='...', secret_key='...')
minio_download_file(client, 'mybucket', 'data.csv', './downloaded.csv')

AWS S3 Functions

boto3_upload_file

1
def boto3_upload_file(client, bucket_name: str, object_name: str, file_path: str) -> list

Uploads a file to AWS S3 using boto3.

Example:

1
2
3
4
import boto3
client = boto3.client('s3')
objects = boto3_upload_file(client, 'mybucket', 'data.csv', './data.csv')
print(f"Bucket contents: {objects}")

boto3_download_file

1
def boto3_download_file(client, bucket_name: str, object_name: str, file_path: str) -> None

Downloads a file from AWS S3 using boto3.

Example:

1
2
3
import boto3
client = boto3.client('s3')
boto3_download_file(client, 'mybucket', 'data.csv', './downloaded.csv')

Common Features

  • Automatic bucket creation if not exists
  • UTF-8 text file handling
  • Version control support (MinIO)
  • List bucket contents after upload
  • Simple and consistent API for both services

API Reference

Author: Wenyu Ouyang Date: 2023-10-27 15:08:16 LastEditTime: 2023-10-27 15:31:13 LastEditors: Wenyu Ouyang Description: Some functions to deal with s3 file system FilePath: /hydroutils/hydroutils/hydro_s3.py Copyright (c) 2023-2024 Wenyu Ouyang. All rights reserved.

boto3_download_file(client, bucket_name, object_name, file_path)

Download a file from S3 using boto3.

This function downloads an object from S3 storage to a local file using the boto3 client. It provides a simple wrapper around boto3's download_file method.

Parameters:

Name Type Description Default
client client

Initialized boto3 S3 client instance.

required
bucket_name str

Name of the bucket containing the object.

required
object_name str

Name of the object to download.

required
file_path str

Local path where the file should be saved.

required
Example

import boto3 client = boto3.client('s3', ... endpoint_url='http://localhost:9000', ... aws_access_key_id='access_key', ... aws_secret_access_key='secret_key') boto3_download_file(client, ... 'mybucket', ... 'data/file.csv', ... '/local/path/file.csv')

Source code in hydroutils/hydro_s3.py
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
def boto3_download_file(client, bucket_name, object_name, file_path: str):
    """Download a file from S3 using boto3.

    This function downloads an object from S3 storage to a local file using
    the boto3 client. It provides a simple wrapper around boto3's download_file
    method.

    Args:
        client (boto3.client): Initialized boto3 S3 client instance.
        bucket_name (str): Name of the bucket containing the object.
        object_name (str): Name of the object to download.
        file_path (str): Local path where the file should be saved.

    Example:
        >>> import boto3
        >>> client = boto3.client('s3',
        ...                      endpoint_url='http://localhost:9000',
        ...                      aws_access_key_id='access_key',
        ...                      aws_secret_access_key='secret_key')
        >>> boto3_download_file(client,
        ...                    'mybucket',
        ...                    'data/file.csv',
        ...                    '/local/path/file.csv')
    """
    client.download_file(bucket_name, object_name, file_path)

boto3_upload_file(client, bucket_name, object_name, file_path)

Upload a file to S3 using boto3.

This function uploads a local file to S3 storage using the boto3 client. If the specified bucket doesn't exist, it will be created automatically. After upload, it returns a list of all objects in the bucket.

Parameters:

Name Type Description Default
client client

Initialized boto3 S3 client instance.

required
bucket_name str

Name of the bucket to upload to.

required
object_name str

Name to give the object in S3 storage.

required
file_path str

Path to the local file to upload.

required

Returns:

Type Description

list[str]: List of all object keys in the bucket after upload.

Note
  • Creates bucket if it doesn't exist
  • Uses upload_file for efficient file upload
  • Lists all objects in bucket after upload
  • Handles bucket listing and creation using boto3's API
Example

import boto3 client = boto3.client('s3', ... endpoint_url='http://localhost:9000', ... aws_access_key_id='access_key', ... aws_secret_access_key='secret_key') objects = boto3_upload_file(client, ... 'mybucket', ... 'data/file.csv', ... '/local/path/file.csv') print(objects) ['data/file.csv', 'data/other.csv']

Source code in hydroutils/hydro_s3.py
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
def boto3_upload_file(client, bucket_name, object_name, file_path):
    """Upload a file to S3 using boto3.

    This function uploads a local file to S3 storage using the boto3 client.
    If the specified bucket doesn't exist, it will be created automatically.
    After upload, it returns a list of all objects in the bucket.

    Args:
        client (boto3.client): Initialized boto3 S3 client instance.
        bucket_name (str): Name of the bucket to upload to.
        object_name (str): Name to give the object in S3 storage.
        file_path (str): Path to the local file to upload.

    Returns:
        list[str]: List of all object keys in the bucket after upload.

    Note:
        - Creates bucket if it doesn't exist
        - Uses upload_file for efficient file upload
        - Lists all objects in bucket after upload
        - Handles bucket listing and creation using boto3's API

    Example:
        >>> import boto3
        >>> client = boto3.client('s3',
        ...                      endpoint_url='http://localhost:9000',
        ...                      aws_access_key_id='access_key',
        ...                      aws_secret_access_key='secret_key')
        >>> objects = boto3_upload_file(client,
        ...                            'mybucket',
        ...                            'data/file.csv',
        ...                            '/local/path/file.csv')
        >>> print(objects)
        ['data/file.csv', 'data/other.csv']
    """
    # Make a bucket
    bucket_names = [dic["Name"] for dic in client.list_buckets()["Buckets"]]
    if bucket_name not in bucket_names:
        client.create_bucket(Bucket=bucket_name)
    # Upload an object
    client.upload_file(file_path, bucket_name, object_name)
    return [dic["Key"] for dic in client.list_objects(Bucket=bucket_name)["Contents"]]

minio_download_file(client, bucket_name, object_name, file_path, version_id=None)

Download a file from MinIO S3-compatible storage.

This function downloads an object from MinIO storage to a local file. It supports versioned objects and handles UTF-8 encoded text files. The function ensures proper cleanup of resources after download.

Parameters:

Name Type Description Default
client Minio

Initialized MinIO client instance.

required
bucket_name str

Name of the bucket containing the object.

required
object_name str

Name of the object to download.

required
file_path str

Local path where the file should be saved.

required
version_id str

Version ID for versioned objects. Defaults to None.

None
Note
  • Assumes UTF-8 encoding for text files
  • Properly closes and releases connection after download
  • Uses context managers for file handling
  • Handles cleanup in finally block for robustness
Example

client = Minio('play.min.io', ... access_key='access_key', ... secret_key='secret_key') minio_download_file(client, ... 'mybucket', ... 'data/file.csv', ... '/local/path/file.csv')

Source code in hydroutils/hydro_s3.py
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
def minio_download_file(
    client: Minio, bucket_name, object_name, file_path: str, version_id=None
):
    """Download a file from MinIO S3-compatible storage.

    This function downloads an object from MinIO storage to a local file. It
    supports versioned objects and handles UTF-8 encoded text files. The function
    ensures proper cleanup of resources after download.

    Args:
        client (Minio): Initialized MinIO client instance.
        bucket_name (str): Name of the bucket containing the object.
        object_name (str): Name of the object to download.
        file_path (str): Local path where the file should be saved.
        version_id (str, optional): Version ID for versioned objects.
            Defaults to None.

    Note:
        - Assumes UTF-8 encoding for text files
        - Properly closes and releases connection after download
        - Uses context managers for file handling
        - Handles cleanup in finally block for robustness

    Example:
        >>> client = Minio('play.min.io',
        ...               access_key='access_key',
        ...               secret_key='secret_key')
        >>> minio_download_file(client,
        ...                    'mybucket',
        ...                    'data/file.csv',
        ...                    '/local/path/file.csv')
    """
    try:
        response = client.get_object(bucket_name, object_name, version_id)
        res_csv: str = response.data.decode("utf8")
        with open(file_path, "w+") as fp:
            fp.write(res_csv)
    finally:
        response.close()
        response.release_conn()

minio_upload_file(client, bucket_name, object_name, file_path)

Upload a file to MinIO S3-compatible storage.

This function uploads a local file to MinIO storage. If the specified bucket doesn't exist, it will be created automatically. After upload, it returns a list of all objects in the bucket.

Parameters:

Name Type Description Default
client Minio

Initialized MinIO client instance.

required
bucket_name str

Name of the bucket to upload to.

required
object_name str

Name to give the object in MinIO storage.

required
file_path str

Path to the local file to upload.

required

Returns:

Type Description

list[str]: List of all object names in the bucket after upload.

Note
  • Creates bucket if it doesn't exist
  • Uses fput_object for efficient file upload
  • Lists all objects recursively after upload
Example

client = Minio('play.min.io', ... access_key='access_key', ... secret_key='secret_key') objects = minio_upload_file(client, ... 'mybucket', ... 'data/file.csv', ... '/local/path/file.csv') print(objects) ['data/file.csv', 'data/other.csv']

Source code in hydroutils/hydro_s3.py
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
def minio_upload_file(client, bucket_name, object_name, file_path):
    """Upload a file to MinIO S3-compatible storage.

    This function uploads a local file to MinIO storage. If the specified bucket
    doesn't exist, it will be created automatically. After upload, it returns a
    list of all objects in the bucket.

    Args:
        client (minio.Minio): Initialized MinIO client instance.
        bucket_name (str): Name of the bucket to upload to.
        object_name (str): Name to give the object in MinIO storage.
        file_path (str): Path to the local file to upload.

    Returns:
        list[str]: List of all object names in the bucket after upload.

    Note:
        - Creates bucket if it doesn't exist
        - Uses fput_object for efficient file upload
        - Lists all objects recursively after upload

    Example:
        >>> client = Minio('play.min.io',
        ...               access_key='access_key',
        ...               secret_key='secret_key')
        >>> objects = minio_upload_file(client,
        ...                            'mybucket',
        ...                            'data/file.csv',
        ...                            '/local/path/file.csv')
        >>> print(objects)
        ['data/file.csv', 'data/other.csv']
    """
    # Make a bucket
    bucket_names = [bucket.name for bucket in client.list_buckets()]
    if bucket_name not in bucket_names:
        client.make_bucket(bucket_name)
    # Upload an object
    client.fput_object(bucket_name, object_name, file_path)
    # List objects
    objects = client.list_objects(bucket_name, recursive=True)
    return [obj.object_name for obj in objects]