boto3 usage

installation

pip install boto3
pip install awscli
aws configure

When prompted, enter access_key_id, secret_access_keyand region. The default location ~/.aws/credentialsis: .

[default]
aws_access_key_id = YOUR_ACCESS_KEY
aws_secret_access_key = YOUR_SECRET_KEY

The storage location for the region ~/.aws/config:

[default]
region=us-east-1

Quick Start

import boto3

# Let's use Amazon S3
s3 = boto3.resource('s3')
# Print out bucket names
for bucket in s3.buckets.all():
    print(bucket.name)
# Upload a new file
data = open('test.jpg', 'rb')
s3.Bucket('my-bucket').put_object(Key='test.jpg', Body=data)

Use Bucket

Create a bucket

import logging
import boto3
from botocore.exceptions import ClientError

def create_bucket(bucket_name):
""" Create an Amazon S3 bucket

:param bucket_name: Unique string name
:return: True if bucket is created, else False
"""

s3 = boto3.client('s3')
try
    s3.create_bucket(Bucket=bucket_name)
except ClientError as e:
    logging.error(e)
    return False
return True

Bucket list

# Retrieve the list of existing buckets
s3 = boto3.client('s3')
response = s3.list_buckets()

# Output the bucket names
print('Existing buckets:')
for bucket in response['Buckets']:
    print(f'  {bucket["Name"]}')

upload files

Basic Usage

s3 provides two ways to upload files: upload_file () and upload_file_obj (). upload_file () will split a large file into a number of parallel upload chunk, so upload_file () fast transmission rate, which is suitable for file uploads has been determined. upload_file_obj () can be used to upload a single-threaded binary stream.
upload_file () Example:

import logging
import boto3
from botocore.exceptions import ClientError


def upload_file(file_name, bucket, object_name=None):
    """Upload a file to an S3 bucket

    :param file_name: File to upload
    :param bucket: Bucket to upload to
    :param object_name: S3 object name. If not specified then file_name is used
    :return: True if file was uploaded, else False
    """

    # If S3 object_name was not specified, use file_name
    if object_name is None:
        object_name = file_name

    # Upload the file
    s3_client = boto3.client('s3')
    try:
        response = s3_client.upload_file(file_name, bucket, object_name)
    except ClientError as e:
        logging.error(e)
        return False
    return True

upload_file_obj () Example:

s3 = boto3.client('s3')
with open("FILE_NAME", "rb") as f:
    s3.upload_fileobj(f, "BUCKET_NAME", "OBJECT_NAME")

upload_file_obj file parameter rb mode only open files.

Client, Bucket, Object offers three types upload_file () and upload_file_obj () two functions, the function of each type provides the same functions are equivalent, there is no good or bad, are free to call the three objects file upload function.

ExtraArgs

ExtraArgs provides other parameters to upload files, these parameters can be used to control the read and write permissions to upload files, meta information. S3Transfer is a very important object that defines the transmission of a number of parameters, in
boto3.s3.transfer.S3Transfer.ALLOWED_UPLOAD_ARGS, the definition of the parameter list ExtraArgs available.

s3.upload_file(
    'FILE_NAME', 'BUCKET_NAME', 'OBJECT_NAME',
    ExtraArgs={'Metadata': {'mykey': 'myvalue'}}
)
s3.upload_file(
    'FILE_NAME', 'BUCKET_NAME', 'OBJECT_NAME',
    ExtraArgs={'ACL': 'public-read'}
)
s3.upload_file(
    'FILE_NAME', 'BUCKET_NAME', 'OBJECT_NAME',
    ExtraArgs={
        'GrantRead': 'uri="http://acs.amazonaws.com/groups/global/AllUsers"',
        'GrantFullControl': 'id="01234567890abcdefg"',
    }
)

Upload process callback function

While upload progress can be achieved by implementing a callback Callback upload while printing.

s3.upload_file(
    'FILE_NAME', 'BUCKET_NAME', 'OBJECT_NAME',
    Callback=ProgressPercentage('FILE_NAME')
)

ProgressPercentage

import os
import sys
import threading

class ProgressPercentage(object):

    def __init__(self, filename):
        self._filename = filename
        self._size = float(os.path.getsize(filename))
        self._seen_so_far = 0
        self._lock = threading.Lock()

    def __call__(self, bytes_amount):
        # To simplify, assume this is hooked up to a single filename
        with self._lock:
            self._seen_so_far += bytes_amount
            percentage = (self._seen_so_far / self._size) * 100
            sys.stdout.write(
                "\r%s  %s / %s  (%.2f%%)" % (
                    self._filename, self._seen_so_far, self._size,
                    percentage))
            sys.stdout.flush()

download file

Download files and upload files is almost perfectly symmetrical, Client, Bucket, Object object provides three download_file (), download_file_obj (). download_file () are parallel, download_file_obj () is a serial, the two functions and also provides a ExtraArgs Callback parameter. boto3.s3.transfer.S3Transfer.ALLOWED_DOWNLOAD_ARGS ExtraArgs describes the available parameters of the download process.

import boto3

s3 = boto3.client('s3')
s3.download_file('BUCKET_NAME', 'OBJECT_NAME', 'FILE_NAME')

with open('FILE_NAME', 'wb') as f:
    s3.download_fileobj('BUCKET_NAME', 'OBJECT_NAME', f)

Transmission Configuration

Uploading files, downloading files, copying files process, AWS SDK will automatically retry management and other network configuration. The default network configuration applicable to most situations, it is only required under special situations modify the transport configuration.
Boto3.s3.transfer.TransferConfig transmission configuration encapsulated in a subject, upload_file () function and the like have a Config parameter takes one TransferConfig object.

Modify multipar threshold

import boto3
from boto3.s3.transfer import TransferConfig

# Set the desired multipart threshold value (5GB)
GB = 1024 ** 3
config = TransferConfig(multipart_threshold=5*GB)

# Perform the transfer
s3 = boto3.client('s3')
s3.upload_file('FILE_NAME', 'BUCKET_NAME', 'OBJECT_NAME', Config=config)

Set the number of concurrent

For upload_file () and download_file () is enabled by default multi-threaded download, in order to reduce bandwidth or increase bandwidth, it can be controlled by the transport configuration.

# To consume less downstream bandwidth, decrease the maximum concurrency
config = TransferConfig(max_concurrency=5)

# Download an S3 object
s3 = boto3.client('s3')
s3.download_file('BUCKET_NAME', 'OBJECT_NAME', 'FILE_NAME', Config=config)

Set concurrent implementation

In boto3, the concurrency is achieved through multi-threading. If you do not use the thread would not be able to achieve concurrency, max_concurrency parameters will be ignored.

# Disable thread use/transfer concurrency
config = TransferConfig(use_threads=False)

s3 = boto3.client('s3')
s3.download_file('BUCKET_NAME', 'OBJECT_NAME', 'FILE_NAME', Config=config)

A Bucket as a static service

Get a bucket of static service configuration

import boto3

# Retrieve the website configuration
s3 = boto3.client('s3')
result = s3.get_bucket_website('BUCKET_NAME')

Set a bucket of static service configuration

# Define the website configuration
website_configuration = {
    'ErrorDocument': {'Key': 'error.html'},
    'IndexDocument': {'Suffix': 'index.html'},
}

# Set the website configuration
s3 = boto3.client('s3')
s3.put_bucket_website('BUCKET_NAME', website_configuration)

To delete a Web site configuration barrel

# Delete the website configuration
s3 = boto3.client('s3')
s3.delete_bucket_website('BUCKET_NAME')

Bucket get a list of permissions

import boto3

# Retrieve a bucket's ACL
s3 = boto3.client('s3')
result = s3.get_bucket_acl(Bucket='my-bucket')
print(result)

presigned URL

Share an Object

import logging
import boto3
from botocore.exceptions import ClientError


def create_presigned_url(bucket_name, object_name, expiration=3600):
    """Generate a presigned URL to share an S3 object

    :param bucket_name: string
    :param object_name: string
    :param expiration: Time in seconds for the presigned URL to remain valid
    :return: Presigned URL as string. If error, returns None.
    """

    # Generate a presigned URL for the S3 object
    s3_client = boto3.client('s3')
    try:
        response = s3_client.generate_presigned_url('get_object',
                                                    Params={'Bucket': bucket_name,
                                                            'Key': object_name},
                                                    ExpiresIn=expiration)
    except ClientError as e:
        logging.error(e)
        return None

    # The response contains the presigned URL
    return response

Directly using a GET request this URL:

import requests    # To install: pip install requests

url = create_presigned_url('BUCKET_NAME', 'OBJECT_NAME')
if url is not None:
    response = requests.get(url)

Other tips

  • Generate a URL, the URL can automatically execute a function, which can achieve dynamic view a Bucket following directory
  • Generate a URL you can upload files

Strategy barrel

design

  1. boto3 provides two levels of interface to access AWS services: High Level of Resource-level interfaces, Low Level of Client interface.
    Client-level interfaces to represent the query to return Dictionary of information resources, Resource-level interface Client interface level object-oriented package, interface, return values mostly Resource object (if the return value is the information of a Resource then), we can return to the object of re-operation (such as delete, modify, etc.).

  2. session
    session is an abstract representation of a set of configuration, all api calls through the same session share a set of configuration. session also access the entrance to all API.
    It is generally obtained and Resource levels Client API objects by the object-level API Code:
    RES = boto3.resource ( 'name-Service')
    Client boto3.client = ( 'name-Service')
  3. resource
    this is the concept of Resource level interface inside, divided Resource Service Resource and Individual Resource. Service Resource represents a service, and Individual Resource represents a service inside the resource

Resource identifier The 3.1
Resource unique identifier, is generally id or url. Obviously Service Resource does not require identifier, because it is through name identification (for example: 'ec2')

3.2 attribute
is actually Resource properties

3.3 action
is to support the operation Resource

SubResource 3.4
Resource child of Resource, Resource object can be obtained by identifier

3.5 collection
collection of children's Resource Resource, you can query the collection, screening (of AWS network will only be produced and issued in the collection to perform operations (traverse, converted to list, batch processing) when the request)

  1. General use

4.1 Resource level

var serviceRes = boto3.resource('ec2') //得到Service Resource
var instance = serviceRes.create_instances(**kwargs) //执行Service Resource级别的操作
var instance = serviceRes.Instance(id) // 通过identifier得到Sub Resource(Individual Resource)

var instances = serviceRes.instances.filter(filter) // 通过filter得到一个Sub Resource的collection

instance.start() //执行Sub Resource上的操作

4.2 Client Level

var ec2Client = boto3.client('ec2') //得到对应服务的client
ec2Client.run_instances(**kwargs) //执行操作

class boto3.session.Session
(aws_access_key_id=None, aws_secret_access_key=None, aws_session_token=None,region_name=None,botocore_session=None, profile_name=None)

ssession storage configuration status and allows you to create client service and resource
parameters:

× aws_access_key_id (string) - the AWS Access Key ID
Key Access the AWS Secret - × aws_secret_access_key (string)
× aws_session_token (string) - the AWS temporary token the session
× region_name (string) - When creating a new connection the default locale
× botocore_session (botocore.session_Session) - use this Botocore session instead of creating a dafault One new new
× profile_name (string) - the name of the configuration file to use, if not given, use the default profile

Create a connection:

boto3

import boto3
s3 = boto3.resource('s3')

Create a Bucket
in boto3, all actions must be passing in keyword parameters, and a bucket configuration must be manually configured

3 votes

s3.create_bucket(Bucket='mybucket')
s3.create_bucket(Bucket='mybucket',CreateBucketConfiguration={'LocationConstraint': 'us-west-1'})

Access a bucket
with resource Boto3 go get a bucket is very convenient, but it does not automatically validate bucket really exist

3 votes

import botocore
bucket = s3.Bucket('mybucket')
exists = True
try:
s3.meta.client.head_bucket(Bucket='mybucket')
except botocore.exceptions.ClientError as e:
# If a client error is thrown, then check that it was a 404 error.
# If it was a 404 error, then the bucket does not exist.
error_code = int(e.response['Error']['Code'])
if error_code == 404:
exists = False

available_profiles
the session certificate available profile
client (service_name, region_name = None, api_version = None, use_ssl = True, verify = None, endpoint = None, aws_acess_key_id = None, aws_secret_key = None, aws_session_token = None config = None)
to create by name a low level of client service
parameters

SERVICE_NAME (string) - the name of the server, for example, 's3' or 'ec2', can be a series of servers to get effective by get_available_services ()
region_name (string) - the name of the region associated with the client, a client end associated with a single area
api_version (string) -. API version to use, the default when creating a new client, botocore will use the latest API version API if you want to use the previous version of the client, you only need to specify this parameter.
use_ssl (boolean) - whether to use SSL, the default is to use SSL, Note: Not all services support no-ssl connection
verify (string or Boolean value) - whether to authenticate ssl certificate, the default SSL certificate authentication is required, you can the following values False - not to authenticate the validity of SSL certificates, SSL is still used, but will not be certified SSL certificate path / to / cert / bundle.pem - to use the CA certificate package file name
endpoint_url - for building full url of the client to use, under normal circumstances, botocore will automatically build an appropriate URL and go to the exchange server. You can go to specify a complete URL, (including http / https model) to override this behavior, if this value is provided, it will be ignored use_ssl
aws_access_key_id (string) - completely optional, if not provided, will be used session configuration,
aws_secret_access_key (string) - similar to the above parameters, session may be automatically provided
aws_session_token (string) - similar to the above
config (boto.client.Config) Advanced client configuration options

The return value
Service Client instance
Events
event emitter of session

get_available_partitions ()
lists the available partitions
Return Type: List
Return Value: returns a list containing the name of the partition

__get_available_regions (service_name, partition_name = 'aws ', allow_non_regional = False)
lists the name of a specific region and the end partition
Returns: an inclusive list of names, for example, [ "US--East. 1"]
get_available_resources ()
to obtain a usable It can be loaded by Session.resources () a list of resources to serve clients,
return type: list
returns:
the service name list
get_available_services ()
to get a pass Session.client () is loaded as a low-level client service
return type: list
return value: service name list of
resource (service_name, region_name = None, api_verson = None, use_ssl = True, verify = None, endpoint_url = None, aws_access_key_id = None, aws_secret_access_key = None, aws_session_token = None, config = None)
by the name of creating a resource services client
return value:
ServiceResource subclass

skills

random read content-range, corresponding to fseek
https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.16

Reference material

https://boto3.amazonaws.com/v1/documentation/api/latest/index.html

Guess you like

Origin www.cnblogs.com/weiyinfu/p/10993205.html