Amazon S3 Client NOT listing all folders in the bucket

saadi :

I'm trying to list all so-called folders and sub-folders in an s3 bucket. Now, as I am trying to list all the folders in a path recursively I am not using withDelimeter() function. All the so-called folder names should end with / and this is my logic to list all the folders and sub-folders.

Here's the scala code (Intentionally not pasting the catch code here):

val awsCredentials = new BasicAWSCredentials(awsKey, awsSecretKey)
val client = new AmazonS3Client(awsCredentials)
def listFoldersRecursively(bucketName: String, fullPath: String): List[String] = {
  try {
    val objects = client.listObjects(bucketName).getObjectSummaries
    val listObjectsRequest = new ListObjectsRequest()
      .withPrefix(fullPath)
      .withBucketName(bucketName)
    val folderPaths = client
      .listObjects(listObjectsRequest)
      .getObjectSummaries()
      .map(_.getKey)
    folderPaths.filter(_.endsWith("/")).toList
  }
}

Here's the structure of my bucket through an s3 client

Here's the list I am getting using this scala code

Without any apparent pattern, many folders are missing from the list of retrieved folders. I did not use

client.listObjects(listObjectsRequest).getCommonPrefixes.toList

because it was returning empty list for some reason.

P.S: Couldn't add photos in post directly because of being a new user.

Michael - sqlbot :

Without any apparent pattern, many folders are missing from the list of retrieved folders.

Here's your problem: you are assuming there should always be objects with keys ending in / to symbolize folders.

This is an incorrect assumption. They will only be there if you created them, either via the S3 console or the API. There's no reason to expect them, as S3 doesn't actually need them or use them for anything, and the S3 service does not create them spontaneously, itself.

If you use the API to upload an object with key foo/bar.txt, this does not create the foo/ folder as a distinct object. It will appear as a folder in the console for convenience, but it isn't there unless at some point you deliberately created it.

Of course, the only way to upload such an object with the console is to "create" the folder unless it already appears -- but appears in the console does not necessarily equate to exists as a distinct object.

Filtering on endsWith("/") is invalid logic.

This is why the underlying API includes CommonPrefixes with each ListObjects response if delimiter and prefix are specified. This is a list of the next level of "folders", which you have to recursively drill down into in order to find the next level.

If you specify a prefix, all keys that contain the same string between the prefix and the first occurrence of the delimiter after the prefix are grouped under a single result element called CommonPrefixes. If you don't specify the prefix parameter, the substring starts at the beginning of the key. The keys that are grouped under the CommonPrefixes result element are not returned elsewhere in the response.

https://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html

You need to access this functionality with whatever library you or using, or, you need to iterate the entire list of keys and discover the actual common prefixes on / boundaries using string splitting.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=116524&siteId=1