I'm trying to list all so-called folders
and sub-folders
in an s3
bucket. Now, as I am trying to list all the folders in a path recursively I am not using withDelimeter()
function. All the so-called folder
names should end with /
and this is my logic to list all the folders and sub-folders.
Here's the scala
code (Intentionally not pasting the catch
code here):
val awsCredentials = new BasicAWSCredentials(awsKey, awsSecretKey)
val client = new AmazonS3Client(awsCredentials)
def listFoldersRecursively(bucketName: String, fullPath: String): List[String] = {
try {
val objects = client.listObjects(bucketName).getObjectSummaries
val listObjectsRequest = new ListObjectsRequest()
.withPrefix(fullPath)
.withBucketName(bucketName)
val folderPaths = client
.listObjects(listObjectsRequest)
.getObjectSummaries()
.map(_.getKey)
folderPaths.filter(_.endsWith("/")).toList
}
}
Here's the structure of my bucket
through an s3 client
Here's the list I am getting using this scala
code
Without any apparent pattern, many folders are missing from the list of retrieved folders. I did not use
client.listObjects(listObjectsRequest).getCommonPrefixes.toList
because it was returning empty list for some reason.
P.S: Couldn't add photos in post directly because of being a new user.
Without any apparent pattern, many folders are missing from the list of retrieved folders.
Here's your problem: you are assuming there should always be objects with keys ending in /
to symbolize folders.
This is an incorrect assumption. They will only be there if you created them, either via the S3 console or the API. There's no reason to expect them, as S3 doesn't actually need them or use them for anything, and the S3 service does not create them spontaneously, itself.
If you use the API to upload an object with key foo/bar.txt
, this does not create the foo/
folder as a distinct object. It will appear as a folder in the console for convenience, but it isn't there unless at some point you deliberately created it.
Of course, the only way to upload such an object with the console is to "create" the folder unless it already appears -- but appears in the console does not necessarily equate to exists as a distinct object.
Filtering on endsWith("/")
is invalid logic.
This is why the underlying API includes CommonPrefixes
with each ListObjects response if delimiter
and prefix
are specified. This is a list of the next level of "folders", which you have to recursively drill down into in order to find the next level.
If you specify a prefix, all keys that contain the same string between the prefix and the first occurrence of the delimiter after the prefix are grouped under a single result element called CommonPrefixes. If you don't specify the prefix parameter, the substring starts at the beginning of the key. The keys that are grouped under the CommonPrefixes result element are not returned elsewhere in the response.
https://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html
You need to access this functionality with whatever library you or using, or, you need to iterate the entire list of keys and discover the actual common prefixes on /
boundaries using string splitting.