This is a public benchmark logs dataset hosted on Amazon S3. It is intended for testing log ingestion, search, compression, storage, and observability pipeline performance.
https://ctrlb-5tb-benchmark-logs-public.s3.ap-south-1.amazonaws.com
Explore the bucket like the AWS Console — navigate folders and click any file to download it. Open the file browser →
logs-benchmark/YYYY/MM/DD/HH/*.log.gz
manifests/
all-files.txt
all-files.txt.gz
samples-1gb.txt
samples-5gb.txt
samples/
1GB/
5GB/
aws s3 cp \
s3://ctrlb-5tb-benchmark-logs-public/samples/1GB/ \
./ctrlb-logs-sample-1gb \
--recursive \
--no-sign-request
aws s3 cp \
s3://ctrlb-5tb-benchmark-logs-public/samples/5GB/ \
./ctrlb-logs-sample-5gb \
--recursive \
--no-sign-request
Manifest files contain direct HTTPS URLs to log files. This works even if public S3 listing is disabled.
curl -O https://ctrlb-5tb-benchmark-logs-public.s3.ap-south-1.amazonaws.com/manifests/all-files.txt.gz
gunzip all-files.txt.gz
while read url; do
wget -c "$url"
done < all-files.txt
If public bucket listing is enabled, you can use:
aws s3 sync \
s3://ctrlb-5tb-benchmark-logs-public/logs-benchmark/ \
./ctrlb-5tb-benchmark-logs \
--no-sign-request
If listing is not enabled, use the manifest-based download method above.
Log files are gzip-compressed:
*.log.gz
To inspect a file:
gunzip -c file.log.gz | head
wget -c.