CodewCaro.
When working in IT, how to handle storage is always part of first step discussions when drawing the architecture. This topic is ever evolving, when beginning the project thinking of choosing, storage on AWS S3 or Azure Blob storage for uploading those images is something given, when continuing the work uploading thumbnails to the same S3 or Blob storage can be beneficial for making that frontend the most streamlined and performant. Then comes data handling of how to keep the media organized, considering a DynamoDB for keeping filenames, ID's, upload dates, when it was last updated, access level and so fourth.
But how can we handle storage in an efficient manner on Azure? What kind of tiers should we choose when thinking of retention rate, scalability and for it to not be too expensive? This article will try to dive in to that. Specifically Azure, since it is a popular cloud provider here in Sweden.
If you're diving into Azure and wondering how to store all your data in the cloud—whether it's files, logs, or even backups—you’re going to need an Azure Storage Account. Think of it like your personal cloud locker, but supercharged for all kinds of data and scalable. Whether you're storing photos, running a web app, or managing a massive database, this is where it is going to be done.
Azure’s storage is pretty versatile. Here’s what you can chuck in there:
Azure Blob Storage is Microsoft's go-to for object storage, and it’s great for storing massive amounts of unstructured data —think text files, images, videos, or pretty much anything that doesn’t fit into a traditional database, even those .pickle files. It's optimized for non relational data, making it flexible for a ton of use cases.
Here’s what it’s great for:
You can access your blobs from anywhere using HTTP or HTTPS. You’ve got plenty of options for working with Blob Storage — use a URL, the Azure Storage REST API, Azure PowerShell, the CLI. Azure has support with libraries for .NET, Node.js, Python, PHP, Java and Ruby.
Blob Storage offers flexibility, whether you’re building an app or just need a reliable way to store and stream data.
Use case: Frequently accessed data, such as active content or workloads.
Storage cost:
Around $0.0184 per GB/month
Around 0.2061 SEK per GB/month (varies slightly by region).
Access Costs:
Lower than other tiers, with read operations costing around $0.004 per 10,000 reads (or 0.0448 SEK per 10,000 reads) and $0.05 per 10,000 writes (or 0.5600 SEK per 10,000 writes).
Retrieval Time: Instant access to your data with low latency.
Best For: Applications with high read/write operations, active websites, and data processing tasks that need real-time data.
Use case: Infrequently accessed data that still needs to be available when required.
Storage cost:
Significantly cheaper than Hot storage, at around $0.01 per GB/month
Or 0.1120 SEK per GB/month.
Access Costs:
Higher than the Hot tier, with read operations costing $0.01 per 10,000 reads (or 0.1120 SEK per 10,000 reads) and $0.10 per 10,000 writes (or 1.1200 SEK per 10,000 writes).
Retrieval Time: Data is immediately available but with slightly higher access costs compared to Hot.
Best For: Data that is accessed occasionally, such as backups, disaster recovery files, or archived user-generated content.
Use case: Rarely accessed data that can tolerate retrieval delays.
Storage cost:
Extremely low, around $0.00099 per GB/month
Or 0.0111 SEK per GB/month.
Access costs:
High retrieval costs, with $5 per 10,000 list operations and read operations costing $0.02 per 10,000 reads (or 0.2240 SEK per 10,000 reads) and $0.50 per 10,000 writes (or 5.6000 SEK per 10,000 writes). Additionally, there is a rehydration fee to move data to Hot or Cool for access.
Retrieval time: Data retrieval can take several hours (up to 15 hours depending on size).
Best for: Long-term data retention, regulatory archives, compliance records, or backup data that you rarely need but must keep for extended periods.
There are ways to securely replicate your data in Azure. The replication options are the following:
Locally Redundant Storage (LRS)
Zone Redundant Storage (ZRS)
Geo-Redundant Storage (GRS)
Geo-Zone-Redundant Storage (GZRS)
Now what do they really mean?
LRS is the budget-friendly option, but it offers the lowest durability. Your data is only stored in one data center, so if that center has a disaster (like fire or flooding), your data might be lost. LRS works well if:
You can easily recreate your data if needed.
Your data changes constantly, like in live feeds.
You’re restricted to keeping data in one region for compliance.
ZRS steps it up by replicating your data across three storage clusters in a single region, each in its own availability zone. This means if one zone goes down, your data stays accessible. ZRS is great for low-latency and high-performance needs but isn’t available in every region yet.
GRS replicates your data to a secondary region far away (hundreds of miles) from the primary one. It’s designed to handle regional outages, with an incredible 16 9’s durability (99.99999999999999%). You can choose:
GRS: Replication to a secondary region, with the data readable only after a Microsoft-initiated failover.
RA-GRS: Like GRS, but you can read from the secondary region anytime, even before a failover.
GZRS gives you the best of both worlds—combining ZRS’s local zone replication with GRS’s regional disaster protection. Your data is replicated across three availability zones in your primary region and also backed up in a secondary region for extra durability. Like GRS, it boasts 16 9’s durability and can optionally enable read access to the secondary region (RA-GZRS).
Every object you store in Azure's storage has a unique URL address. Your storage account name forms the subdomain part of the URL address. It is then the combination of the subdomain and the domain name, which is the service that makes the endpoint for you storage account.
Container service? storageaccountname + blob + core.windows.net = storageaccountname.blob.core.windows.net.
Table service? storageaccountname + table + core.windows.net = storageaccountname.table.core.windows.net.
Queue service? storageaccountname + queue + core.windows.net = storageaccountname.queue.core.windows.net.
File service? storageaccountname + file+ core.windows.net = storageaccountname.file.core.windows.net.
I wrote thing(s) because there is only one thing i've discovered as of yet to keep in the mind. Keep me posted you say ;)
Shared Access Signatures (SAS) tokens enables for fine-grained time limited access to resources in a storage accounts. These would be great for serverless scenarios where you want to dynamically generate temporary access without the long lived connections. In this way you wont have to define firewall and virtual network rules. This can be controlled by other headers on the frontend so only certain applications are allowed to speak to the storage account.
When comparing SAS tokens to firewall and VNet rules for controlling access to azure storage, they differ in several ways.
SAS tokens provide a more fine-grained approach, allowing access to specific objects like blobs or queues, with the ability to define exact permissions such as read or write and set time limits on access. In contrast, firewall and VNet (Virtual Network) rules offer a broader level of control, where access is based on IP ranges or which virtual networks are permitted.
SAS tokens are resource-specific, meaning you can grant access to individual blobs, queues, or specific resources. Firewall and VNet rules, however, apply to the entire storage account, meaning any rule affects all the resources within that account.
SAS tokens are designed for temporary access, as they expire after a defined time period. firewall rules don't expire automatically and need to be manually updated if access needs to be changed.
SAS tokens can be used without relying on any network configurations, making them a more independent option. Firewall and VNet rules, on the other hand, depend on network configurations and require defining IP ranges or specifying virtual networks.
SAS tokens have minimal setup requirements and can be generated dynamically in code, making them ideal for scenarios where you need agile, on-demand access. Firewall and VNet rules require managing network infrastructure, which introduces more complexity.
Finally when it comes to security, SAS tokens offers flexibility as you can easily revoke access by rotating keys or deleting tokens. Firewall and VNet rules provide strong network level security, but they aren't as adaptable when it comes to controlling access to specific resources.
Linking the most comprehensive guide for storage i've found below. Happy coding!