Implement and manage storage
- 8/5/2024
Implementing and managing storage is one of the most important aspects of building or deploying a new solution using Azure. There are several services and features available for use, and each has its own place. Azure Storage is the underlying storage for most of the services in Azure. It provides service for the storage and retrieval of blobs and files, and it has services that are available for storing large volumes of data through tables. Azure Storage includes a fast and reliable messaging service for application developers with queues. This chapter reviews how to implement and manage storage with an emphasis on Azure storage accounts.
Skills covered in this chapter:
Skill 2.1 Configure access to storage
Skill 2.2: Configure and manage storage accounts
Skill 2.3: Configure Azure Files and Azure Blob Storage
Skill 2.1: Configure access to storage
An Azure storage account is a resource that you create that is used to store data objects such as blobs, files, queues, tables, and disks. Data in an Azure storage account is durable and highly available, secure, massively scalable, and accessible from anywhere in the world over HTTP or HTTPS.
Create and configure storage accounts
Azure storage accounts provide a cloud-based storage service that is highly scalable, available, performant, and durable. Within each storage account, a number of separate storage services are provided:
Blobs Provides a highly scalable service for storing arbitrary data objects such as text or binary data.
Tables Provides a NoSQL-style store for storing structured data. Unlike a relational database, tables in Azure Storage do not require a fixed schema, so different entries in the same table can have different fields.
Queues Provides reliable message queueing between application components.
Files Provides managed file shares that can be used by Azure VMs or on-premises servers.
Disks Provides a persistent storage volume for Azure VM that can be attached as a virtual hard disk.
There are three types of storage blobs: block blobs, append blobs, and page blobs. Page blobs are generally used to store VHD files when deploying unmanaged disks. (Unmanaged disks are an older disk storage technology for Azure virtual machines. Managed disks are recommended for new deployments.)
When creating a storage account, there are several options that must be set: Performance Tier, Account Kind, Replication Option, and Access Tier. There are some interactions between these settings. For example, only the Standard performance tier allows you to choose the access tier. The following sections describe each of these settings. We then describe how to create storage accounts using the Azure portal, PowerShell, and Azure CLI.
Storage account names
When you name an Azure storage account, you need to remember these points:
The storage account name must be globally unique across all existing storage account names in Azure.
The name must be between 3 and 24 characters and can contain only lowercase letters and numbers.
Performance tiers
When creating a storage account, you must choose between the Standard and Premium performance tiers. This setting cannot be changed later.
Standard This tier supports all storage services: blobs, tables, files, queues, and unmanaged Azure virtual machine disks. It uses magnetic disks to provide cost-efficient and reliable storage.
Premium This tier is designed to support workloads with greater demands on I/O and is backed by high-performance SSD disks. Premium storage accounts support block blobs, page blobs, and file shares.
Account types
There are three possible storage account types for the Standard tier: StorageV2 (General-Purpose V2), Storage (General-Purpose V1), and BlobStorage. There are four possible storage account types for the Premium tier: StorageV2 (General-Purpose V2), Storage (General-Purpose V1), BlockBlobStorage, and FileStorage. Table 2-1 shows the features for each kind of account. Key points to remember are
The Blob Storage account is a specialized storage account used to store Block Blobs and Append Blobs. You can’t store Page Blobs in these accounts; therefore, you can’t use them for unmanaged disks.
Only General-Purpose V2 and Blob Storage accounts support the Hot, Cool, and Archive access tiers.
TABLE 2-1 Storage account types and their supported features
|
General-Purpose V2 |
General-Purpose V1 |
Blob Storage |
Block Blob Storage |
File Storage |
---|---|---|---|---|---|
Services supported |
Blob, File, Queue, Table |
Blob, File, Queue, Table |
Blob (Block Blobs and Append Blobs only) |
Blob (Block Blobs and Append Blobs only) |
File only |
Unmanaged Disk (Page Blob) support |
Yes |
Yes |
No |
No |
No |
Supported Performance Tiers |
Standard Premium |
Standard Premium |
Standard |
Premium |
Premium |
Supported Access Tiers |
Hot, Cool, Archive |
N/A |
Hot, Cool, Archive |
N/A |
N/A |
Replication Options |
LRS, ZRS, GRS, RA-GRS, GZRS, RA-GZRS |
LRS, GRS, RA-GRS |
LRS, GRS, RA-GRS |
LRS, ZRS |
LRS, ZRS |
General-Purpose V1 and Blob Storage accounts can both be upgraded to a General-Purpose V2 account. This operation is irreversible. No other changes to the account kind are supported.
Replication options
When you create a storage account, you can also specify how your data will be replicated for redundancy and resistance to failure. There are four options, as described in Table 2-2.
TABLE 2-2 Storage account replication options
Replication Type |
Description |
---|---|
Locally redundant storage (LRS) |
Makes three synchronous copies of your data within a single datacenter. Available for General-Purpose or Blob Storage accounts, at both the Standard and Premium Performance tiers. |
Zone redundant storage (ZRS) |
Makes three synchronous copies to three separate availability zones within a single region. Available for General-Purpose V2 storage accounts only, at the Standard Performance tier only. Also available for Block Blob Storage and File Storage accounts. |
Geographically redundant storage (GRS) |
This is the same as LRS (three local synchronous copies), plus three additional asynchronous copies to a second Azure region hundreds of miles away from the primary region. Data replication typically occurs within 15 minutes, although no SLA is provided. Available for General-Purpose or Blob Storage accounts, at the Standard Performance tier only. |
Read access geographically redundant storage (RA-GRS) |
This has the same capabilities as GRS, plus you have read-only access to the data in the secondary data center. Available for General-Purpose or Blob Storage accounts, at the Standard Performance tier only. |
Geographically zone redundant storage (GZRS) |
This is the same as ZRS (three synchronous copies across multiple availability zones in the selected region), plus three additional asynchronous copies to a different Azure region hundreds of miles away from the primary region. Data replication typically occurs within 15 minutes, although no SLA is provided. Available for General-Purpose v2 storage accounts only, at the Standard Performance tier only. |
Read access geographically zone redundant storage (RA-GZRS) |
This has the same capabilities as GZRS, plus you have read-only access to the data in the secondary data center. Available for General-Purpose V2 storage accounts only, at the Standard Performance tier only. |
Access tiers
Azure Blob Storage supports four access tiers: Hot, Cool, Cold, and Archive. Each represents a trade-off of availability and cost. There is no trade-off on the durability (probability of data loss), which is defined by the SKU and replication, not the access tier.
The tiers are as follows:
Hot This access tier is used to store frequently accessed objects. Relative to other tiers, data access costs are low while storage costs are higher.
Cool This access tier is used to store large amounts of data that is not accessed frequently and that is stored for at least 30 days. The availability SLA can vary depending on the replication model selected. Relative to the Hot tier, data access costs are higher and storage costs are lower.
Cold This access tier is used for data that is rarely accessed or modified but needs to be accessible without delay. Data in this tier should be stored for at least 90 days. The Cold tier pricing model has lower storage capacity costs but higher access costs compared to cool and hot tiers.
Archive This access tier is used to archive data for long-term storage that is accessed rarely, can tolerate several hours of retrieval latency, and will remain in the Archive tier for at least 180 days. This tier is the most cost-effective option for storing data, but accessing that data is more expensive than accessing data in other tiers. Blob rehydration might take up to 15 hours before the blob is accessible.
New blobs will default to the access tier that is set at the storage account level, though you can override that at the blob level by setting a different access tier, including the archive tier.
Create an Azure storage account
To create a storage account using the Azure portal, type storage accounts in the search box. On the Storage Accounts blade, click Create to open the Create A Storage Account blade (see Figure 2-1). You must choose a unique name for the storage account. Storage account names must be globally unique and may only contain lowercase characters and digits. Select the Azure region (Location), the performance tier, and replication mode for the account. The blade adjusts based on the settings you choose so that you cannot select an unsupported feature combination.
FIGURE 2.1 Creating an Azure storage account using the Azure portal
The Advanced tab of the Create A Storage Account blade is shown in Figure 2-2. This tab defines additional security settings, hierarchical namespace support, and access protocols.
FIGURE 2.2 The advanced settings that can be set when creating an Azure storage account using the portal
The Networking tab of the Create A Storage Account blade is shown in Figure 2-3. On this tab, choose to maintain storage account access either publicly by choosing Enable Public Access From All Networks or privately by choosing Disable Public Access And Use Private Access.
FIGURE 2.3 The networking properties that can be set when creating an Azure storage account using the portal
The Data Protection tab provides options for configuring the recovery, tracking, and access control of the storage account. This includes soft delete options, retention periods, blob versioning, and version-level immutability support. Figure 2-4 shows the Data Protection tab.
FIGURE 2.4 The data protection properties that can be set when creating an Azure storage account using the portal
The Encryption tab provides options for configuring the encryption type, support for customer-managed keys, and infrastructure encryption. By default, storage accounts are encrypted using Microsoft-managed keys. However, you can configure customer-managed keys to encrypt data using your own keys. Figure 2-5 shows the Encryption tab.
FIGURE 2-5 The encryption properties that can be set when creating an Azure storage account using the portal
Configure Azure Storage firewalls and virtual networks
Storage accounts are managed through Azure Resource Manager. Management operations are authenticated and authorized using Microsoft Entra ID RBAC. Each storage service exposes its own endpoint used to manage the data in that storage service (blobs in Blob Storage, entities in tables, and so on). These service-specific endpoints are not exposed through Azure Resource Manager; instead, they are (by default) internet-facing endpoints.
Access to these internet-facing storage endpoints must be secured, and Azure Storage provides several ways to do so. In this section, you will review the network-level access controls: the storage firewall and service endpoints. This section also discusses Blob Storage access levels. The following sections then describe the application-level controls: shared access signatures and access keys. In later sections, you will learn about Azure Storage replication and how to leverage Microsoft Entra ID authentication for a storage account.
Storage firewall
Using the storage firewall, you can limit access to specific IP addresses or an IP address range. It applies to all storage services endpoints (blobs, tables, queues, and files). For example, by limiting access to the IP address range of your company, access from other locations will be blocked. Service endpoints are used to restrict access to specific subnets within an Azure virtual network.
To configure the storage firewall using the Azure portal, open the storage account blade and click Networking. Under Public Network Access, select Enabled From Selected Virtual Networks And IP Addresses to reveal the Firewall and Virtual Networks settings, as shown in Figure 2-6.
When accessing the storage account via the internet, use the storage firewall to specify the internet-facing source IP addresses (for example, 32.54.231.0/24, as shown in Figure 2-6) which will make the storage requests. All internet traffic is denied, except the defined IP addresses in the storage firewall. You can specify a list of either individual IPv4 addresses or IPv4 CIDR address ranges. (CIDR notation is explained in Skill 4.1 in Chapter 4, “Configure and manage virtual networking.”)
FIGURE 2.6 Configuring a storage account firewall and virtual network service endpoint access
The storage firewall includes an option to allow access from trusted Microsoft services. As an example, these services include Azure Backup, Azure Site Recovery, Azure Networking, and more. For example, it will allow access to storage for NSG flow logs if Allow Trusted Microsoft Services To Access This Account is selected. Separately, you can enable Allow Read Access To Storage Logging From Any Network or Allow Read Access To Storage Metrics From Any Network to allow read-only access to storage metrics and logs.
Virtual network service endpoints
In some scenarios, a storage account is only accessed from within an Azure virtual network. In this case, it is desirable from a security standpoint to block all internet access. Configuring virtual network service endpoints for your Azure storage account, you can remove access from the public internet and only allow traffic from a virtual network for improved security.
Another benefit of using service endpoints is optimized routing. Service endpoints create a direct network route from the virtual network to the storage service. If forced tunneling is being used to force internet traffic to your on-premises network or to another network appliance, requests to Azure Storage will follow that same route. By using service endpoints, you can use a direct route to the storage account instead of the on-premises route, so no additional latency is incurred.
Configuring service endpoints requires two steps. First, to update the subnet settings, you should choose your virtual network from the Virtual Networks blade. Then select Subnets on the left under Settings. Click the subnet you plan to configure to access the subnet settings. After selecting the desired subnet, under Service Endpoints, choose Microsoft.Storage from the Services drop-down menu. This creates the route from the subnet to the storage service but does not restrict which storage account the virtual network can use. Figure 2-7 shows the subnet settings, including the service endpoint configuration.
FIGURE 2-7 Configuring a subnet with a service endpoint for Azure Storage
The second step is to configure which virtual networks can access a particular storage account. From the storage account blade, click Networking. Under Public Network Access, click Enabled From Selected Virtual Networks And IP Addresses to reveal the Firewall and Virtual Network settings, as shown previously in Figure 2-1. Under Virtual Networks, select Add Existing Virtual Network to add the virtual networks and subnets that should have access to this storage account.
Blob Storage access levels
Storage accounts support an additional access control mechanism that is limited only to Blob Storage. By default, no public read access is enabled for anonymous users, and only users with rights granted through RBAC or with the storage account name and key will have access to the stored blobs. To enable anonymous user access, you must enable Allow Blob Anonymous Access (shown in Figure 2-8) and configure the container access level (shown in Figure 2-9).
FIGURE 2.8 Storage account configuration
FIGURE 2-9 Blob Storage access levels
The anonymous access level for a container can be specified during creation, or modified after it has been created. The supported levels of blob containers are as follows:
Private Only principals with permissions can access the container and its blobs. Anonymous access is denied.
Blob Only blobs within the container can be accessed anonymously.
Container Blobs and their containers can be accessed anonymously.
You can change the access level through the Azure portal, Azure PowerShell, Azure CLI, programmatically using the REST API, or by using Azure Storage Explorer. The access level is configured separately on each blob container.
A shared access signature token (SAS token) is a URI query string parameter that grants access to containers, blobs, queues, and/or tables. Use a SAS token to grant access to a client or service that should not have access to the entire contents of the storage account (and therefore, should not have access to the storage account keys) but still requires secure authentication. By distributing a SAS URI to these clients, you can grant them access to a specific resource, for a specified period of time, and with a specified set of permissions. SAS tokens are commonly used to read and write the data to users’ storage accounts. Also, SAS tokens are widely used to copy blobs or files to another storage account.
Create and use shared access signature (SAS) tokens
There are a few different ways you can create a SAS token. A SAS token is a way to granularly control how a client can access data in an Azure storage account. You can also use an account-level SAS to access the account itself. You can control many things, such as what services and resources the client can access, what permission the client has, how long the token is valid for, and more.
This section examines how to create SAS tokens using various methods. The simplest way to create one is by using the Azure portal. Browse to the Azure storage account and open the Shared Access Signature blade (see Figure 2-10). You can check the services, resource types, and permissions based on specific requirements, along with the duration for the SAS token validity and the IP addresses that are providing access. Lastly, you have an option to choose which key you want to use as the signing key for this token.
FIGURE 2.10 Creating a shared access signature using the Azure portal
Once the token is generated, it will be listed along with connection string and SAS URLs, as shown in Figure 2-11.
FIGURE 2.11 Generated SAS token with connection string and SAS URLs
Also, you can create SAS tokens using Storage Explorer or the command-line tools (or programmatically using the REST APIs/SDK). To create a SAS token using Storage Explorer, you need to first select the resource (storage account, container, blob, and so on) for which the SAS token needs to be created. Then right-click the resource and select Get Shared Access Signature. Figure 2-12 demonstrates how to create a SAS token using Azure Storage Explorer.
FIGURE 2-12 Creating a shared access signature using Azure Storage Explorer
Use shared access signatures
Each SAS token is a query string parameter that can be appended to the full URI of the blob or other storage resource for which the SAS token was created. Create the SAS URI by appending the SAS token to the full URI of the blob or other storage resource.
The following example shows the combination in more detail. Suppose the storage account name is examref, the blob container name is examrefcontainer, and the blob path is sample-file.png. The full URI to the blob in storage is
https://examrefstorage.blob.core.windows.net/examrefcontainer/sample-file.png
The combined URI with the generated SAS token is
https://examrefstorage.blob.core.windows.net/examrefcontainer/sample-file.png?sv=2024- 01-02&ss=bfqt&srt=sco&sp=rwdlacupx&se=2024-02-02T08:50:14Z&st=2024-01-01T00:50:14Z&spr=h ttps&sig=65tNhZtj2lu0tih8HQtK7aEL9YCIpGGprZocXjiQ%2Fko%3D
Currently, stored access policy is not supported for account-level SAS.
Use user delegation SAS
You can also create user delegation SAS using Microsoft Entra ID credentials. The user delegation SAS is only supported by Blob Storage, and it can grant access to containers and blobs. Currently, SAS is not supported for user delegation SAS.
Configure stored access policies
A SAS token incorporates the access parameters (start and end time, permissions, and so on) as part of the token. The parameters cannot be changed without generating a new token, and the only way to revoke an existing token before its expiry time is to regenerate the storage account key used to generate the token or to delete the blob. In practice, these limitations can make standard SAS tokens difficult to manage.
Stored access policies allow the parameters for a SAS token to be decoupled from the token itself. The access policy specifies the start time, end time, and access permissions, and the access policy is created independently of the SAS tokens. SAS tokens are generated that reference the stored access policy instead of embedding the access parameters explicitly.
With this arrangement, the parameters of existing tokens can be modified by simply editing the stored access policy. Existing SAS tokens remain valid and use the updated parameters. You can revoke the SAS token by deleting the access policy, renaming it (changing the identifier), or changing the expiry time.
Figure 2-13 shows the creation of stored access policies in the Azure portal.
FIGURE 2.13 Creating stored access policies using the Azure portal
Figure 2-14 shows stored access policies being created in Azure Storage Explorer.
FIGURE 2.14 Creating stored access policies using Azure Storage Explorer
To use the created policies, reference them by name when creating a SAS token using Storage Explorer or when creating a SAS token using PowerShell or the CLI tools.
Manage access keys
The simplest way to manage access to a storage account is to use access keys. With the storage account name and an access key to the Azure storage account, you have full access to all data in all services within the storage account. You can create, read, update, and delete containers, blobs, tables, queues, and file shares. In addition, you have full administrative access to everything other than the storage account itself. (You cannot delete the storage account or change settings on the storage account, such as its type.)
Applications will use the storage account name and key for access to Azure Storage. Sometimes, this is to grant access by generating a SAS token, and sometimes, it is for direct access with the name and key.
To access the storage account name and key, open the storage account from within the Azure portal and click Access Keys. Figure 2-15 shows the primary and secondary access keys for a storage account.
FIGURE 2.15 Access keys for an Azure storage account
Each storage account has two access keys. This means you can modify applications to use the second key instead of the first and then regenerate the first key. This technique is known as “key rolling” or “key rotation.” You can reset the primary key with no downtime for applications that directly access storage using an access key.
Storage account access keys can be regenerated using the Azure portal or the command-line tools. In PowerShell, this is accomplished with the New-AzStorageAccountKey cmdlet; with Azure CLI, you will use the az storage account keys renew command.
Managing access keys in Azure Key Vault
It is important to protect the storage account access keys because they provide full access to the storage account. Azure Key Vault helps safeguard cryptographic keys and secrets used by cloud applications and services, such as authentication keys, storage account keys, data encryption keys, and certificate private keys.
Keys in Azure Key Vault can be protected in software or by using hardware security modules (HSMs). HSM keys can be generated in place or imported. Importing keys is often referred to as bring your own key, or BYOK.
Accessing and unencrypting the stored keys is typically done by a developer, although keys from Key Vault can also be accessed from ARM templates during deployment.