In a typical multi-tenant application, various application resources are shared across all tenants. In such cases, the application may need to charge tenants based on usage of each resource. A typical Windows Azure application uses Windows Azure Storage (a combination of blobs, tables, and queues). If applications need a way to meter storage usage for each tenant, Windows Azure Storage Analytics is a feature that can be used for this purpose.
Storage Analytics
Storage Analytics provides two views into storage usage: Logs and Metrics.
Logs provide details of every executed request against a storage account, including both successful and failed requests. This data enables applications to:
-
Track requests to a storage account. Each log entry has source IP address and timestamp, as well as the specific request executed
-
Analyze data usage trends. With simple queries, an application can easily consume this data
-
Diagnose issues. With data such as server latency and end-to-end latency, performance issues can be detected. With each call’s request status, error conditions are also queryable
Metrics, on the other hand, provides hourly rollups, providing details around the following:
-
Capacity consumed for each storage service (blobs, tables, queues)
-
Number of requests against each storage service
-
Total ingress/egress average latency for each service
Storage Analytics is controlled through a REST API, and hence it may be enabled and configured through any programming language. The Windows Azure SDK for Java provides a complete implementation of this API, making it very simple to enable Storage Analytics from Java applications.
Here, we will look at enabling logging for Blob Storage. Additionally, we recently published CloudNinja for Java to github, a reference application illustrating how to build multi-tenant Java based applications for Windows Azure. CloudNinja for Java incorporates Storage Analytics as described here .
Storage Analytics Logging
When a Storage Account is created, Storage Analytics is disabled by default. As a result, the operations are not logged, and no hourly rollups are provided. Storage Analytics has to be explicitly enabled.
Enabling Storage Analytics Logging for Blob Storage
Enabling Storage Analytics logging is done separately for blobs, tables, and queues (the three storage services). To enable logging for Blob Storage, perform the following simple steps:
1. Retrieve the storage account
2. Create a blob client object referencing the storage account
3. Configure logging properties
4. Upload the logging properties to the Blob Storage service
Retrieve the Storage Account
We first need to retrieve the storage account using the CloudStorageAccount class. The storage account can be retrieved by parsing the connection string using the CloudStorageAccount.parse method. The connection string consists of the default endpoint protocol, storage account name, and storage account key.
Here is the sample code of retrieving the cloud storage account.
public static final String storageConnectionString = DefaultEndpointsProtocol=http;” + “AccountName=your_storage_account;” + “AccountKey=your_storage_account_key”;
CloudStorageAccount storageAccount = CloudStorageAccount.parse(storageConnectionString); |
Create a Blob Client
To access the Blob service, a blob client is required. We use the CloudBlobClient class to get the reference to blobs and containers. We initialize blobClient, the object of CloudBlobClient, using the CloudStorageAccount class. Here is the sample code of creating a blob client.
CloudBlobClient blobClient = storageAccount.createCloudBlobClient(); |
Configure Logging Properties
We must instantiate the object for the LoggingProperties class to enable logging for the Blob Storage service. Here is the example of how we set the logging properties.
LoggingProperties logging = new LoggingProperties();
logging.setVersion(“1.0”);
logging.setRetentionIntervalInDays(1);
EnumSet<LoggingOperations> logOperationTypes = EnumSet.of(LoggingOperations.READ,LoggingOperations.WRITE); logging.setLogOperationTypes(logOperationTypes);
|
In the above code, the property RetentionIntervalInDays sets a retention policy for the $logs container. According to this policy, the container retains the data for the specified number of days, after which the data is automatically purged.
You can also specify the operations that you want to log. The READ, WRITE, and DELETE operations can be logged. Here, we chose to log the READ and WRITE operations.
Upload the logging properties to the Blob Storage service
After defining the logging properties, they must be added to the service properties, which are then uploaded to the Blob Storage service using blobClient.
The following code sample adds LoggingProperties to ServiceProperties and then uploads ServiceProperties using blobClient.
ServiceProperties serviceProperties = new ServiceProperties();
serviceProperties.setLogging(logging);
blobClient.uploadServiceProperties(serviceProperties);
|
After uploading ServiceProperties to the Blob Storage service, Storage Analytics logging is enabled for blob storage and the $logs container is created.
$logs Blob Container
When Storage Analytics logging is enabled, a blob container named $logs is created in the blob namespace of the storage account. The $logs container contains block blobs. The URL format for accessing the $logs container is http://<storage-account-name>.blob.core.windows.net/$logs.
$logs is a reserved container for the Windows Azure Storage service to dump logs. You cannot list or delete this container. The storage account administrator can read and delete log entries that are available in the $logs container but cannot update the entries.
Listing the $logs Blob Container
Within the $logs container, each hour’s logging is stored in its own log file, with the following naming convention:
<service name>/YYYY/MM/DD/hhmm/<Counter>.log
Note that multiple logs may be generated for a given hour, with the Counter value being 0-based, and the minute value always set to 00. Also note that log data is buffered and flushed after approximately 5 minutes of inactivity or after 4MB have been written. Log files are capped at 150MB.
Log files are generated only if there are requests made within a given hour. For example, if requests are made only between 8 to 9 A.M. during the period of 8 to 11 A.M., the blobs are grouped only for 8 to 9 A.M. No blobs are grouped for 9 to 11 A.M.
The log entries are added to the blobs for the eligible requests made to the service. Log entries are created only for the operations that are defined to be logged in LoggingProperties. For example, we defined only the READ and WRITE operations in LoggingProperties in the Setting the Logging Properties section. As a result, log entries for the DELETE operation will not be created.
To read log entries from the blobs, we first get the URI of the blobs that are available in the $logs container. To list the blobs and subsequently get the blob URIs, we use the listBlobs operation on the $logs container. Here is the sample code to get the URIs of the blobs.
BufferedReader bufferedReader = null; String line = null; CloudBlobContainer container = blobClient.getContainerReference(“$logs”); for (ListBlobItem blobItem : container.listBlobs(“”, true, null, null, null)) { // Getting URI URI blobUri = blobItem.getUri(); } |
As a response to the blobItem.getUri request, we get the blob URIs in the following format, as described earlier:
http://<storage-account-name>.blob.core.windows.net/$logs/blob <service-name>/<year>/<month>/<date>/<hour-minute>/<counter>.log
After getting the blob URI, we can read the blobs using the same URI. The following code uses the blob URI to reads the blobs.
CloudBlob blob = container.getBlockBlobReference(blobItem.getUri()); OutputStream stream = new ByteArrayOutputStream(); blob.download(stream); InputStream inputStream = new ByteArrayInputStream(((ByteArrayOutputStream) stream).toByteArray()); bufferedReader = new BufferedReader(new InputStreamReader(inputStream)); if (bufferedReader != null) { while ((line = bufferedReader.readLine()) != null) { System.out.println(line); } } |
The above code downloads the blob content (log entries) and prints each line of the log. A log entry of an operation contains the operation details such as operation type, request start time, request status, request status code, end-to-end latency, server latency, and packet size of request and response. The format of a log is:
<version-number>;<request-start-time>;<operation-type>;<request-status>;<http-status-code>;<end-to-end-latency-in-ms>;<server-latency-in-ms>;<authentication-type>;<requestor-account-name>;<owner-account-name>;<service-type>;<request-url>;<requested-object-key>;<request-id-header>;<operation-count>;<requestor-ip-address>;<request-version-header>;<request-header-size>;<request-packet-size>;<response-header-size>;<response-packet-size>;<response-content-length>;<request-md5>;<server-md5>;<etag-identifier>;<last-modified-time>;<conditions-used>;<user-agent-header>;<referrer-header>;<client-request-id>
Following is the sample log entry of an operation on a blob.
1.0;2012-03-26T05:47:33.7661087Z;PutBlob;Success;201;5;5;authenticated; <requestor-account-name>;<owner-account-name>;blob;”http://<storage-account-name>.windows.net/container-name-13/testBlob?timeout=90″;”/<storage-account-name>/container-name-13/testBlob”;251460ca-a1ff-494c-a6e1-0a4249ee7ec2;0;203.199.147.5:60470;2011-08-18;459;25;225;0;25;;”tq2Xp9kjmUeZN2aI0jPnGA==”;”"0x8CED92B128A5EBA"”;Monday, 26-Mar-12 05:47:32 GMT;;”WA-Storage/Client v0.1.2″;;
|
More information about the log format is available in the Storage Analytics Log Format section of the MSDN Library Web site.
Ingress and Egress Computation
Ingress refers to incoming data and Egress refers to outgoing data. We can determine the total Ingress and Egress of blob storage by parsing the logs. Ingress and Egress can be stored in SQL Azure table and can then be retrieved to analyze the Blob Storage usage.
Parsing the Logs
We need to write our own parsing logic that splits each log entry using a semi colon (;), as a delimiter to obtain the field request-packet-size and response-packet-size. The request-packet-size field provides the Ingress and response-packet-size provides the Egress of the blob storage. After parsing the logs, the total Ingress and Egress can be calculated by summing up the packet size of the same type of operation.
Cost of Enabling Storage Analytics Logging
Enabling Storage Analytics logging generates logs. This logging results in blob creation, which generates billable storage transactions, as do Delete operations. The storage used by the analytical data also counts toward total billable storage usage. However, if you configured a data retention policy, you are not charged for Delete transactions when Storage Analytics deletes old logging.
Impact on production performance
Enabling Storage Analytics logging results in some additional storage transactions against Blob Storage. However, since logging data is buffered, the number of storage transactions is fairly low and shouldn’t cause any noticeable performance impact.
This walkthrough focused on enabling Storage Analytics logging specifically for Blob Storage, but the same techniques apply to Tables and Queues.