WUSM Data Lake Pricing Guide
When using the WUSM Data Lake for your data exploration and analysis needs, it is important to be aware of the associated costs. The costs are broken down into two categories: storage and compute.
Storage Costs
Storage costs are small and much more predictable when compared to the compute costs. In most cases, storage costs are charged by actual usage at a rate of $25/TB per month. Additionally, we typically see a reduction in the overall size of data when ingesting the data into the data lake storage, as the data is moved into a more compressed format.
Compute Costs
Compute costs are determined based on cluster and SQL warehouse usage. The rates for this usage are determined by the configuration and size of the resources. The following estimates are based on our standard configurations.
Both regular clusters and SQL clusters average $4.50 - $5.00 per hour with DBU (Databricks Usage Charges) charges included. It is important to note that all clusters are serverless, and “turn off” when not being used. Because of this, costs are lower than you would see with a traditional database. Assuming 4 hours of usage per workday, you can expect the total cost to be ~$400/month.
Here is a breakdown of the pricing for regular clusters and SQL clusters:
Regular Clusters
- General compute DBU is $0.45/hour, with an average DBU of 6
- Machines cost approximately $0.75/hour (using F16s) x 3 nodes
- Total cost for regular clusters is around $4.95/hour.
SQL Warehouse
- We use the "small" cluster size, which costs $0.57/hour x 4 nodes.
- DBU for a “small” SQL Warehouse is 12 DBU, which costs $0.22/hour
- Total cost for SQL clusters: around $4.92/hour
Please note that these prices are subject to change and may vary depending on your specific usage patterns and requirements. For more information on pricing for the WUSM Data Lake, please feel free to contact the WUSM Data Lake team.
More information can be found here: Azure Databricks Pricing
OpenAI Costs
With OpenAI being fully integrated in Azure, it is quite easy to deploy WUSM unit specific endpoints to be used with popular models such as GPT, Text Similarity, Text Embeddings, etc. The price for using OpenAI is different for each model and is generally priced at a “Per 1000 tokens”. Understanding what a token is can be challenging, but THIS article helps to break it down a bit. For GPT35 (Most popular model), the price per 1000 tokens is $0.002. For up-to-date pricing, please visit the Azure Open AI site HERE. While it is difficult to predict what an average OpenAI monthly bill might be, we have found that light users average $50-100/month, and heavy users average $150-250/month.
*Please note: The prices outlined in this document are subject to change annually.