
AI servers drive storage demand not simply because “there are more servers,” but because training, inference, multimodal data, RAG retrieval, and enterprise private data integration all make data move frequently across GPUs, CPUs, memory, SSDs, and HDDs. HBM solves the bandwidth bottleneck on the GPU side. Server DRAM supports system-side workloads. Enterprise SSDs handle hot data and inference cache. HDDs provide capacity for data lakes and archives. If you follow AI infrastructure or the storage supply chain, you need to look at speed, capacity, cost, power consumption, and supply cycles at the same time.

The core reason AI servers drive storage demand is that AI workloads have evolved from purely “compute-intensive” to “data-intensive + memory-intensive + storage-intensive.” Training requires continuous reading of large datasets. Inference requires loading model weights and retaining context states. Multimodal models process images, audio, video, and logs. Adding more GPUs alone is not enough. Data must move across HBM, DRAM, SSDs, and HDDs in layers so that compute resources can keep working efficiently.
Traditional servers often face bottlenecks in CPUs, networks, or disk I/O. AI server bottlenecks are more complex. GPUs can perform massive matrix operations in extremely short periods, but if model weights, activations, KV cache, training samples, or checkpoints cannot reach the compute units in time, GPUs will wait for data. You can think of an AI data center as a layered funnel: HBM closest to the GPU is the fastest and most expensive layer; CPU-side DRAM offers larger capacity but lower bandwidth; NVMe SSDs handle hot data and low-latency reads/writes; HDDs provide large-scale, low-cost capacity pools.
NVIDIA’s GB300 NVL72 specifications illustrate this trend. A rack-scale system integrates 72 Blackwell Ultra GPUs, 36 Grace CPUs, 37TB of fast memory, 20TB of GPU memory, and 17TB of CPU memory within the same architecture. In other words, an AI server is no longer just “GPUs plugged into a motherboard.” It is a rack-scale data system where GPUs, CPUs, memory, interconnects, and storage are designed together.
Different AI scenarios correspond to different storage layers:
| AI Scenario | Main Data | Key Metrics | Main Storage Layer |
|---|---|---|---|
| Large model training | Training sets, parameters, activations | Bandwidth, throughput, stability | HBM, DRAM, SSD, HDD |
| Model fine-tuning | Private data, checkpoints | Read/write speed, recovery speed | SSD, DRAM, HDD |
| Online inference | Model weights, KV cache | Latency, concurrency, capacity | HBM, DRAM, SSD |
| RAG retrieval | Vector indexes, document libraries | IOPS, query latency | Enterprise SSD, DRAM |
| Multimodal AI | Images, videos, audio | Capacity, throughput, cost | SSD, HDD |
| Backup and archiving | Historical versions, logs, compliance data | $/TB, reliability | HDD, object storage |
In this chain, HBM handles the “fastest data,” server DRAM handles “system-side active data,” enterprise SSDs handle “frequent reads/writes and hot data,” and HDDs handle “long-term large-capacity data.” Therefore, AI servers do not benefit only one storage category. They reorganize the entire storage pyramid.
Summary: AI servers drive storage demand not because of server count alone, but because AI workloads simultaneously expand data scale, access frequency, context length, and system concurrency. Training requires high-throughput data input. Inference requires low-latency caching and longer context. Multimodal models require more raw materials to be stored. Enterprise AI also brings private data into model systems. HBM, server DRAM, enterprise SSDs, and HDDs each solve problems at different layers. The closer the layer is to the GPU, the more bandwidth and latency matter. The closer the layer is to the data lake, the more capacity, cost, and reliability matter. When analyzing AI storage demand, you should separate speed-driven demand from capacity-driven demand, rather than assuming one type of storage will replace all others.

The role of HBM in AI servers is to solve the GPU-side “bandwidth wall” and “memory capacity wall.” The larger an AI model becomes, the more space its parameters, activations, and KV cache require. GPUs need to read massive amounts of data in extremely short periods. HBM sits close to the GPU package and offers much higher bandwidth than ordinary server DRAM, so it directly affects training throughput, inference concurrency, long-context capability, and cost per token.
You can think of HBM as the “high-speed workbench” of an AI GPU. If the workbench is too small, model weights and context cannot fit. If data movement is too slow, GPU compute resources are left idle. When introducing Blackwell Ultra, NVIDIA noted that a single GPU can be configured with 288GB of HBM3e for larger models, longer contexts, and inference workloads. This shows that HBM upgrades are not only about capacity expansion. Capacity, bandwidth, energy efficiency, and packaging capability are improving together.
HBM also has higher supply barriers than ordinary DRAM. It requires multi-layer stacking, TSV, advanced packaging, yield control, and customer qualification. It cannot be quickly replaced by ordinary memory production lines. Micron announced that its HBM4 36GB 12-high for NVIDIA Vera Rubin entered high-volume production, with per-stack bandwidth above 2.8TB/s and improved power efficiency. This indicates that HBM has moved from being a “high-end memory product” to becoming part of AI platform roadmaps.
The division of labor between HBM and other memory types can be understood as follows:
| Type | Location | Strengths | Limitations | AI Scenarios |
|---|---|---|---|---|
| HBM | Near the GPU package | Extremely high bandwidth, low latency | High cost, tight supply | Training, inference, KV cache |
| Server DRAM | CPU and system memory | Larger capacity, general-purpose | Lower bandwidth than HBM | Scheduling, caching, databases |
| LPDDR/CXL memory | Expanded memory layer | Better power efficiency or scalability | Ecosystem still evolving | Inference expansion, memory pooling |
| SSD | Local or network storage | Larger capacity, lower cost | Higher latency than memory | Checkpointing, hot data |
HBM demand also affects the ordinary DRAM market in reverse. In its discussion of AI server demand, TrendForce noted that DRAM suppliers continue to shift capacity toward HBM and server applications, while contract prices for traditional DRAM and NAND Flash are also influenced by tight supply and demand. This means HBM is not just a standalone product. It changes wafer capacity allocation, capital expenditure, and customer lock-in strategies.
For investors, tracking HBM should not stop at “how much prices have increased.” Several variables matter:
Summary: HBM is the storage category most directly tied to GPU computing in AI servers. It solves GPU-side bandwidth, capacity, and energy-efficiency problems. Large model training, inference concurrency, long-context workloads, and agentic AI all increase pressure on HBM. But HBM’s value does not come only from “strong demand.” It also comes from process technology, packaging, yield, customer qualification, and long-term supply relationships. HBM does not simply replace ordinary DRAM. It sits in a high-speed layer closer to the GPU. When analyzing the HBM supply chain, you need to track GPU platform roadmaps, HBM capacity per card, supplier share, capacity ramp-up, and downstream customer lock-in.

Server DRAM is driven by AI because a large number of system-side tasks outside GPUs still require CPUs and host memory. HBM handles GPU-local computation, but data preprocessing, task scheduling, databases, vector retrieval, network protocol stacks, inference service frameworks, virtualization, and container management all rely on server DRAM. The more AI inference spreads, the more likely general server memory configurations are to move higher.
A common misunderstanding is that server DRAM may become less important because HBM is closer to the GPU. The opposite is true. AI clusters are heterogeneous systems. GPUs do not complete every task independently. Model services need CPUs to handle request distribution, token scheduling, cache management, logging, and security policies. RAG systems also call vector databases and enterprise document libraries. Private cloud deployments add containers, monitoring, permissions, and data governance. All these tasks require higher-capacity, higher-performance, and lower-power server memory.
In its third-quarter 2025 results, SK hynix noted that strong sales of HBM and high-performance server products supported quarterly performance, and stated that supply discussions for next year’s HBM had been completed. This type of commentary shows that server-side demand is not concentrated only in HBM. It also includes high-capacity DDR5, enterprise SSDs, and the broader memory portfolio for AI servers.
The main demand scenarios for server DRAM include:
| Scenario | DRAM Role | Demand Change |
|---|---|---|
| AI training preprocessing | Cleaning, splitting, and caching training data | Larger datasets increase memory pressure |
| Online inference services | Request scheduling, batching, context management | Higher concurrency requires higher memory configurations |
| RAG and vector databases | Index caching, query acceleration | Enterprise data integration increases memory demand |
| Multi-tenant cloud services | Containers, virtualization, isolation | Higher capacity and stability are required |
| CPU-GPU coordination | Data movement, protocol stacks, service orchestration | System-side memory pressure rises |
CXL, memory pooling, and larger-capacity modules are also worth watching. The significance of CXL is that it can free memory expansion from single-server motherboard limitations, allowing data centers to configure memory resources more flexibly. It will not immediately replace traditional DRAM, but it may change server memory procurement structures. Some high-frequency workloads will continue to use local DRAM, while some capacity-expansion workloads may use CXL memory pools or new module types.
The investment logic for server DRAM differs from HBM. HBM is a high-barrier, high-ASP, customer-concentrated product. Server DRAM has broader exposure and is influenced by AI inference, general-purpose servers, enterprise IT spending, and cloud capital expenditure together. Its elasticity may not be as extreme as HBM, but its durability depends more on whether AI inference can truly spread from a few hyperscale model platforms to enterprise applications.
Summary: Server DRAM is driven by AI because AI servers are not powered by GPUs alone. CPU-side scheduling, data preprocessing, vector retrieval, cache management, inference service frameworks, and enterprise private cloud deployments all require larger and higher-performance system memory. HBM solves GPU-side bandwidth, while server DRAM supports system-side workloads. They divide responsibilities rather than replace each other. When assessing server DRAM demand, you should look beyond training clusters and also track inference service deployment, RAG applications, enterprise AI privatization, cloud server refresh cycles, and new architectures such as CXL.
The core role of enterprise SSDs in AI servers is to handle frequent reads/writes, hot data, model loading, checkpointing, vector retrieval, and some inference cache workloads. They are not as close to GPUs as HBM, and they do not focus on low-cost capacity like HDDs. Instead, they occupy a critical layer between speed and capacity. The more AI inference, RAG, and agentic AI develop, the more important enterprise SSDs become in the data path.
During training, SSDs are often used to read training samples, write checkpoints, store intermediate results, and support task recovery. Large training jobs run for long periods. Once interrupted, they need to resume quickly from checkpoints. If the storage system writes too slowly, it may not only affect recovery but also slow training progress. During inference, the role of SSDs expands further: model weight loading, embedding indexes, vector databases, user context, and retrieval result caching may all rely on low-latency NVMe SSDs.
In AI to Reshape the Global Technology Landscape, TrendForce noted that QLC SSDs are being used in warm and cold AI data storage layers, such as model checkpointing and dataset archiving, and expected QLC SSDs to account for about 30% of the enterprise SSD market by 2026. This shows that enterprise SSD growth comes not only from high-performance TLC products, but also from QLC products that emphasize capacity cost.
KV cache is another area worth watching. Long-context inference and multi-turn conversations can make KV cache grow quickly, and GPU HBM and host DRAM may not be enough. In its observations on the enterprise SSD market, TrendForce noted that AI Agent services and CSP procurement demand drove enterprise SSD revenue to a record high, and mentioned that DRAM cost and capacity constraints are pushing the market to include high-performance SSDs in the memory hierarchy. Academic research is also exploring SSD-backed KV cache. For example, Tutti introduces NVMe SSDs into long-context LLM serving to relieve HBM and DRAM capacity constraints.
Enterprise SSDs also need to be divided into layers:
| SSD Type | Strengths | Better-Fit Scenarios | Notes |
|---|---|---|---|
| High-performance TLC SSD | Low latency, high endurance | Online databases, RAG, hot data | Higher cost |
| QLC SSD | Larger capacity, lower cost | Checkpointing, datasets, warm data | Write endurance should be evaluated |
| SCM / XL-FLASH-type SSD | Lower latency | KV cache, GPU direct storage | Ecosystem and cost still evolving |
| Consumer SSD | Low cost | Non-critical local tasks | Not suitable for intensive enterprise workloads |
When analyzing enterprise SSD demand, you need to distinguish “performance-driven demand” from “capacity-driven demand.” Performance-driven demand comes from low-latency inference, vector databases, and caching. Capacity-driven demand comes from training sets, checkpointing, multimodal materials, and enterprise data lakes. Both types of demand are driven by AI, but they correspond to different products, margins, and suppliers.
Summary: Enterprise SSDs are one of the most underestimated layers in AI storage. They are not as directly tied to GPUs as HBM, and their capacity economics are not as obvious as HDDs, but they handle hot data, checkpointing, model loading, RAG retrieval, and some KV cache workloads. As AI moves from training to large-scale inference, SSDs are shifting from “data storage devices” to part of the inference service path. Analysis should separate TLC, QLC, SCM, and nearline SSDs because they correspond to different needs: low latency, high endurance, capacity cost, and new memory hierarchy expansion.
HDDs remain the capacity foundation of AI data centers because not all data needs to be stored in SSDs or memory. Raw training data, images, videos, audio, logs, backups, compliance records, and historical versions often belong to warm data or cold data. They care more about $/TB, reliability, and large-scale deployment cost. The more data AI produces, the clearer the value of HDDs as low-cost capacity becomes.
The AI era has not made HDDs disappear. Instead, it has clarified their role. SSDs handle hot data and low-latency access. HDDs handle massive data lakes and long-term storage. This is especially true for multimodal models and enterprise private data integration, which generate large amounts of unstructured data. If all data were stored on SSDs, cost, power consumption, and rack density would become major constraints.
Western Digital’s The Long-Term Case for HDD Storage emphasizes that data centers consider TCO, acquisition cost, power consumption, density, performance, and lifecycle when choosing storage, and cites IDC’s view that SSDs have a 5–10x $/TB premium over HDDs. This gap explains why hyperscale data centers may buy large volumes of SSDs while still keeping HDDs in the capacity layer.
Seagate also connected high-capacity drives with AI data centers, data sovereignty, and hybrid data center investment when announcing its 30TB drives. High-capacity nearline HDDs, HAMR, and UltraSMR all pursue the same goal: to increase usable capacity density and reduce long-term data storage cost under the same rack, power, and maintenance constraints.
The division of labor between SSDs and HDDs can be understood as follows:
| Dimension | Enterprise SSD | Enterprise HDD |
|---|---|---|
| Core strength | Low latency, high IOPS | Low $/TB, high capacity |
| Main data | Hot data, indexes, cache | Data lakes, backups, archives |
| AI scenarios | Inference, RAG, checkpointing | Training materials, multimodal data |
| Cost sensitivity | Medium to high | Extremely high |
| Replacement relationship | Replaces some high-frequency HDD scenarios | Retains the large-capacity base role |
The risks for HDDs should also be clear. As QLC SSD capacity grows, it may replace some nearline HDD scenarios. If AI data centers place greater emphasis on power consumption and read speed, SSDs may take over part of the warm data layer. But as long as AI continues to generate massive unstructured data, HDDs still have hard-to-replace economics.
Summary: The value of HDDs in AI data centers is not speed, but the economics of large-scale capacity. AI training, multimodal data, enterprise data lakes, logs, backups, and compliance archives all generate data that does not require real-time access but must be stored for long periods. SSDs will expand in hot data and some warm data scenarios, and QLC SSDs may also erode some HDD use cases, but HDDs still serve as the capacity foundation. When analyzing the HDD supply chain, focus on hyperscale customer orders, high-capacity nearline products, HAMR progress, long-term purchasing agreements, inventory levels, and cost per unit of capacity.
To judge AI storage demand, you need to separate long-term data growth from short-term pricing cycles. In the long run, training data, multimodal content, inference cache, enterprise private data, and data sovereignty all increase storage capacity needs. In the short run, storage prices may be affected by capacity shifts, early customer lock-ins, inventory corrections, and capital expenditure cycles. Real demand exists, but that does not mean prices and stock prices will rise linearly.
The storage industry is naturally cyclical. DRAM, NAND, and HDDs have all gone through “price increases—capacity expansion—inventory buildup—price declines” cycles. AI changes demand structure and customer lock-in strength, but it does not eliminate cycles. When reporting on rising capital expenditure by technology giants for data centers and AI infrastructure, Reuters noted that data storage firms benefited from AI data demand, and that Western Digital and Seagate saw improved orders and market expectations. But if cloud capex slows or customers overstock early, inventory adjustments may still follow.
You can track AI storage demand using the following checklist:
| Observation Area | Key Indicators | Related Categories |
|---|---|---|
| Cloud capex | Data centers, GPU clusters, rack-scale systems | HBM, DRAM, SSD, HDD |
| GPU platform roadmaps | HBM capacity per GPU, bandwidth, delivery pace | HBM |
| Inference deployment | Long context, concurrency, RAG applications | DRAM, SSD |
| Enterprise AI | Private data integration, local deployment, compliance storage | SSD, HDD |
| Storage pricing | DRAM, NAND, enterprise SSD contract prices | DRAM, SSD |
| HDD orders | Nearline HDD, long-term purchase agreements | HDD |
| Inventory levels | Customer, channel, and supplier inventory | All categories |
A trading cost perspective also matters. If you follow AI storage supply chain stocks, such as memory, hard drive, semiconductor equipment, server, and data center companies, you need to study not only financial results and orders but also actual transaction costs. U.S. stock trading costs usually include more than commissions. They may also include platform fees, external agency fees, trading activity fees, and other charges. Biya U.S. stock trading fees state that U.S. stock trading commission is $0, while platform fees, external agency fees, and other charges are subject to the fee center and order page. Availability of relevant services depends on the user’s location, identity verification results, platform rules, and applicable laws and regulations. Public market information and fee structures do not constitute investment advice.
Risk factors are equally important. If AI capex is front-loaded too aggressively, storage suppliers may deliver strong short-term results but face slower orders later. If HBM capacity expands faster than demand, pricing elasticity may weaken. QLC SSDs may replace some HDD scenarios. CXL and memory pooling may change server DRAM procurement structures. Long-term customer agreements can reduce volatility, but they may also limit further upside in pricing.
Summary: AI storage demand is both a long-term trend and a short-term cycle. The long-term trend comes from data scale, inference deployment, multimodal content, enterprise AI, and data sovereignty. The short-term cycle comes from capacity allocation, pricing pace, inventory levels, customer lock-ins, and capital expenditure volatility. For investors, the most important point is not to equate “demand growth” with “all storage companies will continue to benefit.” Different companies have different exposure to HBM, DRAM, NAND, enterprise SSDs, HDDs, and equipment. Their profit elasticity, valuation levels, and cycle risks also differ. A more reliable analysis method is to track product categories, customers, capacity, orders, and prices separately.
If you follow the AI server supply chain, storage should be placed within a broader observation framework. Upstream, you can track HBM, DRAM, NAND, HDDs, and advanced packaging. Midstream, you can look at servers, switches, liquid cooling, power systems, and data center construction. Downstream, you can monitor cloud capex, enterprise AI deployment, and inference usage. You can also use U.S. stock information search to track related company quotes, earnings dates, and basic information, and use Biya to record multi-asset transactions, FX costs, and billing details. If the service is available in your region, you should review platform rules, order fees, and local regulatory requirements before use. The storage supply chain can be volatile, and any trading decision should be based on independent judgment and personal risk tolerance.
AI servers need both HBM and server DRAM because they serve different computing layers. HBM sits close to the GPU and supports high-speed access to model weights, activations, and KV cache. Server DRAM sits closer to the CPU and supports scheduling, caching, databases, inference service frameworks, and system management. They complement rather than replace each other.
Enterprise SSDs in AI inference mainly support model loading, hot data reads, vector retrieval, RAG data access, checkpointing, and some KV cache scenarios. Low latency, high IOPS, and stable write performance can affect inference service quality, but the exact configuration depends on model size, concurrency, context length, and system architecture.
HDDs have not been fully replaced by SSDs because AI data centers still need a low-cost, high-capacity storage foundation. SSDs are better for hot data and low-latency tasks, while HDDs are better for data lakes, training materials, logs, backups, and archives. In layered storage architectures, the two are complementary rather than purely substitutive.
QLC SSDs are better suited for capacity-sensitive AI data center scenarios with moderate access frequency, such as model checkpointing, dataset archiving, warm data, some object storage, and nearline data layers. High-write, high-endurance, and low-latency workloads still require evaluation of TLC SSDs, SCM SSDs, or other enterprise-grade options.
Investors can track cloud capex, AI server shipments, HBM supply agreements, DRAM/NAND contract prices, enterprise SSD revenue, high-capacity HDD orders, and inventory levels. A single price-increase headline is not enough to judge a long-term trend. Financial results, capacity, customer mix, and valuation risk should be analyzed together.
AI storage demand growth does not necessarily benefit all storage companies. Different companies have different exposure to HBM, server DRAM, NAND, enterprise SSDs, HDDs, controllers, and equipment. Profit elasticity also varies. Customer concentration, capacity expansion, cost control, inventory position, and current valuation all need to be considered.
*This article is provided for general information purposes and does not constitute legal, tax or other professional advice from BiyaPay or its subsidiaries and its affiliates, and it is not intended as a substitute for obtaining advice from a financial advisor or any other professional.
We make no representations, warranties or warranties, express or implied, as to the accuracy, completeness or timeliness of the contents of this publication.



