---
title: Storage
sort_rank: 5
---
# Storage
Prometheus includes a local on-disk time series database, but also optionally integrates with remote storage systems.
## Local storage
Prometheus's local time series database stores data in a custom, highly efficient format on local storage.
### On-disk layout
Ingested samples are grouped into blocks of two hours. Each two-hour block consists
of a directory containing a chunks subdirectory containing all the time series samples
for that window of time, a metadata file, and an index file (which indexes metric names
and labels to time series in the chunks directory). The samples in the chunks directory
are grouped together into one or more segment files of up to 512MB each by default. When
series are deleted via the API, deletion records are stored in separate tombstone files
(instead of deleting the data immediately from the chunk segments).
The current block for incoming samples is kept in memory and is not fully
persisted. It is secured against crashes by a write-ahead log (WAL) that can be
replayed when the Prometheus server restarts. Write-ahead log files are stored
in the `wal` directory in 128MB segments. These files contain raw data that
has not yet been compacted; thus they are significantly larger than regular block
files. Prometheus will retain a minimum of three write-ahead log files.
High-traffic servers may retain more than three WAL files in order to keep at
least two hours of raw data.
A Prometheus server's data directory looks something like this:
```
./data
├── 01BKGV7JBM69T2G1BGBGM6KB12
│ └── meta.json
├── 01BKGTZQ1SYQJTR4PB43C8PD98
│ ├── chunks
│ │ └── 000001
│ ├── tombstones
│ ├── index
│ └── meta.json
├── 01BKGTZQ1HHWHV8FBJXW1Y3W0K
│ └── meta.json
├── 01BKGV7JC0RY8A6MACW02A2PJD
│ ├── chunks
│ │ └── 000001
│ ├── tombstones
│ ├── index
│ └── meta.json
├── chunks_head
│ └── 000001
└── wal
├── 000000002
└── checkpoint.00000001
└── 00000000
```
Note that a limitation of local storage is that it is not clustered or
replicated. Thus, it is not arbitrarily scalable or durable in the face of
drive or node outages and should be managed like any other single node
database.
[Snapshots](querying/api.md#snapshot) are recommended for backups. Backups
made without snapshots run the risk of losing data that was recorded since
the last WAL sync, which typically happens every two hours. With proper
architecture, it is possible to retain years of data in local storage.
Alternatively, external storage may be used via the
[remote read/write APIs](https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage).
Careful evaluation is required for these systems as they vary greatly in durability,
performance, and efficiency.
For further details on file format, see [TSDB format](/tsdb/docs/format/README.md).
## Compaction
The initial two-hour blocks are eventually compacted into longer blocks in the background.
Compaction will create larger blocks containing data spanning up to 10% of the retention time,
or 31 days, whichever is smaller.
## Operational aspects
Prometheus has several flags that configure local storage. The most important are:
- `--storage.tsdb.path`: Where Prometheus writes its database. Defaults to `data/`.
- `--storage.tsdb.retention.time`: How long to retain samples in storage. If neither
this flag nor `storage.tsdb.retention.size` is set, the retention time defaults to
`15d`. Supported units: y, w, d, h, m, s, ms.
- `--storage.tsdb.retention.size`: The maximum number of bytes of storage blocks to retain.
The oldest data will be removed first. Defaults to `0` or disabled. Units supported:
B, KB, MB, GB, TB, PB, EB. Ex: "512MB". Based on powers-of-2, so 1KB is 1024B. Only
the persistent blocks are deleted to honor this retention although WAL and m-mapped
chunks are counted in the total size. So the minimum requirement for the disk is the
peak space taken by the `wal` (the WAL and Checkpoint) and `chunks_head`
(m-mapped Head chunks) directory combined (peaks every 2 hours).
- `--storage.tsdb.wal-compression`: Enables compression of the write-ahead log (WAL).
Depending on your data, you can expect the WAL size to be halved with little extra
cpu load. This flag was introduced in 2.11.0 and enabled by default in 2.20.0.
Note that once enabled, downgrading Prometheus to a version below 2.11.0 will
require deleting the WAL.
Prometheus stores an average of only 1-2 bytes per sample. Thus, to plan the
capacity of a Prometheus server, you can use the rough formula:
```
needed_disk_space = retention_time_seconds * ingested_samples_per_second * bytes_per_sample
```
To lower the rate of ingested samples, you can either reduce the number of
time series you scrape (fewer targets or fewer series per target), or you
can increase the scrape interval. However, reducing the number of series is
likely more effective, due to compression of samples within a series.
If your local storage becomes corrupted to the point where Prometheus will not
start it is recommended to backup the storage directory and restore the
corrupted block directories from your backups. If you do not have backups the
last resort is to remove the corrupted files. For example you can try removing
individual block directories or the write-ahead-log (wal) files. Note that this
means losing the data for the time range those blocks or wal covers.
CAUTION: Non-POSIX compliant filesystems are not supported for Prometheus'
local storage as unrecoverable corruptions may happen. NFS filesystems
(including AWS's EFS) are not supported. NFS could be POSIX-compliant,
but most implementations are not. It is strongly recommended to use a
local filesystem for reliability.
If both time and size retention policies are specified, whichever triggers first
will be used.
Expired block cleanup happens in the background. It may take up to two hours
to remove expired blocks. Blocks must be fully expired before they are removed.
## Right-Sizing Retention Size
If you are utilizing `storage.tsdb.retention.size` to set a size limit, you
will want to consider the right size for this value relative to the storage you
have allocated for Prometheus. It is wise to reduce the retention size to provide
a buffer, ensuring that older entries will be removed before the allocated storage
for Prometheus becomes full.
At present, we recommend setting the retention size to, at most, 80-85% of your
allocated Prometheus disk space. This increases the likelihood that older entries
will be removed prior to hitting any disk limitations.
## Remote storage integrations
Prometheus's local storage is limited to a single node's scalability and durability.
Instead of trying to solve clustered storage in Prometheus itself, Prometheus offers
a set of interfaces that allow integrating with remote storage systems.
### Overview
Prometheus integrates with remote storage systems in four ways:
- Prometheus can write samples that it ingests to a remote URL in a [Remote Write format](https://prometheus.io/docs/specs/remote_write_spec_2_0/).
- Prometheus can receive samples from other clients in a [Remote Write format](https://prometheus.io/docs/specs/remote_write_spec_2_0/).
- Prometheus can read (back) sample data from a remote URL in a [Remote Read format](https://github.com/prometheus/prometheus/blob/main/prompb/remote.proto#L31).
- Prometheus can return sample data requested by clients in a [Remote Read format](https://github.com/prometheus/prometheus/blob/main/prompb/remote.proto#L31).
![Remote read and write architecture](images/remote_integrations.png)
The remote read and write protocols both use a snappy-compressed protocol buffer encoding over
HTTP. The read protocol is not yet considered as stable API.
The write protocol has a [stable specification for 1.0 version](https://prometheus.io/docs/specs/remote_write_spec/)
and [experimental specification for 2.0 version](https://prometheus.io/docs/specs/remote_write_spec_2_0/),
both supported by Prometheus server.
For details on configuring remote storage integrations in Prometheus as a client, see the
[remote write](configuration/configuration.md#remote_write) and
[remote read](configuration/configuration.md#remote_read) sections of the Prometheus
configuration documentation.
Note that on the read path, Prometheus only fetches raw series data for a set of
label selectors and time ranges from the remote end. All PromQL evaluation on the
raw data still happens in Prometheus itself. This means that remote read queries
have some scalability limit, since all necessary data needs to be loaded into the
querying Prometheus server first and then processed there. However, supporting
fully distributed evaluation of PromQL was deemed infeasible for the time being.
Prometheus also serves both protocols. The built-in remote write receiver can be enabled
by setting the `--web.enable-remote-write-receiver` command line flag. When enabled,
the remote write receiver endpoint is `/api/v1/write`. The remote read endpoint is
available on [`/api/v1/read`](https://prometheus.io/docs/prometheus/latest/querying/remote_read_api/).
### Existing integrations
To learn more about existing integrations with remote storage systems, see the
[Integrations documentation](https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage).
## Backfilling from OpenMetrics format
### Overview
If a user wants to create blocks into the TSDB from data that is in
[OpenMetrics](https://openmetrics.io/) format, they can do so using backfilling.
However, they should be careful and note that it is not safe to backfill data
from the last 3 hours (the current head block) as this time range may overlap
with the current head block Prometheus is still mutating. Backfilling will
create new TSDB blocks, each containing two hours of metrics data. This limits
the memory requirements of block creation. Compacting the two hour blocks into
larger blocks is later done by the Prometheus server itself.
A typical use case is to migrate metrics data from a different monitoring system
or time-series database to Prometheus. To do so, the user must first convert the
source data into [OpenMetrics](https://openmetrics.io/) format, which is the
input format for the backfilling as described below.
Note that native histograms and staleness markers are not supported by this
procedure, as they cannot be represented in the OpenMetrics format.
### Usage
Backfilling can be used via the Promtool command line. Promtool will write the blocks
to a directory. By default this output directory is ./data/, you can change it by
using the name of the desired output directory as an optional argument in the sub-command.
```
promtool tsdb create-blocks-from openmetrics [