Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 54 additions & 3 deletions docs/Caching.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,12 +48,63 @@ See https://github.com/mozilla/sccache/blob/8567bbe2ba493153e76177c1f9a6f98cc7ba

### C/C++ preprocessor

In "preprocessor cache mode", [explained in the local doc](Local.md), an
extra key is computed to cache the preprocessor output itself. It is very close
to the C/C++ compiler one, but with additional elements:
In "preprocessor cache mode" explained below, an extra key is computed to cache the preprocessor output itself.
It is very close to the C/C++ compiler one, but with additional elements:

* The path of the input file
* The hash of the input file

Note that some compiler options can disable preprocessor cache mode. As of this
writing, only `-Xpreprocessor` and `-Wp,*` do.

#### Preprocessor cache mode

This is inspired by [ccache's direct mode](https://ccache.dev/manual/3.7.9.html#_the_direct_mode) and works roughly the same.
It adds a cache that allows to skip preprocessing when compiling C/C++. This can make it much faster to return compilation results
from cache since preprocessing is a major expense for these.

Preprocessor cache mode is controlled by a configuration option which is true by default, as well as additional conditions described below.

To ensure that the cached preprocessor results for a source file correspond to the un-preprocessed inputs, sccache needs
to remember, among other things, all files included by the source file. sccache also needs to recognize
when "external factors" may change the results, such as system time if the `__TIME__` macro is used
in a source file. How conservative sccache is about some of these external factors is configurable, see below.

Preprocessor cache mode will be disabled in any of the following cases:

- Not compiling C or C++
- The configuration option is false
- Not using GCC or Clang
- Not using local storage for the cache
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not true anymore, is it? Other information in this PR indicates that maybe it'll work with caches other than local disk as long as they support random seeks - but in any case, now it always is local disk? (That's somewhat contradictory, so please make it consistent)

I am also still wondering if requiring random seek capability is very important or if it just saved three lines of code somewhere and the requirement could be lifted. Well, that's an optional improvement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my experience, random write is definitely required. To create a shared cache for our team, I tried to use mountpoint-s3 to mount a S3 bucket to $HOME/.cache/sccache. What I have observed is that, if preprocessor cache is enabled, mountpoint-s3 would fail because it does not support random write. This implies that somewhere in the sccache code, random write is being used. This is the motivation I decided to extract preprocessor cache out from disk cache.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question is not if random writes are used, but if the random writes are fundamentally necessary or a sort of frivolous requirement that could be easily lifted. You don't want to write hundreds of lines of code to work around a tiny problem that you could just fix instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, making the preprocessor cache compatible with sequential-only devices wouldn't make it useless to have a feature to split the caches: the preprocessor cache is smaller than the main cache and especially benefits from fast storage, which would be reasons to keep it on a local SSD.

Copy link
Contributor Author

@xis19 xis19 Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I did not read the lru code before. Digging into the details makes feel it does not need random write feature. However, it was observed that mountpoint-s3 complains it does not support random write. Not sure if I want to dig into that project at this time though. I do not understand why this happens. My best guess at this stage is that, the LRU cache does some kind of random write (together with the tempfile), and it is done somewhere outside the current code, maybe inside some Rust internal implementation details.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that just leaves adapting the documentation to the new (potentially) split caches reality as a thing to do.

- Any of the compiler options `-Xpreprocessor`, `-Wp,` are present
- The modification time of one of the header files is too new (avoids a race condition)
- Certain strings such as `__DATE__`, `__TIME__`, `__TIMESTAMP__` are present in the source code,
indicating that the preprocessor result may change based on external factors

The preprocessor cache may silently produce stale results in any of the following cases:

- When a source file was compiled and its results were cached, a header file would have been included if it existed, but it did
not exist at the time. sccache does not know about such files, so it cannot invalidate the result if the header file later exists.
- A macro such as `__TIME__` (etc) is used in the source code and `ignore_time_macros` is enabled
- There are other external factors influencing the preprocessing result that sccache does not know about

Configuration options and their default values:

- `use_preprocessor_cache_mode`: `true`. Whether to use preprocessor cache mode. This can be overridden for an sccache invocation by setting the environment variable `SCCACHE_DIRECT` to `true`/`on`/`1` or `false`/`off`/`0`.
- `file_stat_matches`: `false`. If false, only compare header files by hashing their contents. If true, will use size + ctime + mtime to check whether a file has changed. See other flags below for more control over this behavior.
- `use_ctime_for_stat`: `true`. If true, uses the ctime (file status change on UNIX, creation time on Windows) to check that a file has/hasn't changed. Can be useful to disable when backdating modification times in a controlled manner.

- `ignore_time_macros`: `false`. If true, ignore `__DATE__`, `__TIME__` and `__TIMESTAMP__` being present in the source code. Will speed up preprocessor cache mode, but can produce stale results.

- `skip_system_headers`: `false`. If true, the preprocessor cache will only add the paths of included system headers to the cache key but ignore the headers' contents.

- `hash_working_directory`: `true`. If true, will add the current working directory to the cache key to distinguish two compilations from different directories.
- `max_size`: `10737418240`. The size of the preprocessor cache, defaults to the default disk cache size.
- `rw_mode`: `ReadWrite`. ReadOnly or ReadWrite mode for the cache.
- `dir`: `path_to_cache_directory`. Path to the preprocessor cache, By default it will use DiskCache's directory, under subdirectory `preprocessor`.

See where to write the config in [the configuration doc](Configuration.md).

`sccache --debug-preprocessor-cache` can be used to investigate the content of the preprocessor cache.

The preprocessor cache uses random read and write; thus, certain file systems, including `s3fs`, are not supported.
8 changes: 7 additions & 1 deletion docs/Configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ dir = "/tmp/.cache/sccache"
size = 7516192768 # 7 GiBytes

# See the local docs on more explanations about this mode
[cache.disk.preprocessor_cache_mode]
[cache.preprocessor_cache_mode]
# Whether to use the preprocessor cache mode
use_preprocessor_cache_mode = true
# Whether to use file times to check for changes
Expand All @@ -46,6 +46,12 @@ ignore_time_macros = false
skip_system_headers = false
# Whether hash the current working directory
hash_working_directory = true
# Maximum size of the cache
max_size = 1048576
# ReadOnly/ReadWrite mode
rw_mode = "ReadWrite"
# Path to the cache
dir = "/tmp/.cache/sccache-preprocess/"

[cache.gcs]
# optional oauth url
Expand Down
45 changes: 0 additions & 45 deletions docs/Local.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,51 +6,6 @@ The default cache size is 10 gigabytes. To change this, set `SCCACHE_CACHE_SIZE`

The local storage only supports a single sccache server at a time. Multiple concurrent servers will race and cause spurious build failures.

## Preprocessor cache mode

This is inspired by [ccache's direct mode](https://ccache.dev/manual/3.7.9.html#_the_direct_mode) and works roughly the same.
It adds a cache that allows to skip preprocessing when compiling C/C++. This can make it much faster to return compilation results
from cache since preprocessing is a major expense for these.

Preprocessor cache mode is controlled by a configuration option which is true by default, as well as additional conditions described below.

To ensure that the cached preprocessor results for a source file correspond to the un-preprocessed inputs, sccache needs
to remember, among other things, all files included by the source file. sccache also needs to recognize
when "external factors" may change the results, such as system time if the `__TIME__` macro is used
in a source file. How conservative sccache is about some of these external factors is configurable, see below.

Preprocessor cache mode will be disabled in any of the following cases:

- Not compiling C or C++
- The configuration option is false
- Not using GCC or Clang
- Not using local storage for the cache
- Any of the compiler options `-MP`, `-Xpreprocessor`, `-Wp,` are present
- The modification time of one of the header files is too new (avoids a race condition)
- Certain strings such as `__DATE__`, `__TIME__`, `__TIMESTAMP__` are present in the source code,
indicating that the preprocessor result may change based on external factors

The preprocessor cache may silently produce stale results in any of the following cases:

- When a source file was compiled and its results were cached, a header file would have been included if it existed, but it did
not exist at the time. sccache does not know about such files, so it cannot invalidate the result if the header file later exists.
- A macro such as `__TIME__` (etc) is used in the source code and `ignore_time_macros` is enabled
- There are other external factors influencing the preprocessing result that sccache does not know about

Configuration options and their default values:

- `use_preprocessor_cache_mode`: `true`. Whether to use preprocessor cache mode. This can be overridden for an sccache invocation by setting the environment variable `SCCACHE_DIRECT` to `true`/`on`/`1` or `false`/`off`/`0`.
- `file_stat_matches`: `false`. If false, only compare header files by hashing their contents. If true, will use size + ctime + mtime to check whether a file has changed. See other flags below for more control over this behavior.
- `use_ctime_for_stat`: `true`. If true, uses the ctime (file status change on UNIX, creation time on Windows) to check that a file has/hasn't changed. Can be useful to disable when backdating modification times in a controlled manner.

- `ignore_time_macros`: `false`. If true, ignore `__DATE__`, `__TIME__` and `__TIMESTAMP__` being present in the source code. Will speed up preprocessor cache mode, but can produce stale results.

- `skip_system_headers`: `false`. If true, the preprocessor cache will only add the paths of included system headers to the cache key but ignore the headers' contents.

- `hash_working_directory`: `true`. If true, will add the current working directory to the cache key to distinguish two compilations from different directories.

See where to write the config in [the configuration doc](Configuration.md).

## Read-only cache mode

By default, the local cache operates in read/write mode. The `SCCACHE_LOCAL_RW_MODE` environment variable can be set to `READ_ONLY` (or `READ_WRITE`) to modify this behavior.
Expand Down
Loading
Loading