Different data dimensions impact performance and scalability. This section provides an overview of the most critical components that drive scale of a Carbon Black EDR server installation.
- Incoming data rates: Each sensor sends a stream of data to the server; that data requires indexing and storage. The incoming data volume that is generated for an installation is impacted by three measures:
- The number of concurrently active sensors: the server must process event data from each endpoint. More sensors increases CPU and disk IO requirements.
- The number of processes per sensor per day: Planning requires estimates for the typical rate of processes per endpoint per day, but there can be wide variation between installations. Operating systems vary in the number of processes that are generated per sensor.
- The average activity per process: Similar to the count of processes per endpoint per day, the activity of those processes (for example, file modifications, registry changes, network connections, child processes, etc.) can vary between endpoints. The amount of storage each process document takes is dependent on the activity of the process.
- Carbon Black EDR uses a storage model that is based on per-process storage where the activities of a given process are stored within a set of per-process containers. These are referred to as process documents. Each process document represents a snapshot of the process activity in a period of time (typically five minutes).
- Each process can have one or more associated process documents. New process documents are created each time the sensor sends data to the server. Therefore, longer-living processes (such as
svchost.exe) can have many associated process documents. The average process-to-document ratio is between two and three.
- The Carbon Black EDR datastore collates multiple process documents into logical data volumes called shards. Multiple shards are created (based on size or number of days) to achieve the desired retention period while maintaining optimum search performance.
- The process document count is a key driver of performance and storage capacity.
- Threat intelligence feeds: Depending on the number of enabled feeds, the server monitors for activity related to a number of unique indicators. Most organizations do not notice any impact; however, some organizations might monitor a very large number of indicators, and therefore require additional resources.
- Watchlists: For each configured watchlist, the server runs a search every ten minutes. Performance is impacted by the number and complexity of these searches. Most organizations will not notice an impact, but some organizations might monitor many watchlists or process complex watchlists; this will impact performance (for example, lengthy watchlists or searches that contain wildcards).