SaaS Shield Tenant Security Proxy

The Tenant Security Proxy (TSP) is a service that is delivered as a Docker container. You run that container within your SaaS infrastructure. The TSP acts as a gateway between your application and your customers' cloud security infrastructure. It has the ability to communicate with cloud KMS instances running in AWS, GCP, and Azure (see full list of supported services here, depending on which providers your tenants are using. Tenant admins configure their own KMSes using the Configuration Broker.
The TSP service does not require any persistent storage. When it starts, it reads configuration that is provided to the container, and it uses that information to fetch further configuration values, including the tenant KMS configurations, from the Configuration Broker and decrypt them. All KMS configurations are held in memory.

Logging Service

The Docker container also includes a second service, the LogDriver. This service supports SaaS Shield's Tenant Security Event Logging, which creates rich audit trails containing detailed security events from your TSP, such as requests to wrap or unwrap keys, and pushes them to your tenants' logging / SIEM systems.
Like the TSP, LogDriver can communicate with cloud logging instances running in GCP and Splunk (see full list of supported logging services here. Tenant admins control their logging configuration in the Configuration Broker.
The LogDriver service requires persistent storage; when events are pushed to the LogDriver, they are stored to disk before they are bundled and delivered so that events can be recovered if the service is prematurely shut down. On the next startup of the service, it will attempt to deliver all of the stored events.

Installation

The TSP Docker container is hosted publicly on the IronCore Labs Docker registry. Find the latest tag available and pull down the image by running this command:
docker pull gcr.io/ironcore-images/tenant-security-proxy:{tag}
You can see the changes in each version of the TSP in its changelog.
After you have successfully pulled the TSP, you should be able to see the image listed in your docker images list.

Startup

In order to successfully start the Docker container, you need to provide it with a configuration generated from the Configuration Broker and a volume to persist logging events. Once you have this configuration, you can run
docker run
  --env-file config-broker-config.env
  -p 32804:7777
  -m 512M
  --mount 'type=volume,src=<VOLUME-PATH>,dst=/logdriver'
  gcr.io/ironcore-images/tenant-security-proxy:{tag}
The exposed port 32804 can be changed to a different value of your choosing if you want to run the service on a different port. If the image starts successfully, then the TSP service is running locally on the provided port. You'll need to provide the full domain and port where the container is running when using the Tenant Security Client Libraries.
The volume that you provide is used to store security events so they can be recovered in the event of a restart of the container. The events are removed from the store as they are delivered to tenant logging systems, so it is used as an elastic buffer. It will not grow without bound, and the consequence of using zone-local storage is minimal. We recommend providing a 50GB volume, which should be more than sufficient to handle busy logging streams.
The -m argument to Docker sets the amount of memory available to the container. We recommend allocating at least 512MB for the container - the logging service uses the disk as a backup, but for performance, the events are normally processed using in-memory queues. 512MB should allow for a substantial rate of event logging. If you anticipate very high volumes, you should increase the amount of memory available to the container.
When the TSP service starts, it immediately attempts to make a request to the Configuration Broker to retrieve and decrypt all of the KMS configurations that were created by each of your tenants. The decrypted configurations are kept in memory only.
Every 10 minutes, the container re-fetches the list of encrypted tenant configurations from the Configuration Broker. This allows the TSP to get updates on any new or changed KMS configurations for your tenants.
When the LogDriver serice starts, it also requests all of the logging configurations created by your tenants. These configurations are retrieved and decrypted; they are also kept in memory only. LogDriver checks for logging configuration udpates every 10 minutes as well.
Once the container has successfully started, it can be accessed from your applications via the Tenant Security Client, allowing your apps to encrypt and decrypt your customers' data using keys they control and to generate security events that are delivered to your customers' logging and SIEM services.

Health and Liveness Checks

The Docker container also exposes endpoints for checking liveness and health of the container. The checks are implemented based on the Kubernetes lifecycle concepts. The exposed URLs and their meaning are
  • /health: Returns a 200 status code when the container is ready to accept requests. Returns a 500 status code when the server is shutting down or is still initializing.
  • /live: Returns a 200 status code when the container is not shutting down. Returns a 500 status code when the server is shutting down.
  • /ready: Returns a 200 status code when the container is ready to accept requests. Returns a 500 status code when the server is not ready to accept requests.
The container will not report as being "ready" until it has retrieved and decrypted the initial set of tenant KMS configurations from the Configuration Broker. Each of these health endpoints are running on port 9000 within the Docker image.

Configuration

Outside of the configuration mentioned in Startup, there are several optional environment variables that allow for tuning. In general it is recommended that you don't specify these (which will cause the container to use the default values) unless you are instructed to adjust them to resolve an issue.
  • LOGDRIVER_CHANNEL_CAPACITY. Default: 1000. Controls the number of messages that can be held in buffers between logdriver pipeline stages. Increasing this will have a memory impact.
  • LOGDRIVER_SINK_BATCH_SIZE. Default: 1000. Maximum number of events that can be bundled into a single batch call to a tenant's logging system. Increasing this may slow down network calls to cloud logging sinks, but may allow for faster draining of high volume tenants' buffers.
  • LOGDRIVER_BUFFER_POLL_INTERVAL. Default: 2000. Interval (in milliseconds) between each reaping pass of tenant buffers. Decreasing this will increase data rate but may result in a buildup of uncompleted network calls that can eventually use up the container's network resources.
  • LOGDRIVER_CONFIG_REFRESH_INTERVAL. Default: 600. Interval (in seconds) between each logdriver configuration cache refresh.
  • LOGDRIVER_CHANNEL_TIMEOUT. Default: 250. Time (in milliseconds) that pipeline channel sends are allowed before they are abandoned.

Troubleshooting

File Descriptor Limits

When performing batch operations within the Proxy, caution should be used if the batch size is large enough to cause many requests to be made to a tenant's KMS. When a batch operation is performed, multiple parallel requests will be made to the tenant's KMS, one for each key to wrap. If the batch size is large enough, this can cause the number of file descriptors requested from the container to exceed the available resources, creating errors. On Linux, the default file descriptor limit is 1024, so it is best to limit the number of items in a batch operation to no more than 1000 at a time.
File descriptor limit errors may also show up if you have enough concurrent traffic flowing through a single TSP deployment. If you notice these errors in the logs and are not submitting large batches, you should either increase the file descriptor limit on that TSP or horizontally scale and load balance the traffic.

Products

Documentation

Trust Center

Find Us