Securing Connections with SSL
The network communications used between the coordinator, agent, and client machines can be configured to use SSL. The communication is then encrypted using TLS1.2. With the proper certificate configuration, the agent(s) and client(s) can also then authenticate the coordinator. This topic includes the following:
Coordinator Configuration
A self- or CA-signed certificate is required to set-up SSL. Details on how to generate such certificates vary depending on your network PKI infrastructure, and how your network administration group usually generates such certificates.
- The coordinator uses the certificate to encrypt the communications, and therefore needs permissions to the private key. Import the certificate with its private key into the Personal store for the Local Computer (Figure 1).
- Make sure that the user running the coordinator service has permissions to the private key.
- In Coordinator Settings, select the certificate that you imported in the Windows certificate store (Figure 2). You can use the View button to display the details of the selected certificate.
- After activating SSL and selecting the certificate, restart the coordinator service.

Figure 1: Windows Certificate Store used for certificates on the Coordinator machine.
Figure 2: Coordinator SSL Connection Settings.
Agent and Portal Configuration
The user interface for configuring SSL is the same for both the Agent service and the Portal's BookKeeper service. On an agent machine, configure SSL through the Agent Settings Connection window (Figure 3), accessible via the Agent Tray Application. On a machine with the STK Parallel Computing Portal installed, open the Portal Settings Connection window via the Portal Tray Application.
The first checkbox, Use Secure Connection (TLS1.2) instructs the service to connect to the coordinator using TLS1.2 (Figure 3).
Figure 3: Enabling Secure Connections on the Agent.
The second checkbox, Allow Coordinator Self-Signed Certificate, defines how the certificate presented by the coordinator as part of the authentication handshake is treated:
-
When self-signed certificates are not allowed (recommended), the service requires a valid certificate and will verify that the certificate is trusted through its chain. The service also checks for certificate revocation.
-
If self-signed certificates are allowed (not recommended), the service does not try to validate the certificate and accepts it as is. In this mode, the communication between the coordinator and the agent/portal is encrypted using the certificate. If you are using a self-signed certificate without the proper root certification authority, you have to use this option.
You can use the Test button to verify that the connection can be established.
Configuring Mutual SSL
Mutual (also called two-way) SSL adds a second layer on top of SSL in which clients (Agents/Portal) are required to provide a certificate to the server (Coordinator) for verification.
To enable mutual SSL, check the Require Client Certificates checkbox on the Coordinator's Connection settings (Figure 4).
Figure 4: Enabling Mutual SSL
Then, in the Agent or Portal's Connection Settings, check the Provide Client Certificate checkbox and select one of the certificates that has been imported to the machine's Windows certificate store (Figure 5).
Figure 5: Agent Provide Client Certificate checkbox.
Configuring Thumbprint Verification
Certificate thumbprint verification is an additional security measure available when using mutual SSL. When enabled, the client (Agent/Portal) must provide the thumbprint of the security certificate used by the Coordinator to which it is attempting to connect. When a connection is attempted, the client will verify the thumbprint of the Coordinator's security certificate matches the expected value.
To enable certificate thumbprint verification, check the Only allow Coordinator with this certificate thumbprint checkbox and then supply the thumbprint of the Coordinator's security certificate in the textbox (Figure 6). To obtain the Coordinator's security certificate, click the Copy button next to the Thumbprint display on the Coordinator's Connection Settings.
Figure 6: Enabling Certificate Thumbprint Verification
Troubleshooting
Configuring SSL connections can be challenging and the following describes how to troubleshoot connection issues.
If the agent cannot connect to the coordinator, the first place to look for clues is the coordinator log. The log can be accessed from the Logging page in the Coordinator Settings (by default, the log is located on the file system at C:\ProgramData\AGI\STK Parallel Computing Server 3.0\logs\coordinator.log). Scroll down to the bottom of the log and search up for the last time the coordinator restarted (look for the “starting coordinator” string). The next line indicates the connection configuration and the certificate being used by the coordinator:
[initialization] waiting for connections on port 9090 (using SSL, certificate name: [CN=…], protocols: [Tls12])
Verify that the certificate name corresponds to what you expect. Now look for error messages when the agent tried to connect. Client connection attempts will be indicated by “Authenticating connection from <client IP address>”. At this point, you may notice the following error messages in the log:
Coordinator log error message | Resolution |
---|---|
Client failed authentication (The server mode SSL must use a certificate with the associated private key) | The private key for the certificate you selected is missing from the Windows Certificate Store. Import the private key into the Coordinator machine Windows Certificate Store along with the certificate. |
Certificate not found in store (CN=…). | The certificate you selected is no longer available in the Coordinator machine Windows Certificate Store (Personal store for the Local Computer). Re-import the certificate. |
Client failed authentication (The credentials supplied to the package were not recognized). | The user running the coordinator service does not have permissions to the certificate private key. Change the private key permissions to allow the user running the coordinator service to access it. This can be accomplished using different Windows tools. For instance using mmc and the “Manage Private Keys…” task. |
If you do not see any of these errors in the Coordinator log, it is now time to look at the Agent side of the connection. Open the Agent log. The log can be accessed from the Logging page in the Agent Settings (by default, the log is located on the file system at C:\ProgramData\AGI\STK Parallel Computing Server 3.0\logs\agent.log). Scroll down to the bottom of the log and search up for the last time the agent tried to connect to the coordinator (look for the “trying to connect to coordinator endpoint” message):
trying to connect to coordinator endpoint tcp://…
A successful SSL connection to the coordinator will be logged as:
Tls12 connection established, client authenticated/signed/encrypted, remote certificate validated [CN=…] valid … to …, certificate revocation list checked, cipher: …
Using this line you can verify that the SSL connection has the proper security settings based on your requirements.
If the connection cannot be established, this line will be missing. Instead, the next line(s) may indicate one of the following errors.
Agent log error message | Resolution |
---|---|
The server certificate could not be validated because of certificate chain errors (A required certificate is not within its validity period when verifying against the current system clock or the timestamp in the signed file). | The certificate presented by the coordinator has expired. Obtain a new certificate and deploy it to the coordinator machine. |
The server certificate could not be validated because of certificate chain errors (The certificate is revoked). | The certificate presented by the coordinator has been revoked. Obtain a new certificate and deploy it to the coordinator machine. |
The server certificate could not be validated because there was a certificate name mismatch. | The name mismatch error indicates that the common name (domain name) or subject alternative name in the SSL certificate presented by the coordinator doesn't match the address that has been used to configure the agent connection. Edit the coordinator connection address in the Agent Connection settings to match the name in the certificate, or obtain a new certificate with the name matching the coordinator machine name. |
The server certificate could not be validated because of certificate chain errors (The revocation function was unable to check revocation for the certificate, The revocation function was unable to check revocation because the revocation server was offline). | The agent tried to check the certificate for revocation and could not access the revocation server. Make sure that the agent machine can access the revocation server. |
The server certificate could not be validated because of certificate chain errors (A certificate chain could not be built to a trusted root authority, The revocation function was unable to check revocation for the certificate, The revocation function was unable to check revocation because the revocation server was offline). | The certificate presented by the coordinator is not trusted. Make sure that the root certificate and the other certificate in the certificate chain can be validated on the Agent machine. |
If the agent machine cannot validate the certificate presented by the coordinator and the above suggestions did not resolve the issue, check that the certificate can be validated on the Agent machine using the Microsoft Windows certutil utility. First, on the coordinator machine, export your certificate (without the private key) to a file. Copy that file to the Agent machine. Open a command prompt using the same user as the user configured to run the Coordinator machine. Run the following command:
certutil -f -urlfetch -verify <path to cert>
If this command returns errors (look for ERROR: in the output), resolve those errors using the standard Windows utilities, and try again to get the Agent to connect to the Coordinator. Also, if this commands takes a significant delay to complete, you might want to investigate and resolve the origin of the delay, as it might incur a slowdown when connections are established, and impact the performance of your cluster.
If all of this fails, one temporary workaround is to enable the “Allow Self-Signed Certificates” mode in the Agent Settings Connection page. In that mode, the agent does not perform any validation of the Coordinator certificate. However, this is not recommended, as this is less secure than using a certificate that can be properly validated.