I can confirm the problem. Also in the process of setting up a VA. Proxy is defined and I can verify that the proxy is reachable via curl --proxy proxy.name.company .
Also getting a pairing code va-bootstrap pair works. So the proxy seems to get used in some cases.
In the firewall logs I can see that the VA tries to contact AWS without using the proxy.
I had to use the http-proxy parameter to get to that step, but we are blocked afterwards. va-bootstrap set-passphrase --https-proxy=http://1.2.3.4:8080/
We’ve also had issues with the proxy at the beginning of the year, it is unfortunate that SailPoint doesn’t always learn from previous issues.
Did you already try “va-bootstrap -v pair”'? There you should see the proxy in the “HTTPSProxy” field of the JSON payload.
That’s where I concluded that my proxy.yaml is probably working fine.
When I execute “export” in the command line I also can see that the environment variables for http_proxy, HTTP_PROXY, https_proxy, HTTPS_PROXY are not set on the new VA. That’s most likely the problem.
Hi @rdoebele, I think you are right. I may have got the error from va-bootstrap before creating the proxy.yaml, but indeed the set-up code setting the environment variables and the docker env is not being called / working correctly.
SailPoint Support asked me if I still have the issue, just after opening the support ticket
touch ~/config.yaml fixes the problem and cause the http proxy env variables to be set.
According to the support this has been fixed in the latest VA image.
Hi @rdoebele, that is one workaround (the other one is probably to create the docker.env with the correct configuration)
I find it frustrating that every time a new “enhancement” or application is introduced, the scenario of HTTP proxies is not tested, and we have to invest time and energy to get it fixed. At least this time around we didn’t wait for weeks until a workaround or solution was provided.
After a second session with SailPoint Support we followed roughly these steps (for the second VA, after doing a lot more with the first one):
cp proxy.yaml proxy.yaml.bak
va-bootstrap -v reset
Reset the VA (this will also deleted the proxy.yaml).
The reset command is NOT documented (why?)
cp proxy.yaml.bak proxy.yaml
va-bootstrap -v set-passphrase
Add the code to the UI for the VA.
A comment from SailPoint support team: the page where you input the code should not be closed after pairing a new VA for at lesat 2-3 minutes. Why?
touch config.yaml
wait for 30 seconds at least (not sure for what)
sudo systemctl restart charon
tail -f log/charon.log
wait until the VA is restarted automatically (5-6 minutes)
2024-10-25T12:18:52Z 0228e51ada45 /usr/local/bin/confd[33]: INFO Target config /opt/sailpoint/workflow/jobs/VA_REBOOT has been updated
tail -f log/charon.log - after another 5-7 minutes charon starts to pick up the proxy configuration and actually do something
SailPoint internal ticket: SAASVA-290
These steps didn’t work for our second VA. The VA was added to the cluster, but the CCG service isn’t getting started.
I find it frustrating that every time a new “enhancement” or application is introduced, the scenario of HTTP proxies is not tested, and we have to invest time and energy to get it fixed.
Maybe someone should tell the SP developers about automated tests. Then something like this wouldn’t happen as often.
I’m yet to hear from SailPoint when this issue is getting fixed.
We are still tagging along with SailPoint Support to add the second VA since 3 weeks now. They asked for logs, and some more logs, and now after they “analyzed” them, they assume that some domains need to be whitelisted (although they worked properly for the first VA). The issue is their tools don’t use the proxy.
LE: 15.11.2024
The ccg service was started by the VA yesterday, all of a sudden. In the worker.log I can see some related changes, we don’t know which changes were pushed by SailPoint.
["2024-11-14T16:21:08.759"] INFO - worker : "Processing job /opt/sailpoint/workflow/jobs/SERVICE_SETUP-fluentccgredis"
SERVICE_SETUP fluentccgredis locked.
SERVICE_SETUP fluentccgredis service setup called.
'/opt/sailpoint/workflow/services/fluent.service' -> '/etc/systemd/system/fluent.service'
Created symlink /etc/systemd/system/multi-user.target.wants/fluent.service <E2><86><92> /etc/systemd/system/fluent.service.
Enabled fluent 0
Started fluent 0
'/opt/sailpoint/workflow/services/ccg.service' -> '/etc/systemd/system/ccg.service'
Created symlink /etc/systemd/system/multi-user.target.wants/ccg.service <E2><86><92> /etc/systemd/system/ccg.service.
Enabled ccg 0
Started ccg 0
'/opt/sailpoint/workflow/services/redis.service' -> '/etc/systemd/system/redis.service'
Created symlink /etc/systemd/system/multi-user.target.wants/redis.service <E2><86><92> /etc/systemd/system/redis.service.
Enabled redis 0
Started redis 0
Job result /opt/sailpoint/workflow/result/SERVICE_SETUP-fluentccgredis SUCCESS
["2024-11-14T16:21:20.363"] INFO - worker : "Processing job /opt/sailpoint/workflow/jobs/SERVICE_SETUP-otel_agent"
SERVICE_SETUP otel_agent locked.
SERVICE_SETUP otel_agent service setup called.
'/opt/sailpoint/workflow/services/otel_agent.service' -> '/etc/systemd/system/otel_agent.service'
Enabled otel_agent 0
Started otel_agent 0
Job result /opt/sailpoint/workflow/result/SERVICE_SETUP-otel_agent SUCCESS
["2024-11-14T16:21:38.845"] INFO - worker : "Processing job /opt/sailpoint/workflow/jobs/SYSTEM_EXEC-usrbincp1731601242"
SYSTEM_EXEC usrbincp1731601242 locked.
SYSTEM_EXEC usrbincp1731601242 service setup called.
Successfully ran [ /usr/bin/cp -f /opt/sailpoint/share/fc/cln/update.conf /etc/flatcar/update.conf && /usr/bin/systemctl restart update-engine.service ]
Job result /opt/sailpoint/workflow/result/SYSTEM_EXEC-usrbincp1731601242 SUCCESS
["2024-11-14T16:22:00.036"] INFO - worker : "Processing job /opt/sailpoint/workflow/jobs/SERVICE_SETUP-otel_agent"
SERVICE_SETUP otel_agent already processing.
["2024-11-14T16:22:00.040"] INFO - worker : "Processing job /opt/sailpoint/workflow/jobs/SYSTEM_EXEC-usrbincp1731601242"
SYSTEM_EXEC usrbincp1731601242 already processing.
The charon.log was still showing some errors today, a manual reboot was still needed.