Visualizzazione post con etichetta vIDM. Mostra tutti i post
Visualizzazione post con etichetta vIDM. Mostra tutti i post

venerdì 19 febbraio 2021

Elasticsearch on Workspace One Access (former vIDM) start and exit with status 7

Issue


This week I had problem with elasticsearch service on Workspace ONE Access (former VIDM) part of the new VMware Cloud Foundation environment (VCF Version 4.X). It seems that the service has some problem in the startup phase on all the nodes that compose the cluster. 'elasticsearch start' exits with status 7.
Workspace One Access version is 3.3.2-15951611.

Opening the console was present an Error message like “Error: Error log is in /var/log/boot.msg.”
Part of the message are reported below:
 No JSON object could be decoded
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib64/python2.6/json/__init__.py", line 267, in load
    parse_constant=parse_constant, **kw)
  File "/usr/lib64/python2.6/json/__init__.py", line 307, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.6/json/decoder.py", line 319, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.6/json/decoder.py", line 338, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
Number of nodes in cluster is : 
Configuring /opt/vmware/elasticsearch/config/elasticsearch.yml file
Starting elasticsearch: 
<notice -- Feb 15 15:05:17.122319000> 'elasticsearch start' exits with status 7
<notice -- Feb 15 15:05:17.130417000> hzn-dots start
Application Server already running.
<notice -- Feb 15 15:05:17.339108000> 'hzn-dots start' exits with status 0
Master Resource Control: runlevel 3 has been reached
Failed services in runlevel 3: elasticsearch
Skipped services in runlevel 3: splash
<notice -- Feb 15 15:05:17.340630000> 
killproc: kill(456,3)

Solution


Disclaimer: Procedures described below, if you are not fully aware of what you are changing, it is advisable to make the changes with the help of the VMware GSS to prevent the environment from becoming unstable. Use it at your own risk.

Short Answer
We just need to run the following commands on each Workspace ONE Access appliance, to understand if nodes communicates each other, and so on..

  • Check how many nodes are part of the cluster:
    curl -s -XGET http://localhost:9200/_cat/nodes
  • Check cluster health:
    curl http://localhost:9200/_cluster/health?pretty=true
  • Check the queue list of rabbitmq
    rabbitmqctl list_queues | grep analytics
  • If the cluster health is red run these commands:
    • to find UNASSIGNED SHARDS:
      curl -XGET localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason | grep UNASSIGNED
    • to DELETE SHARDS:
      curl -XGET http://localhost:9200/_cat/shards | grep UNASSIGNED | awk {'print $1'} | xargs -i curl -XDELETE "http://localhost:9200/{}"
  • Recheck the health to insure it is green and once green ....
    curl http://localhost:9200/_cluster/health?pretty=true
  • ... then check the elastic search if it is working or not.

  • Nodes may need to be restarted. Proceed as follows:
    • turn off 2 nodes and leave one active
    • turn on a node again (at time), wait for it to appear in the cluster and start correctly
    • do the same with the third node
    • When the third is active and present in the cluster, perform a clean restart cycle also for the first node.


Long Answer (with command's output)
The commands that we will perform into the long answer will be the same already explained above, but we will report down here the output (of one node only). We remember that the commands must be performed on each nodes part of the cluster.

  • Check how many nodes are part of the cluster:
    custm-vrsidm1:~ # curl -s -XGET http://localhost:9200/_cat/nodes
    10.174.28.18 10.174.28.18 6 98 0.31 d * Exploding Man
  • Check cluster health:
    custm-vrsidm1:~ # curl http://localhost:9200/_cluster/health?pretty=true
    {
      "cluster_name" : "horizon",
      "status" : "red",
      "timed_out" : false,
      "number_of_nodes" : 1,
      "number_of_data_nodes" : 1,
      "active_primary_shards" : 74,
      "active_shards" : 74,
      "relocating_shards" : 0,
      "initializing_shards" : 0,
      "unassigned_shards" : 146,
      "delayed_unassigned_shards" : 0,
      "number_of_pending_tasks" : 0,
      "number_of_in_flight_fetch" : 0,
      "task_max_waiting_in_queue_millis" : 0,
      "active_shards_percent_as_number" : 33.63636363636363
    }
  • Check the queue list of rabbitmq
    custm-vrsidm1:~ #  rabbitmqctl list_queues | grep analytics
    -.analytics.127.0.0.1   0
  • If the cluster health is red run these commands:
    • to find UNASSIGNED SHARDS:
      custm-vrsidm1:~ # curl -XGET localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason| grep UNASSIGNED
        % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                       Dload  Upload   Total   Spent    Left  Speed
      100 11440  100 11440    0     0   270k      0 --:--:-- --:--:-- --:--:--  279k
      v4_2021-02-14     4 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-14     1 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-14     2 p UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-14     2 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-14     3 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-14     0 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-03     4 p UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-03     4 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-03     3 p UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-03     3 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-01-28     4 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-01-28     3 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-01-28     2 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-01-28     1 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-01-28     0 r UNASSIGNED CLUSTER_RECOVERED
      v2_searchentities 4 p UNASSIGNED CLUSTER_RECOVERED
      v2_searchentities 4 r UNASSIGNED CLUSTER_RECOVERED
      v2_searchentities 1 r UNASSIGNED CLUSTER_RECOVERED
      v2_searchentities 2 r UNASSIGNED CLUSTER_RECOVERED
      v2_searchentities 3 r UNASSIGNED CLUSTER_RECOVERED
      v2_searchentities 0 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-06     4 p UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-06     4 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-01-27     0 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-05     4 p UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-05     4 r UNASSIGNED CLUSTER_RECOVERED
      .................................................
      v4_2021-02-05     2 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-05     1 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-05     0 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-01-26     4 p UNASSIGNED CLUSTER_RECOVERED
      v4_2021-01-26     4 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-04     1 r UNASSIGNED CLUSTER_RECOVERED
      v4_2021-02-04     0 r UNASSIGNED CLUSTER_RECOVERED
    • to DELETE SHARDS:
      custm-vrsidm1:~ # curl -XGET http://localhost:9200/_cat/shards | grep UNASSIGNED | awk {'print $1'} | xargs -i curl -XDELETE "http://localhost:9200/{}"
        % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                       Dload  Upload   Total   Spent    Left  Speed
      100 16060  100 16060    0     0   589k      0 --:--:-- --:--:-- --:--:--  627k
      {"acknowledged":true}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-14","index":"v4_2021-02-14"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-14","index":"v4_2021-02-14"},"status":404}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-14","index":"v4_2021-02-14"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-14","index":"v4_2021-02-14"},"status":404}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-14","index":"v4_2021-02-14"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-14","index":"v4_2021-02-14"},"status":404}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-14","index":"v4_2021-02-14"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-14","index":"v4_2021-02-14"},"status":404}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-14","index":"v4_2021-02-14"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-14","index":"v4_2021-02-14"},"status":404}{"acknowledged":true}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-03","index":"v4_2021-02-03"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-03","index":"v4_2021-02-03"},"status":404}
      ..........................................................
      {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-01-28","index":"v4_2021-01-28"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-01-28","index":"v4_2021-01-28"},"status":404}{"acknowledged":true}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v2_searchentities","index":"v2_searchentities"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v2_searchentities","index":"v2_searchentities"},"status":404}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v2_searchentities","index":"v2_searchentities"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v2_searchentities","index":"v2_searchentities"},"status":404}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v2_searchentities","index":"v2_searchentities"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v2_searchentities","index":"v2_searchentities"},"status":404}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v2_searchentities","index":"v2_searchentities"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v2_searchentities","index":"v2_searchentities"},"status":404}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v2_searchentities","index":"v2_searchentities"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v2_searchentities","index":"v2_searchentities"},"status":404}{"acknowledged":true}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-06","index":"v4_2021-02-06"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-06","index":"v4_2021-02-06"},"status":404}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-06","index":"v4_2021-02-06"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-06","index":"v4_2021-02-06"},"status":404}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-06","index":"v4_2021-02-06"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-06","index":"v4_2021-02-06"},"status":404}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-04","index":"v4_2021-02-04"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-04","index":"v4_2021-02-04"},"status":404}{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-04","index":"v4_2021-02-04"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"v4_2021-02-04","index":"v4_2021-02-04"},"status":404}
  • Recheck the health to insure it is green and once green ....
    custm-vrsidm1:~ # curl http://localhost:9200/_cluster/health?pretty=true
    {
      "cluster_name" : "horizon",
      "status" : "green",
      "timed_out" : false,
      "number_of_nodes" : 1,
      "number_of_data_nodes" : 1,
      "active_primary_shards" : 0,
      "active_shards" : 0,
      "relocating_shards" : 0,
      "initializing_shards" : 0,
      "unassigned_shards" : 0,
      "delayed_unassigned_shards" : 0,
      "number_of_pending_tasks" : 0,
      "number_of_in_flight_fetch" : 0,
      "task_max_waiting_in_queue_millis" : 0,
      "active_shards_percent_as_number" : 100.0
    }
  • ... then check the elastic search if it is working or not.

  • After the reboots of the all nodes. Number_of_nodes and number_of_data_nodes is now three (in my case) as should be .....
    custm-vrsidm1:~ # curl http://localhost:9200/_cluster/health?pretty=true
    {
      "cluster_name" : "horizon",
      "status" : "green",
      "timed_out" : false,
      "number_of_nodes" : 3,
      "number_of_data_nodes" : 3,
      "active_primary_shards" : 5,
      "active_shards" : 10,
      "relocating_shards" : 0,
      "initializing_shards" : 0,
      "unassigned_shards" : 0,
      "delayed_unassigned_shards" : 0,
      "number_of_pending_tasks" : 0,
      "number_of_in_flight_fetch" : 0,
      "task_max_waiting_in_queue_millis" : 0,
      "active_shards_percent_as_number" : 100.0
    }
    custm-vrsidm1:~ #
    custm-vrsidm1:~ #  curl -s -XGET http://localhost:9200/_cat/nodes
    10.174.28.19 10.174.28.19 14 97 0.20 d * Orka
    10.174.28.20 10.174.28.20  5 97 0.18 d m Mongoose
    10.174.28.18 10.174.28.18 11 96 0.47 d m Urthona


So, now VIDM seems to be up and running, if we check NSX-T's LB we can see that .....
... the pool is successfully contacting all nodes.
We are also, able to log into .....
... and check graphically that everything is ...
... FINE.

A double check can be done, verifying the file /var/log/boot.msg
<notice -- Feb 16 18:31:28.776900000> 
elasticsearch start

horizon-workspace service is running
Waiting for IDM: ..........
<notice -- Feb 16 18:33:44.203450000> checkproc: /opt/likewise/sbin/lwsmd 1419
<notice -- Feb 16 18:33:44.530367000> 
checkproc: /opt/likewise/sbin/lwsmd 
1419

... Ok.
Number of nodes in cluster is : 3
Configuring /opt/vmware/elasticsearch/config/elasticsearch.yml file
Starting elasticsearch: done.
    elasticsearch logs: /opt/vmware/elasticsearch/logs
    elasticsearch data: /db/elasticsearch
<notice -- Feb 16 18:34:39.403558000> 
'elasticsearch start' exits with status 0


That's it.

mercoledì 27 giugno 2018

VMware Identity Manager 2.9 – Could not Pull the Required Object From Identity Manager

Disclaimer: Some of the procedures described below is not officially supported by VMware. Use it at you own risk.

Problema
Sono stato contattato da un cliente perché l'Identity Manager che avevo messo in piedi un pò di tempo fa, sembra non autenticare più correttamente gli utenti. Sembra che l'IDM non contatti più i Domain Controllers.
Effettuato il login come utente amministratore all'interno dell'Appliance IDM (nel mio caso l'IDM è una versione 2.9.2.0 Build 6095217) vado a verificare sotto la tab Identiry & Access Management i Directories e noto che a fianco del bottone Sync Now nella riga corrispondente del Directory Name interessato c'è una X rossa che indica che l'IDM non riesce correttamente a sincronizzarsi/contattare l'AD (Active Directory).
Clicco quindi sul Sync Now  e vado a verificare in Sync Log cosa sta succedendo; il seguente messaggio di errore:

Could not pull the required object from Identity Manager


Ci si connette in SSH sull'Identity Manager cercando di reperire maggiori info dai file di logs. 
I file di logs per quello che riguarda l'vIDM sono nella directory "/opt/vmware/horizon/workspace/logs/"  nello specifico ho trovo alcune informazioni interessanti nel file di log connector.log (vedi sotto)

2018-06-20 14:11:28,514 INFO  (Timer-24) [3002@WSIDM;;] com.vmware.horizon.connector.admin.StateService - Saving config for 3002@WSIDM to file /usr/local/horizon/conf/states/WSIDM/3002/config-state.json 2018-06-20 14:11:28,521 INFO  (Timer-24) [3002@WSIDM;;] com.vmware.horizon.connector.admin.StateService - Saving state config to disk DONE. 2018-06-20 14:12:26,057 INFO  (Timer-18) [3002@WSIDM;;] com.vmware.horizon.connector.utils.RestClient - END   sendRequestBase (https://wsidm.<NOME CLIENTE>.it/SAAS/t/wsidm/jersey/manager/api/connectormanagement/directoryconfigs/19321c7b-0172-4eaa-89b4-c54a316a4514/syncprofile, ..., application/vnd.vmware.horizon.manager.connector.management.directory.sync.profile+json, GET, null, ...) 2018-06-20 14:12:26,057 WARN  (Timer-18) [3002@WSIDM;; com.vmware.horizon.engine.ObjectPullEngine - Code from Service :-404 2018-06-20 14:12:26,057 ERROR (Timer-18) [3002@WSIDM;;] com.vmware.horizon.engine.ObjectPullEngine - Error message from Service :-Request timed out..[response-Request timed out. 2018-06-20 14:12:26,057 ERROR (Timer-18) [3002@WSIDM;;] com.vmware.horizon.engine.ObjectPullEngine - Could not retrieve required object from Horizon com.vmware.horizon.connector.exception.PullEngineException: Could not retrieve required object from Horizon         at com.vmware.horizon.engine.ObjectPullEngine.getObjectFromHorizon(ObjectPullEngine.java:98)         at com.vmware.horizon.connector.connectormanagement.DirectorySyncConfigPullEngine.getDirectorySyncConfigFromService(DirectorySyncConfigPullEngine.java:44)         at com.vmware.horizon.connector.admin.DirectorySyncConfigUpdateService.updateDirectorySyncConfigFromService(DirectorySyncConfigUpdateService.java:43)         at com.vmware.horizon.connector.admin.SyncScheduleService.syncIfAppropriate(SyncScheduleService.java:153)         at com.vmware.horizon.connector.admin.ScheduleService$1.run(ScheduleService.java:83)         at java.util.TimerThread.mainLoop(Timer.java:555)         atjava.util.TimerThread.run(Timer.java:505) 2018-06-20 14:12:26,060 ERROR (Timer-18) [3002@WSIDM;;] com.vmware.horizon.connector.admin.ScheduleService - Sync of Directory aborted.com.vmware.horizon.connector.exception.PullEngineException: Could not retrieve required object from Horizon<         at com.vmware.horizon.engine.ObjectPullEngine.getObjectFromHorizon(ObjectPullEngine.java:98)         at com.vmware.horizon.connector.connectormanagement.DirectorySyncConfigPullEngine.getDirectorySyncConfigFromService(DirectorySyncConfigPullEngine.java:44)         at com.vmware.horizon.connector.admin.DirectorySyncConfigUpdateService.updateDirectorySyncConfigFromService(DirectorySyncConfigUpdateService.java:43)         at com.vmware.horizon.connector.admin.SyncScheduleService.syncIfAppropriate(SyncScheduleService.java:153)         at com.vmware.horizon.connector.admin.ScheduleService$1.run(ScheduleService.java:83)         at java.util.TimerThread.mainLoop(Timer.java:555)         at java.util.TimerThread.run(Timer.java:505) 2018-06-20 14:12:26,060 INFO  (SimpleAsyncTaskExecutor-171294) [3002@WSIDM;;] com.vmware.horizon.client.rest.Utils -BEGIN sendRequestBase (https://wsidm.< NOME CLIENTE>.it/SAAS/t/wsidm/API/1.0/REST/auth/cert, ..., application/x-www-form-urlencoded, GET, null, ...) 2018-06-20 14:12:26,060 ERROR (Timer-18) [3002@WSIDM;;] com.vmware.horizon.connector.mvc.UIAlerts - Could not pull the required object from Identity Manager. Request timed out..[response-Request timed out.] 2018-06-20 14:12:26,060 INFO  (Timer-18) [3002@WSIDM;;] com.vmware.horizon.connector.admin.SyncScheduleService - Directory sync method: end. 2018-06-20 14:12:26,070 INFO  (SimpleAsyncTaskExecutor-171294) [3002@WSIDM;;] com.vmware.horizon.client.rest.Utils -END   sendRequestBase (https://wsidm.<NOME CLIENTE>.it/SAAS/t/wsidm/API/1.0/REST/auth/cert, ..., application/x-www-form-ur lencoded, GET, null, ...)
Noto una entry con "Could not retrieve required object from Horizon", il che mi fa pensare a qualche cambiamento di cui non sono a conoscenza :-) ..... googlando un pochino la stringa presente nell'interfaccia web "Could not Pull the Required Object From Identity Manager" vedo che non sono l'unico con il problema e trovo un interessante post di Matt Allfrod a questo link .

Chiedo quindi al cliente se ci sono cambiamenti con i Domain Controllers e scopro che uno di questi è stato spento (perché dismesso).

L'Identity Manager una volta configurato non effettua automaticamente delle query per verificare quali sono i Domain Controllers disponibili, ma fa riferimento ad un file "domain_krb.properties" creato in fase di configurazione.

Documentazione ufficiale VMware nell'area Integrazione con Active directory


Individuato il nome del DC dismesso procediamo alla modifica manuale del fie domain_krb.properties nel modo seguente:

1. effettuare il login all'interno dell'vIDM (Utilizzare sshuser e poi diventare root)

2. effettuare una copia del file originale 
cp /usr/local/horizon/conf/domain_krb.properties  /usr/local/horizon/conf/domain_krb.properties .ORIG

3.editare il file vi /usr/local/horizon/conf/domain_krb.properties

4. effettuare le modifiche in base ai cambiamenti dei Domain Controller e salvare le modifiche. Nel nostro caso eliminare la entry del DC che non esiste più.

5. cambiare l'ownership del file eseguendo 
chown horizon:www /usr/local/horizon/conf/domain_krb.properties

6. riavviare il servizio eseguendo service horizon-workspace restart

7. lanciare un Sync Now per verificarne il corretto funzionamento.

Con mia sorpresa mi accorgo che la procedura appena indicata non ha sortito effetto, il problema è ancora lì ed l'vIDM continua a non sincronizzarsi correttamente con l'infrastruttura Active Directory.


Soluzione
Analizzando in modo più approfondito i file di logs del connector ed ispirato dalla KB2145438 ho notato che tutte le richieste effettuate sembrano essere iniziate da un initiator (nel mio caso 3002@<nome server>) che ha un file di configurazione posizionato in /usr/local/horizon/conf/states/<NOME SERVER>/3002/config-state.json (vedi sopra riga 1 nel file di log).

Verificando il contenuto del file config-state.json, riscontro che ci sono delle entry con il nome del Domain Controller dismesso. 


Decido quindi di procedere come d seguito...

8. effettuare una copia del file  sorgente (prima di effettuare qualsiasi modifica) nel modo seguente... cp /usr/local/horizon/conf/states/<NOME SERVER>/3002/config-state.json /usr/local/horizon/conf/states/<NOME SERVER>/3002/config-state.json.ORIG

wsidm:~ # cp /usr/local/horizon/conf/states/WSIDM/3002/config-state.json /usr/local/horizon/conf/states/WSIDM/3002/config-state.json.ORIG


9. sostituisco il nome del DC dismesso con uno esistente sed -i 's/<DC Vecchio>/<Nuovo DC>/g' /usr/local/horizon/conf/states/<NOME SERVER>/3002/config-state.json

wsidm:~ # sed -i 's/sgrsodc2017/sgrbodc2/g' /usr/local/horizon/conf/states/WSIDM/3002/config-state.json

10. riavviare il servizio eseguendo service horizon-workspace restart 


11. lanciare un Sync Now per verificarne il corretto funzionamento



12. Verificare l'accesso con una utenza di servizio. Avendo utilizzato un DC prossimo all'vIDM, a sensazione (non avendolo monitorato) la procedura di autenticazione è sembrata più veloce.


Conclusione

Consiglio di verificare tutti gli step indicati, sia per modificare il file  domain_krb.properties (eliminando i DC non più presenti) e sostituendo le entry del DC dismesso all'interno del file config-state.json con con un DC in cui è abilitata la ricerca della Posizione servizio DNS (record SRV) come indicato qui