Was stumped by a recent escalation. vSphere throws invalid credentials when logging in with a user who is part of another active directory forest.
Some background info:
- Both the forests are in a two way transitive trust
- vCenter was joined with integrated windows authentication to a primary domain in one of the forests'
- This is supported as per this VMware KB
- No DNS related issues. Forward/Reverse lookup works fine.
Logs just reported the error Native platform error [code: 851968]
[2021-09-23T03:42:14.708Z tomcat-http--3 vsphere.local a340b2e0-b5ac-47cd-8f48-12223ef8eaa1 INFO com.vmware.identity.idm.server.IdentityManager] Authentication failed for user [test@gs.lab] in tenant [vsphere.local] in [34] milliseconds with provider [corp.local] of type [com.vmware.identity.idm.server.provider.activedirectory.ActiveDirectoryProvider]
[2021-09-23T03:42:14.708Z tomcat-http--3 vsphere.local a340b2e0-b5ac-47cd-8f48-12223ef8eaa1 ERROR com.vmware.identity.idm.server.ServerUtils] Exception 'com.vmware.identity.idm.IDMLoginException: Native platform error [code: 851968][null][null]' com.vmware.identity.idm.IDMLoginException: Native platform error [code: 851968][null][null]
Validated the two-way transitive trust using the command: /opt/likewise/bin/lw-lsa get-status
Trust was normal.
Enabled trace logging - more information in this VMware KB
/opt/likewise/bin/lwsm set-log-level trace
/opt/likewise/bin/lwio-set-log-info trace
/opt/likewise/bin/lwnet-set-log-level trace
/opt/likewise/bin/lw-set-log-level trace
From /var/log/messages
2021-09-23T07:00:42.105837+00:00 vcsa lsassd[1475]: 0x7fb346ffd700:[NtlmServerAcquireCredentialsHandle() ../lsass/server/ntlm/acquirecreds.c:103] Error code: 40506 (symbol: LW_ERROR_NO_CRED)
2021-09-23T07:00:42.106486+00:00 vcsa lsassd[1475]: 0x7fb369d7c700:[lwmsg_peer_log_message() ../lwmsg/src/peer-task.c:212] (assoc:0x7fb338000e00 >> 14355) CALL RES NTLM_R_GENERIC_FAILURE:
{
dwError = 40506
}
The error NTLM_R_GENERIC_FAILURE
was inconclusive. So we decided to capture packets to look at the Kerberos responses that were coming back from the domain controller.
tcpdump -i eth0 -w vcsa-2309.capture
From the packet capture we were able to root cause the issue to be environmental
- AS-REQ was sent to the domain controller from vCenter Server. An AS-REQ is an Authentication Service message which exchanges credentials for tickets.
as-req
pvno: 5
msg-type: krb-as-req (10)
padata: 1 item
PA-DATA pA-REQ-ENC-PA-REP
padata-type: pA-REQ-ENC-PA-REP (149)
padata-value: <MISSING>
req-body
Padding: 0
kdc-options: 00000010
cname
name-type: kRB5-NT-PRINCIPAL (1)
cname-string: 1 item
CNameString: test
realm: gs.labs
sname
name-type: kRB5-NT-SRV-INST (2)
sname-string: 2 items
SNameString: krbtgt
SNameString: gs.labs
till: 2021-09-24 07:03:46 (UTC)
nonce: 158691297
etype: 3 items
ENCTYPE: eTYPE-AES256-CTS-HMAC-SHA1-96 (18)
ENCTYPE: eTYPE-AES128-CTS-HMAC-SHA1-96 (17)
ENCTYPE: eTYPE-ARCFOUR-HMAC-MD5 (23)
- AS-REP was returned successfully by the dommain controller with the key. AS-REP is the response to the Authentication Service message.
as-rep
pvno: 5
msg-type: krb-as-rep (11)
padata: 1 item
PA-DATA pA-ETYPE-INFO2
padata-type: pA-ETYPE-INFO2 (19)
padata-value: XXXX
crealm:
cname
name-type: kRB5-NT-PRINCIPAL (1)
cname-string: 1 item
CNameString: vcsa-test
ticket
tkt-vno: 5
realm: gs.labs
sname
name-type: kRB5-NT-SRV-INST (2)
sname-string: 2 items
SNameString: krbtgt
SNameString: gs.labs
enc-part
etype: eTYPE-AES256-CTS-HMAC-SHA1-96 (18)
kvno: 2
cipher: XXX
enc-part
etype: eTYPE-AES256-CTS-HMAC-SHA1-96 (18)
kvno: 5
cipher: XXX
- TGS-REQ was sent next from the vCenter Server. A TGS-REQ is a Ticket Granting Service Request which is similar to the Authentication Service message however TGS-REQ will contain the client ID. In the example below it is
vcsa.gs.labs
tgs-req
pvno: 5
msg-type: krb-tgs-req (12)
padata: 2 items
PA-DATA pA-TGS-REQ
padata-type: pA-TGS-REQ (1)
padata-value: xxxx
PA-DATA pA-FX-FAST
padata-type: pA-FX-FAST (136)
padata-value: xxxx
req-body
Padding: 0
kdc-options: 00810000
realm: gs.labs
sname
name-type: kRB5-NT-SRV-HST (3)
sname-string: 2 items
SNameString: host
SNameString: vcsa.gs.labs
till: 2021-09-23 17:03:45 (UTC)
nonce: 1632380625
etype: 3 items
ENCTYPE: eTYPE-AES256-CTS-HMAC-SHA1-96 (18)
ENCTYPE: eTYPE-AES128-CTS-HMAC-SHA1-96 (17)
ENCTYPE: eTYPE-ARCFOUR-HMAC-MD5 (23)
- The domain controller is supposed to reply back with a TGS-REP. However in this instance, we got an error:
KRB5KDC_ERR_S_PRINCIPAL_UNKNOWN
krb-error
pvno: 5
msg-type: krb-error (30)
stime: 2021-09-23 07:03:45 (UTC)
susec: 569859
error-code: eRR-S-PRINCIPAL-UNKNOWN (7)
realm: gs.labs
sname
name-type: kRB5-NT-SRV-HST (3)
sname-string: 2 items
SNameString: host
SNameString: vcsa.gs.labs
- The
KRB5KDC_ERR_S_PRINCIPAL_UNKNOWN
will need to be further diagnosed from the domain controller. Why is the domain controller saying the principal is unknown when the vCenter server has good membership with a domain in the trusted forest?