Malware Technology

Personal Journal on Defensive Cybersecurity, DevOps, and anything that comes to mind


Killing Them (Passwords) Slowly: Architecting Zero-Trust with Passwordless Infrastructure Access

Killing them (Passwords) Softly

Organizations are still getting breached in 2023 due to credential abuse. Phishing and the purchase of credentials by malicious insiders are avenues of initial access by prolific ransomware groups. Phishing training for users is common practice, but it’s a bandaid put on the problem. Even tech-savvy users may make a mistake; they might be distracted or fatigued, and the phishermen are always improving their tactics and lures.

The real problem isn’t the user; it’s the reliance upon weak forms of authentication.

Using modern security practices and protocols, is it possible to solve the problem of credential phishing and password abuse?

Are Password Managers the One Way Forward?


Lastpass Hacked


Passwords are everywhere, despite being commonly known to be a weak form of authentication. The push to 2FA/MFA is evidence of that. Passwords are a weak form of authentication even at a conceptual level because there is no physical component to it. Once it is known to an attacker, it can be abused. It can be brute-forced, guessed, or cracked given hashes. Application-integrated mitigations such as account lockouts or alerting on impossible travel reduce the risk but do not eliminate it. Even utilizing a strong hashing algorithm for credentials only reduces risk, as weak passwords can still be guessed in a dictionary attack.


Bandaids are put on the problem, but none solve the root cause.

Problems with Password Storage and Generation


Password policies are one such “solution”, requiring users to remember and utilize long and complex passwords that are rotated every 30/60/90 days (policy-dependent). But that solution in itself can create its own vulnerabilities as users, unwiling or unable to repeatedly remember a new and complex password, develop their own workarounds to IT policy.

Users may end up writing their password down and leaving it in an unsecured location (ever seen a movie where the password is on a sticky note under the keyboard?). The digital equivalent would have a user storing credentials in plaintext in some plaintext file, perhaps even with a name like passwords.txt or credentials.xls. I heard a variation of this method recently as being used by somebody that works in IT!

Slightly more security-conscious users might use new passwords that are based upon on some common pattern that is acceptable by an organization’s password policy. They might use something like %Hunter2, %Hunter3, %Hunter4, all of which might meet character and length requirements, but would be able to be easily guessed if an attacker knew the common pattern.

Worse yet, since users don’t want to have to remember a bunch of different passwors for different environments, they may end up using the same password for different applications, both internal and external to an organization. This increases risk because if any of those applications are breached, an attacker that obtains the users’ passwords may be able to laterally move from a system of seemingly little value and bad security practices (e.g. a 3rd party event registration page with passwords stored in plaintext) to one with extreme value (the organization’s Active Directory).

The best, but still not perfect solution for using passwords is to use a password manager.

Enforcing Stronger Passwords


Password managers allow users to remember a single “master” password, which then allows them to view and store unique, complex passwords for each application. The underlying data store is always encrypted on disk, with decryption only taking place in memory. When a password is generated, it can be configured to follow specific length, character, and case requirements. This takes the onus of generating a strong password off the user, and if the target application follows good security like using a strong hash and salt, can make it infeasible for an attacker to crack with today’s technology.

This isn’t a bad solution if used properly, but it doesn’t solve all the problems already described.

Problems with Password Managers


First, users have to be trained or even forced to use a password manager. Web applications and desktop applications have no way to determine if a user has made up the password themselves or have had one generated by the password manager. Even if an organization pre-installs it on all workstations, there is no guarantee that a user will use it when registering for a new application. User training remains a necessity, but compliance will be hard to test.

Another problem with password managers is availability. If a user utilizes an online password manager such as 1Password or LastPass (which increases risk as attackers have already been shown to be capable of compromising the latter), what happens when they do not have Internet access? If they use an offline password manager such as KeypassXC, what happens when they use a device that doesn’t have the local encrypted database file? In both cases, some form of sync will be required; this additional requirements increases complexity potentially increases risk through the involvement of a third-party online. But what about when a user is utilizing an untrusted computer, such as one in a public library? Online password managers usually have some form of browser extension that will apply the appropriate password to a login form given the URL (which would still be susceptible to traffic interception when the POST is eventually sent), but what of offline password managers? Copying from an offline pasword manager utilizes the system clipboard which is vulnerable to malware, and which is oftentimes targeted by “stealers” that intend to gather credentials for a threat actor. Keyloggers can capture keyboard input in the case that a user manually types out a password displayed in their password manager.

A password manager is intended to securely generate and store credentials, but what happens when a user doesn’t use its password generation capability? If a user suspects that they may need to remember the password in an environment where they may not have access to their password manager, they may opt to create a (weaker) password themselves. In this case, a user may use almost all aspects of a password manager but still use weak practices that nullify nearly all the value.

Clearly, although password managers are useful, they are not bulletproof. Password managers complement password policies; they do not replace them.

Mitigating Risks Associated with Password-based Authentication


Organizations attempt to mitigate the problems associated with passwords in a couple of different ways. First, they may try to reduce the number of passwords required for a user. This is done through the introduction of single sign-on (SSO). The second mitigation is the introduction of two-factor (2FA) or multi-factor (MFA) authenticatication.

Single-Sign On in the Ancient Times


FreeIPA

One of the first (and still widely adopted) forms of single-sign on might be the lightweight directory access protocol (LDAP). This is present in Linux and Windows environment through the use of tools like FreeIPA or Active Directory. LDAP allows a user to remember only one password and be able to utilize all LDAP-enabled applications required for work. LDAP is mature technology, so it is commonly supported by enterprise applications deployed on-premises. This convenience doesn’t come free though. Introducting LDAP into an environment has the cost of additional infrastructure to deploy and manage.

The deployment of FreeIPA in a Linux environment requires the usage of an agent like sssd that can “bind” to the directory. Authorization policies such as:

  • HBAC rules
    • Who can access the server/service
  • Sudoer rules
    • What they are allowed to do with elevated permissions

as well as public keys are stored on the FreeIPA master(s), and then sssd agents pull that information in order to make authorization decisions. All of this is built upon the LDAP protocol.

Larger environments spread across disparate networks highlight an issue with the protocol. LDAP is an older protocol and was designed with the assumption that LDAP-enabled applications will be able to directly connect to the LDAP server (or proxy). In disparate environments, especially spanning the Internet, this requires the use of site-to-site VPNs or SaaS-based LDAPS (such as that provided by Jumpcloud).

The introduction of VPNs for applications to communicate increases the complexity and maintenance requirements of the environment, adding another point of failure, and also potentially requiring a different VPN from what users are already utilizing (e.g. Cisco AnyConnect).

Requiring VPNs and ensuring the agents installed on each server are able to communicate with the FreeIPA server(s) is of the utmost importance. If authorization policies change, or if a user is (disabled/deleted/changed their password/changed their key) and the sssd agent is unaware of the change, all Hell breaks loose! only slightly exaggerating

In the event that the agent is unable to communicate with the LDAP server, users that should be able to log into a server may be unable to do so. Worse yet, terminated or disabled users that should NOT be able to utilize a server may still have access due to caching. This is definitely not good for audits! Network outages, partitions, replication failures, and sssd timeouts can all cause such a problem. In smaller environments, the chance of such a failure may be low, but in larger environments, intermittent issues across environments can tax the infrastructure security team responsible, distracting them from improving the environment and forcing them to utilize their time in putting out fires.

In effect, network placement is a form of authentication, as LDAP clients are assumed to be on a trusted network that has access to the LDAP server. There are bridging solutions such as Jumpcloud’s Cloud LDAP, but they should not be seen as the end goal. Rather, they should be seen as a way to support legacy applications while migrating to the company’s cloud directory, and stronger authentication methods like OIDC/SAML through Jumpcloud SSO .

Active Directory is definitely not immune! In addition to the problems associated with situations where an agent is unable to communicate with the server, Active Directory can experience problems where time deltas between servers can cause login issues that are hard to diagnose. In addition, the dependence upon weak authentication protocols such as NTLM and Kerberos. Both do not support MFA while at the same time introducing vulnerability classes such as password cracking (in the case of NTLM), replay attacks, forced authentication, and unconstrained deletation that can allow an attacker to move laterally through a network.


2FA/MFA to the rescue? Not quite…


MFA Fatigue


Due to the age of the LDAP protocol, and it being the underpinning of both FreeIPA and Active Directory, a common issue exists. LDAP was designed prior to the concepts of 2FA/MFA gaining traction in the security space, and so neither have native support. Instead, solutions have to be layered on top, or else the onus of support for stronger authentication is placed upon the end-user applications.

This means:

1) An application must support and require 2FA/MFA 2) A user must register another form of authentication besides a password with each application that supports it

If one application supports only TOTP and another supports push-based MFA like Duo, the user must have both. If they lose their TOTP key, they must obtain support to temporarily remove 2FA from that application and utilize a new seed prior to using the application again. This taxes the team that supports these kinds of requrests as they must be versed in security-related operations for each application deployed in an environment.

If an application is old and only supports SMS-based 2FA, the user must utilize the feature despite known issues with SIM-swapping. Each application’s developers dictates the level of security they support as they must develop the feature, and since those security features don’t directly drive monetization as product features do, they may not be prioritized. This leaves security-concious organizations and users out of luck if they want to utilize a stronger form of authentication.

Two-factor or multi-factor authentication is good, but support must exist at the core of the authentication protocol, not layered on top or delegating the task of supporting it to the end-user applications.

Even if device-based multi-factor (such as Okta or Duo’s push requests) is enabled, if passwords are utilized as one factor, threat actors can utilize MFA-fatigue, endlessly triggering push requests until a user gives in.

In the end, if you’re implementing 2FA/MFA that way, you’re just putting lipstick on a pig.


Living in the Present with OIDC and Passkeys


Yubikey

Passwords and authentication protocols that are dependent upon them must go. Reducing the number of passwords a user must remember doesn’t go far enough, and utilizing weak forms of authentication when better alternatives exist must change.

The solution is a combination of stronger forms of authentication both for users AND applications.

OIDC/SAML allow us to utilize applications that can work with an identity provider, regardless of network placement. An application deployed on-premises with no outbound connectivity to the Internet can cryptographically validate a user whom has authenticated with a trusted identity provider.

Passkeys can be utilized as the method of user authentication to an OIDC/SAML identity provider, providing a phishing-proof form of MFA.

But what is a passkey? A good explanation is provided by Apple, but the gist of it is an application-scoped asymmetric key-pair. The application-scoping feature is similar to how a password manager can know the correct credentials to use based upon the URL. The asymmetric key-pair, composed of one public key and one private key, is likewise similar to a form of authentication most Linux users are familiar with: SSH keys!

Although the cryptographic algorithm used to generate the key-pair may be different, the functionality is conceptually the same; the server/application need only know the user’s public key. A strong key-pair is infeasible for an attacker to crack, so when an application that enforces passkeys gets breached, even if user data is stolen, attackers would not be able to recover credentials.

Passkeys are device-bound, whether using a phone that supports the technology (such as iPhones) or dedicated hardware security keys. There are pros and cons to each, as the phone-based passkeys provide exportable keypairs to allow for backup, and hardware security keys such as Yubikey are non-exportable by design.

Passkeys are also multi-factor by design. In addition to some form of tap (really just a method of validating the intent to authenticate), there is a PIN required. At first glance, this can seem eerily similar to passwords. However, unlike passwords, the PIN is never sent to the server; instead, it is used by the device as part of a challenge-response process to sign a message to the server proving ownership over the private key for a given user.

You can read more about passkeys here.

Goal: Authenticate Everywhere with a Yubikey!


CISA's zero-trust model

In my homelab, I wanted to demonstrate a few of the tenets of CISA’s model for zero-trust, namely:

  • “All communication is secured regardless of network location”
  • “All data sources and computing services are considered resources”
  • “Access to individual enterprise resources is granted on a per-session basis”
  • “All resource authentication and authorization are dynamic and strictly enforced before access is allowed”

Further, in the Identity pillar, I wanted to demonstrate the usage of phishing-resistant MFA that would be required for each application used.

Ultimately, what I sought to achieve was the ability to log into a internal SSH server from the Internet while using only a Yubikey. No passwords allowed. Along the way, I would break down the barrier of the traditional network perimeter and utilize short-lived credentials. I would question traditional IT axioms such as shared accounts through the usage of auditable SSH certificates.

I accomplished these goals (with one caveat), using four technologies:

Federated Identity Management with OpenID Connect (OIDC)


OpenID Connect (OIDC) is an authentication protocol built upon OAuth2. If you’ve ever seen those buttons like:

OAuth2 buttons

Then you’ve seen OAuth2 in action. OAuth2 is a protocol intended for resource sharing. OIDC builds upon OAuth2 to provide authentication (providing the identity of the person). At the core of both, there is the concept of an identity provider (IdP) and a relying party (RP). The relying party is any application that needs to be able to verify a user’s identity and provide services based upon it. In the case of my lab, all of the services described above (Tailscale, Boundary, Smallstep) are OIDC clients. By splitting the identity provider from the application, we can federate identity, keeping a single source of truth, all without the client application ever seeing the user’s credentials.

When a user wants to access an OIDC client in the lab, they will be redirected to Google Workspace (my IdP), authenticate, and then get redirected back to the application, passing along the token from Google. The client application is able to cryptographically verify that the token received from the user was created by Google and then provide services to the user based upon their identity (username, groups, etc.). Because the client application never has to talk directly to Google, it can (and they are) utilize this authentication method without having Internet access!

Moreover, this centralized user identity management provides additional benefits in the user lifecycle. Because every application has their identity store tied back to Google, disabling or deleting a user in Google has the effect of disabling them everywhere. You don’t have to worry about a stray SSH key or local user allowing a terminated user to have continued access to the environment.

Limitations with Google Workspace


Unfortunately, as I worked through this experiment, I found some serious limitations with using Google Workspace (really GCP with an Internal OIDC application requiring an organization Workspace account). Namely, Google Workspace has no ability to add custom claims!

OIDC claims are essentially user attributes that can give more descriptive information such as group membership.

GCP OIDC Config


The image above is from Google’s OIDC configuration endpoint, providing a JSON-based configuration at a standardized endpoint to inform client applications of its support for various aspects of OIDC. Nowhere in those standard claims is group membership shown. When looking through Google documentation, I found that no support for custom claims was available. Other OIDC identity providers such as Okta and Auth0 provide that support.

The result of the limitations in GCP custom claims support is that we can’t create granular rules in client applications based upon a user’s groups in Google Workspace. This might be solvable by integrating Google Workspace with Jumpcloud but was out of scope for my lab at this time.

Replacing the Traditional VPN


Tailscale


Tailscale is a permissioned overlay network based upon Wireguard, a modern VPN protocol. It’s different from traditional VPNs in that authenticated users do not all receive the same network capabilities, being able to talk to the same hosts on the same ports. Instead, the ability for a user to talk to specific hosts and ports is based upon their permissions, which can be sourced from their configured groups. This could allow, for example, developers to communicate only with development infrastructure, and HR users likewise only with sensitive HR applications despite users of both being connected to the same Tailscale subnet router (analagous to a VPN concentrator). Essentially, Tailscale provides programmable permissions (Identity-as-Code) capabilities over layers 3 and 4 of the TCP/IP model.

In my lab, I utilized the bare minimum of features for Tailscale, opting for it as it would allow me to forgo having to expose ports on my home network to the public Internet. Boundary is also less widely adopted as of the time of this writing, so I didn’t want to expose it when I don’t have 247 SOC capabilities. Finally, Boundary provides the session-based access to resources, but when a user wants to access only web-applications running on HTTPS, it becomes difficult to work with due to the way it uses local port forwards.

Using a configured OIDC-based user pool with Google Workspace, users’ traffic gets piped through an OPNSense-based router in a heavily locked-down DMZ in the lab network that grants them access to Boundary and specific web-applications, and nothing else. The OPNSense instance acts as a Tailscale node and connects out to the control plane node (which is running Headscale, an OSS version of the Tailscale control plane).

Hashicorp Boundary


Boundary is similar to Tailscale in that it provides programmable, permissioned access to resources at layers 3 and 4. As discussed earlier, I found it unsuitable for users who need to access only web-applications that are protected by HTTPS. Boundary works using a model of Controllers and Workers, with users communicating with the former’s API, authenticating before they receive a local port forward through a Worker node to their target. The Worker nodes act essentially as identity-aware proxies, and allow users to communicate with resources in networks to which they do not have direct access.

As I worked through the experiment, I found a lot of overlap between Tailscale and Boundary in the supported use-cases. Both are being actively developed with features to meet parity with Teleport, arguably the forerunner in modern remote infrastructure access.

In the end, due to the desire to avoid listening/forwarded ports on my home network’s edge router as well as the relative immaturity ( <1.0 version) of Boundary, I opted to run Tailscale on a cheap Digital Ocean node as the first hop for users in talking to applications in the network. Once authenticated via Tailscale, they must authenticate to Boundary in order to access services those based upon HTTPS that are explicitly port-forwarded to the DMZ.

Authenticating to the Server


Smallstep CA


Once authenticated to Tailscale and Boundary, a user intending to SSH to a server must provide credentials. When dealing with authentication to SSH, many people and organizations opt to use SSH keys. But there are several problems associated with key lifecycle management that make for painful maintenance processes. Trying to create time-bounded access to a service (e.g. letting a security engineer perform tasks required for IR, but limiting access to 1 hour) requires custom tooling that must be built. Still, verifying that access doesn’t fail open due to issues such as caching becomes another concern. Issuing short-lived certificates as well as utilizing the Boundary and Indent integration would allow for ensuring that access is immediately and verifiably terminated after an amount of time specified during the approval process.

Smallstep CA OSS is deployed in my lab, port-forwarded through the subnet router and accessible to the DMZ for Tailscale or on-prem users. It is configured as an OIDC service, utilizing Google Workspace as the identity provider. Once authenticated, users will receive a certificate in their ssh-agent that can be utilized to transparently authenticate to lab hosts running SSH.

When users authenticate to a host’s SSH service, they do so with a shared user

Screaming mom

Shared Users Aren’t Bad, M’kay?


WAIT! It’s systems administration 101 that shared accounts are bad practice. So why am I intentionally using it here? Am I just lazy? Well, no (kinda yes, though).

After managing FreeIPA in the past, I came to realize that as we collectively adopt the DevOps mindset of treating infrastructure as cattle, not pets, a lot of infrastructure has no use for individual users. Would a production application server need users of different privilege levels? Anybody that needs access to that machine is likely an administrator. Do users plan on storing personal files on that server? I would argue that in most cases, that wouldn’t be likely. In fact, having personal users creates more overhead through the need for heavy agents like sssd (along with any infrastructure needed for identity management).

However, administrators may need occasional or emergency access to that machine. Some organizations may solve this by using a shared password (very bad) or having individual keys for a shared user stored in ~/.ssh/authorized_keys. Again, though, lifecycle management of those keys can be a pain.

The avoidance of shared users on infrastructure really stems from the concern for auditability. When a user logs onto the server and makes some change, we need to know who did it. We want to ensure only authorized personnel are able to log in, and if some breach were to occur, understanding which account was compromised could be key in resolving the issue.

Shared users with shared credentials creates issues of auditability. If you are able to utilize a shared user but different credentials, then you can reap the benefits of auditability without incurring the overhead of additional identity infrastructure.

A better solution, and the one I chose, was to use Smallstep CA to generate short-lived certificates that include an additional Principal in the certificate body. Authorized principals for a given certificate would include the personal user, and then a shared user (in my case, sysadmin). When a user logs into a server, we can see the authenticated user as well as the main principal on the certificate utilized.

SSH certificates are auditable

Multiple Personality Disorder a.k.a Multiple Sources of Identity


There is one more benefit in the solution described above. By avoiding the implementation and operation (costly in maintenance) of identity infrastructure such as Active Directory or FreeIPA, and reusing your corporate IT identity infrastructure (Google Workspace, Okta, etc.), you can avoid having multiple personalities.

A common IT philosophy is having a Single Source of Truth. This means that aren’t multiple data providers that must be interrogated in order to gather factual information about an environment. An example of having multiple sources of truth would be a local file or hosted wiki describing cloud-based infrastructure; that information can rapidly become out of data. When an organiztion relies upon such a method for accurate reporting, they are setting themselves up for failure. Instead, why not keep the cloud service provider as the single source of truth, and dynamically query it for the most accurate, real-time information of the live environment?

In the context of identity, such a problem can arise from having a corporate IT identity (e.g. the identity you use to log into your email account) and a secondary identity for application infrastructure (e.g. LDAP). This difference might stem from the makeup of the organization, wherein the enterprise IT team/infrastructure does not have great communication with application developers. It could also be due to the reliance upon older authentication protocols such as LDAP that present network-connectivity challenges between an on-premises corporate identity infrastructure deployment and a cloud-based application infrastructure. Ultimately, corporate identity might allow an employee to access shared drives and their email, but won’t allow them to SSH (or utilize some other authenticated service) to a server deployed in the cloud.

Multiple sources of identity present issues with management both for those responsible for infrastructure operations and for those that utilize it. When an employee is onboarded, it’s unlikely that there will be a stream-lined process for provisioning access as their access requests must traverse multiple teams that might utilize different systems for work intake (different ticketing systems with different information requirements). They must remember at least two different sets of credentials. When they are terminated, it will most likely be done at the corporate IT level. In the best case, the team responsible for application infrastructure must develop some hook or receive some notification of employee terminations, and must develop the automation to action those items. Attributes of identity must be mapped; most likely, the application team must be able to tie the email address of the employee to their username in application identity infrastructure in order to avoid naming conflicts. But edge-cases abound when you must deal with employees changing their name (e.g. marriages).

It becomes a maintenance nightmare when dealing with multiple sources of identity.

KISS (Keep It Simple, Stupid!) with the Login Flow


Ultimately, we are able to keep server authentication simple by designing the environment with the following:

  • Reuse of existing Identity infrastructure
    • By utilizing a single source of identity, we avoid maintenance issues and making the UX better for users who will not need to know multiple credentials
  • Breaking down the concept of trusted networks by implementing controls that allow more granular, user-based controls at TCP/IP layers 3 and 4
    • Boundary
  • Using short-lived credentials (i.e. SSH certificates) to avoid key lifecycle management issues and password management

For a user to log into the network from the Internet, they first authenticate to Tailscale (although I’m running the limited-featured Headscale since it’s OSS). Once on the DMZ, they are only allowed to communicate with specific web-applications and Boundary. After authenticating to Boundary, they can talk to the SSH service on a server. Logging into Smallstep CA and requesting a daily certificate would then allow them to authenticate to the SSH service.

When an engineer connects from the Internet to a server for the first time of the day, they must perform the following steps:


  • Authenticate to Tailscale
  • Authenticate to Boundary
  • Authenticate to Smallstep
  • Authenticate to SSH w/ Smallstep-provided certificate

Limitations


Each application is an OIDC client application, and so the engineer will be redirected to the identity provider (in my case, Google Workspace) for each. Using passkeys, this should be a fast experience. Unfortunately in my case, and for reasons that are not readily apparent, Google Workspace (i.e. paying) users are not able to utilize passkeys as of 05/03/2023 but free Google accounts are allowed to do so (thanks Google!). As a result, in my experiment, the login flow looks clunky since to avoid passwords and utilize my security key, I am actually recovering my account every time I log in. Horrible UX, I know. Once Google supports Workspace account authentication with passkeys, this issue should be resolved.

Although it may seem like a lot of authentications, in practice, engineers should not be needing to SSH to infrastructure if modern deployment practices are enforced. CI/CD, immutable infrastructure, and containerization preclude the need for access to the underlying OS except in order to troubleshoot some complicated issue that can’t be resolved by replacing the problematic infrastructure.

Demo



Plans for the Future


In the end, I was able to implement the zero-trust controls I had planned, and even achieved passwordless authentication. However, the inability to utilize passkeys in Google Workspace accounts was annoying and something I intend to fix as soon as support is rolled out. The fact that GCP didn’t allow custom OIDC attribute mapping was another surprise, and made for limitations in the types of controls I can implement in the future for group-based access.

Due to the lack of support for custom OIDC attribute mapping in GCP, I plan to try to utilize Jumpcloud’s Google Workspace integration. Jumpcloud will act, in essence, like an identity broker, authenticating users to Google but being able to provide the attributes needed for more granular controls in Tailscale and Boundary. With those abilities, I will be able to utilize Terraform and the principals of Identity-as-Code as well as Privileged Access Management through Indent. This will allow me to then simulate an enterprise application environment, keeping user permissions to the bare-minimum and allowing for automated, approval-gated, time-bounded access to the production environment.