Identification, Authentication, and Access Control, Web Security Course, Cyberecurity Courses and Resources

Identification and Authentication

Web 1.0 was mainly static with public information served directly from the server's file system. Information was mostly public and serious security breaches were not a major concern. As we moved to Web 2.0 with focus on dynamic, data-driven content, the confidentiality and integrity of information emerged as a security risk. Data had to be classified (e.g., public, private, confidential) and protected with measures such as access control.

NIST Special Publication 800-63B [1] provides comprehensive guidelines for Digital Identity and Authentication. The publication defines Digital Identity as the "online persona of a subject" engaged in online transactions. This identity can be represented in several ways such as someone's email, user ID, or someone's laptop. Proving someone is who they say they are, digitally, is hard and opens opportunities for attackers to steal someone's identity.

Authentication

Digital authentication is the "process of determining the validity of one or more authenticators used to claim a digital identity" [1]. In other words, it is the process of verifying that you are who you claim to be. In real life, in an airport, for example, your identity is verified by examining your passport information, including your photo (compared to your face). On a website, a login mechanism is used, the most common of which is the use of a username and a password.

In digital systems, such as websites, we employ credentials to perform authentication. A credential (e.g., password) binds an authenticator (e.g., a website) to the subscriber (i.e., the user), via an identifier (e.g., username or email address). Here is a question for you: do you think a password is an enough credential to protect your online identity?

Figure 3.1: Authentication (Login)

Digital authentication supports privacy protection by mitigating risks of unauthorized access to individuals’ information. The classic paradigm for authentication systems identifies three factors as the cornerstones of authentication [1]:

Something you know (e.g., a password).
Something you have (e.g., an ID badge or a cryptographic key).
Something you are (e.g., a fingerprint or other biometric data).

Multifactor Authentication, MFA, refers to the use of more than one of the above factors. The more factors you employ in your authentication system, the stronger it is. Currently the NIST guidelines suggest the use of at least two factors. An example of MFA is the use of a one-time passcode (OTP). Using the first factor, a user would enter their password (something they know) to verify their identity. This however, is not enough because someone else may enter the correct password one way or another (e.g., guessed it; stole it). This is when the second factor comes in handy; an OTP is sent to a registered phone number or email. The user would then enter the OTP (something they now have) to complete the process.

Figure 3.2: Securing Payments With OTP

Authorization (verifying access permissions) is not the same as Authentication (verifying one's identity).

Identification and Authentication Failures (A07:2021)

In terms of Web Security, we need to confirm the identity of a user through authentication in order to protect against authentication-related attacks. OWASP Top Ten 2021 list includes a security flaw relating to exactly that; namely, A07:2021 – Identification and Authentication Failures [2]. According to OWASP, there may be authentication weaknesses if the application:

Permits automated attacks such as credential stuffing* where the attacker has a list of valid usernames and passwords.
Permits brute force** or other automated attacks. When a website has no password policy to prevent multiple attempts and the use of common passwords, an attacker can use lists of common username and passwords to brute force a username or password field until successful authentication.
Permits default, weak, or well-known passwords, such as "Password1" or "admin/admin".
Uses weak or ineffective credential recovery and forgot-password processes, such as "knowledge-based answers," which cannot be made safe. Security questions should not be used as they may be easily guessable or obtainable by attackers.
Uses plain text, encrypted, or weakly hashed passwords data stores. When passwords are stored on a website, they must be adequately protect through proper measures such as Hashing and Salting***. As a first step, a website must enforce a strong password policy. Next, is to ensure the password is "salted" before it is stored.
Has missing or ineffective multi-factor authentication. As mentioned in the previous section a minimum of two-factor authentication should be employed.
Exposes session identifier in the URL. See session section below.
Reuse session identifier after successful login. See session section below.
Does not correctly invalidate Session IDs. User sessions or authentication tokens (mainly single sign-on (SSO) tokens) aren't properly invalidated during logout or a period of inactivity. See session section below.

*Credential stuffing is the automated injection of stolen username and password pairs (“credentials”) in to website login forms, in order to fraudulently gain access to user accounts. [3]

**Brute-Force Attacks in essence, brute-forcing refers to "trying until it works." In a password-related brute-force attack, an attacker may use a list of username/password combination to gain acccess.

***Salting is the process of adding a randomly generated string to a password as part of the hashing process.

Hashing vs Encryption In the context of password storing, passwords should be hashed, NOT encrypted. Encryption is a two-way function that uses a key to encrypt and decrypt. This means that the original plaintext password can be retrieved. Hashing is a one-way function and can NEVER be decrypted.

Attack Scenario

Let's say there was a data breach on Website A and the attacker managed to get hundreds of username/password credentials. Many users reuse the same username/password combination across multiple sites:

One of the users of Website A, is Victim X and he has an account on our target Website (Insecure Bank)
Victim X uses the same username/password combination on Website A and on Insecure Bank's website
Insecure Bank does not check for the number of login attempts
The attacker automates an attack that sends large-scale login requests directed against multiple web applications
One of those websites is our Insecure Bank

In this case, and although the attacker does not know that our victim has an Insecure Bank account where he is using the same username/password combiniation, the attack managed to gain access to the victim's account.

Web Sessions

HTTP is a stateless (or connectionless) protocol in the sense that each Request-Response pair is independent, and unaware, of any other pair. But what does this really mean? Imagine you visit a website like Google and you change the default language from Arabic to English. You then save your language preference and by doing so, you send an HTTP request and receive an English homepage as a response. You then search for a term and send a new request to Google. The results come back in Arabic again (based on your location). That would be really annoying. And the reason this would happen is because the Google server doesn't know who you are and what your language preference is; it simply "forgot" you.

The fact that HTTP is stateless poses a real problem. It makes the Web as we know it, impossible. No client would be able to maintain a continuous "conversation" with the server. Luckily, the solution is available in the form of the SESSION object. The session is the mechanism by which the client and the server can establish a conversation.

So how does the session work? Let's use the same Google example. You visit Google for the first time from Dubai and you get the homepage in Arabic. You change the preference and send it via an HTTP request. The server responds but this time, it sends you a unique identifier called the SESSION TOKEN along with the response. Next, you search for something, but this time you include the session token with the new request. The server recognizes this token and associate it with you and your preferences.

Session Management

Because of its importance, the Session must be handled and managed very carefully. The session is not only used to save user preferences, it is also often used in tandem with authentication and access control. Session Management involves:

When the user first accesses the server, they are issued a pre-authentication token (not necessarily a very secure token and may be transmitted without encryption)
The user then logs in and gets authenticated. A new, and more secure, session token is issued is associated with the user
Subsequent handling of the session is now more secure (e.g transmit over SSL)
The server also determines what the user can and cannot see and do
When the user logs off, closes their browser, or the session times out, the session is disposed off

As an object, the session lives within the browser window. This means that when the browser is closed, the session is terminated. The server would issue a new token with a new visit. However, sometimes, when the website does not involve highly sensitive or confidential data, it may opt to store the session token in a text file on your computer called the cookie. When a website is very secure, on the other hand, it terminates the session after a period of inactivity (timeout).

Attack Scenario

Let's say someone accessed their online banking website from a public computer (not a good idea to start with):

User opens a web browser and logs in (e.g., https://insecurebank.ae)
When done, the user closed the browser window without logging out (another bad idea)
The bank's website does not properly handle sessions in the sense that the session never times out
An hour later, a hacker walks in, uses the same computer, opens the web browser, and opens the history
Immediately, they notice a bank's URL and of course, gives it a try (clicks the URL)
Because the session is still "alive", the hacker is logged in and given access to the user's account

In this case, we didn't even need a hacker. Anyone who is trying to access their own insecurebank.ae account, would gain access immediately and without logging in.

Prevention Measures

In order to prevent Identification and Authentication failures, measures can be taken in two main areas:

Credentials Management
Session Management

Here are few specific prevention measures:

Multi-Factor Authentication (MFA) Use at least tw-factor authentication.
No Default Credentials Default usernames and passwords should not be used.
Strong Passwords Enforce strong password policy. The website should reject weak passwords. Read more about passwords here »
Standard Error Message Ensure that regardless of the login outcomes, a standard error message is written in a way that does not leak information (e.g., Invalid Username/Password Combination).
Login Attempts Limit multiple login attempts to protect against brute-force attacks.
Session Management Use secure session management techniques including generating a new random session ID after login; no session identifier in the URL; invalidate sessions after logout and idle time; securely store sessions.

References

[1] NIST Special Publication 800-63B
[2] OWASP A07:2021
[3] Credential stuffing

Access Control

Access control (or Authorization) is a fundamental information security component that dictates who is authorized to access information resources. Information Security involves the protection of information and its critical elements, including systems and hardware that use, store, and transmit that information. The golden standard for Information Security is based on protecting the C.I.A triad of information.

Confidentiality is the ability to hide information from those people unauthorized to view it.
Integrity is the ability to ensure that data is an accurate and unchanged representation of the original secure information.
Availability refers to the ability to ensure that the information concerned is readily accessible to the authorized viewer at all times.

Figure 3.3: The C.I.A Triad

A breach in Confidentiality means unauthorized people can view information they should not be able to view. Imagine a patient is able to see not only their medical records but also the records of other patients. A breach in Integrity, on the other hand, means unauthorized people can change information they should not be able to modify. Should an online shopper be able to change the price of an item before they buy it? Of course, NOT!

Access control should regulate:

Who can use the system
What authorized users can access
When authorized users can access the system
Where authorized users can access the system from

Access control dictates that users cannot act outside of their intended permissions. In other words, they cannot view, modify, or delete information without permission. Broken Access Control is a failure to implement or enforce access control measures.

Broken Access Control (A01:2021)

When Access Control fails, hackers may be able to gain access to confidential resources including the ability to make unauthorized modifications to breach resources. In Web applications, this may have devastating impact; a hacker may bypass the authentication mechanism (i.e., login) and gain access to restricted areas of the website. Let's consider the following scenario:

Attack Scenario

Once logged in, the URL of the customer's dashboard webpage for Insecure Bank looks like this:

https://insecurebank.ae/customer?id=1234

Notice the URL parameter named id with a value equals to 1234. If no additional access control measures are in place, any user can modify the value, say from id=1234 to id=5678, and potentially access another customer's dashboard. In this scenario, the only access control measure would've been the login authentication using the customer's credenitals (username and password).

The main security flaw in this website is Broken Access Control. The attacker in this case, however, exploited a specific vulnerability; namely bypassing access control via parameter tampering (aka, force browsing). This, in turn, implies that there may be a number of vulnerabilities that can lead to a breach (e.g., missing access controls for unsafe HTTP methods such as POST, PUT and DELETE).

Authorization (verifying access permissions) is not the same as Authentication (verifying one's identity).

Prevention Measures

One way to prevent breaches due to Broken Access Control is to fix all possible vulnerabilities (see OWASP list here »). But what if there are zero-day vulnerabilities? A much better approach is to design access control from the start. Security should be part of the Software Development Lifecycle (SDLC), rather than an after thought. The following are Access Control design principles that must be considered at early stages of web application development:

Enforce Least Privileges The Least Privileges principle refers to granting users only the minimum privileges necessary to complete their job. When given access to a college portal, a student should only be granted permission to view their own records including schedule, grades, etc., but NOTHING more.
Deny by Default Web applications must make a decision, whether implicitly or explicitly, to either deny or permit a certain requested access. In instances, a rule may exist that explicitly tells the application what to do (e.g., a student may NOT modify their grades). What if a rule doesn't explicitly exist? A Deny by Default approach implicitly denies access to ALL unanticipated scenarios.
Don't Hardcode Roles Role Based Access Control is a model for controlling access to resources based on user roles rather than individual identities. As an example, access control rules can be created for a Student role rather an individual student; when a new student, Maryam, is assigned the role of a Student, ALL relevant rules automatically apply to her. The design principle in this case would be NOT to hard-code these roles into the application itself. Consider this example using pseudo code:

if user.role=="teacher" Grade.Change

Code 3.1: Hard-Coded Role Rule

In the above scenario, if the user managed to escalate their privilege, say from Student to Teacher, then they will be able to change their grade. Now let us consider a better approach:

if user.hasaccess=="Grade_Change" Grade.Change

Code 3.2: Attribute-Based Rule

In the second scenario, using attribute-based rules, allow us to check for the origin of the request. For instance, grade change may only happen on the teachers' portal, and access to that portal requires Multifactor Authentication (e.g., an OTP to the teacher's mobile phone).

References

[4] OWASP A01:2021