- What’s the difference between 401 and 403 error codes?
- What is a 401 status code (Unauthorized), and what triggers it?
- What is a 403 status code (Forbidden), and what triggers it?
- What are the Similarities Between 403 and 401 Status Codes?
- How 401 and 403 status codes can impact SEO
- Search engines can’t index pages
- Crawl budget is wasted on restricted pages
- Users get frustrated and leave your pages quickly
- Rankings may drop over time
- How to monitor 401 and 403 HTTP errors on your website
- Identifying 401 and 403 error codes with Google Search Console
- Identifying 401 and 403 error codes with the Website Audit Tool
- Identifying 401 and 403 error codes with log file analysis
- “Blocked due to unauthorized request (401)” in GSC: How to fix 401 errors
- “Blocked due to access forbidden (403)” in GSC: How to fix 403 errors
- Conclusion
When accessing a website, you may stumble upon 401 or 403 HTTP response codes. They indicate that you are trying to access the site without the appropriate credentials.
But what’s the actual difference between these two HTTP status codes? At first glance, they look very similar, but that’s where the trick lies. Once you understand the difference between 403 Forbidden vs 401 Unauthorized, you can better diagnose and fix issues around user authentication and access control.
This guide breaks down the complexities of 401 vs 403 error codes. It also illustrates how they differ and offers detailed solutions for various scenarios.
What’s the difference between 401 and 403 error codes?
The primary challenge with 401 vs 403 error (forbidden vs. unauthorized) codes lies in the reasons for denied access.
An HTTP 401 response code is returned when a user attempts to access a resource but hasn’t provided the necessary authentication credentials, like a valid username and password. It’s like trying to open a locked door without having the right keys.
On the other hand, an HTTP 403 status code occurs after the user has provided the correct login details. The main difference here is that they can’t access the requested resource due to insufficient permissions. Even though the user is authenticated, they lack the necessary authorization to proceed further. It’s like having the keys to the door but being told you’re not allowed inside.
The HTTP 401 status code signals that the client’s request lacked authentication credentials to access the target resource.
When the client (usually a web browser) tries to access a protected resource, the site requires the client to provide valid authentication. Depending on the website, this could take the form of a username and password, API keys, or other methods.
After that, the website processes the credentials to validate their legitimacy. This process could involve cross-referencing the credentials with a stored user database, contacting an external authentication provider, or another validation method. Upon successful authentication, the server returns a 200 status code, and it happens behind the scenes.
But in cases where authentication isn’t successful, the website issues the 401 status code.
There are several situations where the 401 status code may appear:
- Wrong login method used: The user tries to access the website with an inappropriate or unsupported authentication method.
- Expired or canceled login details: Sometimes, authentication credentials and tokens can expire after a certain period, or the user’s account is intentionally revoked. This is when the user must renew their access permissions.
- Login details missing or wrong: The request is made without authentication credentials, or the credentials provided are incorrect.
- No authorization header used: The request lacks the necessary authorization header, which typically carries information about the user’s credentials.
- Issues with cookies: The user’s browser doesn’t accept cookies (or delete them regularly) so the site has problems remembering the user’s login information.
What is a 403 status code (Forbidden), and what triggers it?
The HTTP 403 response code indicates that the server has understood the client’s request. The client has been authenticated but is not permitted to access the requested resource. Unlike a 401 error, which often indicates authentication issues, a 403 error signifies a broader problem related to authorization.
For example, if a client attempts to access a link intended only for administrators, the server would respond with a 403 error to signify that they are not authorized to access the specified link.
Here are some reasons why the 403 status code may appear:
- User lacks permission: The authenticated user lacks the necessary permissions to access the specific resource.
- Login failure or expired session: Temporary authentication failure, an expired session, or suspicious activity noticed by the server can result in restricted access, even for authenticated users.
- Geo or IP-based restrictions: Some servers impose restrictions based on IP addresses or geographical locations.
- Content access restrictions: Some websites or online services limit access to content based on age, location, membership status, etc.
- Directory access restriction: The server is configured to restrict directory listings when attempting to access a directory without a specific resource.
- Access Control Lists (ACLs) restrictions: Some servers use ACLs to set specific permissions for different users. If a user isn’t included in the ACL, the server will limit access for them.
What are the Similarities Between 403 and 401 Status Codes?
Understanding the difference between 401 vs 403 error codes can be confusing as both signal access denial, security, and user authentication issues. This can also make it difficult to figure out how to deal with each one. Let’s figure out what else these two codes have in common.
- Both 401 and 403 codes are HTTP status codes.
- Both errors lead to denied access. Even though a 401 error indicates the lack of valid authentication credentials and a 403 error denotes that access is forbidden, each prevents the client from landing on the requested website page.
- Both status codes communicate issues related to access control and security. Unauthorized users (or authorized ones with improper permissions) can’t access private data.
- Both errors are typically visible to users and are displayed in the browser.
- Both the 401 and 403 error codes address user authentication but at different stages. 401 refers to the lack of valid authentication credentials, whereas the 403 error occurs after authentication, signaling the absence of necessary permissions to access a resource.
How 401 and 403 status codes can impact SEO
401 and 403 HTTP errors are SEO issues that can result in incomplete or inaccurate indexing. They can also degrade user experience and engagement, and increase bounce rates. Let’s take a closer look at the impact of these status codes on SEO results.
Search engines can’t index pages
Since both HTTP errors indicate access denial, search engines won’t be able to crawl or index pages that return these codes. It’s okay if those pages weren’t meant to be available to the public. But if you planned to have them in search results, they won’t appear there. This means that your website’s overall visibility will be lower.
Crawl budget is wasted on restricted pages
When pages return 401 or 403 HTTP status codes, search engine bots expend crawl resources in an attempt to access content that they are ultimately unable to include in search results. This impacts the overall efficiency of the crawling process because it prevents other important pages or new content on your website from being crawled as frequently or thoroughly
Users get frustrated and leave your pages quickly
When the user stumbles upon these errors, it leads to frustration and a negative overall experience. The website experiences higher bounce rates, lower engagement metrics, and reduced time spent on the site.
Rankings may drop over time
The combination of blocked indexing, wasted crawl budget, and negative user experience can all lead to potential ranking drops in search engine results.
How to monitor 401 and 403 HTTP errors on your website
To keep 401 vs 403 errors under control, you must actively monitor them. This is pretty easy to do, especially with tools like Google Search Console and SE Ranking’s Website SEO Audit tool. You can also analyze log files. More on this below.
Identifying 401 and 403 error codes with Google Search Console
Google Search Console helps you identify issues related to 401 and 403 HTTP response codes by providing a detailed report on crawl errors. These errors indicate instances where Googlebot has difficulty accessing certain pages on your website.
To find issues with 401 and 403 status codes in the Google Search Console, go to the Indexing report and open the Pages tab. Scroll down to the Why pages aren’t indexed section to see the list of reasons. If your website contains blocked pages, you’ll see Blocked due to unauthorized request (401) or Blocked due to access forbidden (403).
Click on the reason will take you to a detailed report on the blocked URLs.
Another option is to go to Settings and open the Crawl stats report.
Scroll down to the By response section, and check for Unauthorized (401/407) and Other client error (4XX).
Identifying 401 and 403 error codes with the Website Audit Tool
SE Ranking’s Website Audit tool makes identifying forbidden vs unauthorized status codes a breeze. It helps with technical audits to identify the overall health of your website. It also detects HTTP errors, page indexing issues, redirect problems, and much more.
To detect 401 and 403 HTTP errors, launch a website check with SE Ranking. The tool is available both as a project and a stand-alone solution. Once the analysis is complete, go to the Issue report within the Website Audit tool.
Proceed to the HTTP Status Code section and check all the 4xx-related issues, including:
- 4XX pages in XML sitemap
- 4XX HTTP Status Codes
- Canonical URLs with a 4XX Status Code
- External links to 4XX, etc.
Click on the issue to see a description of the problem with tips on how to solve it.
You can also use the Crawled Pages report to see all of the pages on your website that were found by SE Ranking, as well as use filters to sort pages by 4xx status code. Click on Filters, choose HTTP Status Code in Issues, select the error of your interest, and Apply filters.
If you’ve run several website checks, use the Crawl Comparison to see how the situation regarding 4xx HTTP errors has changed over time.
Identifying 401 and 403 error codes with log file analysis
Log files generated by web servers contain valuable information about every request made to the server, including details about the status codes returned.
Here are the steps for monitoring HTTP errors using log file analysis:
- Access log retrieval: Get access logs from your web server. These logs contain detailed records of every request made to the server and response codes returned.
- Filtering by status code: Use a log analysis tool or script to filter log entries for HTTP status codes 401 and 403.
- Timestamp analysis: Review the timestamps associated with each entry to determine when the errors occurred and gain insights into potential issues.
- IP address and user agent examination: Investigate the IP addresses and user agents associated with the requests that resulted in 401 or 403 errors. This helps with identifying the sources of the access attempts and understanding whether they are legitimate users, bots, or potential security threats.
- URLs and referrers: Analyze the URLs and referrers in the log entries to identify the pages or resources that triggered the 401 or 403 errors. This pinpoints the location of access issues.
- User Authentication Insights (401): For 401 errors, examine the log entries for details about user authentication failures. Look for patterns such as failed login attempts, incorrect credentials, or expired sessions.
- Forbidden Access Insights (403): For 403 errors, analyze the log entries to determine the reasons for denied access. Investigate directory permissions, access controls, or any other configurations that may be restricting access to certain resources.
Before fixing the “Blocked due to unauthorized request (401)” page on your website, decide whether you want this 401 page to be indexed. Not all pages on your website need to be indexed (i.e. pages behind a login wall), but every situation is unique. You may have your own personal reasons for indexing them. Filter pages by the ones listed in your sitemap to see which of them can and can’t be indexed. You can also run an on-page SEO check if you want to see whether a specific URL has been indexed.
If you decide to index the 401 pages, adjust server settings to permit Googlebot to access these URLs and treat them differently than users’ browsers. However, presenting distinct content to Google can lead to cloaking penalties, so be cautious. Address this by using structured data on paywalled pages and following Google’s guidelines for adding appropriate data on subscription pages.
If you decide that the 401 pages don’t need to be indexed, disallow these parts of your website in the robots.txt file. This will optimize your crawl budget.
It’s also recommended to edit or remove unnecessary links to the 401 page from the referring pages. Doing so will keep your website’s internal linking safe and sound. Use the URL Inspection Tool in GSC to identify the links directing the crawler to a specific 401 page.
You can also use SE Ranking’s Website Audit tool to see the internal and external links pointing to the pages on your website. Go to the Found links report to view the table with link URLs, their status codes, and pages where those links were found. Use filters for a more convenient search.
“Blocked due to access forbidden (403)” in GSC: How to fix 403 errors
Just like with the 401 error, you must decide whether it’s worth fixing the “Blocked due to access forbidden (403)” issue at all. Do you want to show 403 pages to Googlebot?
If you don’t, then you can block them from crawling by using the robots.txt rule. This will prevent the Googlebot from wasting its crawl budget on restricted pages. You can use the following commands to disallow access to a specific folder or URL on your website:
- Disallow: /folder-name/
- Disallow: /page-url.html
If you have pages that you want to be indexed on search engines but restricted from non-logged-in users (such as paywalled content), you can grant access to Googlebot. Just modify the server settings without blocking it with a login wall. Note that displaying different content to Googlebot than to users requires the addition of structured data. This informs the crawler about paywalled content.
However, there may be certain pages on your site that are intended for public access, but they currently return a 403 status code to Googlebot due to various reasons.
Let’s look at the reasons why and how to fix them:
- Errors in your .htaccess file: Disable the existing .htaccess file and generate a new one. Then you can crawl your pages using a Googlebot. It will view your website from its perspective to verify that the issue is resolved.
- File permissions: Check if you have enabled the permissions for the files that you want search engines to see. If not, grant the necessary permissions.
- Incompatible plugin: If you use aCMS like WordPress, you may encounter plugin problems. Update them and check to see if they are compatible with your current version of WordPress. Deactivate them if they aren’t.
- Wrong IP address: Verify your Address record (A-record), which is used to map a domain or subdomain to a specific IPv4 address.
- Malware infection: Inspect your websites for malware infections and eliminate them if there are any.
- Hosting issues: If the suggestions above don’t help, it’s best to contact your hosting provider. There may be a problem on their side.
Conclusion
Understanding the nuances of 401 vs 403 errors is crucial for maintaining a healthy and well-optimized website. These HTTP status codes can have significant implications for SEO, affecting indexing, crawl budget utilization, user experience, and potential ranking drops.
منبع: https://seranking.com/blog/401-vs-403-error-codes/