Why is the hostname declared invalid when creating a URI

Eugen Covaci :

Running this code with JDK 1.8:

try {
    System.out.println( new URI(null, null, "5-12-145-35_s-81", 443, null, null, null));
} catch (URISyntaxException e) {
    e.printStackTrace();
}

results in this error: java.net.URISyntaxException: Illegal character in hostname at index 13: //5-12-145-35_s-81:443

Where does this error come from, considering all the hostname characters seem legit, according to Types of URI characters?


If I use these URLs: //5-12-145-35_s-81:443 or /5-12-145-35_s-81:443 the error is gone.


From the comments, I understand that, according to RFC-2396, the hostname cannot contain any underscore characters.

The question that still holds is why a hostname starting with slash or double slash is allowed to contain underscores?

Andreas :

Host name must match the following syntax:

hostname      = domainlabel [ "." ] | 1*( domainlabel "." ) toplabel [ "." ]
domainlabel   = alphanum | alphanum *( alphanum | "-" ) alphanum
toplabel      = alpha | alpha *( alphanum | "-" ) alphanum

As you can see, only . and - are allowed, _ is not.


You then say that //5-12-145-35_s-81:443 is allowed, and it is, but not for host name.

To see how that pans out:

URI uriBadHost = URI.create("//5-12-145-35_s-81:443");
System.out.println("uri = " + uriBadHost);
System.out.println("  authority = " + uriBadHost.getAuthority());
System.out.println("  host = " + uriBadHost.getHost());
System.out.println("  port = " + uriBadHost.getPort());
URI uriGoodHost = URI.create("//example.com:443");
System.out.println("uri = " + uriGoodHost);
System.out.println("  authority = " + uriGoodHost.getAuthority());
System.out.println("  host = " + uriGoodHost.getHost());
System.out.println("  port = " + uriGoodHost.getPort());

Output

uri = //5-12-145-35_s-81:443
  authority = 5-12-145-35_s-81:443
  host = null
  port = -1
uri = //example.com:443
  authority = example.com:443
  host = example.com
  port = 443

As you can see, when the authority has a valid host name, the host and port are parsed, but when not valid, the authority is treated as freeform text, and not parsed any further.


UPDATE

From comment:

System.out.println( new URI(null, null, "/5-12-145-35_s-81", 443, null, null, null)) outputs: ///5-12-145-35_s-81:443. I'm giving it as hostname

The URI constructor you're calling is a convenience method, and it simple builds a full URI string and then parses that.

Passing "5-12-145-35_s-81", 443 becomes //5-12-145-35_s-81:443.
Passing "/5-12-145-35_s-81", 443 becomes ///5-12-145-35_s-81:443.

In the first, it's a host and port, and fails to parse.
In the second the authority part is empty, and /5-12-145-35_s-81:443 is a path.

URI uri1 = new URI(null, null, "/5-12-145-35_s-81", 443, null, null, null);
System.out.println("uri = " + uri1);
System.out.println("  authority = " + uri1.getAuthority());
System.out.println("  host = " + uri1.getHost());
System.out.println("  port = " + uri1.getPort());
System.out.println("  path = " + uri1.getPath());

Output

uri = ///5-12-145-35_s-81:443
  authority = null
  host = null
  port = -1
  path = /5-12-145-35_s-81:443

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=4245&siteId=1