How to Encode Special Characters in Java’s URI Class

How to Encode Special Characters in Java's URI Class

You would think adding query parameters with special characters to a URI would be easy in Java, but you’d be wrong.  The java.net.URI class tries to do some URL encoding, but runs into trouble with characters like ampersands, question marks, and slashes.  Here’s a quick URI workaround that doesn’t rely on third-party libraries.

 
Failure 1 – No Encoding

Your first attempt to encode special characters in a query string might be to pass it to the URI’s constructor just as you received it.

URI is smart enough to encode some characters like spaces and percent signs, but it leaves other symbols untouched.  If you try this first approach, you’ll get this garbled URL:

Instead of the properly encoded one.

Failure 2 – Double Encoding

For the next attempt, you’d reasonably try to first encode the query parameters with java.net.URLEncoder before passing them onto URI.

Makes sense.  Unfortunately, this causes URI to encode the percent sign (%) that was produced by the first encoding to %25.  Definitely not what we want.

Bypass URI’s Constructor with Reflection

That last attempt came really close.  If only there was a way to bypass the encoding step in URI’s constructor and set the query string to your properly encoded value directly.

Well there is a way — using reflection.  This approach won’t be blessed by any of the high OOP priests, but using reflection to set URI’s private fields directly does get the job done.  It doesn’t require any extra libraries and works in Oracle’s JDK 6, 7, and 8.

The important points with this approach are to:

  1. Call  Class.getDeclaredField(String)  instead of  Class.getField(String)  since the latter only looks for public fields while query is defined as private.
  2. Call  field.setAccessible(true)  to allow you to modify the value in this private field.
  3. Force the URI to rebuild on the next  toString()  call by setting its string field to null.

 

 

About Dele Taylor

Dele Taylor is the founder of StackHunter.com -- a tool to track Java exceptions. You can follow him on Twitter, G+, and LinkedIn.

8 Responses to “How to Encode Special Characters in Java’s URI Class”

  1. I think I found a better way, using the java.net.URL.toURI() method:

    // double encoded java.net.URI
    System.out.println(new URI(“https”, null, “foo.bar”, -1, “/baz”, “fuz=a%26b&q=w”, null));
    // good java.net.URL
    System.out.println(new URL(“https”, “foo.bar”, -1, “/baz?fuz=a%26b&q=w”));
    // good java.net.URI
    System.out.println(new URL(“https”, “foo.bar”, -1, “/baz?fuz=a%26b&q=w”).toURI());

    • stribika,

      You are right! Creating a URL object first and then converting it to a URI object seems to honor the encodings already performed. This avoids using Reflection as the author of this article suggests. Great job!

  2. Great article!
    This is exactly what I was looking for in order to get “?” string value

  3. Thank God for this! …err I mean Thank you good sir for this!!!

    I needed to add two fragments (parameters with #) into an URI and this is exactly what I needed. It would not be possible any other way since two URI fragments are completely non standard. The difference was that I actually changed the URI.class.getDeclaredField(“fragment”);

    Now why would anyone need such a twistedness you may ask – Robohelp –
    it uses a special linking to its webhelp files with two URI fragments to open help chapters and topics within them, they are some twisted m*** f***rs.

Trackbacks/Pingbacks

  1. How to: Java URL encoding | SevenNet - November 28, 2014

    […] How to Encode Special Characters in java.net.URI […]

  2. Fixed Java URL encoding #dev #it #asnwer | Good Answer - December 20, 2014

    […] How to Encode Special Characters in java.net.URI […]

  3. How to: Java URL encoding #dev #development #computers | IT Info - December 27, 2014

    […] How to Encode Special Characters in java.net.URI […]