Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / August 2006

Tip: Looking for answers? Try searching our database.

Convert a given string into how Java would interpret it if used in code?

Thread view: 
steve.chambers@gmail.com - 03 Aug 2006 10:38 GMT
Hi,

Is there any way to do this in Java? I'll try to explain what I mean a
bit better.

Given the following string in a text file:

   This is a return:\r\nThis is a tab:\tand this is a backslash:\\

I want Java to interpret this as it would if this string were in code
by changing the escape characters into their single character
representations. I know I could do this by using a number of
String.replaceAll()'s but was wondering if there's a command to parse
the string & produce the result in a nicer way???

Thanks for any help with this...

Cheers,
Steve
steve.chambers@gmail.com - 03 Aug 2006 10:55 GMT
In fact I've just realised it's a bit more complicated than I thought
if using replaceAll(). If there was a double backslash followed by an
escape character e.g. \\t, which ever way round I do the replacing it's
going to run into problems. If I replace the \\ with \ first then \t
will still be replaced with the literal tab character afterwards. And
replacing the \t first wouldn't work either - what we want to end up
with is a literal backslash followed by a "t". I'm stuck! But clinging
onto the hope that there might be an API call somewhere that will take
the work away...

> Hi,
>
[quoted text clipped - 15 lines]
> Cheers,
> Steve
Robert Klemme - 03 Aug 2006 11:06 GMT
Please don't top post.

> In fact I've just realised it's a bit more complicated than I thought
> if using replaceAll(). If there was a double backslash followed by an
[quoted text clipped - 5 lines]
> onto the hope that there might be an API call somewhere that will take
> the work away...

I don't see the point.  If you have the sequence "\\t" in your string
then what you want afterwards is a single literal backslash and a t.
Otherwise you would have to have either one or three backslashes.

The way I typically code this is a single loop that replaces the
sequence "backslash anything" with the appropriate value.

HTH

Regards

    robert
Simon - 03 Aug 2006 11:20 GMT
> Please don't top post.
>
[quoted text clipped - 11 lines]
> then what you want afterwards is a single literal backslash and a t.
> Otherwise you would have to have either one or three backslashes.

Take the string "\\t" and consider Steve's algorithm using replaceAll(). It can
be implemented in to ways:

Option 1:
1) First replace all "\\" by "\"
  Result: "\t"
2) Then replace all "\t" by "[TAB]"
  Result: "[TAB]"

Option 2:
1) First replace all "\t" by "[TAB]"
  Result: "\[TAB]"
2) Then replace all "\t" by "[TAB]"
  Result: "\[TAB]"

Both is not what you want ("\t"). It fails because a replacement character may
be replaced again. Iterating once over all characters of the original string and
appending the characters or their replacements to a new string should do the
job. This implementation should also be quick and easy to check for correctness.

Still, the question remains open whether this is implemented somewhere already.
Actually I think I did that a dozen times already...

Cheers,
Simon
steve.chambers@gmail.com - 03 Aug 2006 11:53 GMT
> Take the string "\\t" and consider Steve's algorithm using replaceAll(). It can
> be implemented in to ways:
[quoted text clipped - 21 lines]
> Cheers,
> Simon

Well said simon, that was what I meant. What I could do to botch this
is a replaceAll on the double backslashes first but replace them with a
string that would never be expected and which doesn't include a
backslash character (e.g. "`¬backslash~@") and then replace all these
back to backslashes at the end, having replaced the other escape
characters.

However as this is a bit of a botch I think I'll go with the suggested
looping method instead. Here's my first attempt, which converts the
particular escape characters that I need to be able to use in my text
file but doesn't bother with backspaces, unicode literals etc:

   /**
    * Replaces the following escape characters in a string with their
literal
    * equivalents:
    * \\f  -> \f (form feed)
    * \\n  -> \n (new line)
    * \\r  -> \r (carriage return)
    * \\t  -> \t (tab)
    * \\'  -> \' (single quote)
    * \\"  -> \" (double quote)
    * \\\\ -> \\ (backslash)
    *
    * @param inputString The string in which the replacements will be
made
    * @return The string with all escape characters replaced by their
    *         equivalent literals
    */
   public static String replaceEscapesWithLiterals(String inputString)
{
       String returnString = "";
       int inputStringLength = inputString.length();

       int charNum = 0;
       while (charNum < inputStringLength) {
           char currentChar = inputString.charAt(charNum);
           char literal = '\0';
           if ((currentChar == '\\') && (charNum + 1 <
inputStringLength)) {
               char nextChar = inputString.charAt(charNum + 1);
               switch (nextChar) {
                   case 'f':
                       literal = '\f';
                       break;
                   case 'n':
                       literal = '\n';
                       break;
                   case 'r':
                       literal = '\r';
                       break;
                   case 't':
                       literal = '\t';
                       break;
                   case '\'':
                       literal = '\'';
                       break;
                   case '\"':
                       literal = '\"';
                       break;
                   case '\\':
                       literal = '\\';
                       break;
               }
           }
           if (literal == '\0') {
               returnString += currentChar;
               charNum++;
           } else {
               returnString += literal;
               charNum += 2;
           }
       }
       
       return returnString;
   }
Roland de Ruiter - 03 Aug 2006 12:29 GMT
>> Take the string "\\t" and consider Steve's algorithm using replaceAll(). It can
>> be implemented in to ways:
[quoted text clipped - 35 lines]
>
> [...]

Here's one I did earlier ;-)

// begin of ConvertEscapedChars.java
public class ConvertEscapedChars {
    public static void main(String[] args) {
        final String orig = "This is a return:\\r\\nThis is "
        + "a tab:\\tand this is a backslash:\\\\.\r\n"
        + "Others are the backspace \\b\r\nthe formfeed \\f\r\n"
        + "single quote \\'\r\ndouble quote \\\"\r\n"
        + "one character octal \\7\r\ntwo character octal \\76\r\n"
        + "three character octal \\176\r\n"
        + "two character octal escape followed by octal digit \\767\r\n"
        + "\t(note that this is not a 3-char octal escape because "
        + "first digit is bigger than 3)\r\n"
        + "two character octal escape followed by non-octal digit "
        + "\\768\r\n"
        + "one character octal escape followed by non-octal digit "
        + "\\78\r\n"
        + "an invalid escape sequence \\w\r\n"
        + "an unterminated escape sequence \\";

        System.out.println("------- Original -------");
        System.out.println(orig);
        System.out.println("------------------------");
        System.out.println();

        final String conv = convertEscapedChars(orig);
        System.out.println("------- Converted ------");
        System.out.println(conv);
        System.out.println("------------------------");
    }

    /**
     * Replaces escape sequences in the given String <code>s</code> by
     * the actual characters they represent. The escape sequences that
     * are recognized are those that are defined in section 3.10.6 of
     * the JLS, 3rd ed. Invalid escape sequences are left untouched.
     *
     * @param s
     *            the String to convert
     * @return a String where escape sequences have been replaced by
     *         their actual character
     * @see
http://java.sun.com/docs/books/jls/third_edition/html/lexical.html#101089
     */
    public static String convertEscapedChars(String s) {
        int n = s == null ? 0 : s.length();
        if (n == 0) {
            return s; // null or empty string
        }
        StringBuffer result = new StringBuffer(n);
        for (int i = 0; i < n; i++) {
            char c = s.charAt(i);
            if (c != '\\') {
                result.append(c);
            } else {
                if (++i < n) {
                    c = s.charAt(i);
                    switch (c) {
                    case 'b':
                        result.append('\b');
                        break;
                    case 't':
                        result.append('\t');
                        break;
                    case 'n':
                        result.append('\n');
                        break;
                    case 'f':
                        result.append('\f');
                        break;
                    case 'r':
                        result.append('\r');
                        break;
                    case '\"':
                        result.append('\"');
                        break;
                    case '\'':
                        result.append('\'');
                        break;
                    case '\\':
                        result.append('\\');
                        break;
                    case '0':
                    case '1':
                    case '2':
                    case '3':
                    case '4':
                    case '5':
                    case '6':
                    case '7':
                        // OctalEscape
                        StringBuffer octal =
                            new StringBuffer(3).append(c);
                        if (i + 1 < n && (c = s.charAt(i + 1)) >= '0'
                                && c <= '7') {
                            i++;
                            octal.append(c);
                            if (i + 1 < n && (c = s.charAt(i + 1))>= '0'
                                    && c <= '7') {
                                i++;
                                octal.append(c);
                            }
                        }
                        if (octal.length()==3 && octal.charAt(0)>'3') {
                            i--;
                            octal.setLength(2);
                        }
                        result.append(
                           (char) Integer.parseInt(octal.toString(),
                                8));
                        break;
                    default:
                        System.err.println(
                            "Invalid escape sequence: \\" + c);
                        result.append('\\').append(c);
                        break;
                    }
                } else {
                    System.err.println("Unterminated escape sequence");
                    result.append('\\');
                }
            }
        }
        return result.toString();
    }
}
// end of ConvertEscapedChars.java
Signature

Regards,

Roland

steve.chambers@gmail.com - 03 Aug 2006 14:19 GMT
> Here's one I did earlier ;-)
>
[quoted text clipped - 128 lines]
>
> Roland

Thanks Roland, looks like I've reinvented the wheel with a slightly
inferior version!


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.