How to replace /n with tags in java

I have a string of the following form

Hello ntHow are you?nt

I want to replace it with

<li>Hello</li> <li>How are you?</li>

I am able to replace /n to <br> using message.replaceAll("\n\t","<br>");

But I am not sure how to replace with <li>***</li> tags. I tried this message.replace("/[^rn]+/g", "<li>$&</li>"); but it does not work.

Answer

Well, you seem to have drawn this conclusion:

  • Hey, I can replace newlines with <br /> tags using .replace. I generalize that turning string inputs into HTML structures therefore must always be done with replace…
  • so how do I use replace to make <li>-structured HTML instead?

This is a brainfart – you’ve made jumps that you shouldn’t be making. Specifically, .replace is not a good idea, generally, for converting unstructured text into structured HTML. It’s really just the ‘newlines to br tags’ transformation that can feasibly be done with .replace and very little else, and only because that is an oversimplification that usually fails, except for that specific transformation.

Specifically, you tried message.replace("/[^rn]+/g", "<li>$&</li>") which obviously doesn’t work – replace does not apply regular expressions at all – perhaps you wanted replaceAll. You CAN probably attempt to use replaceall to do this job but it wouldn’t be a good idea. It’s like the question: “Can you butter your toast using a clawhammer?” – Yes, you can. That is no proof that it’s a good idea. Just use a buttering knife if you want to do it right, and ‘but it works with this clawhammer’ doesn’t change that fact.

So let’s break it down.

What you actually want to do is take your input which is unstructured, but represents structured data, into its actual structure first, and only then convert the structure you have into <li>-tag based HTML. In other words, it’s a two step model:

  1. Convert your input string into the structure you believe it holds.
  2. Convert that structure back into the output you want.

With each step having its own proper solution depending on what kind of structure it holds, what the input is formatted as, and what output you want.

In this specific case it seems easy enough: The structure is ‘the input consists of a number of lines, separated by a newline character’, and the desired output is ‘each line, HTML-escaped, as a list item in a <ul> HTML element’.

So let’s do that:

First, we recover the actual structured data. Given that your structure is ‘a bunch of lines’, List<String> sounds like a data type that properly represents this. So, we need to convert the input into a List<String>. Let’s get to it:

List<String> myData = Arrays.asList(input.split("\R"));

Then, you solve the completely separate problem of converting your now properly structured data into a desired output, which is now also quite easy:

StringBuilder html = new StringBuilder();
html.append("<ul>");
for (String line : myData) {
  html.append("<li>").append(htmlEscape(line)).append("</li>");
}
html.append("</ul>");
return html.toString();

Where htmlEscape is for example powered by guava’s HtmlEscaper – usually if you’re doing web stuff you have some framework or other on the classpath with this functionality. If you must have it on your own, it’s probably as simple as:

// Ampersand must remain first character.
private static final String HTML_PROTECTED = "&<>"'";
private static final String[] HTML_REPLACEMENTS = {"amp", "lt", "gr", "quot", "apos"};

public String htmlEscape(String s) {
  for (int i = 0; i < HTML_PROTECTED.length(); i++) {
    s = s.replace("" + HTML_PROTECTED.charAt(i), "&" + HTML_REPLACEMENTS[i] + ";");
  }
  return s;
}

But, those frameworks no doubt have faster implementations of this idea.