I have a bunch of data that looks like this:
Minimum system requirements CPU: Celeron M 420 1.6GHz, Sempron 3100+ RAM: 1 GB VGA: GeForce 205, Radeon HD 6320 OS: Windows XP/Windows Vista/Windows 7/Windows 8 HDD: 4 GB Sound: DirectX compatible sound card DirectX: 9.0c
And I’m trying to find a way to organize it so it would be easier to read at a later point in time. I was thinking adding a line break before every word that has a colon attached to it would be the simplest way to do so, but I’m not too familiar with Regex and I don’t really know how to approach this problem. I would just search for each individual component separately, like “CPU:”, “OS:” etc but it’s not very consistent. Sometimes it’s listed as Processor, sometimes it’s CPU. Sometimes it’s RAM, other times its memory.
You can try
str = str.replaceAll("(?=\b\w+:)", "n");
(?=bw+:) is a positive lookahead that finds all 0-length substrings that are followed by
bw+: (but does not include
bw+: in the actual match).
b is a word-boundary and
w+ is one or more word characters (alphanumeric characters and underscores, equivalent to
: is a literal colon. We replace all 0-length matches of this regex with a newline,