How to extract all integers from a string and store them in an int array in java

I want to take a string and isolate all integer numbers in it then store them up in an array.

The input string will only ever contain letters a-z(both upper and lower case), digits 0-9 and “-” (read as minus sign).

So far I’ve written:

String str = readString();

String[] arr = str.split("[^0-9-]+");

if my input string is for example “15abc-59abc31abc100” the code above works fine and I just need to convert each element from the string array into int, however if my input string has no letters between the numbers to seperate them it won’t work properly. example: abc59-12abc56-10abc10 will produce an array that only has 3 elements: 59-12, 56-10, 10 how do I make it recognize the minus sign as a start of a new element in the array without losing the sign itself?

Ideally I want the input “abc59-12abc56-10abc10” to look like this after the split:

String[] arr = {"59","-12","56","-10","10");

readString(); method will always provide the type of string I described above btw.

Answer

What you really want here is that - is a valid ‘number symbol’, but only at the start of any given number block.

However, by attempting to match the negative space in between them, you’ve made it hard on yourself: That is tricky to put in terms of regexes.

But, had you gone the positive route (write a regexp that describes a number), that’d be trivial: Pattern.compile("-?\d+") describes it perfectly: An optional minus sign followed by 1 or more digits. Simple enough. That’ll even ‘work’ if your input is “aa—-5”, which has one matching sequence (-5).

So.. do that, then. You’re abusing split here. Don’t abuse systems, it tends to go badly once you make things even a tiny little bit more complicated.

private static final Pattern NUMBER = Pattern.compile("-?\d+");

public List<Integer> getNumbers(String in) {
  Matcher m = NUMBER.matcher(in);
  var out = new ArrayList<Integer>();
  while (m.find()) out.add(Integer.parseInt(m.group(0)));
  return out;
}

A few tricks are being used here:

  • m.find() finds the next subsequence in the input string that matches the provided regexp.
  • Regexps are ‘greedy’ by default. meaning, the string “1234” can be interpreted equally legally in many ways: Is that a single item (1234)? Is that Just the ‘1’ also matches “-?d+”, after all. “Greedy” means that regexpes will match the longest sequence they can. Which is exactly what you want, no doubt.
  • m.group(0) gets you the match. 0 is a special group comprising the entire found sequence. If you use parentheses in regexes, you make groups, and you can get those too, e.g. if you want to exclude the minus sign you could have done "-?(\d+)" – note the parentheses. Now you can do m.group(1).
  • Note that if you must have an int[], converting a list of integers to int[] requires a loop. toArray can’t do it (Integer is not int, but List<int> is, for now, as yet illegal java).