New array based on one property in existing object array

I’m trying to figure out the cleanest way of using the string-similarity library in NodeJS with the 2 arrays used in my project.

The first is an array of objects that look something like this:

{
    eventName: "Some event name",
    tournamentName: "US Open",
    city: "New York"
}

The second array contains objects that looks slightly different, for example:

{
    eventName: "Some event name",
    temperature: "28",
    spectators: "15000"
}

What I’m trying to do is build something that iterates through the first array and finds the closest matching event name in the second array, based of course ONLY on the eventName property using the “string-similarity” NodeJS library.

The below method works really well:

stringSimilarity.findBestMatch(eventName, arrayOfEventNames)

But of course the 2nd parameter requires an array consisting only of event names. I don’t have that. I have an array consisting of objects. It’s true that one of the properties of these objects is the event name, so what I’m trying to figure out is the best way to pass that in to this function. I built the below function (calling it inside forEach on first array) which basically takes in the name of the event I want to search for and the second array of objects and then creates a new temporary array inside it of ONLY the event names. Then I have the 2 inputs I need to call the stringSimilarity.findBestMatch method.

function findIndexOfMatchingEvent(eventName, arrayToCompareAgainst) {
    let onlyEventNames = [];
    
    arrayToCompareAgainst.forEach(e => {
        onlyEventNames.push(e.eventName);
    });
    
    if (arrayToCompareAgainst.length !== onlyEventNames.length) {
        throw new Error("List of events array length doesn't match event names array length!");
    }
    
    const bestMatch = stringSimilarity.findBestMatch(eventName, onlyEventNames);
    const bestMatchEventName = bestMatch.bestMatch.target;
    const bestMatchAccuracyRating = bestMatch.bestMatch.rating;

    const index = arrayToCompareAgainst.findIndex(e => {
        return e.eventName === bestMatchEventName;
    });

    if (index === -1) {
        throw new Error("Could not find matched event in original event list array");
    } else if (bestMatchAccuracyRating >= 0.40) {
        return index;
    }
}

This works but it feels very wrong to me. I’m creating this new temporary array so many times. If my first array has 200 objects, then for each of those I’m calling my custom function which is then creating this temporary array (onlyEventNames) 200 times as well. And even worse, it’s not really connected to the original array in any way, which is why I’m then using .findIndex to go back and find which object inside the array the found event refers to.

Would really appreciate some feedback/advice on this one. Thanks in advance!

Answer

In my earlier answer I misunderstood the question.

There’s no need to recreate the array of event names for each entry in the other array you want to compare. Create the array of event names once, then reuse that array when looping through the other array’s entries. You can create the array of event names the way you did in findIndexOfMatchingEvent, but the more idiomatic way is with map.

Assuming these arrays:

const firstArray = [
    {
        eventName: "Some event name",
        tournamentName: "US Open",
        city: "New York"
    },
    // ...
];
const secondArray = [
    {
        eventName: "Some event name",
        temperature: "28",
        spectators: "15000"
    },
    // ...
];

Then you can do this:

const onlyEventNames = secondArray.map(e => e.eventName);
let bestResult;
let bestRating = 0;
for (const {eventName} of firstArray) {
    const result = stringSimilarity.findBestMatch(eventName, onlyEventNames)
    if (!bestResult || bestRating < result.rating) {
        // Better match
        bestResult = secondArray[result.bestMatchIndex];
        bestRating = result.rating;
    }
}
if (bestRating >= 0.4) {
    // Use `bestResult`
}

When done with the loop, bestResult will be the object from the second array that is the best match for the events in the first array, and bestRating will be the rating of that object. (That assumes there are entries in the arrays. If there are no entries in firstArray, bestResult will be undefined and bestRating will be 0; if there aren’t any in the second array, I don’t know what findBestMatch returns [or if it throws].)

About your specific concerns:

I’m creating this new temporary array so many times.

Yes, that’s definitely not ideal (though with 200 elements, it’s really not a big problem). That’s why in the above I create it only once and reuse it.

…it’s not really connected to the original array in any way…

It is: by index. You know for sure that if the match was found at index 2 of onlyEventNames, that match is for index 2 of secondArray. In the code above I grab the entry using the index returned by findBestMatch.