Code Bites GS

So... Let's talk about Google Sheets. Years ago when I was Director of Internet Technologies for Carl Kartcher Enterprises, I had to learn Visual Basic (Viz-you-all Bzz-Kick) for Excel. It was a lot of no fun for a lot of effort, and really, after building UI for a bunch of smaller apps thre was really no point in any of it.

Come to 2019 (almost 22 years later) and now we have wonderful things like AI and Machine Learning and Google Docs and CSS! Why would anyone learn anything, when almost everythig is smart enough to fix it's own problems? Because, is why. Sometimes you need a thing to do more than look pretty—it needs to be smart. Google Sheets has UNIQUE for getting rid of duplicates, but to list duplicate items, you have to make the hoops to jump through.

Engineers only check off the boxes they are given.

M. Rasamoto

We are going to start with a problem my wife had. She was given an Excel spreadsheet with ~90,000 emails on it. These emails were in columns A and B I'm guessing it was all part of some gross data-dump, I didn't ask. She needed to know how many of those emails were duplicate, and then she needed to populate another column with the duplicate emails so she could sort them using voodoo and chicanery.

Her new $$$ work laptop could NOT process that many rows, so I advised her to load the EXCEL into Google and let Sheets make the Cloud do the work. Which it did.

Holy shit, if you don't know what I'm talking about, I understand. None of these words make sense if you're not familiar with Google Docs. The Sentence reads like garbage. "Sheets", "Cloud" yeah, it's all inside-baseball nonsence.

So the first thing we did was Google the issue, and every goddamn nutjob and his sister posted a video or tutorial on how to highlight a duplicate using custom formatting. Which is pretty, but a helpful as a box of broken hammers. So with a little more jiggering, we find form post after forum post on how to find duplicate files to delete them. Surly, now we can all agree that this is proof that god is indeed dead.

At this time, my poor wife has been asleep for an hour, and I'm still clawing through Google like an idiot on a mission, because... well (the answer's in the riddle). Then, all of the sudden - Eureka.

I have no idea how I found it, but there it was, some kind of "the madman from San Fran", Finn Smith, wrote a plug-in script that actually LISTS out duplicate entries. Then another link pops up on how to non-destrctivly combine two columns into an array (technically, he [Skyyang] does it using three columns, because he is a show off.)


function NOT_UNIQUE(range) {
  
  var allValues = '';
  var duplicates = [];
 
  for (i in range) {
    var thisRow = range[i];
    var thisRowString = thisRow.toString();
    if (allValues.indexOf(thisRowString) == -1) {
      allValues = allValues + ' ' + thisRowString;
    } else {
      var duplicatesString = duplicates.toString();
      if (duplicatesString.indexOf(thisRowString) == -1) {
        duplicates.push(thisRow);
      }
    }
  }
  return duplicates;
    
    /*
Find all the duplacates in Column A: 
    =NOT_UNIQUE(A:A)
Find all the duplicates in Columns A & B:
    =NOT_UNIQUE(FILTER({A:A;B:B}, LEN({A:A;B:B})))
    */
}

Yup, that's it... a lot of talk. I could have just given you the code and an example, but you could have also just scrolled down. But you didn't, and now you are reading this.