Wednesday, February 16, 2011

Explode (Part One)

I have always been a fan of PHP's explode (quite opposed to a lot of other parts of that language) which made me want to recreate it in C. Thinking about the problem i came with a non-equivalent function that i liked better which i thought i should share.  It takes a string and a separator on which to split the string up on.  The function returns an array of strings that can (and has to be freed) with a single free. This is achieved by reorganizing the array in front of the resulting strings as one continous piece.
When thinking about it I decided to factor out multiple occurrences of the separator instead of inserting Null Strings, which may or may not be desired. One could improve this in several ways like several possible separator chars, a separator of more than one char, a limiting argument,...  of course, none of which i need right now. I am planning to write up a couple of other possible implementations over the course of time and discuss their respective merits and benchmark them.

char **explode(const char * source, char sep){
  int i =0, j =0;
  const char * runner=NULL, *oldrunner=NULL;
  int length = strlen(source)+1;
  int rlength=0;
  char ** resArr = NULL;
  char * finalArr = NULL;

  if (!(*source)){
    return NULL;
  }
  if (! (resArr = malloc(sizeof(char *) * length)) )
    return NULL;
  memset(resArr,'\0', length*sizeof(char*));

  runner = oldrunner = source;

  while (*runner){
    if (*runner == sep){
      if ( (length = runner - oldrunner) ){
        rlength+=length+1;
        resArr[i++] = (char *) oldrunner;
      }
      oldrunner = runner+1;
    }
    runner++;
  }
  resArr[i++] = (char *) oldrunner;
  rlength += runner - oldrunner+1;

  /* this part would be optional if i malloc'ed and strcpy'ed
   * above but that way the user only needs to free once*/
  if  ( !(finalArr = realloc(resArr, sizeof(char *)*(i+1) + rlength)) ){
    free (resArr);
    return NULL;
  }
  resArr = (char **) finalArr;
  finalArr = (char *) (resArr + ((i+1)*sizeof(char *)));
  j = i;

  for (i=0; i
    /* this of course only works because they are contigous */
    length = resArr[i+1] - resArr[i]-1;
    strncpy( finalArr, resArr[i], length);
    finalArr[length] = '\0';
    resArr[i] = finalArr;
    finalArr += length+1;
  }
  strcpy(finalArr, resArr[j-1]);
  resArr[j-1] = finalArr;
  return resArr;
}

No comments:

Post a Comment