The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Wednesday, February 01, 2012

Not quite full service

We took pity on Edvard this month. We invited him for cake.

[You scream, I scream, we all scream for ice cream]

Sadly, there was no ice cream.


“You mean there are worse programmers than rabid howler monkeys on crack?”

I broke The Protocol Stack From Hell™. Again. It's a common occurance whenever I attempt to run a load test (nominally against our own code, but it has to run through The Protocol Stack From Hell™ and well, The Protocol Stack From Hell™ just tends to crumple). It's not fatal, just a severe annoyance at having to restart everything and hope it all comes back up.

I talked to R about this, seeing how he has over twenty years of experience with telephony protocols. I mentioned just how bad The Protocol Stack From Hell™ is, and ask if there was anything better.

I was informed that most of the major telephony players, like AT&T, wrote their own stack, but there do exist two commercial offerings, one being The Protocol Stack From Hell™ that I keep going on and on about. The other one …

R said that the other one is not only more expensive, but it's worse!

The stack we're using, the one written by rabid howler monkeys on crack, is the better of the two.

[My head asplode]

99 ways to program a hex, Part 24: more lookup tables

So we went from a character encoding specific version to a character encoding agnostic version to today's version—another character encoding specific version (ASCII to be exact). But today's version also eliminates a branch point in the code, using a 256-element string to pick which character to display as part of the hexidecimal dump.

/*************************************************************************
*
* Copyright 2012 by Sean Conner.  All Rights Reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version 2
* of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
*
* Comments, questions and criticisms can be sent to: sean@conman.org
*
*************************************************************************/

/* Style: C89, const correctness, assertive, system calls, full buffering */
/*	  lookup tables */

#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <assert.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

#define LINESIZE	16

/********************************************************************/

extern const char *sys_errlist[];
extern int         sys_nerr;

static void	do_dump		(const int,const int);
static size_t	dump_line	(char **const,unsigned char *,size_t,const unsigned long);
static void	hexout		(char *const,unsigned long,size_t,const int);
static void	myperror	(const char *const);
static size_t	myread		(const int,char *,size_t);
static void	mywrite		(const int,const char *const,const size_t);

/********************************************************************/

int main(const int argc,const char *const argv[])
{
  if (argc == 1)
    do_dump(STDIN_FILENO,STDOUT_FILENO);
  else
  {
    int i;
    
    for (i = 1 ; i < argc ; i++)
    {
      int fhin;
      
      fhin = open(argv[i],O_RDONLY);
      if (fhin == -1)
      {
        myperror(argv[i]);
        continue;
      }
      
      mywrite(STDOUT_FILENO,"-----",5);
      mywrite(STDOUT_FILENO,argv[i],strlen(argv[i]));
      mywrite(STDOUT_FILENO,"-----\n",6);
      
      do_dump(fhin,STDOUT_FILENO);
      if (close(fhin) < 0)
        myperror(argv[i]);
    }
  }
  
  return EXIT_SUCCESS;
}
      
/************************************************************************/     

static void do_dump(const int fhin,const int fhout)
{
  unsigned char  buffer[4096];
  char           outbuffer[75 * 109];
  char          *pout;
  unsigned long  off;
  size_t         bytes;
  size_t         count;
  
  assert(fhin  >= 0);
  assert(fhout >= 0);

  memset(outbuffer,' ',sizeof(outbuffer));
  off      = 0;
  count    = 0;
  pout     = outbuffer;
  
  while((bytes = myread(fhin,(char *)buffer,sizeof(buffer))) > 0)
  {
    unsigned char *p = buffer;
    
    for (p = buffer ; bytes > 0 ; )
    {
      size_t amount;
      
      amount    = dump_line(&pout,p,bytes,off);
      p        += amount;
      bytes    -= amount;
      off      += amount;
      count++;
      
      if (count == 109)
      {
        mywrite(fhout,outbuffer,(size_t)(pout - outbuffer));
        memset(outbuffer,' ',sizeof(outbuffer));
        count    = 0;
        pout     = outbuffer;
      }      
    }
  }
  
  if ((size_t)(pout - outbuffer) > 0)
    mywrite(fhout,outbuffer,(size_t)(pout - outbuffer));
}

/********************************************************************/

static size_t dump_line(
	char                **const pline,
	unsigned char              *p,
	size_t                      bytes,
	const unsigned long         off
)
{
  char   *line;
  char   *dh;
  char   *da;
  size_t  count;
  
  assert(pline  != NULL);
  assert(*pline != NULL);
  assert(p      != NULL);
  assert(bytes  >  0);
  
  line = *pline;
  
  hexout(line,off,8,':');
  if (bytes > LINESIZE)
    bytes = LINESIZE;
  
  p  += bytes;
  dh  = &line[10 + bytes * 3];
  da  = &line[58 + bytes];
  
  for (count = 0 ; count < bytes ; count++)
  {
    p  --;
    da --;
    dh -= 3;
    
    *da = "................................ !\"#$%&'()*+,-./0123456789:;<=>?"
	"@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~."
	"................................................................"
	"........................................................"
	"........"[*p];
    
    hexout(dh,(unsigned long)*p,2,' ');
  }
  
  line[58 + count] = '\n';
  *pline = &line[59 + count];
  return count;
}

/**********************************************************************/  

static void hexout(char *const dest,unsigned long value,size_t size,const int padding)
{
  assert(dest != NULL);
  assert(size >  0);
  assert((padding >= ' ') && (padding <= '~'));
  
  dest[size] = padding;
  while(size--)
  {
    dest[size] = "0123456789ABCDEF"[value & 0x0f];
    value >>= 4;
  }
}

/************************************************************************/

static void myperror(const char *const s)
{
  int err = errno;
  
  assert(s != NULL);
  
  mywrite(STDERR_FILENO,s,strlen(s));
  mywrite(STDERR_FILENO,": ",2);
  
  if (err > sys_nerr)
    mywrite(STDERR_FILENO,"(unknown)",9);
  else
    mywrite(STDERR_FILENO,sys_errlist[err],strlen(sys_errlist[err]));
  mywrite(STDERR_FILENO,"\n",1);
}

/************************************************************************/

static size_t myread(const int fh,char *buf,size_t size)
{
  size_t amount = 0;
  
  assert(fh   >= 0);
  assert(buf  != NULL);
  assert(size >  0);
  
  while(size > 0)
  {
    ssize_t bytes;
    
    bytes = read(fh,buf,size);
    if (bytes < 0)
    {
      myperror("read()");
      exit(EXIT_FAILURE);
    }
    if (bytes == 0)
      break;
    
    amount += bytes;
    size   -= bytes;
    buf    += bytes;
  }
  
  return amount;
}

/*********************************************************************/  
  
static void mywrite(const int fh,const char *const msg,const size_t size)
{
  assert(fh   >= 0);
  assert(msg  != NULL);
  assert(size >  0);
  
  if (write(fh,msg,size) < (ssize_t)size)
  {
    if (fh != STDERR_FILENO)
      myperror("output");
      
    exit(EXIT_FAILURE);
  }
}

/***********************************************************************/

And it is faster:

[spc]lucy:~/projects/99/src>time ./23 ~/bin/firefox/libxul.so >/dev/null

real    0m0.258s
user    0m0.247s
sys     0m0.011s
[spc]lucy:~/projects/99/src>time ./24 ~/bin/firefox/libxul.so >/dev/null

real    0m0.186s
user    0m0.178s
sys     0m0.008s

About 1.3 times faster, but it is faster.

The conversion string is fixed, but that doesn't preclude a port to, say, an EBCIDIC system from using a different one, or the string being constructed at run time. The runtime generation would be more portable, but to me, that's wasted time spent generating a string that will always be the same (and frankly, if we're using this hack for speed, that's just wasted time).

Perhaps better might be several such strings, ASCII, EBCIDIC, Baudot, PETSCII and select via a command line option which one to use (defaulting to whatever character set is native for the platform the program is running on). It could be a useful thing.

But such a modification I'm leaving as an exercise for the reader.

Now, is this the fastest version possible? I'm not going to say yes this time. There might be something else that could be done to wring that last bit of performance out of this code, but at this point, I am definitely done with wringing out the speed.

I think.

Thursday, February 02, 2012

99 ways to program a hex, Part 25: C♯

Jeff Cuscutis sent in a version written in C♯ (C-sharp, in case you don't have a font with the sharp symbol). He assured me the code works, but I can't test it as I don't use Microsoft Windows; nor have I installed Mono, as I don't really have a need to interoperate with the Microsoft Windows environment (at home, or at The Corporation).

// *************************************************************************
//
// Copyright 2012 by Jeff Cuscutis.  All Rights Reserved.
//
// This program is free software; you can redistribute it and/or
// modify it under the terms of the GNU General Public License
// as published by the Free Software Foundation; either version 2
// of the License, or (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
// GNU General Public License for more details.
// 
// You should have received a copy of the GNU General Public License
// along with this program; if not, write to the Free Software
// Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
//
// Comments, questions and criticisms can be sent to: sean@conman.org
//
// ***********************************************************************

// C#

using System;
using System.IO;
using System.Text;

namespace Hex
{
    class Program
    {
        static void Main(string[] args)
        {
            if (args.Length == 0)
            {
                DoDump(Console.In, Console.Out);
            }
            else
            {
                foreach (var fileName in args)
                {
                    try
                    {
                        using (var file = new FileStream(fileName, FileMode.Open, FileAccess.Read))
                        {
                            TextReader tr = new StreamReader(file, Encoding.ASCII);
                            Console.Out.WriteLine("-----{0}-----",fileName);
                            DoDump(tr, Console.Out);
                            file.Close();
                        }
                    }
                    catch (Exception e)
                    {
                        Console.Error.WriteLine(e.Message);
                    }
                    
                }
            }
        }

        static void DoDump(TextReader inFile, TextWriter outFile)
        {
            const int blockLength = 16;
            int actuallyRead;
            var buf = new char[blockLength];
            var offset = 0;

            while ((actuallyRead = inFile.Read(buf, 0, blockLength)) > 0)
            {
                var display = new char[blockLength+1];

                outFile.Write("{0:X8} ",offset);

                var j = 0;
                do
                {
                    outFile.Write("{0:X2} ", (byte)buf[j]);
                    if (!char.IsControl(buf[j]))
                        display[j] = buf[j];
                    else
                        display[j] = '.';
                    offset++;
                    j++;
                    actuallyRead--;
                } while ((j < blockLength) && (actuallyRead > 0));
                display[blockLength] = '\0';

                if (j < blockLength)
                    for (var i = j; i < blockLength; i++) outFile.Write("   ");

                outFile.WriteLine(display);

                outFile.Flush();
            }
        }
    }
}

About the only question I have about this version is that it appears to open the input file in text mode, and the hex dump program should work on any type of file, text or binary.

Friday, February 03, 2012

99 ways to program a hex, part 26: C89, system calls and mmap()

I still stand by what I said in part 24:

Now, is this the fastest version possible? I'm not going to say yes this time. There might be something else that could be done to wring that last bit of performance out of this code, but at this point, I am definitely done with wringing out the speed.

That didn't prevent Dave Täht from sending in a patch to part 24 that used mmap() (a system call that does some magic to make a file suddenly appear in memory) which did better on his system, although it was a percentage gain, not an order of magnitude gain.

/*************************************************************************
*
* Copyright 2012 by Sean Conner.  All Rights Reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version 2
* of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
*
* Comments, questions and criticisms can be sent to: sean@conman.org
*
*************************************************************************/

/* Style: C89, const correctness, assertive, system calls, full buffering */
/*	  lookup tables, mmap() */

#define _GNU_SOURCE

#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <assert.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>

#define LINESIZE	16

/********************************************************************/

extern const char *sys_errlist[];
extern int         sys_nerr;

static void	do_dump		(const int,const int);
static size_t	do_dump_memory	(unsigned char *,size_t,size_t,const int);
static size_t	dump_line	(char **const,unsigned char *,size_t,const unsigned long);
static void	hexout		(char *const,unsigned long,size_t,const int);
static void	myperror	(const char *const);
static size_t	myread		(const int,char *,size_t);
static void	mywrite		(const int,const char *const,const size_t);

/********************************************************************/

int main(const int argc,const char *const argv[])
{
  if (argc == 1)
    do_dump(STDIN_FILENO,STDOUT_FILENO);
  else
  {
    int i;
    
    for (i = 1 ; i < argc ; i++)
    {
      int fhin;
      
      fhin = open(argv[i],O_RDONLY);
      if (fhin == -1)
      {
        myperror(argv[i]);
        continue;
      }
      
      mywrite(STDOUT_FILENO,"-----",5);
      mywrite(STDOUT_FILENO,argv[i],strlen(argv[i]));
      mywrite(STDOUT_FILENO,"-----\n",6);
      
      do_dump(fhin,STDOUT_FILENO);
      if (close(fhin) < 0)
        myperror(argv[i]);
    }
  }
  
  return EXIT_SUCCESS;
}
      
/************************************************************************/     

static void do_dump(const int fhin,const int fhout)
{
  struct stat info;
  
  assert(fhin  >= 0);
  assert(fhout >= 0);
  
  if (fstat(fhin,&info) < 0)
    myperror("fstat()");
  
  if (!S_ISREG(info.st_mode))
  {
    unsigned char buffer[4096];
    size_t        bytes;
    size_t        off = 0;
    
    while((bytes = myread(fhin,(char *)buffer,sizeof(buffer))) > 0)
      off = do_dump_memory(buffer,bytes,off,fhout);
  }
  else
  {
    unsigned char *buffer;
    
    buffer = mmap(NULL,info.st_size,PROT_READ,MAP_SHARED,fhin,0);
    if (buffer == MAP_FAILED)
      myperror("mmap()");
    
    if (madvise(buffer,info.st_size,MADV_SEQUENTIAL | MADV_WILLNEED) < 0)
      myperror("madvise()");
    
    do_dump_memory(buffer,info.st_size,0,fhout);
    munmap(buffer,info.st_size);
  }
}

/********************************************************************/

static size_t do_dump_memory(
	unsigned char *p,
	size_t         bytes,
	size_t         off,
	const int      fhout
)
{
  char    outbuffer[75 * 109];
  char   *pout;
  size_t  count;
  
  assert(p     != NULL);
  assert(fhout >= 0);
  
  memset(outbuffer,' ',sizeof(outbuffer));
  count = 0;
  pout  = outbuffer;
  
  while(bytes)
  {
    size_t amount;
    
    amount = dump_line(&pout,p,bytes,off);
    p     += amount;
    bytes -= amount;
    off   += amount;
    count++;
  
    if (count == 109)
    {
      mywrite(fhout,outbuffer,(size_t)(pout - outbuffer));
      memset(outbuffer,' ',sizeof(outbuffer));
      count = 0;
      pout  = outbuffer;
    }
  }
  
  if ((size_t)(pout - outbuffer) > 0)
    mywrite(fhout,outbuffer,(size_t)(pout - outbuffer));
  return off;
}

/******************************************************************/  

static size_t dump_line(
	char                **const pline,
	unsigned char              *p,
	size_t                      bytes,
	const unsigned long         off
)
{
  char   *line;
  char   *dh;
  char   *da;
  size_t  count;
  
  assert(pline  != NULL);
  assert(*pline != NULL);
  assert(p      != NULL);
  assert(bytes  >  0);
  
  line = *pline;
  
  hexout(line,off,8,':');
  if (bytes > LINESIZE)
    bytes = LINESIZE;
  
  p  += bytes;
  dh  = &line[10 + bytes * 3];
  da  = &line[58 + bytes];
  
  for (count = 0 ; count < bytes ; count++)
  {
    p  --;
    da --;
    dh -= 3;
    
    *da = "................................ !\"#$%&'()*+,-./0123456789:;<=>?"
	"@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~."
	"................................................................"
	"........................................................"
	"........"[*p];
    
    hexout(dh,(unsigned long)*p,2,' ');
  }
  
  line[58 + count] = '\n';
  *pline = &line[59 + count];
  return count;
}

/**********************************************************************/  

static void hexout(char *const dest,unsigned long value,size_t size,const int padding)
{
  assert(dest != NULL);
  assert(size >  0);
  assert((padding >= ' ') && (padding <= '~'));
  
  dest[size] = padding;
  while(size--)
  {
    dest[size] = "0123456789ABCDEF"[value & 0x0f];
    value >>= 4;
  }
}

/************************************************************************/

static void myperror(const char *const s)
{
  int err = errno;
  
  assert(s != NULL);
  
  mywrite(STDERR_FILENO,s,strlen(s));
  mywrite(STDERR_FILENO,": ",2);
  
  if (err > sys_nerr)
    mywrite(STDERR_FILENO,"(unknown)",9);
  else
    mywrite(STDERR_FILENO,sys_errlist[err],strlen(sys_errlist[err]));
  mywrite(STDERR_FILENO,"\n",1);
}

/************************************************************************/

static size_t myread(const int fh,char *buf,size_t size)
{
  size_t amount = 0;
  
  assert(fh   >= 0);
  assert(buf  != NULL);
  assert(size >  0);
  
  while(size > 0)
  {
    ssize_t bytes;
    
    bytes = read(fh,buf,size);
    if (bytes < 0)
    {
      myperror("read()");
      exit(EXIT_FAILURE);
    }
    if (bytes == 0)
      break;
    
    amount += bytes;
    size   -= bytes;
    buf    += bytes;
  }
  
  return amount;
}

/*********************************************************************/  
  
static void mywrite(const int fh,const char *const msg,const size_t size)
{
  assert(fh   >= 0);
  assert(msg  != NULL);
  assert(size >  0);
  
  if (write(fh,msg,size) < (ssize_t)size)
  {
    if (fh != STDERR_FILENO)
      myperror("output");
      
    exit(EXIT_FAILURE);
  }
}

/***********************************************************************/

I tried it on my system, and saw no difference in performance whatsoever. But Dave was using a 64-bit system, and I was using a 32-bit system. Okay, there could be a difference there. I then tried it on a 64-bit system (The Corporation provided laptop, running a 64-bit version of Linux) and there was a difference, but well:

[spc]saltmine:~/source/99>time ./24 libxul.so >/dev/null

real    0m0.043s
user    0m0.030s
sys     0m0.010s
[spc]saltmine:~/source/99>time ./26 libxul.so >/dev/null

real    0m0.054s
user    0m0.040s
sys     0m0.010s

The version with mmap() is slower! It's more noticeable with a large (759,012,536 bytes) file:

[spc]saltmine:~/source/99>time ./24 largedata >/dev/null

real    0m1.682s
user    0m1.500s
sys     0m0.170s
[spc]saltmine:~/source/99>time ./26 largedata >/dev/null

real    0m1.809s
user    0m1.680s
sys     0m0.120s

Yes, time spent in the kernel goes down (understandable, since we no longer have to copy data out of the file through the kernel) but the overall time goes up. And at this point, we've reached the point of diminishing returns, where the amount of return does not justify the amount of effort. It could be that Dave's machine had more memory than my machine, or a faster harddrive, or a later version of the kernel that handled mmap()/madvise() better. There's no real gains at this point to be had.

Another interesting thing is to run the time command, but with more verbose output:

[spc]saltmine:~/source/99>time -v ./24 largedata >/dev/null
        Command being timed: "./24 largedata"
        User time (seconds): 1.47
        System time (seconds): 0.22
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.69
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 1600
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 139
        Voluntary context switches: 1
        Involuntary context switches: 171
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Here we see the code from part 24 taking 139 page faults. Now, today's version:

[spc]saltmine:~/source/99>time -v ./26 largedata >/dev/null
        Command being timed: "./26 largedata"
        User time (seconds): 1.71
        System time (seconds): 0.10
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.82
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 2965440
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 185449
        Voluntary context switches: 1
        Involuntary context switches: 182
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

The number of page faults skyrockets to 185,449, three orders of magnitude more than the previous version, and thus, that could account for the time loss on my 64-bit system (quite possibly this does show a sub-optimal implementation of mmap()).

Your milage may vary, though.

Saturday, February 04, 2012

99 ways to program a hex, Part 27: C♯, binary stream

Jeff Cuscutis sent in another version of the C♯ program. He writes:

From
Jeffrey Cuscutis <XXXXXXXXXXXXXXXXXXXXXXX>
To
Sean Conner <sean@conman.org>
Subject
Re: 99 Programs
Date
Sat, 4 Feb 2012 22:51:37 -0500

Modified to use BinaryReader instead of TextReader. It sort of worked on binary files before this, but replaced unprintable characters with “?” automatically when read.

It now does this correctly, but I had to make a wrapper function to read data from the Console as that is a TextReader.

// *************************************************************************
//
// Copyright 2012 by Jeff Cuscutis.  All Rights Reserved.
//
// This program is free software; you can redistribute it and/or
// modify it under the terms of the GNU General Public License
// as published by the Free Software Foundation; either version 2
// of the License, or (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
// GNU General Public License for more details.
// 
// You should have received a copy of the GNU General Public License
// along with this program; if not, write to the Free Software
// Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
//
// Comments, questions and criticisms can be sent to: sean@conman.org
//
// ***********************************************************************

// C#, binary stream

using System;
using System.IO;

namespace Hex
{
    class Program
    {
        static void Main(string[] args)
        {
            if (args.Length == 0)
            {
                DoDump(ReadFromConsole, Console.Out);
            }
            else
            {
                foreach (var fileName in args)
                {
                    try
                    {
                        using (var file = new FileStream(fileName, FileMode.Open, FileAccess.Read))
                        {
                            var tr = new BinaryReader(file);
                            Console.Out.WriteLine("-----{0}-----",fileName);
                            DoDump(tr.Read, Console.Out);
                            file.Close();
                        }
                    }
                    catch (Exception e)
                    {
                        Console.Error.WriteLine(e.Message);
                    }
                    
                }
            }
        }

        // wrapper to fake reading from a TextReader to 
        // make it look like it is a BinaryReader
        static int ReadFromConsole(byte[] buf, int index, int count)
        {
            var charBuf = new char[count];

            int actuallyRead = Console.In.Read(charBuf, index, count);

            for (int i = 0; i < charBuf.Length; i++)
            {
                buf[i] = (byte)charBuf[i];
            }

            return actuallyRead;
        }

        static void DoDump(Func<byte[], int, int, int> readFunc, TextWriter outFile)
        {
            const int blockLength = 16;
            int actuallyRead;
            var buf = new byte[blockLength];
            var offset = 0;

            while ((actuallyRead = readFunc(buf, 0, blockLength)) > 0)
            {
                var display = new char[blockLength+1];

                outFile.Write("{0:X8} ",offset);

                var j = 0;
                do
                {
                    outFile.Write("{0:X2} ", buf[j]);
                    if (!char.IsControl((char)buf[j]))
                        display[j] = (char)buf[j];
                    else
                        display[j] = '.';
                    offset++;
                    j++;
                    actuallyRead--;
                } while ((j < blockLength) && (actuallyRead > 0));
                display[blockLength] = '\0';

                if (j < blockLength)
                    for (var i = j; i < blockLength; i++) outFile.Write("   ");

                outFile.WriteLine(display);

                outFile.Flush();
            }
        }
    }
}

Update on Monday, February 6th, 2012

Jeff wrote me to add:

I forgot to mention that it also uses a generic signature to handle the readFunc parameter in DoDump()

Sunday, February 05, 2012

99 ways to program a hex, Part 28: K&R C, system calls, full buffering

So, how would the version based on system calls have looked in the 80s? You know, probably before the mmap() system call existed? Probably like this, vowel impairments, sorry, vwlmprmnts and all.

/*************************************************************************
*
* Copyright 2012 by Sean Conner.  All Rights Reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version 2
* of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
*
* Comments, questions and criticisms can be sent to: sean@conman.org
*
*************************************************************************/

/* Style: K&R, system calls, full buffering */

#include <stdlib.h>
#include <string.h>
#include <errno.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

#define LINESIZE	16

/********************************************************************/

main(argc,argv)
char **argv;
{
	int i,fhin;

	if (argc == 1) {
		hexdmp(0,1);
	} else {
		for (i = 1 ; i < argc ; i++) {
			fhin = open(argv[i],O_RDONLY);
			if (fhin == -1) {
				myperr(argv[i]);
				continue;
			}

			mywrt(1,"-----",5);
			mywrt(1,argv[i],strlen(argv[i]));
			mywrt(1,"-----\n",6);
      
			hexdmp(fhin,1);
			if (close(fhin) < 0) {
				myperr(argv[i]);
			}
		}
	}

	return 0;
}

/************************************************************************/     

char buffer[4096],outbuf[75 * 109];

hexdmp(fhin,fhout)
{
	int off,bytes,count,amount;
	char *pout,*p;

	memset(outbuf,' ',sizeof(outbuf));
	off = count = 0;
	pout = outbuf;

	while((bytes = myread(fhin,(char *)buffer,sizeof(buffer))) > 0) {
		p = buffer;
		for (p = buffer ; bytes > 0 ; ) {
			amount = hexln(&pout,p,bytes,off);
			p += amount;
			bytes -= amount;
			off += amount;
			count++;

			if (count == 109) {
				mywrt(fhout,outbuf,pout - outbuf);
				memset(outbuf,' ',sizeof(outbuf));
				count = 0;
				pout = outbuf;
			}      
		}
	}

	if (pout - outbuf > 0) {
		mywrt(fhout,outbuf,pout - outbuf);
	}
}

/********************************************************************/

hexln(pline,p,bytes,off)
char **pline,*p;
{
	char *line,*dh,*da;
	int count;
  
	line = *pline;
  
	hexout(line,off,8,':');
	if (bytes > LINESIZE) {
		bytes = LINESIZE;
  	}
	
	p += bytes;
	dh = &line[10 + bytes * 3];
	da = &line[58 + bytes];

	for (count = 0 ; count < bytes ; count++) {
		p  --;
		da --;
		dh -= 3;
    
		if ((*p >= ' ') && (*p <= '~')) {
			*da = *p;
		} else {
			*da = '.';
		}

		hexout(dh,(unsigned long)*p,2,' ');
	}

	line[58 + count] = '\n';
	*pline = &line[59 + count];
	return count;
}

/**********************************************************************/  

hexout(dest,value,size,padding)
char *dest;
{
	dest[size] = padding;
	while(size--) {
		dest[size] = (char)((value & 0x0F) + '0');
		if (dest[size] > '9') {
			dest[size] += 7;
		}
		value >>= 4;
	}
}

/************************************************************************/

myperr(s)
char *s;
{
	extern char **sys_errlist;
	extern int sys_nerr;
	int err = errno;

	mywrt(2,s,strlen(s));
	mywrt(2,": ",2);

	if (err > sys_nerr) {
		mywrt(2,"(unknown)",9);
	} else {
		mywrt(2,sys_errlist[err],strlen(sys_errlist[err]));
	}
	mywrt(2,"\n",1);
}

/************************************************************************/

myread(fh,buf,size)
char *buf;
{
	int amount = 0,bytes;

	while(size > 0) {
		bytes = read(fh,buf,size);
		if (bytes < 0) {
			myperr("read()");
			exit(1);
		}
		if (bytes == 0) {
			break;
		}    
		amount += bytes;
		size -= bytes;
		buf += bytes;
	}
	return amount;
}

/*********************************************************************/  

mywrt(fh,msg,size)
char *msg;  
{
	if (write(fh,msg,size) < size) {
		if (fh != 2) {
			myperr("output");
		}
		exit(1);
	}
}

/***********************************************************************/

Actually, the vowel impairment vwlmprmnt code was due to linker strictions at the time—linkers at the time were fairly limited, and one of the limits was the length of identifiers it could handle, a limit of around 6 characters (some might have handled more, but the first C standard in 1989 set the limit to six, so that's probably the smallest size at the time). With only six characters (makes you wonder where that limit comes from) and vowels typically being redundant (“f y cn rd ths y t cn wrt prgrms”) is it any wonder early code was typically vwlmprd?


I can't quite put my finger on it

I can't quite shake the feeling that this commercial is ripping something off. What, I don't know … but it's something …

Anybody? Anybody?

Monday, February 06, 2012

99 ways to program a hex, Part 29: K&R, system calls, full buffering, obfuscated

I suspect that many entries in the IOCCC start out as normal, are converted to K&R as a first step, then rename all variables and functions to one or two character names and unneeded spaces removed.

Much like today's version.

/*************************************************************************
*
* Copyright 2012 by Sean Conner.  All Rights Reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version 2
* of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
*
* Comments, questions and criticisms can be sent to: sean@conman.org
*
*************************************************************************/

/* Style: K&R, system calls, full buffering, obfuscated */

#include <errno.h>
#include <fcntl.h>

main(a,b)char**b;{int i,f;if(a==1)fa(0,1);else{for(i=1;i<a;i++){f=open
(b[i],O_RDONLY);if(f==-1){fd(b[i]);continue;}ff(1,"-----",5);ff(1,b[i],
strlen(b[i]));ff(1,"-----\n",6);fa(f,1);if(close(f)<0)fd(b[i]);}}return 0;}

char a[4096],b[75*109];
fa(c,d){int e,f,g,h;char*i,*p;memset(b,' ',sizeof(b));e=g=0;i=b;
while((f=fe(c,(char *)a,sizeof(a)))>0){p=a;for(p=a;f>0;){h=fb(&i,p,f,e);
p+=h;f-=h;e+=h;g++;if(g==109){ff(d,b,i-b);memset(b,' ',sizeof(b));g=0;
i=b;}}}if (i-b>0)ff(d,b,i-b);}

fb(a,p,c,d)char**a,*p;{char*e,*f,*g;int h;e=*a;fc(e,d,8,':');if(c>16)
{c=16;}p+=c;f=&e[10+c*3];g=&e[58+c];for(h=0;h<c;h++){p--;g--;f-=3;
if((*p>=' ')&&(*p<='~'))*g=*p;else*g = '.';fc(f,*p,2,' ');}e[58+h]='\n';
*a=&e[59+h];return h;}

fc(a,b,c,d)char*a;{a[c]=d;while(c--){a[c]=
(b&0x0F)+'0';if(a[c]>'9')a[c]+=7;b>>=4;}}

fd(a)char*a;{extern char**sys_errlist;extern int sys_nerr;int b=errno;
ff(2,a,strlen(a));ff(2,": ",2);if(b>sys_nerr){ff(2,"(unknown)",9);}else
{ff(2,sys_errlist[b],strlen(sys_errlist[b]));}ff(2,"\n",1);}

fe(a,b,c)char*b;{int d=0,e;while(c>0){e=read(a,b,c);if(e<0){fd("read()");
exit(1);}if(e==0){break;}d+=e;c-=e;b+=e;}return d;}

ff(a,b,c)char*b;{if(write(a,b,c)<c){if(a!=2){fd("output");}exit(1);}}

The sad thing—I've seen production code like this (and no, The Protocol Stack From Hell™ isn't this bad, thankfully).

Tuesday, February 07, 2012

99 ways to program a hex, Part 30: K&R, really obfuscated

And here we have a fully obfuscated version of our program—a nearly impenetrable wall of characters that nonetheless compiles and works.

/*************************************************************************
*
* Copyright 2012 by Sean Conner.  All Rights Reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version 2
* of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
*
* Comments, questions and criticisms can be sent to: sean@conman.org
*
*************************************************************************/

/* Style: K&R, system calls, full buffering, obfuscated 2 */

#include <errno.h>
#include <fcntl.h>

main(a,b)char **b;{int i,f;if(a==1)fa(0,1);else{for(i=1;i<a;i++){f=open(b[i],
O_RDONLY);if(f==-1){fd(b[i]);continue;}ff(1,"-----",5);ff(1,b[i],strlen(b[i])
);ff(1,"-----\n",6);fa(f,1);if(close(f)<0)fd(b[i]);}}return 0;}char a[4096],b
[75*109];fa(c,d){int e,f,g,h;char*i,*p;memset(b,' ',sizeof(b));e=g=0;i=b;while
((f=fe(c,(char *)a,sizeof(a)))>0){p=a;for(p=a;f>0;){h=fb(&i,p,f,e);p+=h;f-=h;
e+=h;g++;if(g==109){ff(d,b,i-b);memset(b,' ',sizeof(b));g=0;i=b;}}}if (i-b>0)
ff(d,b,i-b);}fb(a,p,c,d)char**a,*p;{char*e,*f,*g;int h;e=*a;fc(e,d,8,':');if(
c>16){c=16;}p+=c;f=&e[10+c*3];g=&e[58+c];for(h=0;h<c;h++){p--;g--;f-=3;if((*p
>=' ')&&(*p<='~'))*g=*p;else*g = '.';fc(f,*p,2,' ');}e[58+h]='\n';*a=&e[59+h]
;return h;}fc(a,b,c,d)char*a;{a[c]=d;while(c--){a[c]=(b&0x0F)+'0';if(a[c]>'9'
)a[c]+=7;b>>=4;}}fd(a)char*a;{extern char**sys_errlist;extern int sys_nerr;int
b=errno;ff(2,a,strlen(a));ff(2,": ",2);if(b>sys_nerr){ff(2,"(unknown)",9);}
else{ff(2,sys_errlist[b],strlen(sys_errlist[b]));}ff(2,"\n",1);}fe(a,b,c)char
*b;{int d=0,e;while(c>0){e=read(a,b,c);if(e<0){fd("read()");exit(1);}if(e==0)
{break;}d+=e;c-=e;b+=e;}return d;}ff(a,b,c)char*b;{if(write(a,b,c)<c){if(a!=2
){fd("output");}exit(1);}}

And because it's so obfuscated, it's mercifully short as well.

Wednesday, February 08, 2012

99 ways to program a hex, Part 31, has been delayed indefinitely

Yesterday's version is the last version I'll be posting for now. When I was initially inspired, I ripped through a majority of what you've seen in just three days. It's not really surprising given that a majority of the “variations” differed by a line or two of code.

But I've run out. And now, having done 21 variations in C (one more than I originally planned), five in Lua (I could do one more in Lua—the actual original code I based the Lua versions off of, but oddly enough, it doesn't actually handle files), two in a dialect of BASIC I can't currently test and two I didn't expect in C♯ (both submitted by Jeff Cuscutis), I don't think I have it in me to do many more.

I've exhausted C. And I pretty much exhausted Lua, which are my two “go to” languages these days. I could probably push out a couple of Perl versions, and a PHP version (PHP does not have nearly the expressiveness of Lua or even Perl to bother with more than one version) but that's about the limit.

There are a few other languages I could do (Common Lisp, Scheme, SNOBOL (seriously!), Forth, Awk, Erlang, Python and Ruby) but those would require significant time hitting up documentation and what not because I don't know those langauges all that well (if at all).

So I'll probably continue this series, but it'll probably be a post or two every few months and not every XXXXXXX day as I have been doing.

Thursday, February 16, 2012

“You don't really own your data, as much as we let you use it”

I made a comment recommending against using “the cloud” to store your data on GoogleFacePlusBook and someone took offense to that remark. I know, I know, but in my defense, we were both in the wrong, and in the end I hope we all learned something. I learned that “buying a book” is more “licensing to read” than actual ownership (even the dead tree type, and this from a lawyer I called (and if I knew his website, I would link to it here)) and the other person learned that yes, Virginia, you can successfully sue Amazon for having eaten your homework.

I still stand on my original remark, not to use “the cloud” to store your data. To present your data (like pictures, idiotic blog posts, what have you) to the public, sure, use “the cloud.” To store your data (or even a backup of your data)? Not on your life.

I do have my reasons and they range from the reasonable (it's not reliable, as even Google has bad hair days), the debatable (you have no control over your data as in the aforementioned Amazon eating your homework, sites going down with little to no notification) to the downright “wearing a tin hat in a shack in the woods” (actual remark by the other person, and here we go into government snooping through your data in “the cloud”—and if you think you are not a “person of interest” I'm sure Ted Kennedy never thought he would be on the “No Fly List”—ponder that for a while).

But it didn't occure to me that a company hosting “the cloud” could concievably mine your own data—I mean, it's there, right? And then I read this little gem of an article:

… Target has a baby-shower registry, and Pole started there, observing how shopping habits changed as a woman approached her due date, which women on the registry had willingly disclosed. He ran test after test, analyzing the data, and before long some useful patterns emerged. …

About a year after Pole created his pregnancy-prediction model, a man walked into a Target outside Minneapolis and demanded to see the manager. He was clutching coupons that had been sent to his daughter, and he was angry, according to an employee who participated in the conversation.

“My daughter got this in the mail!” he said. “She's still in high school, and you're sending her coupons for baby clothes and cribs? Are you trying to encourage her to get pregnant?”

The manager didn't have any idea what the man was talking about. He looked at the mailer. Sure enough, it was addressed to the man's daughter and contained advertisements for maternity clothing, nursery furniture and pictures of smiling infants. The manager apologized and then called a few days later to apologize again.

On the phone, though, the father was somewhat abashed. “I had a talk with my daughter,” he said. “It turns out there's been some activities in my house I haven't been completely aware of. She's due in August. I owe you an apology.”

Via Hacker News, How Companies Learn Your Secrets

Okay, it's not about a company mining “the cloud,” but it does illustrate just how much data we willingly (or unknowingly) give out.

Update a few minutes later

Perhaps government overreach isn't quite as “tin hat crazy” as I thought …

Tuesday, February 21, 2012

Hey, as long as you paid taxes on the income, I don't see the IRS having any issues with this …

Okay, let me see if I have the pitch right—hypothetically speaking, let's say I have made, through many illegal means, a metric buttload of money and I want to have it laundered. I don't want to mess with the banking systems as they seem to be under high scrutiny these days (besides, everybody knows they're criminals anyway).

So, I grab a few hundred pages from Wikipedia, bundle them into a “book” which I “publish and sell” via Amazon. This costs me nearly nothing. Then I “buy” as many copies of this “book”, getting nearly 45% of the money as the “author.” Amazon won't complain, as they're getting a nice chunk of change. Heck, I could even get a higher percentage of the money if the “book” is bought through an affiliate program I set up (this might push my percentage over 50%).

And who's to say this isn't going on right now?

And you know, money laudering might explain some of these Etsy sites (link via Regretsy).

Obligatory Picture

[Don't hate me for my sock monkey headphones.]

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: http://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

http://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2017 by Sean Conner. All Rights Reserved.