Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / March 2007

Tip: Looking for answers? Try searching our database.

Algorithm for performing a rollup

Thread view: 
Chris - 18 Mar 2007 00:11 GMT
This is a really basic question for an experienced programmer to ask,
but does anyone have a preferred algorithm for doing a rollup of items
in a list? I find myself writing ugly code over and over again to do this.

As a simple example, let's say we have an array of strings, sorted, and
we want to get list of the unique strings along with the count for each.
Sort of the way you do with a SQL "group by" clause:

input:
{"A", "A", "A", "B", "B", "C", "D", "D"}

output:
"A", 3
"B", 2
"C", 1
"D", 2

This is what I usually do:
***********************************************
String [] array = {"A", "A", "A", "B", "B", "C", "D", "D"};
   
String str = null;
String prevStr = null;
int count = 1;

for (int i = 0; i < array.length; i++) {
    str = array[i];

    if (str.equals(prevStr)) {
        count++;
    } else {
        if (prevStr != null) {
            System.out.println(prevStr + " " + count);
            count = 1;
        }
        prevStr = str;
    }
}

// catch the last one
if (str != null) {
    System.out.println(prevStr + " " + count);
}
*************************************************

This is just plain ugly. I really hate the fact that I have to dump the
results of the rollup in two places.

To get rid of "catch the last one", I could try something that peeks at
the next element to see if it marks the start of a new run. The trouble
is, this peek can sometimes be expensive (depending what getting the
next element entails), and sometimes it's impossible. It can be
impossible if you're using an Iterator, because calling iterator.next()
burns the element.

Anybody got a better way?
Arne Vajhøj - 18 Mar 2007 00:23 GMT
> This is a really basic question for an experienced programmer to ask,
> but does anyone have a preferred algorithm for doing a rollup of items
[quoted text clipped - 43 lines]
> This is just plain ugly. I really hate the fact that I have to dump the
> results of the rollup in two places.

You could store string and count in an ArrayList and print at the end.

Arne
Arne Vajhøj - 18 Mar 2007 00:24 GMT
> You could store string and count in an ArrayList and print at the end.

If you used HashMap the array did not need to be sorted (but you may
need to sort keys when printing).

Arne
Stefan Ram - 18 Mar 2007 00:58 GMT
>input: {"A", "A", "A", "B", "B", "C", "D", "D"} output:
>"A", 3
>"B", 2
>"C", 1
>"D", 2

 I have not studied your code, I just took the first approach
 which would come to my mind:

public class Main
{ public static void main( final java.lang.String[] args )
 { final java.lang.String source[] =
   { "A", "A", "A", "B", "B", "C", "D", "D", null };
   for( int i = 0; i < source.length - 1; )
   { int j = i;
     while( source[ i ].equals( source[ j ]))++j;
     java.lang.System.out.println( "\"" + source[ i ]+ "\", " +( j - i ));
     i = j; }}}

 Your code does not fulfill the output-specification,
 because you do not write the text enclosed in double quotes.

 The »null« is called a »sentinel«. Search for »sentinel« in

http://www.oberon.ethz.ch/WirthPubl/AD.pdf
http://www.oberon.ethz.ch/WirthPubl/ProgInOberon.pdf

 However, without the sentinel only

     while( source[ i ].equals( source[ j ] ))++j;

 would have to be slightly extended to

     while( j < source.length && source[ i ].equals( source[ j ]))++j;

 and the »- 1« would have to be removed.
Chris - 18 Mar 2007 04:23 GMT
>> input: {"A", "A", "A", "B", "B", "C", "D", "D"} output:
>> "A", 3
[quoted text clipped - 14 lines]
>       java.lang.System.out.println( "\"" + source[ i ]+ "\", " +( j - i ));
>       i = j; }}}

Thank you. That is more elegant. In cases where a sentinel is possible,
it can work. It does do a peek() at the next element, though, which
won't work in many cases in my code.
Stefan Ram - 18 Mar 2007 06:01 GMT
>Thank you. That is more elegant. In cases where a sentinel is possible,
>it can work. It does do a peek() at the next element, though, which
>won't work in many cases in my code.

 OK, I remove the sentinel and the peek:

public class Main
{
 final static java.lang.String source[] =
 { "A", "A", "A", "B", "B", "C", "D", "D" };
 static int position = 0;
 static java.lang.String getNext(){ return source[ position++ ]; }
 static boolean hasNext(){ return position < source.length; }

 public static void main( final java.lang.String[] args )
 { java.lang.String first = null;
   java.lang.String next = null;
   while( true )
   { if( next != null ){ first = next; next = null; }
     else if( hasNext() )first = getNext(); else break;
     int count = 1;
     while( true )if( !hasNext() ){ next = null; break; }
     else if( first.equals( next = getNext() ))++count; else break;
     java.lang.System.out.println( "\"" + first + "\", " + count ); }}}
Lew - 18 Mar 2007 16:18 GMT
>> Thank you. That is more elegant. In cases where a sentinel is possible,
>> it can work. It does do a peek() at the next element, though, which
[quoted text clipped - 20 lines]
>       else if( first.equals( next = getNext() ))++count; else break;
>       java.lang.System.out.println( "\"" + first + "\", " + count ); }}}

This is so complicated compared to Arne's suggestion of using a Map.

-- Lew
Lew - 18 Mar 2007 16:19 GMT
> public class Main
> {
[quoted text clipped - 14 lines]
>       else if( first.equals( next = getNext() ))++count; else break;
>       java.lang.System.out.println( "\"" + first + "\", " + count ); }}}

Consider using indentation to make your code readable, as per the Sun Java
Coding Conventions extant since 1999.

-- Lew
Chris - 18 Mar 2007 19:08 GMT
>> Thank you. That is more elegant. In cases where a sentinel is possible,
>> it can work. It does do a peek() at the next element, though, which
[quoted text clipped - 20 lines]
>       else if( first.equals( next = getNext() ))++count; else break;
>       java.lang.System.out.println( "\"" + first + "\", " + count ); }}}

This is very clever. An outer and an inner loop, with essentially a
two-element pipeline. (It is difficult to follow; I had to reformat it
to understand it).

My guess that there is a way to simplify the code a bit and get a good
general algorithm.
Stefan Ram - 18 Mar 2007 19:46 GMT
>My guess that there is a way to simplify the code a bit and get
>a good general algorithm.

 I do not like the »while( true ){ ... break; ... }«-style
 myself, but sometimes it is the last ressort, when the loop
 control logic is too complex to fit into the parentheses.

 The loop control with some embedded side-effects was put into
 the body of the loop:

/* begin of the control section of the outer loop */
while( true )
{ if( next != null ){ first = next; next = null; }
 else if( hasNext() )first = getNext(); else break;
 /* end of the control section of the outer loop */

 /* begin of the actual body of the outer loop */
 int count = 1;

       /* begin of the control section of the inner loop */
       while( true )if( !hasNext() ){ next = null; break; }
       else if( !first.equals( next = getNext() ))break; else
       /* end of the control section of the inner loop */

       /* begin of the actual body of the inner loop */
       ++count;
       /* end of the actual body of the inner loop */

 java.lang.System.out.println( "\"" + first + "\", " + count );
 /* end of the actual body of the outer loop */ }

 The problem reminds me of the question how to structure a read
 loop, where the user has to redo his input if it is not valid
 without repeating the input statement »i = input()« within the
 source code, as in:

i = input(); while( invalid( i )){ i = input(); }

 This appears in languages without a »do{ ... }while( ...)«-loop
 or when additional constraints are given.

 It can be solved in the same way, using a »dirty« »while( true
 ){ ... break; ... }« style.

while( true ){ i = input(); if( valid( i ))break; }

 In fact, there was a time when such questions were research
 topics, namely, if I recall correct, Donald E. Knuth wrote an
 article about this input-loop in the 1960ies. Or, possibly, it
 was Niklaus Wirth, who wrote this.

>I had to reformat it to understand it

 Feel free not to read the following rationale for my
 formatting style. I append it just for people who enjoy
 reading about formatting styles.

 One Way to Format Parentheses

 There are several different ways to format texts with braces
 and parentheses. One of them is being described here.

 Indentation within Braces

 An indentation of just one space often is too small to be seen
 clearly, because the natural width and form of characters
 often varies by an amount that is not very much smaller than a
 space. Therefore, the indentation should amount to at least
 two positions. In order not to waste horizontal spaces, an
 indentation of exactly two positions  is chosen. This means,
 that the left position of the next level is two larger than
 the position of the directly enclosing level.

 Indentation by two positions within a block

{ ++x;
 ++x; }
^ ^
0 2

 Bad A small indentation by one position is not always visible
 clearly

{++x;
++x; }

 Good The indentation by two positions is visible clearly

{ ++x;
 ++x; }

 Bad A large indentation by more than two positions wastes
 horizontal space with no additional benefit

{    ++x;
    ++x; }

Spaces within braces

 In mathematics, there are often no spaces at the inner side of
 parentheses or braces in expressions, but spaces are used
 indeed at the inner side of braces in set notation, when the
 braces contain a description (not when they contain a list).
 Spaces in set notation

{ x | x  > 2 }

 This style is adopted here: One space is written at the inner
 side of braces.

 Spaces at the inner side of parentheses within a block

{ ++x; }

 This style is consistent with the indentation by two
 positions, because only using this style, corresponding parts
 of two lines have the same position.

 Bad No space after the first brace, the two statements are
 misaligned

{++x;
 ++x; }

 Good One space after the first brace, the two statements are
 properly aligned

{ ++x;
 ++x; }

 Bad Two spaces after the first brace, the two statements are
 misaligned

{  ++x;
 ++x; }

 There are some exceptions to this rule: No spaces are used
 within empty braces "{}" and between two or more closing
 braces of the same direction "}}", except, when the first one
 of them is part of an empty pair "{} }" (an empty pair of
 braces if treated like a single non-braces character).
 Unified rules for all Brackets

 For simplicity and uniformity, the rules from above apply to
 all kinds of brackets, including parentheses, braces (curly
 brackets), square brackets, and angle brackets.

 Spaces within parentheses and square brackets

{ y = f( x )+ g() + a[ 2 ]; }

 Binary operators are sorrounded by a space, but the space is
 omitted, when there already is a space on the other side of a
 sequence of bracket directly beside the operator: By this rule
 " )+" is written instead of " ) +".

 Representation of the Syntactical Structure

 A method declaration in Java consists of a head and a body.
 The following representation shows this structure:

 Good formatting according to the structure

void alpha() // head
{ beta(); }  // body

 The following formatting is misleading, because the line break
 does not match the structural break:

 Bad line break within the body

void alpha() { // head and the beginning of the body
 beta(); }    // the rest of the body

 This formatting also would make no sense for blocks within
 blocks. So it is often not used for such blocks. Therefore
 even the adopters of this style can not use it uniformly.

 Left Braces Look Like "bullets"

 There is a well known style to publish lists in typography
 using bullets sticking out on the left, looking like this:
 Common list representation with bullets in typography

o This is the first point
 of this list, it is written
 here just as an example.

o Here is another entry

o This is another example given
 just as an example to show
 an example

 The braces of the beginnings of blocks stand out on the left
 just the same, when the formatting being described here is
 used, so they look quite naturally as beginning-of-a-block
 markers, when one is used to the typographical list notation:

Left braces look like bullets to mark blocks

{ printf(); printf();
 printf(); printf(); printf();
 printf(); printf();   }

{ printf(); printf(); }

{ printf(); printf(); printf();
 printf(); printf();
 printf(); }

 Neutrality

 Someone wrote this C code:

 Code someone wrote

while( fgets( eingabe, sizeof eingabe, stdin ))
 if( sscanf( eingabe, "%d", &wert )!= 1 )
   fprintf( stderr, "Please enter a number!\n" );
 else
   summe += wert;

 It amazes me that I can add braces by my style conventions
 without the need to change the position of any character of
 the given code:

 The code from above plus braces

while( fgets( eingabe, sizeof eingabe, stdin ))
{ if( sscanf( eingabe, "%d", &wert )!= 1 )
 { fprintf( stderr, "Please enter a number!\n" ); }
 else
 { summe += wert; }}
Lew - 18 Mar 2007 20:14 GMT
>> My guess that there is a way to simplify the code a bit and get
>> a good general algorithm.
[quoted text clipped - 230 lines]
>   else
>   { summe += wert; }}

There are two conventional styles for brace placement in Java, dating back to
C days. The first, recommended by Sun since 1999, is to place the opening
brace at the end of the control structure line, and the closing brace to the
indent position of the control line, on its own line. The statements in the
controlled block are indented one notch further, two spaces by your
convention. Thus:

  while ( condition ) {
    doSomething();
  }

The other variant is to place the opening brace on its own line, indented the
same as the control line:

  while ( condition )
  {
    doSomething();
  }

Placing the braces on the same line as contained block statements makes the
source structure harder to discern.

I recommend that you follow one of these two conventions with respect to
braces, since that will align you with the Java subculture.

It's a good idea to familiarize yourself with Sun's guidelines, whether you
choose to follow them or not (at your peril):
<http://java.sun.com/docs/codeconv/>

-- Lew
Stefan Ram - 18 Mar 2007 22:33 GMT
>I do not like the »while( true ){ ... break; ... }«-style
>myself, but sometimes it is the last ressort, when the loop
>control logic is too complex to fit into the parentheses.

 The next version:

class Source
implements java.util.Iterator<java.lang.String>
{ private final java.lang.String source[] =
 { "A", "A", "A", "B", "B", "C", "D", "D" };
 private int position;
 public Source()
 { this.position = 0; }
 public java.lang.String next(){ return source[ position++ ]; }
 public boolean hasNext(){ return position < source.length; }
 public void remove(){ throw new java.lang.UnsupportedOperationException(); }}

/** For an Iterator<E> object delivering components, allow to count how many
components appear in a sequence of consecutive equal components.
@param E the type of the components */

class Counter<E>
{
 /** Initialize a new Counter object for a source.
 @param source an iterator delivering each component of the source in turn */

 public Counter( final java.util.Iterator<E> source )
 { first = null;
   next = null;
   this.source = source; }

 /** Advance the Counter object to the beginning of the next
 sequence of consecutive equal components.
 A true result indicates success.
 This might be called directly after object creation.
 Between a call of this and the next call there
 must be exactly one call of "count". */

 public boolean advance()
 { final boolean advanced;
   if( next != null )
   { first = next; next = null; advanced = true; }
   else if( source.hasNext() )
   { first = source.next(); advanced = true; }
   else advanced = false;
   return advanced; }

 /** After a succesful "advance",
 count the number of consecutive equal source components.
 This must be called exactly once after a successful call of "advance". */

 public int count()
 { int count = 1; while( this.isExtendable() )++count;
   return count; }

 /** The current value of the component, valid after a successful
 advance operation */

 public E value()
 { return first; }

 /** Is the next component equal to the first?
 side-effect: advance the (perceived) position to the next component if it is
 equal to the first */
 private boolean isExtendable()
 { final boolean result;
   if( !source.hasNext() ){ next = null; result = false; }
   else result = first.equals( next = source.next() );
   return result; }

 /** The first component in a sequence of consecutive
 equal source components */
 private E first;

 /** The next component as just read from the source */
 private E next;

 /** The component source for this Counter */
 private final java.util.Iterator<E> source; }

public class Main
{
 public static void main( final java.lang.String[] args )
 { Source source = new Source();
   Counter<java.lang.String> counter = new Counter<java.lang.String>( source );
   while( counter.advance() )
   { java.lang.System.out.println
     ( "\"" + counter.value() + "\", " + counter.count() ); }}}
Stefan Ram - 19 Mar 2007 01:26 GMT
>Between a call of this and the next call there
>must be exactly one call of "count".

 This burden placed on the client has been removed in the
 following version, where the client is free to call »value«
 and »count« an arbitrary number of times and in an arbitrary
 order after each successful »advance«.

class Source
implements java.util.Iterator<java.lang.String>
{ private final java.lang.String source[] =
 { "A", "A", "A", "B", "B", "C", "D", "D" };
 private int position;
 public Source()
 { this.position = 0; }
 public java.lang.String next(){ return source[ position++ ]; }
 public boolean hasNext(){ return position < source.length; }
 public void remove(){ throw new java.lang.UnsupportedOperationException(); }}

/** For an Iterator<E> object delivering components, allow to count how many
components appear in a sequence of consecutive equal components.
@param E the type of the components */

class Counter<E>
{
 /** Initialize a new Counter object for a source.
 @param source an iterator delivering each component of the source in turn */

 public Counter( final java.util.Iterator<E> source )
 { this.first = null;
   this.next = null;
   this.count = -1;
   this.source = source; }

 /** Advance the Counter object to the beginning of the next
 sequence of consecutive equal components.
 A true result indicates success. */

 public boolean advance()
 { this.first = null;
   this.count = -1;
   final boolean advanced;
   if( next != null )
   { first = next; next = null; advanced = true; }
   else if( source.hasNext() )
   { first = source.next(); advanced = true; }
   else advanced = false;
   if( advanced )doCount();
   return advanced; }

 /** The current value of the component, valid after a successful
 advance operation */

 public E value()
 { return first; }

 /** The current value of the counter, valid after a successful
 advance operation */

 public int count()
 { return count; }

 /** count the number of consecutive equal source components */
 private void doCount()
 { count = 1; while( this.isExtendable() )++count; }

 /** Is the next component equal to the first?
 side-effect: advance the (perceived) position to the next component if it is
 equal to the first */
 private boolean isExtendable()
 { final boolean result;
   if( !source.hasNext() ){ next = null; result = false; }
   else result = first.equals( next = source.next() );
   return result; }

 /** The first component in a sequence of consecutive
 equal source components */
 private E first;

 /** The next component as just read from the source */
 private E next;

 /** The component source for this Counter */
 private final java.util.Iterator<E> source;

 /** The last count */
 private int count = -1; }

public class Main
{
 public static void main( final java.lang.String[] args )
 { Source source = new Source();
   Counter<java.lang.String> counter = new Counter<java.lang.String>( source );
   while( counter.advance() )
   { java.lang.System.out.println
     ( "\"" + counter.value() + "\", " + counter.count() ); }}}
Red Orchid - 18 Mar 2007 01:51 GMT
Chris <spam_me_not@goaway.com> wrote or quoted in
Message-ID: <45fc7346$0$7452$9a6e19ea@news.newshosting.com>:

> input:
> {"A", "A", "A", "B", "B", "C", "D", "D"}
[quoted text clipped - 18 lines]
> This is just plain ugly. I really hate the fact that I have to dump the
> results of the rollup in two places.

I propose the following code:

<code 0>
void dumpArrayDuplicateInfo(String [] a) {
   int dcount = 1;
   for (int i = 0; i < a.length - 1; i++) {
       if (a[i].equals(a[i + 1])) {
           dcount++;
           continue;
       }
       print(a[i], dcount);
       dcount = 1;
   }
   if (a.length != 0) {
       print(a[a.length - 1], dcount);
   }
}
   
void print(String e, int dcount) {       
   System.out.println(e + " " + dcount);
}
</code 0>

But, if you hate dumping in two places, maybe, the following
code is possible:

<code 1>
void dumpArrayInfo(String [] arr) {
   String[] a = new String[arr.length + 1];
   System.arraycopy(arr, 0, a, 0, arr.length);

   int dcount = 1;
   for (int i = 0; i < a.length - 1; i++) {
       if (a[i].equals(a[i + 1])) {
           dcount++;
           continue;
       }
       System.out.println(e + " " + dcount);
       dcount = 1;
   }
}
</code 1>
Chris - 18 Mar 2007 04:26 GMT
> Chris <spam_me_not@goaway.com> wrote or quoted in
> Message-ID: <45fc7346$0$7452$9a6e19ea@news.newshosting.com>:
[quoted text clipped - 64 lines]
> }
> </code 1>

Thanks. Similar to the code from Stefan above. Works when you can get at
elements in the array cheaply, but doesn't work when a peek ahead is
expensive or when you're iterating over elements using an Iterator.
Lew - 18 Mar 2007 05:10 GMT
>> Chris <spam_me_not@goaway.com> wrote or quoted in
>> Message-ID: <45fc7346$0$7452$9a6e19ea@news.newshosting.com>:
[quoted text clipped - 7 lines]
>>> "C", 1
>>> "D", 2

I'd push things into Collections.

public class Foo
{

  public static void main( String [] args )
  {
    List< String > starters =
      Arrays.asList( "A", "A", "A", "B", "B", "C", "D", "D" );

    Map< String, Integer > kounts = new HashMap< String, Integer >();
    for( String key : starters )
    {
      Integer k = kounts.get( key );
      kounts.put( key, Integer.valueOf(k == null? 1 : k + 1 ));
    }

    List< String > outters = new ArrayList< String >();
    outters.addAll( kounts.keySet() );
    Collections.sort( outters );

    for ( String key : outters )
    {
      System.out.println( "\""+ key + "\", "+ kounts.get( key ));
    }

  }

}

-- Lew
Chris - 18 Mar 2007 18:44 GMT
>>> Chris <spam_me_not@goaway.com> wrote or quoted in
>>> Message-ID: <45fc7346$0$7452$9a6e19ea@news.newshosting.com>:
[quoted text clipped - 37 lines]
>
> }

Thanks. This works, and has the advantage that it doesn't require the
input to be in order. The downside is that it isn't scalable. It needs
to store every unique element in memory. When you're processing very
large data sets, memory fills quickly.
Lew - 18 Mar 2007 18:51 GMT
> Thanks. This works, and has the advantage that it doesn't require the
> input to be in order. The downside is that it isn't scalable. It needs
> to store every unique element in memory. When you're processing very
> large data sets, memory fills quickly.

Don't all the other solutions shown so far in this discussion require
maintaining all the elements in memory? It's a rhetorical question: of course
they do.

How is this different?

If your problem space is too large for memory, use an RDBMS. Get the report
you want with a SQL SELECT value, count(value) FROM somewhere GROUP BY value
ORDER BY value.

-- Lew
Patricia Shanahan - 18 Mar 2007 18:58 GMT
>> Thanks. This works, and has the advantage that it doesn't require the
>> input to be in order. The downside is that it isn't scalable. It needs
[quoted text clipped - 6 lines]
>
> How is this different?

Remember the original problem statement started with sorted data. It
could be coming from a disk file.

Patricia
Lew - 18 Mar 2007 19:20 GMT
>>> Thanks. This works, and has the advantage that it doesn't require the
>>> input to be in order. The downside is that it isn't scalable. It
[quoted text clipped - 9 lines]
> Remember the original problem statement started with sorted data. It
> could be coming from a disk file.

Here is the original statement of the inputs:
> As a simple example, let's say we have an array of strings, sorted, and we want to get list of the unique strings along with the count for each. Sort of the way you do with a SQL "group by" clause:
>
> input:
> {"A", "A", "A", "B", "B", "C", "D", "D"}

This puts all the elements in memory at once. The OP then went on to show code
using an array kept entire in memory.

-- Lew
Chris - 19 Mar 2007 00:15 GMT
>> Remember the original problem statement started with sorted data. It
>> could be coming from a disk file.
[quoted text clipped - 9 lines]
> This puts all the elements in memory at once. The OP then went on to
> show code using an array kept entire in memory.

For the input, that was just an example. For the output, no, my code did
not store everything in memory. It streamed out the results as they were
discovered.
Chris Uppal - 18 Mar 2007 20:34 GMT
Law wrote:

> Don't all the other solutions shown so far in this discussion require
> maintaining all the elements in memory?

No.

> It's a rhetorical question: of
> course they do.

Oh no, they don't !

   -- chris
Lew - 18 Mar 2007 21:26 GMT
> Law wrote:
>
[quoted text clipped - 7 lines]
>
> Oh no, they don't !

OK, so the use of arrays in every example cited was not the use of in-memory
constructs.

-- Lew
Lew - 18 Mar 2007 21:32 GMT
> Law wrote:
>
[quoted text clipped - 7 lines]
>
> Oh no, they don't !

Oh, I see. I was misled by the fact that every example in this thread did
maintain all the elements in memory, starting with the OP's.

The "lookahead" algorithm does not require all elements to be in memory.

-- Lew
Lew - 18 Mar 2007 05:03 GMT
Chris <spam_me_not@goaway.com> wrote:

>> input:
>> {"A", "A", "A", "B", "B", "C", "D", "D"}
[quoted text clipped - 4 lines]
>> "C", 1
>> "D", 2

Arne gave the hint with the Map idea. Here's a version of how I would attempt
such a thing:

First off, let's put the Strings into a List so we can go all Collections.

public class Foo
{

  public static void main( String [] args )
  {
    List< String > starters =
        Arrays.asList( "A", "A", "A", "B", "B", "C", "D", "D" );

    Map< String, Integer > kounts = new HashMap();
    for( String key : starters )
    {
      Integer k = kounts.get( key );
      if ( k == null )
      {
        kounts.put( key, 1 );
      }
      else
      {
        kounts.put( key, k + 1 );
      }
    }

    List< String > outters = new ArrayList< String >();
    outters.addAll( kounts.keySet() );
    Collections.sort( outters );

    for ( String key : outters )
    {
      System.out.println( "\""+ key + "\", "+ kounts.get( key ));
    }
  }
}

-- Lew
Chris Uppal - 18 Mar 2007 02:58 GMT
> This is a really basic question for an experienced programmer to ask,
> but does anyone have a preferred algorithm for doing a rollup of items
> in a list? I find myself writing ugly code over and over again to do this.

"writing ugly code over and over again" -- that's a /big/ mistake.  Even if the
code were as beautiful as the day, it would still be a mistake that a halfway
decent programmer should not make.

It seems as if a utility method which creates a LinkedHashMap<X, Integer> would
do the trick quite nicely.   Alternatively, if you don't want the expense of
creating an extra Collection, you could set up a little helper class as in the
attached code; with it you can say:

   for (Group<String> g : Group.groups(args))
       System.out.println(g);

where (in this example) "args" is a List<String> or String[] array.

Code follows.  All sorts of elaborations are possible.

BTW, it looks a bit long, but it's really very short -- almost no real code,
just lots of Java-ish padding...

   -- chris

=========== Group.java ===========
import java.util.*;

public class Group<X>
{
     private final X m_value;
     private final int m_count;

     public // ctor
     Group(X value, int count)
     {
           m_value = value;
           m_count = count;
     }

     public X
     getValue()
     {
           return m_value;
     }

     public int
     getCount()
     {
           return m_count;
     }

     public String
     toString()
     {
           return m_count + " * " + m_value;
     }

     public static <T>
     GroupingIteratable<T>
     groups(List<T> list)
     {
           return new GroupingIteratable<T>(list);
     }

     public static <T>
     GroupingIteratable<T>
     groups(T... elements)
     {
           return new GroupingIteratable<T>(elements);
     }
}
======= GroupingIteratable.java =======
import java.util.*;

public class GroupingIteratable<T>
implements Iterable<Group<T>>
{
private final List<T> m_list;

public // ctor
GroupingIteratable(List<T> list)
{
 m_list = list;
}

public // ctor
GroupingIteratable(T... elements)
{
 this(Arrays.asList(elements));
}

public Iterator<Group<T>>
iterator()
{
           return new GroupingIterator<T>(m_list.iterator());
     }
}
======= GroupingIterator.java =======
import java.util.*;

public class GroupingIterator<T>
implements Iterator<Group<T>>
{
     private final Iterator<T> m_iterator;
     private boolean m_hasReadAhead;
     private T m_readAhead;

     public // ctor
     GroupingIterator(Iterator<T> iterator)
     {
           m_iterator = iterator;
           m_hasReadAhead = false;
     }

     public boolean
     hasNext()
     {
           return m_hasReadAhead || m_iterator.hasNext();
     }

     public Group<T>
     next()
     {
           T current = m_hasReadAhead
                                         ? m_readAhead
                                         : m_iterator.next();
           m_hasReadAhead = false;
           m_readAhead = null;
           int count = 1;

           while (m_iterator.hasNext())
           {
                 T next = m_iterator.next();
                 if (!current.equals(next))
                 {
                       m_readAhead = next;
                       m_hasReadAhead = true;
                       break;
                 }

                 count++;
           }

           return new Group<T>(current, count);
     }

     public void
     remove()
     {
           throw new UnsupportedOperationException();
     }
}
===============================


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.