Hello,
I have a text paragraph and a String[] of StopWords. Now I will have
to compare the each word of the paragraph with the StopWords array and
then if the word in paragraph doesn't match it returns false and that
word be pushed into a vector. So to compare each word of the text
paragraph I have put it in a String[] like this
String[] arrAbstractText = txtAbstract.split("\\ ");
txtAbstract is the below text paragraph.
******************************Here is the
txtAbstract****************************************
Abstract : A comparative transcriptome analysis for successive stages
of Arabidopsis developmental leaf senescence (NS),
darkening-induced senescence of individual leaves attached to the plant
(DIS) and senescence in dark-incubated detached
leaves (DET) revealed many novel senescence-associated genes with
distinct expression profiles. The three senescence
processes share a high number of regulated genes, although the overall
number of regulated genes during DIS and DET is about
two times lower than during NS. Consequently, the number of NS-specific
genes is much higher than of DIS- or DET-specific
genes. The expression profiles of transporters, receptor like kinases,
autophagy genes and hormone pathways were analysed in
detail. The Arabidopsis transporters and other integral membrane
proteins were systematically re-classified based on the
Transporter Classification System. Coordinate activation or
inactivation of several genes is observed in some transporter
families in all three or only in individual senescence types,
indicating differences in the genetic programs for
remobilization of catabolites. Characteristic senescence type-specific
differences were also apparent in the expression
profiles of (putative) signaling kinases. For eight hormones the
expression of biosynthesis, metabolism, signaling and
(partially) response genes was investigated. In most pathways novel
senescence-associated genes were identified. The
expression profiles of hormone homeostasis and signaling genes reveal
additional players in the senescence regulatory
network.
*****************************************************************************************************
After putting using split("\\ ") function the above pragraph becomes an
array and when I debug the value of "arrAbstractText" is looks like
below
***********************************************arrAbstractText
array*********************************
[A, comparative, transcriptome, analysis, for, successive, stages, of,
Arabidopsis, developmental, leaf, senescence, (NS),, darkening-induced,
senescence, of, individual, leaves, attached, to, the, plant, (DIS),
and, senescence, in, dark-incubated, detached, leaves, (DET), revealed,
many, novel, senescence-associated, genes, with, distinct, expression,
profiles., The, three, senescence, processes, share, a, high, number,
of, regulated, genes,, although, the, overall, number, of, regulated,
genes, during, DIS, and, DET, is, about, two, times, lower, than,
during, NS., Consequently,, the, number, of, NS-specific, genes, is,
much, higher, than, of, DIS-, or, DET-specific, genes., The,
expression, profiles, of, transporters,, receptor, like, kinases,,
autophagy, genes, and, hormone, pathways, were, analysed, in, detail.,
The, Arabidopsis, transporters, and, other, integral, membrane,
proteins, were, systematically, re-classified, based, on, the,
Transporter, Classification, System., Coordinate, activation, or,
inactivation, of, several, genes, is, observed, in, some, transporter,
families, in, all, three, or, only, in, individual, senescence, types,,
indicating, differences, in, the, genetic, programs, for,
remobilization, of, catabolites., Characteristic, senescence,
type-specific, differences, were, also, apparent, in, the, expression,
profiles, of, (putative), signaling, kinases., For, eight, hormones,
the, expression, of, biosynthesis,, metabolism,, signaling, and,
(partially), response, genes, was, investigated., In, most, pathways,
novel, senescence-associated, genes, were, identified., The,
expression, profiles, of, hormone, homeostasis, and, signaling, genes,
reveal, additional, players, in, the, senescence, regulatory, network.]
********************************************************************************************************
And also when the StopWords array looks like below when I debug the
code
**************************************************StopWords
array***********************************
[a, a's, able, about, above, according, accordingly, across, actually,
after, afterwards, again, against, ain't, all, allow, allows, almost,
alone, along, already, also, although, always, am, among, amongst, an,
and, another, any, anybody, anyhow, anyone, anything, anyway, anyways,
anywhere, apart, appear, appreciate, appropriate, Approximately, are,
aren't, around, as, aside, ask, asking, associated, at, available,
away, awfully, b, be, became, because, become, becomes, becoming, been,
before, beforehand, behind, being, believe, below, beside, besides,
best, better, between, beyond, both, brief, but, by, c, c'mon, c's,
came, can, can't, cannot, cant, cause, causes, certain, certainly,
changes, clearly, co, com, come, comes, concerning, conditions,,
consequently, consider, considering, contain, containing, contains,
corresponding, could, couldn't, course, currently, d, definitely,
described, despite, did, didn't, different, do, does, doesn't, doing,
don't, done, down, downwards, during, e, each, edu, eg, eight, either,
else, elsewhere, enough, entirely, especially, et, etc, even, ever,
every, everybody, everyone, everything, everywhere, ex, exactly,
example, except, f, far, few, fifth, first, five, followed, followin,
follows, for, former, formerly, forth, four, from, further,
furthermore, g, get, gets, getting, given, gives, go, goes, going,
gone, got, gotten, greetings, h, had, hadn't, happens, hardly, has,
hasn't, have, haven't, having, he, he's, hello, help, hence, her, here,
here's, hereafter, hereby, herein, hereupon, hers, herself, hi, him,
himself, his, hither, hopefully, how, howbeit, however, i, i'd, i'll,
i'm, i've, ie, if, ignored, immediate, in, inasmuch, inc, indeed,
indicate, indicated, indicates, inner, insofar, instead, into, inward,
is, isn't, it, it'd, it'll, it's, its, itself, j, just, k, keep, keeps,
kept, know, knows, known, l, last, lately, later, latter, latterly,
least, less, lest, let, let's, like, liked, likely, little, look,
looking, looks, ltd, m, mainly, many, may, maybe, me, mean, meanwhile,
merely, might, more, moreover, most, mostly, much, must, my, myself, n,
name, namely, nd, near, nearly, necessary, need, needs, neither, never,
nevertheless, new, next, nine, no, nobody, non, none, noone, nor,
normally, not, nothing, novel, now, nowhere, o, obviously, of, off,
often, oh, ok, okay, old, on, once, one, ones, only, onto, or, other,
others, otherwise, ought, our, ours, ourselves, out, outside, over,
overall, own, p, particular, particularly, per, perhaps, placed,
please, plus, possess, possible, presumably, probably, provides, q,
que, quite, qv, r, rather, rd, re, really, reasonably, regarding,
regardless, regards, relatively, respectively, right, s, said, same,
saw, say, saying, says, second, secondly, see, seeing, seem, seemed,
seeming, seems, seen, self, selves, sensible, sent, serious, seriously,
seven, several, shall, she, should, shouldn't, since, six, so, some,
somebody, somehow, someone, something, sometime, sometimes, somewhat,
somewhere, soon, sorry, specified, specify, specifying, still, sub,
such, sup, sure, t, t's, take, taken, tell, tends, th, than, thank,
thanks, thanx, that, that's, thats, the, The, their, theirs, them,
themselves, then, thence, there, there's, thereafter, thereby,
therefore, therein, theres, thereupon, these, they, they'd, they'll,
they're, they've, think, third, this, thorough, thoroughly, those,
though, three, through, throughout, thru, thus, to, together, too,
took, toward, towards, tried, tries, truly, try, trying, twice, two, u,
un, under, unfortunately, unless, unlikely, until, unto, up, upon, us,
use, used, useful, uses, using, usually, uucp, v, value, various, very,
via, viz, vs, w, want, wants, was, wasn't, way, we, we'd, we'll, we're,
we've, welcome, well, went, were, weren't, what, what's, whatever,
when, whence, whenever, where, where's, whereafter, whereas, whereby,
wherein, whereupon, wherever, whether, which, while, whither, who,
who's, whoever, whole, whom, whose, why, will, willing, wish, with,
within, without, won't, wonder, would, would, wouldn't, x, y, yes, yet,
you, you'd, you'll, you're, you've, your, yours, yourself, yourselves,
z, zero, -, %, !, @, #, $, ^, &, *, (, ), +, =, ,, ., /, ?, <, >, ~, `]
*********************************************************************************************************
and the code I wrote to compare and filter the words is
*********************************************************************************************************
String[] arrAbstractText = txtAbstract.split("\\ ");
boolean match = false;
for (int k = 0; k < arrAbstractText.length; k++) {
for (int l = 0; l < stopWords.length; l++) {
s = String.valueOf(arrAbstractText[k]).trim();
if(s.length()>0 &&
stopWords[l].trim().equals(s.toLowerCase())){match=true;}
}
if (!match) {
vFWords.add(arrAbstractText[k].toString().toLowerCase());
System.out.println("Words do not match :" +
arrAbstractText[k].toLowerCase().trim());
}
}
*********************************************************************************************************
I am not sure if I am doing it right in the above code snippet but I
don't missing lot of words while comparing the text
Here is a set of words that it suppose to return
"Arabidopsis"
"senescence"
"proteins" and many words like this.
Could some one please help me with this? Its higly appreciated as I am
close to the dead line to my project.
thanks
-L
Rhino - 29 Apr 2006 23:51 GMT
> Hello,
> I have a text paragraph and a String[] of StopWords. Now I will have
[quoted text clipped - 179 lines]
> Could some one please help me with this? Its higly appreciated as I am
> close to the dead line to my project.
Have you ever read the "trail" (chapter) on Collections in the Java
Tutorial? If not, I think you should have a good look at it. You should find
several techniques in there that will help you do what you want. The "trail"
starts here: http://java.sun.com/docs/books/tutorial/collections/index.html.
Several of the topics should be quite helpful to you, even if they don't do
_exactly_ what you are doing, particularly "Set Interface Bulk Operations"
in "The Set Interface" and "Multimaps" in "The Map Interface".
--
Rhino
learner9 - 30 Apr 2006 03:58 GMT
Hello Rhino,
Sure I will definitely go through the linke you provided. I am kinda
newbie and any kind of useful links will be helpful to me. By the way I
solved the problem.
thanks for the reply,
-L
Trung Chinh Nguyen - 29 Apr 2006 23:58 GMT
> String[] arrAbstractText = txtAbstract.split("\\ ");
> boolean match = false;
[quoted text clipped - 10 lines]
> }
> }
I think this line
> boolean match = false;
should be put inside the first loop instead?
Also, it might be better to use equalsIgnoreCase() instead of equals()
learner9 - 30 Apr 2006 01:34 GMT
Hey it works thanks for the heads up. I was kind lost and kinda
wondering where I have done mistake :)
thanks once again. By the way you got any clue how do I print a
variable or text in bold using System.out.println.()?
For instance
System.out.println("this is bold text");
how do I print that in bold?
-L
Chris Uppal - 30 Apr 2006 10:45 GMT
> By the way you got any clue how do I print a
> variable or text in bold using System.out.println.()?
There is no easy way to do it, so you might as well give up on the idea.
(It /can/ be done if you happen to know exactly where your code will be running
and exactly which -- if any -- escape sequences cause the console or console
window to change mode, but you really don't want to be messing around with that
kind of stuff.)
-- chris