Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / October 2006

Tip: Looking for answers? Try searching our database.

what are the most frequently used functions?

Thread view: 
Xah Lee - 28 Oct 2006 08:24 GMT
I had a idea today.

I wanted to know what are the top most frequently used functions in the
emacs lisp language. I thought i can write a quick script that go thru
all the elisp library locations and get a word-frequency report i want.

I started with a simple program:
http://xahlee.org/p/titus/count_word_frequency.py

and applied it to a Shakespeare text. Here's a sample result:
http://xahlee.org/p/titus/word_frequency.html

Then, i wrote a more elaborate one that recurse thru directories to
work on elisp code treasury.

The code is here:
http://xahlee.org/x/count_word_frequency.py

and i got a strange result. The word “the” appeared on the top,
along with many other English words. I quickly realized that these are
due to lisp function's doc strings. (not comments)

At this point, it dawned on me that there's no easy way to work around
this, Unless, i write this script in elisp which has functions that
read lisp code and can easily filter out doc strings.

Originally, i planned to use the word-frequency script on Perl, Python,
as well as Java, as well as Elisp. However, now it seems to me this
task is nigh impossible. Each of these lang has their own doc string
syntax. It's gonna be a heavy undertaking if the word-frequency script
is to work with all these langs, since that amounts to writing a parser
for each lang.

Alternatively, one can write multiple word-frequency scripts using each
lang in question, since most lang has facilities to deal with its own
syntax. However, this is still not trivial, and amounts to several
programing efforts.

Anyone would be interested in this problem?

PS bpalmer on #emacs irc.freenode wrote a elisp quicky to deal with
lisp, but that program is currently not fully working... see bottom
http://paste.lisp.org/display/28840

 Xah
 xah@xahlee.org
http://xahlee.org/
robert - 28 Oct 2006 12:29 GMT
> I had a idea today.
>
[quoted text clipped - 17 lines]
> along with many other English words. I quickly realized that these are
> due to lisp function's doc strings. (not comments)

Would be interesting to see if the type-checking "The" in lisp is still frequent. I doubt.

> At this point, it dawned on me that there's no easy way to work around
> this, Unless, i write this script in elisp which has functions that
[quoted text clipped - 11 lines]
> syntax. However, this is still not trivial, and amounts to several
> programing efforts.

Editor code (best maybe scintilla/sc1, check also emacs itself, ...) has libraries for colorizing comments in all kinds of programming langs ...

> Anyone would be interested in this problem?

I have a theory, that "bad source code" has more if/else/elif/case/switch dispatching statements per number of code words (lines..) than "good code" - independent of the language.

If you can count these ratio and correlate it to maybe a sf-ranking and to languages, that would be highly interesting for me... (in case drop a pointer in this thread / repeated subject)

-robert
Jürgen Exner - 28 Oct 2006 12:36 GMT
>> I had a idea today.

Oh, really? You should mark your calendar and celebrate the day annually!!!

>> I wanted to know what are the top most frequently used functions in
>> the emacs lisp language.

And the relationship with Perl, Python, Java is exactly what?

jue
robert - 28 Oct 2006 13:04 GMT
>>> I had a idea today.
>
[quoted text clipped - 4 lines]
>
> And the relationship with Perl, Python, Java is exactly what?

read more of the context and answer to the OP
Dr.Ruud - 28 Oct 2006 14:03 GMT
robert schreef:

> read more of the context and answer to the OP

That OP is invisible in most relevant contexts.

Signature

Affijn, Ruud

"Gewoon is een tijger."

Barry Margolin - 28 Oct 2006 15:40 GMT
> I had a idea today.
>
[quoted text clipped - 21 lines]
> this, Unless, i write this script in elisp which has functions that
> read lisp code and can easily filter out doc strings.

For Lisp, just look for symbols that are immediately preceded by ( or
#'.  The tokens after ( are not always functions, since this is also
used for constructing literal lists and for subforms of special
operators (e.g. the variable names in LET bindings) but I think the ones
that aren't functions will have low enough frequency that they won't
impact the results.

Perl would be harder, I think.  For ordinary function calls you can look
for a word followed by (, but built-in functions allow use without
parentheses around the parameters.

Signature

Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***

Xah Lee - 29 Oct 2006 06:55 GMT
« For Lisp, just look for symbols that are immediately preceded by (
...»

Thanks a lot! great thought.

I've done accordingly, which counts satisfactorily.
http://xahlee.org/emacs/function-frequency.html

Will take a break and think about Perl, Python, Java later...  For
Python and Java, i think the report will also have to count method
call since that what these langs deal with... slightly quite more
complex than just functional langs...

 Xah
 xah@xahlee.org
http://xahlee.org/


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.