Java Forum / General / May 2006
Tabs versus Spaces in Source Code
Xah Lee - 15 May 2006 03:04 GMT Tabs versus Spaces in Source Code
Xah Lee, 2006-05-13
In coding a computer program, there's often the choices of tabs or spaces for code indentation. There is a large amount of confusion about which is better. It has become what's known as religious war a heated fight over trivia. In this essay, i like to explain what is the situation behind it, and which is proper.
Simply put, tabs is proper, and spaces are improper. Why? This may seem ridiculously simple given the de facto ball of confusion: the semantics of tabs is what indenting is about, while, using spaces to align code is a hack.
Now, tech geekers may object this simple conclusion because they itch to drivel about different editors and so on. The alleged problem created by tabs as seen by the industry coders are caused by two things: (1) tech geeker's sloppiness and lack of critical thinking which lead them to not understanding the semantic purposes of tab and space characters. (2) Due to the first reason, they have created and propagated a massive none-understanding and mis-use, to the degree that many tools (e.g. vi) does not deal with tabs well and using spaces to align code has become widely practiced, so that in the end spaces seem to be actually better by popularity and seeming simplicity.
In short, this is a phenomenon of misunderstanding begetting a snowball of misunderstanding, such that it created a cultural milieu to embrace this malpractice and kick what is true or proper. Situations like this happens a lot in unix. For one non-unix example, is the file name's suffix known as extension, where the code of file's type became part of the file name. (e.g. .txt, .html, .jpg). Another well-known example is HTML practices in the industry, where badly designed tags from corporation's competitive greed, and stupid coding and misunderstanding by coders and their tools are so wide-spread such that they force the correct way to the side by the eventual standardization caused by sheer quantity of inproper but set practice.
Now, tech geekers may still object, that using tabs requires the editors to set their positions, and plain files don't carry that information. This is a good question, and the solution is to advance the sciences such that your source code in some way embed such information. This would be progress. However, this is never thought of because the unix philosophies already conditioned people to hack and be shallow. In this case, many will simply use the character intended to separate words for the purpose of indentation or alignment, and spread the practice with militant drivels.
Now, given the already messed up situation of the tabs vs spaces by the unixers and unix brain-washing of the coders in the industry... Which should we use today? I do not have a good proposition, other than just use whichever that works for you but put more critical thinking into things to prevent mishaps like this.
Tabs vs Spaces can be thought of as parameters vs hard-coded values, or HTML vs ascii format, or XML/CSS vs HTML 4, or structural vs visual, or semantic vs format. In these, it is always easy to convert from the former to the latter, but near impossible from the latter to the former. And, that is because the former encodes information that is lost in the latter. If we look at the issue of tabs vs spaces, indeed, it is easy to convert tabs to spaces in a source code, but more difficult to convert from spaces to tabs. Because, tabs as indentation actually contains the semantic information about indentation. With spaces, this critical information is lost in space.
This issue is intimately related to another issue in source code: soft-wrapped lines versus physical, hard-wrapped lines by EOL (end of line character). This issue has far more consequences than tabs vs spaces, and the unixer's unthinking has made far-reaching damages in the computing industry. Due to unix's EOL ways of thinking, it has created languages based on EOL (just about ALL languages except the Lisp family and Mathematica) and tools based on EOL (cvs, diff, grep, and basically every tool in unix), thoughts based on EOL (software value estimation by counting EOL, hard-coded email quoting system by
> prefix, and silent line-truncations in many unix tools), such that any progress or development towards a algorithmic code unit concept or language syntaxes are suppressed. I have not written a full account on this issue, but i've touched it in this essay: The Harm of hard-wrapping Lines, at http://xahlee.org/UnixResource_dir/writ/hard-wrap.html ---- This post is archived at: http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html
Xah xah@xahlee.org http://xahlee.org/
Eli Gottlieb - 15 May 2006 03:44 GMT Actually, spaces are better for indenting code. The exact amount of space taken up by one space character will always (or at least tend to be) the same, while every combination of keyboard driver, operating system, text editor, content/file format, and character encoding all change precisely what the tab key does.
There's no use in typing "tab" for indentation when my text editor will simply convert it to three spaces, or worse, autoindent and mix tabs with spaces so that I have no idea how many actual whitespace characters of what kinds are really taking up all that whitespace. I admit it doesn't usually matter, but then you go back to try and make your code prettier and find yourself asking "WTF?"
Undoubtedly adding the second spark to the holy war, Eli
 Signature The science of economics is the cleverest proof of free will yet constructed.
David Steuber - 15 May 2006 05:35 GMT Spaces work better. Hitting the TAB key in my Emacs will auto-indent the current line. Only spaces will be used for fill. The worst thing you can do is mix the two regardless of how you feel about tab vs space.
The next step in evil is to give tab actual significance like in make.
Xah Lee is getting better at trolling. He might fill up Google's storage.
 Signature http://www.david-steuber.com/ 1998 Subaru Impreza Outback Sport 2006 Honda 599 Hornet (CB600F) x 2 Crash & Slider The lithobraker. Zero distance stops at any speed.
Bent C Dalager - 16 May 2006 14:28 GMT >Spaces work better. Hitting the TAB key in my Emacs will auto-indent >the current line. Only spaces will be used for fill. The worst thing >you can do is mix the two regardless of how you feel about tab vs >space. This really hits at the crux of the matter if, as me, you largely code in Java. Sun uses a Java coding standard that includes tab-based full indents and space-based half indents. And, of course, their full indents are sometimes 8 spaces and sometimes 1 tab. This makes Sun-formatted code unreadable unless you have the exact same Tab-size as Sun does and, quite frankly, 8 spaces to the tab is just too much.
Since I look at the Java API sources every now and then while programming, I therefore set my Tab to be 8 spaces to make it readable, and run with 3-space indents in my own source code.
I wouldn't really mind much using Tab as indent instead, but that cannot coexist very happily with a Sun-mandated 8-space Tab setting.
>The next step in evil is to give tab actual significance like in >make. Since I don't tend to use make a lot, I think of it as "quaint" rather than "evil" :-)
Cheers Bent D
 Signature Bent Dalager - bcd@pvv.org - http://www.pvv.org/~bcd powered by emacs
Kaz Kylheku - 16 May 2006 19:04 GMT > >Spaces work better. Hitting the TAB key in my Emacs will auto-indent > >the current line. Only spaces will be used for fill. The worst thing [quoted text clipped - 7 lines] > Sun-formatted code unreadable unless you have the exact same Tab-size > as Sun does and, quite frankly, 8 spaces to the tab is just too much. The 8 space tab is industry standard. Anything else is an abomination.
What Sun's source code is doing is the only acceptable use of tabs in source code whatsoever: groups of eight spaces are replaced by a tab in a maximally greedy way to save space.
Many editors support this mode of indentation: use a greedy number of tabs instead of spaces, and then add a few spaces of padding to achieve the indentation.
Vim, Emacs, even the editor in Microsoft's Visual Studio.
> Since I look at the Java API sources every now and then while > programming, I therefore set my Tab to be 8 spaces to make it > readable, and run with 3-space indents in my own source code. If you have to set your tab to be 8 spaces, your editor is a braindamaged pile of crap.
It should be the default setting, and ideally not even overrideable.
And by golly, there is such a tool.
Microsoft's NOTEPAD.EXE has an 8 space tab, which cannot be changed.
So if your pointy-haired boss ever views a text file that you produced, and he uses Notepad on his Windows laptop, it behooves you to have assumed 8 space tabs.
> I wouldn't really mind much using Tab as indent instead, but that > cannot coexist very happily with a Sun-mandated 8-space Tab setting. Or would that be Notepad-mandated? Hahahaha.
Pascal Bourguignon - 17 May 2006 01:18 GMT > The 8 space tab is industry standard. Anything else is an abomination. Yes, that's the reason why they shouldn't be used for indenting, they're too wide. Space is good for indenting.
> What Sun's source code is doing is the only acceptable use of tabs in > source code whatsoever: groups of eight spaces are replaced by a tab in > a maximally greedy way to save space. To save what? http://www.shopping.com/xGS-LaCie-Bigger-Disk~NS-1~linkin_id-3068036
Let me see, the LOC/programmer/month is anything between 40 and 800. Let's take 400 lines of 40 characters, or 16000 c/p/m. We've been programming for 50 years, there's about 4 million programmers worldwide, that's 320 GB to store all the code produced ever, half that if you compress it, and you still have a lot of space to store movies and mp3.
 Signature __Pascal Bourguignon__ http://www.informatimago.com/ Un chat errant se soulage dans le jardin d'hiver Shiki
Chris Smith - 18 May 2006 03:55 GMT > Let me see, the LOC/programmer/month is anything between 40 and > 800. 40 lines of code per programmer per month? Wow! That's less than two lines per day ON AVERAGE! Suddenly I don't feel so bad about the occasional unproductive day at work...
 Signature Chris Smith
Alain Picard - 18 May 2006 10:48 GMT >> Let me see, the LOC/programmer/month is anything between 40 and >> 800. > > 40 lines of code per programmer per month? Wow! That's less than two > lines per day ON AVERAGE! Suddenly I don't feel so bad about the > occasional unproductive day at work... 40 lines of DEBUGGED, TESTED, DOCUMENTED, and SHIPPED lines of code.
Yeah. Sounds about right. :-(
Patricia Shanahan - 18 May 2006 15:32 GMT >>>Let me see, the LOC/programmer/month is anything between 40 and >>>800. [quoted text clipped - 6 lines] > > Yeah. Sounds about right. :-( But the objective was to bound the total amount of code. You don't assume all code is debugged, tested, documented, and shipped?
Patricia
Pascal Bourguignon - 18 May 2006 18:27 GMT >>>>Let me see, the LOC/programmer/month is anything between 40 and >>>>800. [quoted text clipped - 7 lines] > But the objective was to bound the total amount of code. You don't > assume all code is debugged, tested, documented, and shipped? Well not really, you don't need to keep scratch and prototype code that's 30 years old. Just keep a few sources snapshoots. But even if you wanted to keep all the keypresses of all the programmers ever, that'd be less than 200 TB, quite a realizable capacity today, even so in a few years.
 Signature __Pascal Bourguignon__ http://www.informatimago.com/
PLEASE NOTE: Some quantum physics theories suggest that when the consumer is not directly observing this product, it may cease to exist or will exist only in a vague and undetermined state.
Rob Warnock - 19 May 2006 04:53 GMT +---------------
| > But the objective was to bound the total amount of code. You don't | > assume all code is debugged, tested, documented, and shipped? [quoted text clipped - 4 lines] | that'd be less than 200 TB, quite a realizable capacity today, even so | in a few years. +---------------
Indeed, see <http://www.agami.com/products/>. With the AIS-6119 (on the far right), you can get just under 154 TB in a single rack, today. [That's with 400 GB drives. 192 TB in a single rack, when the 500 GB drives become available.]
-Rob
p.s. Obligatory disclosure: I'm currently employed by Agami...
----- Rob Warnock <rpw3@rpw3.org> 627 26th Avenue <URL:http://rpw3.org/> San Mateo, CA 94403 (650)572-2607
Kaz Kylheku - 18 May 2006 17:17 GMT > > The 8 space tab is industry standard. Anything else is an abomination. > [quoted text clipped - 7 lines] > To save what? > http://www.shopping.com/xGS-LaCie-Bigger-Disk~NS-1~linkin_id-3068036 Exactly. So the only justification for these tabs is very flimsy, isn't it?
Maybe it's the cycles. Think of all the cycles that are wasted lexically analyzing eight spaces compared to one tab.
:) James Of Tucson - 17 May 2006 06:23 GMT >The 8 space tab is industry standard. Anything else is an abomination. It may be a standard, but it's a standard from the days of the ASR-33 TTY, and I suspect possibly from typewriters of the early 1900s. But even the vintage typewriters I've seen had adjustable tab stops.
>So if your pointy-haired boss ever views a text file that you produced, >and he uses Notepad on his Windows laptop, it behooves you to have >assumed 8 space tabs. I would have the confidence to insist that he use an editor that is approved by our departmental policies, and I would go further and require him to observe our coding standards. I have the confidence in my skills and in my value to the company, to look such a person square in the eye and explain his problem to him. I can't imagine the situation you describe ever being a problem.
Roedy Green - 20 May 2006 02:45 GMT >The 8 space tab is industry standard. I think you overstate it. It is pretty common on Windows though.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
John Gagon - 23 May 2006 12:18 GMT > The 8 space tab is industry standard. Anything else is an abomination. Perhaps it is some industry's standard but it is a huge waste of useful space.
John Gagon
jmcgill - 15 May 2006 05:43 GMT If I work on your project, I follow the coding and style standards you specify.
Likewise if you work on my project you follow the established standards.
Fortunately for you, I am fairly liberal on such matters.
I like to see 4 spaces for indentation. If you use tabs, that's what I will see, and you're very likely to have your code reformatted by the automated build process, when the standard copyright header is pasted and missing javadoc tags are generated as warnings.
I like the open brace to start on the line of the control keyword. I can deal with the open brace being on the next line, at the same level of indentation as the control keyword. I don't quite understand the motivation behind the GNU style, where the brace itself is treated as a half-indent, but I can live with it on *your* project.
Any whitespace or other style that isn't happy to be reformatted automatically is an error anyway.
I'd be very laissez-faire about it except for the fact that code repositories are much easier to manage if everything is formatted before it goes in, or as a compromise, as a step at release tags.
Roedy Green - 20 May 2006 02:44 GMT >Actually, spaces are better for indenting code. Agreed. All it takes is one programmer to use a different tab expansion convention to screw up a project. Spaces are unambiguous.
Ideally though you should run code through a beautifier before checkin to avoid false deltas with people manually formatting code slightly differently.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
mystilleef - 15 May 2006 08:56 GMT Mumia W. - 15 May 2006 09:00 GMT > Tabs versus Spaces in Source Code > [quoted text clipped - 5 lines] > a heated fight over trivia. In this essay, i like to explain what is > the situation behind it, and which is proper. Thanks Xah. I value your posts. Keep posting. And since your posts usually cover broad areas of CS, keep crossposting. Don't go anywhere Xah :-)
> Simply put, tabs is proper, and spaces are improper. Why? This may seem > ridiculously simple given the de facto ball of confusion: the semantics > of tabs is what indenting is about, while, using spaces to align code > is a hack. I wouldn't say that spaces are a hack, but tabs are superior.
> Now, tech geekers may object this simple conclusion because they itch > to drivel about different editors and so on. The alleged problem [quoted text clipped - 6 lines] > align code has become widely practiced, so that in the end spaces seem > to be actually better by popularity and seeming simplicity. Don't forget the laziness of programmers like me who don't put the tabbing information in the source file. Vim deals with tabs well IMO, but I almost never used to put the right auto-commands in the file to get it set up right for other users.
> In short, this is a phenomenon of misunderstanding begetting a snowball > of misunderstanding, such that it created a cultural milieu to embrace [quoted text clipped - 14 lines] > the sciences such that your source code in some way embed such > information. Vim does this. We just have to use it.
> This would be progress. However, this is never thought of > because the “unix philosophies” already conditioned people to hack [quoted text clipped - 14 lines] > former. And, that is because the former encodes information that is > lost in the latter. Nope. Conversion is relatively easy. I've written programs to do this myself, and everyone and his brother has also done this. Virtually every programmer's editor that I've ever used can do this, and a great, great many independent programs convert tabs to spaces. It's like saying, "it's near impossible to write a calculator program." :-)
I bet that someone has a Perl one-liner to do it.
On any Debian system, try a "man expand" and see what you find. Also, emacs and vim do it. Perl has a Text::Tabs module. TCL's
::textutil::(un)?tabify routines do it. The birds do it, and the bees do it. Oh wait, that's something else :-)
> If we look at the issue of tabs vs spaces, indeed, > it is easy to convert tabs to spaces in a source code, but more > difficult to convert from spaces to tabs. Nope again. It's easy, you just keep track of the virtual character position as you decide whether to write a space or a tab. Computers do the "counting" thing fairly well.
> Because, tabs as indentation > actually contains the semantic information about indentation. With [quoted text clipped - 22 lines] > xah@xahlee.org > ∑ http://xahlee.org/ I've never thought of tabs-vs-spaces as a religious war. Anyway, the authority of the programming environment will determine which one is used. Have a good week Xah.
Iain King - 16 May 2006 10:39 GMT Oh God, I agree with Xah Lee. Someone take me out behind the chemical sheds...
Iain
> Tabs versus Spaces in Source Code > [quoted text clipped - 84 lines] > xah@xahlee.org > http://xahlee.org/ Dale King - 16 May 2006 17:14 GMT > Oh God, I agree with Xah Lee. Someone take me out behind the chemical > sheds... <more worthless nonsense>
Please don't feed the troll!
And for the record, spaces are 100% portable, tabs are not. That ends the argument for me.
Worse than either tabs or spaces however is Sun's mixture of the two.
 Signature Dale King
Stephen Kellett - 16 May 2006 17:27 GMT >Oh God, I agree with Xah Lee. Someone take me out behind the chemical >sheds... Jack Bauer is on his way...
 Signature Stephen Kellett Object Media Limited http://www.objmedia.demon.co.uk/software.html Computer Consultancy, Software Development Windows C++, Java, Assembler, Performance Analysis, Troubleshooting
numeromancer - 16 May 2006 14:48 GMT An old debate. My $0.02 :
http://numeromancer.dyndns.org/~timothy/tab-width-independence/description.html
The idea can be extended to other programming languages.
TS
Oliver Bandel - 16 May 2006 16:23 GMT > Tabs versus Spaces in Source Code > [quoted text clipped - 7 lines] > > Simply put, tabs is proper, and spaces are improper. [...]
I fullheartedly disagree :)
So, no "essay" on this is necessary to read :->
Ciao, Oliver
opalpa@gmail.com opalinski from opalpaweb - 16 May 2006 16:31 GMT > Simply put, tabs is proper, and spaces are improper. > Why? This may seem > ridiculously simple given the de facto ball of confusion: the semantics > of tabs is what indenting is about, while, using spaces to align code > is a hack. The reality of programming practice trumps original intent of tab characters. The tab character and space character are pliable in that if their use changes their semantics change.
> ... and the solution is to advance > the sciences such that your source code in some way > embed such information. If/when time comes where such info is embeded perhaps then tabs will be OK.
---------------------------------------------------------------
I use spaces because of the many sources I've opened I have many times sighed on opening tabed ones and never done so opening spaced ones.
I don't get mad, but sighing is a clear indicator of negativity. Anyway, the more code I write and read the less indentation matters to me. My brain can now parse akward source correctly far bettter than it did a few years ago.
All the best, Opalinski opalpa@gmail.com http://www.geocities.com/opalpaweb/
Pascal Bourguignon - 16 May 2006 16:40 GMT >> Simply put, tabs is proper, and spaces are improper. >> Why? This may seem [quoted text clipped - 22 lines] > me. My brain can now parse akward source correctly far bettter than it > did a few years ago. And anyways, C-x h C-M-\ comes automatically after C-x C-f source RET Just add this to your ~/.emacs :
(add-hook 'find-file-hook (lambda () (indent-region (point-min) (point-max)) (pop-mark)))
 Signature __Pascal Bourguignon__ http://www.informatimago.com/
IMPORTANT NOTICE TO PURCHASERS: The entire physical universe, including this product, may one day collapse back into an infinitesimally small space. Should another universe subsequently re-emerge, the existence of this product in that universe cannot be guaranteed.
Oliver Bandel - 16 May 2006 17:15 GMT >>Simply put, tabs is proper, and spaces are improper. >>Why? This may seem [quoted text clipped - 5 lines] > characters. The tab character and space character are pliable in that > if their use changes their semantics change. [...]
Yes, as I started programming I also preferred tabs. And with growing experience on how to handle this in true life (different editors/systems/languages...) I saw, that converting the "so fine tabs" was annoying.
The only thing that always worked were spaces. Tab: nice idea but makes programming an annoyance.
Ciao, Oliver
Edmond Dantes - 17 May 2006 23:40 GMT ...
> Yes, as I started programming I also preferred tabs. > And with growing experience on how to handle this in true life [quoted text clipped - 6 lines] > Ciao, > Oliver It all depends on your editor of choice. Emacs editing of Lisp (and a few other languages, such as Python) makes the issue more or less moot. I personally would recommend choosing one editor to use with all your projects, and Emacs is wonderful in that it has been ported to just about every platform imaginable.
The real issue is, of course, that ASCII is showing its age and we should probably supplant it with something better. But I know that will never fly, given the torrents of code, configuration files, and everything else in ASCII. Even Unicode couldn't put a dent in it, despite the obvious growing global development efforts. Not sure how many compilers would be able to handle Unicode source anyway. I suspect the large majority of them would would choke big time.
Oh well...
 Signature -- Edmond Dantes, CMC And Now for something Completely Different: http://gift-basket.prosperitysprinkler.com http://sewing-machine.womencraft.com http://coveralls.whiteboystuff.com http://eyewear.blackboystuff.com http://dinette.funiturenow.com http://wheels.whiteboystuff.com http://patio.funiturenow.com
Pascal Bourguignon - 18 May 2006 12:10 GMT > It all depends on your editor of choice. Emacs editing of Lisp (and a few > other languages, such as Python) makes the issue more or less moot. I [quoted text clipped - 9 lines] > handle Unicode source anyway. I suspect the large majority of them would > would choke big time. All right unicode support is not 100% perfect already, but my main compilers support it perfectly well, only 1/5 don't support it, and 1/5 support it partially:
------(unicode-script.lisp)---------------------------------------------
(defun clisp (file) (ext:run-program "/usr/local/bin/clisp" :arguments (list "-ansi" "-norc" "-on-error" "exit" "-E" "utf-8" "-i" file "-x" "(ext:quit)") :input nil :output :terminal :wait t))
(defun gcl (file) (ext:run-program "/usr/local/bin/gcl" :arguments (list "-batch" "-load" file "-eval" "(lisp:quit)") :input nil :output :terminal :wait t))
(defun ecl (file) (ext:run-program "/usr/local/bin/ecl" :arguments (list "-norc" "-load" file "-eval" "(si:quit)") :input nil :output :terminal :wait t))
(defun sbcl (file) (ext:run-program "/usr/local/bin/sbcl" :arguments (list "--userinit" "/dev/null" "--load" file "--eval" "(sb-ext:quit)") :input nil :output :terminal :wait t))
(defun cmucl (file) (ext:run-program "/usr/local/bin/cmucl" :arguments (list "-noinit" "-load" file "-eval" "(extensions:quit)") :input nil :output :terminal :wait t))
(dolist (implementation '(clisp gcl ecl sbcl cmucl)) (sleep 3) (terpri) (print implementation) (terpri) (funcall implementation "unicode-source.lisp"))
------(unicode-source.lisp)--------------------------------------------- ;; -*- coding: utf-8 -*-
(eval-when (:compile-toplevel :load-toplevel :execute) (format t "~2%~A ~A~2%" (lisp-implementation-type) (lisp-implementation-version)) (finish-output))
(defun ιοτα (&key (номер 10) (단계 1) (בכוכ 0)) (loop :for i :from בכוכ :to номер :by 단계 :collect i))
(defun test () (format t "~%Calling ~S --> ~A~%" '(ιοτα :номер 10 :단계 2 :בכוכ 2) (ιοτα :номер 10 :단계 2 :בכוכ 2)))
(test)
------------------------------------------------------------------------
(load"unicode-script.lisp") ;; Loading file unicode-script.lisp ...
CLISP i i i i i i i ooooo o ooooooo ooooo ooooo I I I I I I I 8 8 8 8 8 o 8 8 I \ `+' / I 8 8 8 8 8 8 \ `-+-' / 8 8 8 ooooo 8oooo `-__|__-' 8 8 8 8 8 | 8 o 8 8 o 8 8 ------+------ ooooo 8oooooo ooo8ooo ooooo 8
Copyright (c) Bruno Haible, Michael Stoll 1992, 1993 Copyright (c) Bruno Haible, Marcus Daniels 1994-1997 Copyright (c) Bruno Haible, Pierpaolo Bernardi, Sam Steingold 1998 Copyright (c) Bruno Haible, Sam Steingold 1999-2000 Copyright (c) Sam Steingold, Bruno Haible 2001-2006
;; Loading file unicode-source.lisp ...
CLISP 2.38 (2006-01-24) (built 3347193361) (memory 3347193794)
Calling (ΙΟΤΑ :НОМЕР 10 :단계 2 :בכוכ 2) --> (2 4 6 8 10) ;; Loaded file unicode-source.lisp Bye.
GCL
GNU Common Lisp (GCL) GCL 2.6.7
Calling (ιοτα :номер 10 :단계 2 :בכוכ 2) --> (2 4 6 8 10)
ECL ;;; Loading "unicode-source.lisp"
ECL 0.9g
Calling (ιοτα :номер 10 :단계 2 :בכוכ 2) --> (2 4 6 8 10)
SBCL This is SBCL 0.9.12, an implementation of ANSI Common Lisp. More information about SBCL is available at <http://www.sbcl.org/>.
SBCL is free software, provided as is, with absolutely no warranty. It is mostly in the public domain; some portions are provided under BSD-style licenses. See the CREDITS and COPYING files in the distribution for more information.
SBCL 0.9.12
Calling (|ιοτα| :|номер| 10 :|ˋ¨ʳÂ| 2 :|בכוכ| 2) --> (2 4 6 8 10)
CMUCL ; Loading #P"/local/users/pjb/src/lisp/encours/unicode-source.lisp".
CMU Common Lisp 19c (19C)
Reader error at 214 on #<Stream for file "/local/users/pjb/src/lisp/encours/unicode-source.lisp">: Undefined read-macro character #\Ã [Condition of type READER-ERROR]
Restarts: 0: [CONTINUE] Return NIL from load of "unicode-source.lisp". 1: [ABORT ] Skip remaining initializations.
Debug (type H for help)
(LISP::%READER-ERROR #<Stream for file "/local/users/pjb/src/lisp/encours/unicode-source.lisp"> "Undefined read-macro character ~S" #\Ã) Source: Error finding source: Error in function DEBUG::GET-FILE-TOP-LEVEL-FORM: Source file no longer exists: target:code/reader.lisp. 0] abort * Received EOF on *standard-input*, switching to *terminal-io*. * (extensions:quit) ;; Loaded file unicode-script.lisp T [4]>
 Signature __Pascal Bourguignon__ http://www.informatimago.com/ Grace personified, I leap into the window. I meant to do that.
Jonathon McKitrick - 18 May 2006 15:42 GMT > (defun ιοτα (&key (номер 10) (단계 1) (בכוכ 0)) > (loop :for i :from בכוכ :to номер :by 단계 :collect i)) How do you even *enter* these characters? My browser seems to trap all the special character combinations, and I *know* you don't mean selecting from a character palette.
hey, this is weird...
î
I've got something happening, but I can't tell what.
Yes, I'm an ignorant Western world ASCII user. :-)
Pascal Bourguignon - 18 May 2006 18:24 GMT >> (defun ιοτα (&key (номер 10) (단계 1) (בכוכ 0)) >> (loop :for i :from בכוכ :to номер :by 단계 :collect i)) > > How do you even *enter* these characters? My browser seems to trap all > the special character combinations, and I *know* you don't mean > selecting from a character palette. Why? Of course! Aren't you either an emacs or a Mac user?
On a Mac, you just select the input keyboad from the Input menu (the little flag on the right of the menubar, you may activate it from the International System Preference panel).
On emacs, it's as simple: M-x set-input-method RET
I've bound C-F9, C-F10, C-F11, and C-F12 to various input methods:
(global-set-key [C-f9] (lambda()(interactive)(set-input-method 'chinese-py-b5))) (global-set-key [C-f10] (lambda()(interactive)(set-input-method 'cyrillic-yawerty))) (global-set-key [C-f11] (lambda()(interactive)(set-input-method 'greek))) (global-set-key [C-f12] (lambda()(interactive)(set-input-method 'hebrew)))
C-\ is bound to toggle-input-method which allows to revert back to the usual input method.
For the alphabetic scripts, there's no difficulty, it's like with roman scripts: each key is a character. For ideographic scripts, the input methods are more sophisticated.
Then, you have to learn some of these strange languages. I learned several (but I forgot everything but: לודג גד דג ינד, здраствуйте, я люблю тибе, 我 聽龍, 我 不 中国人). For the Korean, I copy-and-pasted it from some web translation service. But keying them in is the easiest part.
 Signature __Pascal Bourguignon__ http://www.informatimago.com/ Cats meow out of angst "Thumbs! If only we had thumbs! We could break so much!"
Oliver Bandel - 18 May 2006 19:31 GMT >>(defun ιοτα (&key (номер 10) (단계 1) (בכוכ 0)) >> (loop :for i :from בכוכ :to номер :by 단계 :collect i)) > > How do you even *enter* these characters? My browser seems to trap all > the special character combinations, and I *know* you don't mean > selecting from a character palette. Didn't you heard of that big keyboards?
12 meter x 2 meter wide I think.... you need a long stick (maybe if you play golf, that can help).
The you have all UTF-8 characters there, that's fine, but typing needs some time. But it's good, because when ready with typing your email, it's not necessary to go to sports after work. So your boss can insist that you longer stay at work.
Ciao, Oliver
;-)
Oliver Wong - 23 May 2006 16:14 GMT >> (defun ιοτα (&key (номер 10) (단계 1) (בכוכ 0)) >> (loop :for i :from בכוכ :to номер :by 단계 :collect i)) [quoted text clipped - 10 lines] > > Yes, I'm an ignorant Western world ASCII user. :-) What OS are you using? In Windows XP, you'd have to let the XP know that you're interested in input in languages other than English via "Control Panel -> Regional Settings -> Languages -> Text Services and Input Languages". There, you'd add input methods other than English. Each "input method" works in a sort of unique way, so you'll just have to learn them. For example, under English, you can use the "keyboard" input method which probably is what you're using now, or the "handwriting recognition" input method, or the "speech recognition" input method to insert english text. There are other input methods for the Asian languages (e.g. Chinese, Japanese, etc.)
- Oliver
Kaz Kylheku - 16 May 2006 18:51 GMT > Tabs vs Spaces can be thought of as parameters vs hard-coded values, or > HTML vs ascii format, or XML/CSS vs HTML 4, or structural vs visual, or > semantic vs format. In these, it is always easy to convert from the > former to the latter, but near impossible from the latter to the > former. Bahaha, looks like someone hasn't thought things through very well.
Spaces, under a mono font, offer greater precision and expressivity in achieving specific alignment. That expressivity cannot be captured by tabs.
The difficulty in converting spaces to tabs rests not in any bridgeable semantic gap, but in the lack of having any way whatsoever to express using tabs what the spaces are expressing.
It's not /near/ impossible, it's /precisely/ impossible.
For instance, tabs cannot express these alignments:
/* * C block * comment * in a common style. */
(lisp (nested list with symbols and things))
(call to a function with many parameters) ;; how do you align "to" and "with" using tabs? ;; only if "to" lands on a tab stop; but dependence on specific tab stops ;; destroys the whole idea of tabs being parameters.
To do these alignments structurally, you need something more expressive than spaces or tabs. But spaces do the job under a mono font, /and/ they do it in a completely unobtrusive way.
If you want to do nice typesetting of code, you have to add markup which has to be stripped away if you actually want to run the code.
Spaces give you decent formatting without markup. Tabs do not. Tabs are only suitable for aligning the first non-whitespace character of a line to a stop. Only if that is the full extent of the formatting that you need to express in your code can you acheive the ideal of being able to change your tab parameter to change the indentation amount. If you need to align characters which aren't the first non-whitespace in a line, tabs are of no use whatsoever, and proportional fonts must be banished.
achates - 16 May 2006 19:46 GMT > If you want to do nice typesetting of code, you have to add markup > which has to be stripped away if you actually want to run the code. Typesetting code is not a helpful activity outside of the publishing industry. You might like the results of your typsetting; I happen not to. You probably wouldn't like mine. Does that mean we shouldn't work together? Only if you insist on forcing me to conform to your way of displaying code.
You are correct in pointing out that tabs don't allow for 'alignment' of the sort you mention: (lisp (nested list with symbols and things)) But then neither does Python. I happen to think that's a feature.
(And of course you can do what you like inside a comment. That's because tabs are for indentation, and indentation is meanigless in that context. Spaces are exactly what you should use then. I may or may not like your layout, but it won't break anything when we merge our code.)
achates - 16 May 2006 20:22 GMT argh, sorry; missed the cross-post. Was replying from comp.lang.python..
Kaz Kylheku - 16 May 2006 23:01 GMT > > If you want to do nice typesetting of code, you have to add markup > > which has to be stripped away if you actually want to run the code. > > Typesetting code is not a helpful activity outside of the publishing > industry. Be that as it may, code writing involves an element of typesetting. If you are aligning characters, you are typesetting, however crudely.
> You might like the results of your typsetting; I happen not > to. You probably wouldn't like mine. Does that mean we shouldn't work > together? Only if you insist on forcing me to conform to your way of > displaying code. Someone who insists that everyone should separate line indentation into tabs which achieve the block level, and spaces that achieve additional alignment, so that code could be displayed in more than one way based on the tab size without loss of alignment, is probably a "space cadet", who has a bizarre agenda unrelated to developing the product.
There is close to zero value in maintaining such a scheme, and consequently, it's hard to justify with a business case.
Yes, in the real world, you have to conform to someone's way of formatting and displaying code. That's how it is.
You have to learn to read, write and even like more than one style.
> You are correct in pointing out that tabs don't allow for 'alignment' > of the sort you mention: That alignment has a name: hanging indentation.
All forms of aligning the first character of a line to some requirement inherited from the previous line are called indentation.
Granted, a portion of that indentation is derived from the nesting level of some logically enclosing programming language construct, and part of it may be derived from the position of a character of some parallel constituent within the construct.
> (lisp > (nested list > with symbols > and things)) > But then neither does Python. I happen to think that's a feature. Python has logical line continuation which gives rise to the need for hanging indents to line up with parallel constituents in a folded expression.
Python also allows for the possibility of statements separated by semicolons on one line, which may need to be lined up in columns.
var = 42; foo = 53 x = 2; y = 10
> (And of course you can do what you like inside a comment. That's > because tabs are for indentation, and indentation is meanigless in that > context. A comment can contain example code, which contains indentation.
What, I can't change the tab size to display that how I want? Waaah!!! (;_;)
Aaron Gray - 17 May 2006 01:14 GMT I was once a religous tabber until working on multiple source code sources, now I am a religious spacer :)
My 2bits worth,
Aaron
Bill Pursell - 17 May 2006 14:51 GMT > Tabs versus Spaces in Source Code > > Xah Lee, 2006-05-13 > > In coding a computer program, there's often the choices of tabs or > spaces for code indentation. <snip>
> (2) Due to the first reason, they have created and > propagated a massive none-understanding and mis-use, to the degree that > many tools (e.g. vi) does not deal with tabs well
:set ts=<n> Yeah, that's really tough. vi does just fine handling tabs. vim does an even better job, with mode-lines, = and :retab.
In my experience, the people who complain about the use of tabs for indentation are the people who don't know how to use their editor, and those people tend to use emacs.
Alain Picard - 18 May 2006 10:46 GMT > In my experience, the people who complain about the use > of tabs for indentation are the people who don't know > how to use their editor, and those people tend to use > emacs. HA HA HA HA HA HA HA HA HA HA HA HA ....
Tee, hee heee.... snif!
Phew. Better now.
That was funny! Thanks! :-)
ashesh - 18 May 2006 06:45 GMT If I work on your project, I follow the coding and style standards you specify.
Likewise if you work on my project you follow the established standards.
Fortunately for you, I am fairly liberal on such matters.
I like to see 4 spaces for indentation. If you use tabs, that's what I
will see, and you're very likely to have your code reformatted by the automated build process, when the standard copyright header is pasted and missing javadoc tags are generated as warnings.
I like the open brace to start on the line of the control keyword. I can deal with the open brace being on the next line, at the same level of indentation as the control keyword. I don't quite understand the motivation behind the GNU style, where the brace itself is treated as a
half-indent, but I can live with it on *your* project.
Any whitespace or other style that isn't happy to be reformatted automatically is an error anyway.
I'd be very laissez-faire about it except for the fact that code repositories are much easier to manage if everything is formatted before it goes in, or as a compromise, as a step at release tags.
Ashesh..
Xah Lee - 23 May 2006 12:02 GMT the following are 2 FAQ following this thread. Thanks.
Addendum: 2006-05-15
Q: What you mean by embeding tab position info into the source code? How's that gonna be done?
A: Tech geekers may not realize, but such embedding of meta info do exist in many technologies by various means because of a need. For example, Mac OS Classic's resource fork and Mac OS X's bundling system, unix shell script's shebang (#!), emacs and Python's encoding declaration #-*- coding: utf-8 -*-, Unicode's BOM, CVS's change-log insertion, Mathematica's source code system the Notebook, Microsoft Word's transparent meta data, as well as HTML and XML's various declarations embedded in the file. Some of these systems are good designs and some are hacks.
Somehow tech geekers have the sense that source code must be a plain text file containing nothing else but the programing code. This may be a defendable position, but as we can see in the above examples, this idea is primitive and does not address the various needs. If the tech geekers have thought out about these issues, computing languages and its source code may have developed into more powerful and flexible integrated systems as the above standardized examples. For instance, many commercial development systems actually already have such meta-data embodied with the source code. (e.g. Borland Delphi, Metrowerks's CodeWarrior, Microsoft Visual Studio, Wolfram Research's Mathematica.) Some of which, not only embody development-related info such as debug points or linking files, but also allow programers to high-light code for visual purposes like a word processor, or even display them visually as type-set mathematics.
Q: Converting spaces to tabs is actually easy. I don't see how spacess lose info.
A: Here is a illustration on how it is not possible to convert spaces to tabs. Suppose you are writing in a language where the indentation is part of the semantics, not just for appearance. Now, suppose you have these two lines:
1234567890 A B
The first line has 2 space prefix and second line has 4 space prefix. How, if you convert this to tabs, how do you know that's 1 and 2 tabs, or 2 and 4 tabs? In essence, there is no way to tell how many tabs n represents, where n is the smallest space prefix in the code, unless n == 1.
The above demonstrates the information loss in using spaces for indentation in a theoretical way. There are also practical problems. In practice, many languages allow string literals like this myName="i love you", and strings easily can have a run of spaces. One cannot simply run a blind find-n-replace operation to replace all spaces to tabs. But also, many unix languages contains a so-called construct of heredoc as a mean to embed a literal block of text. For example, here's a PHP construct of heredoc:
$novelText = <<<arbitraryCharsHereAsDelimiter (__) (oo) /-------\/ / | || * ||----|| ~~ ~~ arbitraryCharsHereAsDelimiter; }
Regardless of its design as a language construct, the purpose of heredoc is that it allows programers to easily embed a text (a large string), without worrying about the text containing sequence of characters that may be meaningful to the language. If a language has heredoc construct, then it is basically impossible to convert from spaces to tabs, as that will botch literal string embedded in heredoc. However, it is less of a problem to convert tabs to spaces, because the frequency of spaces appearing in literal strings are far higher than literal tabs.
Another practical issue is error recovery. Suppose, one uses 4 spaces for a indentation. Now, it is not uncommon to see lines with odd number of space prefixes such as 7 or 10 out of common sloppiness. Such error would happen more often if spaces are used for indentation, and the essence is that tabs enforce a semantic association and is impossible to make a half-indentation.
Q: Well, i just like spaces because they are most compatible.
A: Sure, crass simplicity is always more compatible. Suppose a unixer will say, he doesn't like HTML because it is fret with problems and incompatibilities. He'd rather prefer plain text. And, indeed, a lot unixers seriously think that.
--------------------------- PS in the answer to the first question, i gave the following examples of IDE/Language that actually embed formatting info in the source code: Borland Delphi, Metrowerks's CodeWarrior, Microsoft Visual Studio, Wolfram Research's Mathematica
actually, i know Mathematica does, but i'm not quite sure about the other examples. So, my question is, does any one knows a language or IDE that actually allows the coder to manually highlight parts of the code and this highlight stick with the file upon reopening, as if a word processor?
Xah xah@xahlee.org http://xahlee.org/
> Tabs versus Spaces in Source Code > This post is archived at: > http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html Mumia W. - 23 May 2006 14:19 GMT > the following are 2 FAQ following this thread. Thanks. > [quoted text clipped - 12 lines] > various declarations embedded in the file. Some of these systems are > good designs and some are hacks. Vim's mode-lines do this too.
> Somehow tech geekers have the sense that “source code” must be a > plain text file containing nothing else but the programing code. This [quoted text clipped - 3 lines] > and its source code may have developed into more powerful and flexible > integrated systems as the above standardized examples. The tech geekers have thought about it. Donald Knuth invented TeX, and went on to invent the WEB literate programming system. You don't get any geekier than that :)
> For instance, > many commercial development systems actually already have such [quoted text clipped - 12 lines] > part of the semantics, not just for appearance. Now, suppose you have > these two lines: I'd say that such a language removes the choice of whether to use tabs or spaces, and the discussion is over when you don't have a choice.
> 1234567890 > A [quoted text clipped - 5 lines] > represents, where n is the smallest space prefix in the code, unless n > == 1. vim: tabstop=4
The argument for spaces over tabs says that you have to include some metadata in order for the document to look right on other people's computers if you use tabs. This example, plus my example mode-line for vim, reinforces that idea IMO.
> The above demonstrates the information loss in using spaces for > indentation in a theoretical way. There are also practical problems. In [quoted text clipped - 14 lines] > arbitraryCharsHereAsDelimiter; > } Yes, there are lots of situations like this where you can't just willy-nilly convert between tabs and spaces. But even in this case shows that, if you use consistent tab widths, the text has a chance of surviving. I converted your little doggie to and from text with tab sizes of eight, and he survived. (I did it with tabs set to four too, and it worked.)
> Regardless of its design as a language construct, the purpose of > “heredoc” is that it allows programers to easily embed a text (a > large string), without worrying about the text containing sequence of > characters that may be meaningful to the language. If a language has > heredoc construct, then it is basically impossible to convert from > spaces to tabs, as that will botch literal string embedded in heredoc. Yes it would. Upon printing, if the terminal tab width was set to eight, but the text conversion was done with tabs at four, bye bye doggie.
> However, it is less of a problem to convert tabs to spaces, because the > frequency of spaces appearing in literal strings are far higher than [quoted text clipped - 6 lines] > essence is that tabs enforce a semantic association and is impossible > to make a half-indentation. What I've learned is that, if I'm going to use tabs for indentation, I have to be consistent.
> Q: Well, i just like spaces because they are most compatible. > [quoted text clipped - 8 lines] > Borland Delphi, Metrowerks's CodeWarrior, Microsoft Visual Studio, > Wolfram Research's Mathematica Perl's POD and Java's javadoc do it too.
> actually, i know Mathematica does, but i'm not quite sure about the > other examples. So, my question is, does any one knows a language or [quoted text clipped - 9 lines] >> This post is archived at: >> http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html I'm slowly moving into the "spaces" camp. After reading your earlier post on tabs vs. spaces and other people's responses, I began thinking about why I like tabs so much, and there is only one answer--backspace.
If I use tabs, when I backspace I go back to the previous tab position, which is what I want. With spaces, I have to hit the backspace key several times to get back. That's it--one feature is the only reason I like tabs, so I decided to investigate vim's features to see if vim would let me backspace to the previous tab position with one keystroke.
'Softtabstop' (sts) is the feature. I would have never thought to look for this feature without your post. Thanks again Xah.
Your posts are on topic, informative, engaging and necessary. Keep them coming Xah. :)
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|