Java Forum / General / December 2005
ligatures in Java 2D
Des Small - 16 Dec 2005 15:35 GMT I've been trying to work out how to get Java to handle ligatures such as "fi" correctly, but without much success. Various online documents have suggested that I need to be intervening in the rendering process and laying out my own GlyphVectors based on information associated with Fonts, but I can't find out the details I need to actually do that systematically.
I have discovered (<http://mindprod.com/jgloss/ligature.html>) that 'fi' has its own code point (\ufb01) in Unicode - should I be manually checking whether my Font has specific glyphs for such ligatured characters and manually substituting them into my char[] to feed to layoutGlyphVector? Do Fonts come with lists of ligatures they support, and if so how do I get at them?
And most of all, what is TFM that I am currently foolishly neglecting to R?
Des
Roedy Green - 16 Dec 2005 16:25 GMT >I have discovered (<http://mindprod.com/jgloss/ligature.html>) that >'fi' has its own code point (\ufb01) in Unicode - should I be manually >checking whether my Font has specific glyphs for such ligatured >characters and manually substituting them into my char[] to feed to >layoutGlyphVector? Do Fonts come with lists of ligatures they >support, and if so how do I get at them? It is up to you to use them if they exist. You have to check manually because the automatic method to check if a char renders LIES. see http://mindprod.com/jgloss/font.html
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Thomas Weidenfeller - 16 Dec 2005 16:31 GMT > I've been trying to work out how to get Java to handle ligatures such > as "fi" correctly, but without much success. Correctly? Well, ligatures are evil :-) They are different in different languages (the sets used are different), and they are used for a number of purposes (e.g. to get a more pleasing typographic view, but also as semantic elements of languages).
Let's just assume you want to do it for aesthetic reasons. But that's still difficult to automate, because it depends on language rules.
> Various online documents > have suggested that I need to be intervening in the rendering process I guess this assumes that you intend to emulate ligatures by using the corresponding normal glyphs and move them closer together. This only works in some fonts, but looks extremely ugly in others.
You can simply move glyphs closer together by placing one glype at a time, e.g. via Graphics.drawString() with single-character strings. This is inefficient.
> and laying out my own GlyphVectors based on information associated > with Fonts, but I can't find out the details I need to actually do > that systematically. A loop, a font-specific table to look up how close you want to move two glyphs, some variables to keep track how much you must move the remaining parts of a GlypheVector once you move one glyphe, and GlyphVector.setGlyphPosition().
> I have discovered (<http://mindprod.com/jgloss/ligature.html>) that > 'fi' has its own code point (\ufb01) in Unicode - should I be manually > checking whether my Font has specific glyphs for such ligatured > characters and manually substituting them into my char[] to feed to > layoutGlyphVector? If your font provides glyphs for ligatures it is of course better to use these, then emulating ligatures with single glyphs. And that indeed means you have to get the corresponding unicode into your char[].
However, the problem is how to automate this. E.g. in my mother tongue the rules for using ligatures are partly related to hyphenation rules. Any automatic replacement of character sequences with corresponding ligatures would have to take these into account to get things right.
> Do Fonts come with lists of ligatures they > support, and if so how do I get at them? I am not aware that typical font file formats contain separate lists for ligatures. However, you don't need this. Characters in Java are Unicode-encode. All you need to know are the unicode code points for the ligatures you want to support. Check the tables at www.unicode.org, there aren't many. Then you need to check if a particular font contains matching glyphs. Font.canDisplay() and Font.canDiaplayUpTo() are your friends.
> And most of all, what is TFM that I am currently foolishly neglecting > to R? /Thomas
 Signature The comp.lang.java.gui FAQ: ftp://ftp.cs.uu.nl/pub/NEWS.ANSWERS/computer-lang/java/gui/faq http://www.uni-giessen.de/faq/archiv/computer-lang.java.gui.faq/
Des Small - 16 Dec 2005 16:56 GMT > > I've been trying to work out how to get Java to handle ligatures such > > as "fi" correctly, but without much success. [quoted text clipped - 6 lines] > Let's just assume you want to do it for aesthetic reasons. But that's > still difficult to automate, because it depends on language rules. That's OK. I mean, it's fiddly, but it's OK.
> > Various online documents > > have suggested that I need to be intervening in the rendering process > > I guess this assumes that you intend to emulate ligatures by using the > corresponding normal glyphs and move them closer together. This only > works in some fonts, but looks extremely ugly in others. No; I'm specifically thinking of <http://java.sun.com/j2se/1.3/docs/guide/2d/spec/j2d-fonts.fm5.html> which confidently announces that "In Figure 4-14, the custom layout algorithm replaces the fi substring with the ligature _fi_". Without giving any hints as to how such a custom layout algorithm could be implemented.
[...]
> > I have discovered (<http://mindprod.com/jgloss/ligature.html>) that > > 'fi' has its own code point (\ufb01) in Unicode - should I be manually [quoted text clipped - 6 lines] > indeed means you have to get the corresponding unicode into your > char[]. OK. It seems slightly odd, though, that cyrrent Java handles the complexities of Arabic and Hindi writing but neglects to provide any hook into this for Latin. (The documentation intermittently boasts that the fancy i18n stuff also allows conventient high end typography of Latin scripts, but I find, as you are suggesting, this not to be the case.)
> However, the problem is how to automate this. E.g. in my mother tongue > the rules for using ligatures are partly related to hyphenation > rules. Any automatic replacement of character sequences with > corresponding ligatures would have to take these into account to get > things right. If you mean German and especially "ß", then that is certainly more ligaturisation than I plan to automate. (In particular, I'm not implementing hyphenation, so I would treat it as a separate character.)
> > Do Fonts come with lists of ligatures they support, and if so how > > do I get at them? [quoted text clipped - 6 lines] > particular font contains matching glyphs. Font.canDisplay() and > Font.canDiaplayUpTo() are your friends. Fair enough. If that's how it is, that's what I'll do.
Des
Roedy Green - 17 Dec 2005 03:10 GMT >No; I'm specifically thinking of ><http://java.sun.com/j2se/1.3/docs/guide/2d/spec/j2d-fonts.fm5.html> >which confidently announces that "In Figure 4-14, the custom layout >algorithm replaces the fi substring with the ligature _fi_". Without >giving any hints as to how such a custom layout algorithm could be >implemented. See the notes I have added on GlyphVectors at http://mindprod.com/jgloss/ligature.html
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Chris Uppal - 17 Dec 2005 13:39 GMT > No; I'm specifically thinking of > <http://java.sun.com/j2se/1.3/docs/guide/2d/spec/j2d-fonts.fm5.html> > which confidently announces that "In Figure 4-14, the custom layout > algorithm replaces the fi substring with the ligature _fi_". Without > giving any hints as to how such a custom layout algorithm could be > implemented. I get the impression that this stuff is still all "work in progress", and the level of pluggability that the architecture promises has not (as yet) been exposed in the public API. As of now, the only route /I/ can see for using a custom layout algorithm is to create your own subclass of GlyphVector which allows you to fill in the details as desired. That would (as far as I can see) be a /lot/ of work. Check the source for sun.font.StandardGlyphVector (it's part of the platform source, but not present in src.zip) for an idea of how much work. And then too, you'd have to side-step the convenience methods for text handling, and do everything at the lowest level of the API. It seems to me that it would be /much/ easier just to substitute the 7 defined ligature Unicode characters into your text and let the normal processing handle it.
-- chris
Thomas Weidenfeller - 19 Dec 2005 08:10 GMT > If you mean German and especially "ß", then that is certainly more > ligaturisation than I plan to automate. No, "ß" is yet another case. It once started as a ligature but is pretty much treated as a single character these days. As opposite to ligatures there is no voluntary replacement of two single "s" with one "ß".
What I mean in I simplified form (the real rule are even trickier) is that one doesn't use a ligature if there could possibly be a hyphenation between the original letters. There must not be an actual hyphenation, just that there could be on.
/Thomas
 Signature The comp.lang.java.gui FAQ: ftp://ftp.cs.uu.nl/pub/NEWS.ANSWERS/computer-lang/java/gui/faq http://www.uni-giessen.de/faq/archiv/computer-lang.java.gui.faq/
Thomas Weidenfeller - 19 Dec 2005 08:32 GMT Forgot to answer that part:
>>I guess this assumes that you intend to emulate ligatures by using the >>corresponding normal glyphs and move them closer together. This only >>works in some fonts, but looks extremely ugly in others. > > No; I'm specifically thinking of > <http://java.sun.com/j2se/1.3/docs/guide/2d/spec/j2d-fonts.fm5.html> But that's pretty much what they suggest :-) This magic custom layout algorithm they are talking about would exactly have to be an algorithm that moves otherwise independent glyphs (Font data) together, so they look like a ligature (or draws glyphs from some other source which represent ligatures).
> which confidently announces that "In Figure 4-14, the custom layout > algorithm replaces the fi substring with the ligature _fi_". Without > giving any hints as to how such a custom layout algorithm could be > implemented. Of course not - because it is difficult. I consider that paragraph in the guide as markting bla bla to show-off. You can summarise the paragraph as it follows:
/If you want some special handling you can't use the usual drawString() methods (which you can notice in the first figure). Instead, you are completely on your own. You somehow have to create a GlypehVector with some magic algorithm./
The algorithms they are talking about are algorithms to do proper typesetting. You have a stream of characters, font attribures, etc. as input, and the algorithm has to properly size, color, position (and whatever) the corresponding glyphs, of which the result is recorded in a GlypheVector.
I know of one guy who has published such an algorithm: Knuth for his TeX typesetting system, and I seem to remember that the Unix troff authors, as well as the GNU groff authors published some information about their typesetting algorithms, too.
I have my doubts if it is worth the effort to implemente these so you can emulate ligatures.
If I need to emulate ligatures, I would consider manipulating an already generated GlypheVector, and try to reposition glyphs in the vector to a very limited extend.
/Thomas
 Signature The comp.lang.java.gui FAQ: ftp://ftp.cs.uu.nl/pub/NEWS.ANSWERS/computer-lang/java/gui/faq http://www.uni-giessen.de/faq/archiv/computer-lang.java.gui.faq/
Des Small - 19 Dec 2005 09:23 GMT > Forgot to answer that part: > [quoted text clipped - 9 lines] > look like a ligature (or draws glyphs from some other source which > represent ligatures). I'm after the latter. High quality fonts typically come with a stock of such glyphs. I want to use them on screen.
> > which confidently announces that "In Figure 4-14, the custom layout > > algorithm replaces the fi substring with the ligature _fi_". Without [quoted text clipped - 9 lines] > figure). Instead, you are completely on your own. You somehow have to > create a GlypehVector with some magic algorithm./ Yes; however the marketroids left the impression there would be hooks into the code for this. In fact, the Font class has no protocol for asking about ligature glyphs. There must be an internal API for this, since it is (as the marketroids observe) formally the same problem as rendering Arabic text, and Java can indeed render Arabic text. It is just that the APIs aren't public. (Another poster suggested that this stuff is work in progress, but the APIs don't seem to have changed since 1999.)
> The algorithms they are talking about are algorithms to do proper > typesetting. You have a stream of characters, font attribures, etc. as [quoted text clipped - 9 lines] > I have my doubts if it is worth the effort to implemente these so you > can emulate ligatures. I don't want to "emulate" ligatures; I want to just plain have and use ligatures. I want a text-editor with decent rendering of text - when writing I spend more time looking at text on the screen than anywhere else, and it seems to me worth optimising for that. (I don't want WYSIWYG; I don't care about WYSIWIG and I usually typeset with TeX.)
Under the circumstances the non-support from Java does indeed mean that it isn't worth doing in Java. Accordingly, I am currently investigating more hospitable platforms for this project.
Thanks very much for all your help, though.
Des
John C. Bollinger - 20 Dec 2005 02:24 GMT >>But that's pretty much what they suggest :-) This magic custom layout >>algorithm they are talking about would exactly have to be an algorithm [quoted text clipped - 4 lines] > I'm after the latter. High quality fonts typically come with a stock > of such glyphs. I want to use them on screen. And the problem with that is what?
[...]
> Yes; however the marketroids left the impression there would be hooks > into the code for this. In fact, the Font class has no protocol for [quoted text clipped - 4 lines] > stuff is work in progress, but the APIs don't seem to have changed > since 1999.) [...]
> I don't want to "emulate" ligatures; I want to just plain have and use > ligatures. I want a text-editor with decent rendering of text - when [quoted text clipped - 5 lines] > that it isn't worth doing in Java. Accordingly, I am currently > investigating more hospitable platforms for this project. But you still haven't explained (that I have been able to follow) what specific features you find lacking. Perhaps I'm missing it because my native writing system doesn't use any ligatures, but you seem to have started this thread with a view toward a very complicated way of handling what may be a very simple problem. To wit: what's wrong with using the appropriate Unicode characters for the ligatures you want, and relying on the Font to render them correctly? (You do assert that high-quality fonts will have the glyphs.) Why do you need to worry about glyph vectors and layout details? It may be that the lack of docs you complained about arises from there being nothing different about rendering the Unicode character for a ligature than there is for rendering any other arbitrary Unicode character.
 Signature John Bollinger jobollin@indiana.edu
Thomas Weidenfeller - 20 Dec 2005 09:08 GMT >> I'm after the latter. High quality fonts typically come with a stock >> of such glyphs. I want to use them on screen. > > And the problem with that is what? For me the OPs problem sounds as it follows:
A font as part of its data provides information about mappings of charsets to the glyphs contained in the font. Not each such mapping can point to all glyphs in a font. E.g. an ISO Lantin 1 mapping provided by a font would of course only point to the glyphs in the font which are part of ISO Latin 1. A Unicode mapping can only point to glyphs in the font which have a place in Unicode.
Now, Unicode defines very few ligatures, much less than one would need in serious typesetting. Fonts (apparently his fonts) can contain glyphs for many more ligatures than Unicode knows about.
But he can't get to these additional (ligature) glyphs in the font, because they have no Unicode code point, are therefor not in a font's Unicode-to-glyph mapping, and are therefore not addressable by Java. Because Java only deals with Unicode.
I just learned that (contrary to what I originally wrote) high quality fonts even contain "glyph substitution" data. Lists which provide a mapping of character combinations to special glyphs for these combinations, e.g. ligatures. Java provides no API to access such lists. And even it it would, the resulting glyph has no Unicode code point, so could not be represented in a Java char.
So from a typesetting point of view already Java's Font API is very poor.
Hi also can't simply circumvent the Font API and somehow create his own GlypheVector. At least not with some huge effort. From what I see, doing this would (a) require to write an own font file reader, capable of getting all the necessary information out of a font file, (b) a layout algorithm to place glyph data in a GlypheVector, which (c) requires to write an own font renderer to interpret the glyph data in a font and convert it to Java glyph information.
> Perhaps I'm missing it because my > native writing system doesn't use any ligatures, I would be very surprised if you haven't seen your native language (American English?) printed with ligatures. It is a matter of aesthetic typesetting. High-quality newspaper, magazine or book printers for sure use ligatures.
Do you happen to have the GoF book at hand? For example the first paragraph of the first page (xi) of the prefix contains a ligature. "This book assumes you are reasonably proficient ...". See the "fi" in "proficient"? IMHO it is printed as a ligature.
> started this thread with a view toward a very complicated way of > handling what may be a very simple problem. To wit: what's wrong with > using the appropriate Unicode characters for the ligatures you want, There aren't any Unicode characters.
/Thomas
 Signature The comp.lang.java.gui FAQ: ftp://ftp.cs.uu.nl/pub/NEWS.ANSWERS/computer-lang/java/gui/faq http://www.uni-giessen.de/faq/archiv/computer-lang.java.gui.faq/
Roedy Green - 20 Dec 2005 10:31 GMT On Tue, 20 Dec 2005 10:08:34 +0100, Thomas Weidenfeller <nobody@ericsson.invalid> wrote, quoted or indirectly quoted someone who said :
>I just learned that (contrary to what I originally wrote) high quality >fonts even contain "glyph substitution" data. Lists which provide a >mapping of character combinations to special glyphs for these >combinations, e.g. ligatures. Java provides no API to access such lists. >And even it it would, the resulting glyph has no Unicode code point, so >could not be represented in a Java char. Those could be done by mapping the ligatures into the unicode private area. When I was reading up on ligatures a while back that is how they were supposed to be handled. Unicode itself wanted to wash its hands of them.
A curiosity itch that is getting more and more insistent is to find out just what information is encoded in a font, and how the mapping from unicode to glyph gets handled. Are there 8 and 16 bit fonts, or are 16 bit fonts faked with multiple 8-bit fonts and duct tape? I looking at fonts on the net. the glyphs seem to be often placed in quite arbitrary slots. How do you get them moved where you want them?
How can you tell what glyphs you really have available?
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Chris Uppal - 20 Dec 2005 12:11 GMT > Do you happen to have the GoF book at hand? For example the first > paragraph of the first page (xi) of the prefix contains a ligature. > "This book assumes you are reasonably proficient ...". See the "fi" in > "proficient"? IMHO it is printed as a ligature. Oddly enough, my copy of GoF (11th printing, May '97) does not appear to use ligatures. The f and i in that word are clearly separated, and the i has a dot of its own.
-- chris
Des Small - 20 Dec 2005 10:02 GMT > >>But that's pretty much what they suggest :-) This magic custom layout > >>algorithm they are talking about would exactly have to be an algorithm [quoted text clipped - 5 lines] > > And the problem with that is what? The problem with that is there is no protocol to ask fonts whether they do or not have such ligatures.
[...]
> > I don't want to "emulate" ligatures; I want to just plain have and use > > ligatures. I want a text-editor with decent rendering of text - when [quoted text clipped - 7 lines] > But you still haven't explained (that I have been able to follow) what > specific features you find lacking. The correct model is to produce a GlyphVector, which includes all the ligatured glyph, from a sequence of logical characters, which is entirely oblivious to ligatures. What I lack is hooks into that process; what you are suggesting is to munge the characters instead which is a presumably workable hack for some cases, but see below.
I don't think it's unreasonable to want what I want, and it certainly isn't unanticipated: the Java official documentation says: "In Figure 4-14, the custom layout algorithm replaces the fi substring with the ligature fi." <http://java.sun.com/j2se/1.3/docs/guide/2d/spec/j2d-fonts.fm5.html#67469>
That's what I want to do. Write such a "custom layout algorithm".
> Perhaps I'm missing it because my native writing system doesn't use > any ligatures, but you seem to have started this thread with a view > toward a very complicated way of handling what may be a very simple > problem. To wit: what's wrong with using the appropriate Unicode > characters for the ligatures you want, and relying on the Font to > render them correctly? They are essentially deprecated:
""" The existing ligatures exist basically for compatibility and round-tripping with non-Unicode character sets. Their use is discouraged. No more will be encoded in any circumstances. """ <http://www.unicode.org/faq/ligature_digraph.html>
It's the rendering engine's job to do this stuff. The current Java rendering engine does not do this stuff. The current Java rendering engine does not have hooks to allow me to do this stuff. (Yes, I can feed it deprecated characters. No, I'm not going to.)
Note that some fonts have a fancy 'st' ligature which has no Unicode codepoint and which isn't going to get one.
> (You do assert that high-quality fonts will have the glyphs.) Why > do you need to worry about glyph vectors and layout details? It may > be that the lack of docs you complained about arises from there > being nothing different about rendering the Unicode character for a > ligature than there is for rendering any other arbitrary Unicode > character. There isn't just a lack of docs, there's a lack of hooks into the character rendering pipeline.
Just to put the matter beyond reasonable doubt, though, I have decided that the showpiece for my implementation will be the correct handling of Fraktur writing, which has many ligatures not encoded in Unicode code-points.
Des
Roedy Green - 20 Dec 2005 10:33 GMT >The problem with that is there is no protocol to ask fonts whether >they do or not have such ligatures. It sounds like it MIGHT be encoded in some fonts. The problem is Java is not handing the information to you on a plate. Presumably you can go analyse the font yourself.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Chris Uppal - 20 Dec 2005 12:10 GMT > The problem with that is there is no protocol to ask fonts whether > they do or not have such ligatures. If you were content to use the 7+4 common ligatures found in English text then you could use Font.canDisplay() with the corresponding Unicode code point. I know you are not content to limit yourself thusly, but just for completeness.
> [The Unicode English text ligatures] are essentially deprecated: "Deprecated" is the wrong word. It is true that they are purely presentational, and that that does not fit with what Unicode is intended for. Nevertheless, they exist, and they always /will/ exist (Unicode code points never change once assigned). They are indeed intended for legacy purposes, and while they probably should not appear in Unicode /text/ (i.e. shared between applications), using them internally to an application is a legitimate use. In this case, using them to "talk" to the Java font APIs is legitimate.
For reference, Unicode defines ligatured characters for the following: ff fi fl ffi ffl ft st starting at U+FB00. It also defines ae, oe, AE, and OE (plus various accented derivatives). This is a superset of the ligatures in the Adobe "Standard Roman" + "Expert" character sets.
> It's the rendering engine's job to do this stuff. Agreed.
> The current Java rendering engine does not do this stuff. Agreed. Although it does have the hooks internally, as I said before they are not (yet) exposed in the public API.
> The current Java rendering engine does not have hooks to allow me to > do this stuff. (Yes, I can feed it deprecated characters. No, I'm > not going to.) If the Unicode-derived restrictions are too much for you then the existing API is undoubtedly inadequate. But if it /would/ suffice if you were willing to use it, then refusing to do so just because you think the character assignments are "deprecated" would be a mistake.
> Note that some fonts have a fancy 'st' ligature which has no Unicode > codepoint and which isn't going to get one. It already has one, U+FB06. ;-)
> Just to put the matter beyond reasonable doubt, though, I have decided > that the showpiece for my implementation will be the correct handling > of Fraktur writing, which has many ligatures not encoded in Unicode > code-points. Which confirms that the Java standard font stuff is not suitable for your purposes.
-- chris
Des Small - 20 Dec 2005 12:38 GMT > > The problem with that is there is no protocol to ask fonts whether > > they do or not have such ligatures. [quoted text clipped - 3 lines] > Unicode code point. I know you are not content to limit yourself > thusly, but just for completeness. Noted. [...]
> > It's the rendering engine's job to do this stuff. > [quoted text clipped - 4 lines] > Agreed. Although it does have the hooks internally, as I said > before they are not (yet) exposed in the public API. From 1999 to 2005 is a lot of yet, in my considered. And Java's roadmap doesn't look very client-bound.
> > The current Java rendering engine does not have hooks to allow me to > > do this stuff. (Yes, I can feed it deprecated characters. No, I'm [quoted text clipped - 5 lines] > you think the character assignments are "deprecated" would be a > mistake. Well, it's uglier than that. Since I also want goodies like search and replace, I have to keep a spare copy of the umunged text as well as a munged copy for the typesetting outside of the rendering pipeline where we both (and the docs) agree it properly belongs.
> > Note that some fonts have a fancy 'st' ligature which has no Unicode > > codepoint and which isn't going to get one. > > It already has one, U+FB06. ;-) D'oh, my bad! How about <fj> then? It's all the rage in Scandiwegian.
> > Just to put the matter beyond reasonable doubt, though, I have decided > > that the showpiece for my implementation will be the correct handling [quoted text clipped - 3 lines] > Which confirms that the Java standard font stuff is not suitable for > your purposes. Vigorous agreement there. I didn't especially come here for an argument, I thought my information might have been out of date or incomplete.
Thanks to everyone who has helped clear this up.
Des
Roedy Green - 20 Dec 2005 21:35 GMT On Tue, 20 Dec 2005 12:10:44 -0000, "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> wrote, quoted or indirectly quoted someone who said :
>For reference, Unicode defines ligatured characters for the following: > ff [quoted text clipped - 7 lines] >derivatives). This is a superset of the ligatures in the Adobe "Standard >Roman" + "Expert" character sets. For future reference, these are documented at http://mindprod.com/jgloss/ligature.html
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 20 Dec 2005 21:37 GMT On Tue, 20 Dec 2005 12:10:44 -0000, "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> wrote, quoted or indirectly quoted someone who said :
>> Just to put the matter beyond reasonable doubt, though, I have decided >> that the showpiece for my implementation will be the correct handling [quoted text clipped - 3 lines] >Which confirms that the Java standard font stuff is not suitable for your >purposes. to handle that in Java you could modify the encoding of the font to insert the ligature glyphs into the private area of Unicode, if they were not already there. Then you would have to use the roll your own GlyphVector technique.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 20 Dec 2005 05:27 GMT > In fact, the Font class has no protocol for >asking about ligature glyphs. Let's say you were to invent such as api. You would have to give the unicode slot where the ligature was, often in the private area. You would also have to specify what pair of characters it was intended to replace.
Then you have things like the German ß which does its own thing.
Then consider Arabic. Arrgh as you run screaming from the room.
You then came back, and solved it, then you realised that somehow your table would have to be embedded in every font, and you would have to convince all the Font software makers to support your addition and all Font designers to fill it in.
What you will have to do is get a copy of any new font, examine it manually, and add it to your ligaturiser code.
Consider too that you might not WANT to use some ligatures. They may look too archaic.
What perhaps you want instead is a tool for you user to examine a font for ligatures and the end user can decide if they want to use them, and for what pair. It might report back to central on their findings. You can build a consensus table to use as the defaults.
Surely you are not the first person to want to handle ligatures. Perhaps if you study the OpenType format there is something in there. You might examine the font file directly or use some platform specific tool to get you more font trivia.
I am not optimistic. Font canDisplay LIES. It returns TRUE even if you just get a blob back. If they can't even get that right, what hope is there for ligatures?
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Des Small - 20 Dec 2005 10:16 GMT > > In fact, the Font class has no protocol for > >asking about ligature glyphs. [quoted text clipped - 3 lines] > would also have to specify what pair of characters it was intended to > replace. Yes, of course.
> Then you have things like the German ß which does its own thing. German ß is usually considered a character in its own right; iso-latins give it its own codepoint.
> Then consider Arabic. Arrgh as you run screaming from the room. Arabic isn't really that bad, if you abandon a naive belief in a 1-1 mapping of characters and glyphs. Since I have abandoned such a belief, I am not very intimidated. What I am asking does indeed amount to a recognition that proper typesetting of even English requires comparable resources to typesetting Arabic. But that's because it does.
> You then came back, and solved it, then you realised that somehow your > table would have to be embedded in every font, and you would have to > convince all the Font software makers to support your addition and all > Font designers to fill it in. They do. That's the point. They already do. Consider the following fragment of the Abode Font Metrics (afm) file for a passing Palatino:
""" C 102 ; WX 333 ; N f ; B 23 -3 341 728 ; L i fi ; L l fl ; """
This is the letter 'f', and the Ls say how it combines with 'i' and 'l' to form ligatures. This is typical of Abode's Type 1 fonts.
> What you will have to do is get a copy of any new font, examine it > manually, and add it to your ligaturiser code. No, it can be done programmatically from afm files (or equivalent data included with, say, TrueType fonts).
> Consider too that you might not WANT to use some ligatures. They may > look too archaic. That's a crossable bridge, if I come to it.
> What perhaps you want instead is a tool for you user to examine a font > for ligatures and the end user can decide if they want to use them, > and for what pair. It might report back to central on their findings. > You can build a consensus table to use as the defaults. Maybe. But at the moment the user is me, and I want them all.
> Surely you are not the first person to want to handle ligatures. I can Google up other forlorn unsuccesses with Java. At the moment I'm working on parsing Type 1 font afm files, and the most hospitable framework seems to be the Gnome project's Pango, which specialises in internationalised text and already has an Arabic shaper module.
Des
Thomas Weidenfeller - 20 Dec 2005 10:28 GMT > I can Google up other forlorn unsuccesses with Java. At the moment > I'm working on parsing Type 1 font afm files, and the most hospitable > framework seems to be the Gnome project's Pango, which specialises in > internationalised text and already has an Arabic shaper module. Deep inside somewhere in Apache FOP should also be some classes to read TTF and Type 1 fonts.
/Thomas
 Signature The comp.lang.java.gui FAQ: ftp://ftp.cs.uu.nl/pub/NEWS.ANSWERS/computer-lang/java/gui/faq http://www.uni-giessen.de/faq/archiv/computer-lang.java.gui.faq/
Roedy Green - 20 Dec 2005 10:36 GMT >I can Google up other forlorn unsuccesses with Java. At the moment >I'm working on parsing Type 1 font afm files, and the most hospitable >framework seems to be the Gnome project's Pango, which specialises in >internationalised text and already has an Arabic shaper module. You have three basic font formats to deal with, truetype, Adobe type 1 and OpenType. I'm going to see if I can read up on the format.
I have the White Book for Adobe Type 1, but it is very old.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 20 Dec 2005 10:52 GMT >I've been trying to work out how to get Java to handle ligatures such >as "fi" correctly, MS OpenType file layout spec. http://www.microsoft.com/typography/otspec/otff.htm
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 20 Dec 2005 11:29 GMT >I've been trying to work out how to get Java to handle ligatures such >as "fi" correctly, OpenType does encode the ligatures, with mind-boggling complexity. No wonder Java ignored it all! http://www.microsoft.com/typography/otspec/gsub.htm
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|