Discussion:
[geda-user] A complete set of CJK glyphs rendered as PCB symbols
Erich Heinzle
2014-09-14 15:29:50 UTC
Permalink
I have batch processed the gnu unifont bdf

http://unifoundry.com/pub/unifont-7.0.03/font-builds/unifont-7.0.03.bdf.gz

available from

http://unifoundry.com/unifont.html

to produce around 20,000 chinese, japanese and korean (CJK) symbols that
can be used in gEDA PCB.

the uncompressed text file weighs in at around 20MB, but compressed is only
1.5MB

http://users.on.net/~esh/geda/pcb/src/fonts/fireflyR16-CJK-glyphs.pcb.gz

Users requiring a few glyphs can now include them and map them to spare
ascii symbols until there is a way to more easily include unicode symbols,
i.e.

Symbol['6' 1200]
#gEDA PCB compatible symbol with drawn elements depicting uni9ED6
#Symbol['uni9ED6' 1200]
( etc...
)

The symbols are based on the open 16x16 bitmapped firefly CJK font set and
have had contiguous pixels vertically, horizontally and diagonally
converted into SymbolLine[] strokes, and orphan pixels if any, are rendered
as a dot. A default stroke width of 800 has been used.

The smaller 11x11, 12x12 13x13, 14x14, 15x15 bitmapped CJK glyphs also
could be easily converted if necessary, but they may lack the fidelity of
the 16x16 glyphs which can be scaled anyway within PCB, so I have not
bothered to convert the lower resolution bitmaps at this stage.

I will release the the conversion utility shortly once I finish tidying it
up as GPL2+/-3.

I expected to walk away from the computer for at least a few minutes to
convert the 20,000+ CJK bdf archive but it was done in seconds, meaning
that on the fly importing of single CJK glyph bdf definitions from the
freely downloadable gnu unifont bdf could in theory be done from within PCB
if a suitable menu option were available.

I hope this is useful to anyone desperately in need of some functional CJK
glyphs before gEDA PCB supports either an integrated conversion process
like this or TTF support.

I cannot vouch for the rendering of all of the glyphs, as I do not read
Chinese, haven't had the chance to review them all, and the heuristics may
have joined the occasional diagonally adjacent pixels which should remain
unjoined.

The symbol archive is released under GPL2 or at your option, a later
version, and can be freely distributed.

I would make the observation that this approach to glyph rendering makes
for quite a compact symbol definition. My initial efforts involving
conversion of curved paths produced symbol definitions 3-4 times the
overall size, with implications for final PCB file size. It would be
interesting to see how much bigger or smaller gerbers might end up being if
text is rendered as polygons derived from TTF fonts.

If/when unicode support is implemented, I think it would be a useful
feature if users could retain the ability to use a traditional default_font
style font for the ASCII character code page, and a separate unicode font
for the other code pages, since available CJK containing fonts may not have
ASCII code page fonts ideally suited to rendering with strokes, whereas the
existing gEDA PCB ASCII fonts are pretty optimal, compact and gerber
friendly.

Cheers,

Erich.
Stefan Salewski
2014-09-14 18:51:07 UTC
Permalink
Post by Erich Heinzle
I have batch processed the gnu unifont bdf
Is there a detailed description available about the conversion process?

I can remember that I was not really happy with PCB's way including
fonts as lines in each pcb file some years ago, so I did some thinking
about how to do on the fly conversion of arbitrary fonts -- inkscape's
ability to convert bitmaps to vector graphics was one possible way that
time, but I have never investigated it. (My general idea was to use
ordinary fonts for screen display, and convert to lines only for gerber
export.)

I have never seen a pcb board with a single Chinese character, and I can
not imagine why an Asian person should have the wish to have such glyphs
on a pcb board. But the conversion process is interesting of course...
Larry Doolittle
2014-09-14 19:22:27 UTC
Permalink
Stefan -
Post by Stefan Salewski
I have never seen a pcb board with a single Chinese character, and I can
not imagine why an Asian person should have the wish to have such glyphs
on a pcb board.
Oh, gee, I don't know. To sign their name, mabye?

- Larry
Erich Heinzle
2014-09-15 02:45:51 UTC
Permalink
The fonts for X windows are commonly available in bdf format.

http://en.wikipedia.org/wiki/Glyph_Bitmap_Distribution_Format

The bdf font file is a series of consecutive symbol definitions of the form:

STARTCHAR U+004E
ENCODING 78
SWIDTH 500 0
DWIDTH 8 0
BBX 8 16 0 -2
BITMAP
00
00
00
00
42
62
62
52
52
4A
4A
46
46
42
00
00
ENDCHAR


My utility
1) acquires a valid BDF symbol definition via stdin
2) extracts the glyph label, glyph height, glyph width, display width, and
the bitmap nibbles
2.1) optionally exports an xbm bitmap
2.2) optionally exports a "Dot matrix" PCB symbol rendition of the glyph
using SymbolLine strokes to depict dots
3) stores the nibbles for each line in the glyph as a single integer
4) creates arrays in which each pixel is depicted as an integer
5) steps through the single integer representation of the rows and scores
each pixel, putting the score into the integer per pixel row array
6) exports consecutive rows of pixels as a SymbolLine strokes
7) steps through each column of scored pixels and does further scoring of
each pixel, putting the score into the integer per pixel column array
8) exports consecutive columns of pixels as a SymbolLine strokes
8.1) optionally exports a symbol without diagonal row detection and
conversion to strokes
9) creates left and right skewed arrays of the final pixel scores after
column and row export
10) steps through the single integer representation of the right skewed
array columns and detects diagonals, exports SymbolLine strokes
11) steps through the single integer representation of the left skewed
array columns and detects diagonals, exports SymbolLine strokes
12) identifies any left over/orphan pixels and exports a SymbolLine stroke
to depict a "dot"
13) exports a complete symbol with vertical, horizontal, and diagonal
strokes
14) looks for another BDF symbol via stdin

This produces output containing a series of geda PCB symbol definitions for
each glyph, and is what is in the gz file I mentioned.

Until we have a mechanism for seamlessly adding unicode symbols or
rendering ttf fonts, users needing glyphs can search the gzipped archive
and cut and paste their needed symbol, and relabel it to assign it to an
unused ascii character.

Anything that increases the potential user base by > 1 billion has got to
be a good thing.

Cheers

Erich.
Post by Stefan Salewski
Post by Erich Heinzle
I have batch processed the gnu unifont bdf
Is there a detailed description available about the conversion process?
I can remember that I was not really happy with PCB's way including
fonts as lines in each pcb file some years ago, so I did some thinking
about how to do on the fly conversion of arbitrary fonts -- inkscape's
ability to convert bitmaps to vector graphics was one possible way that
time, but I have never investigated it. (My general idea was to use
ordinary fonts for screen display, and convert to lines only for gerber
export.)
I have never seen a pcb board with a single Chinese character, and I can
not imagine why an Asian person should have the wish to have such glyphs
on a pcb board. But the conversion process is interesting of course...
mskala-iYp5QZLjffFsCj9YYT8S7fd9D2ou9A/
2014-09-15 04:38:04 UTC
Permalink
Anything that increases the potential user base by > 1 billion has got to be
a good thing.
I think that's a bit of an exaggeration. For it to be true, lack of this
feature would have to be stopping the ENTIRE population of China from
becoming users of the software, and be the ONLY thing stopping them.

Do we have any indication that anybody actually wants to put what comes
out of bitmap-to-stroke conversion from low-res Chinese bitmap fonts onto
a PCB at all?

I maintain a Japanese-language stroked font project
(tsukurimashou.sourceforge.jp). My project wouldn't be suitable for
Chinese and doesn't really have full character coverage for any CJK
language; but other projects do provide stroke data for these kinds of
characters with better coverage. It doesn't have to come from bitmap
conversion. I'm most familiar with the Japanese-language ones, which
include the Hershey fonts from the 1960s (incomplete coverage,
unfortunately); KanjiVG (complete coverage of Japanese in SVG format -
these would probably be easiest to convert for gEDA use); and Wadalab (the
original source of many of the Asian-language fonts shipped in Linux
distributions to this day). Chinese-language projects of similar nature
do exist.

If someone wanted to put stroked CJK characters onto a PCB, I think they'd
be much happier using something derived from one of those sources, instead
of from an attempt at converting low-res bitmaps back to strokes.
Low-res bitmaps always contain significant compromises of the basic
geometry of the characters, in order to get them to fit the grid at all.
As a simple English-language example, in some of my terminal windows a
lowercase "m" appears as just a solid rectangle, because there isn't
enough horizontal resolution in the low-res bitmap to render the three
vertical strokes separately with nice spacing. That's readable in its
correct context, but imagine what it would look like, and whether it would
be acceptable, after being converted "back" to strokes. CJK bitmap fonts
are rife with such cases. That's why doing the conversion in the other
direction, from strokes to bitmaps, is a largely manual process despite
the expense of doing it at the scale of these character sets: knowing
where to make the compromises is a very difficult thing to automate.
--
Matthew Skala
mskala-iYp5QZLjffFsCj9YYT8S7fd9D2ou9A/***@public.gmane.org People before principles.
http://ansuz.sooke.bc.ca/
Erich Heinzle
2014-09-15 05:21:36 UTC
Permalink
I don't particularly mind if the final solution employs stroke -> PCB
derived fonts instead of my initial symbol set. Obviously, stroke -> symbol
derived fonts are going to be nicer to look at. It was an interesting
exercise for me whether or not the symbols eventually get used.

Nevertheless, I think having something available in the interim for people
needing a few glyphs here or there is better nothing, and I was primarily
hoping that by providing a full CJK symbol set it would provide some
impetus to implementing unicode support.

Which CJK symbol set should ideally be used is not the rate limiting step
here, unicode support within symbol definitions is.

Cheers,

Erich.
Post by mskala-iYp5QZLjffFsCj9YYT8S7fd9D2ou9A/
Post by Erich Heinzle
Anything that increases the potential user base by > 1 billion has got
to be
Post by Erich Heinzle
a good thing.
I think that's a bit of an exaggeration. For it to be true, lack of this
feature would have to be stopping the ENTIRE population of China from
becoming users of the software, and be the ONLY thing stopping them.
Do we have any indication that anybody actually wants to put what comes
out of bitmap-to-stroke conversion from low-res Chinese bitmap fonts onto
a PCB at all?
I maintain a Japanese-language stroked font project
(tsukurimashou.sourceforge.jp). My project wouldn't be suitable for
Chinese and doesn't really have full character coverage for any CJK
language; but other projects do provide stroke data for these kinds of
characters with better coverage. It doesn't have to come from bitmap
conversion. I'm most familiar with the Japanese-language ones, which
include the Hershey fonts from the 1960s (incomplete coverage,
unfortunately); KanjiVG (complete coverage of Japanese in SVG format -
these would probably be easiest to convert for gEDA use); and Wadalab (the
original source of many of the Asian-language fonts shipped in Linux
distributions to this day). Chinese-language projects of similar nature
do exist.
If someone wanted to put stroked CJK characters onto a PCB, I think they'd
be much happier using something derived from one of those sources, instead
of from an attempt at converting low-res bitmaps back to strokes.
Low-res bitmaps always contain significant compromises of the basic
geometry of the characters, in order to get them to fit the grid at all.
As a simple English-language example, in some of my terminal windows a
lowercase "m" appears as just a solid rectangle, because there isn't
enough horizontal resolution in the low-res bitmap to render the three
vertical strokes separately with nice spacing. That's readable in its
correct context, but imagine what it would look like, and whether it would
be acceptable, after being converted "back" to strokes. CJK bitmap fonts
are rife with such cases. That's why doing the conversion in the other
direction, from strokes to bitmaps, is a largely manual process despite
the expense of doing it at the scale of these character sets: knowing
where to make the compromises is a very difficult thing to automate.
--
Matthew Skala
http://ansuz.sooke.bc.ca/
mskala-iYp5QZLjffFsCj9YYT8S7fd9D2ou9A/
2014-09-15 06:08:06 UTC
Permalink
Post by Erich Heinzle
Which CJK symbol set should ideally be used is not the rate limiting step
here, unicode support within symbol definitions is.
I agree - and that would have many other benefits too, because there
are many characters and symbols people might want to use on PCBs besides
Chinese. Even just a few of the symbols in the Zapf Dingbats range would
be nice to have.
--
Matthew Skala
mskala-iYp5QZLjffFsCj9YYT8S7fd9D2ou9A/***@public.gmane.org People before principles.
http://ansuz.sooke.bc.ca/
Atommann
2014-09-15 05:23:47 UTC
Permalink
Post by Stefan Salewski
I have never seen a pcb board with a single Chinese character, and I can
not imagine why an Asian person should have the wish to have such glyphs
on a pcb board. But the conversion process is interesting of course...
This morning I asked a technician come to our office to fix the laser
cutter, and when he opens the door of the machine, I smile, there are
a lot of Chinese characters on the PCB! (In my home opinion, putting
Chinese on PCB is not good. But it exists)
https://www.flickr.com/photos/atommann/15240430621/sizes/h/
--
Best regards,
Atommann
Loading...