

User defined (type 3) fonts you have to treat as a series of PDF operations, scaled by the text matrix. You can use FreeType to render glyphs from fonts with PostScript and TrueType outlines (you can also have it return the path if you would rather use that). There are broadly three categories of font in PDF: In any event extracting the path is as much work, possibly more, as rendering the bitmap.

I believe the only way to get truly accurate information is to actually render the glyphs at the given point size and collect the extents of the resulting bitmap.Įven extracting the path describing the glyph won't give you completely accurate information because hinting can subtly (or in some case, not so subtly) alter the way the glyph is rendered. Is there a tool that does this off the shelf? How did you calculate those positions? (I realize this is a lot to ask, given complexity of PDF.) It would be a huge help to have a walkthrough, and I'm sure it would help others in the future.

I know it's incorrect because it says the first glyph ("&") horizontally intersects the second ("\u02d9"), which you can see isn't true when you view the PDF in a PDF reader.
#Fontforge extract font from pdf code#
What is the "ground truth" bounding box positioning for these 10 glyphs, in device space? My current code produces the following, but it's incorrect.

This involves keeping track of the CTM, drawing/positioning PDF instructions, etc., but also calculating the boundaries of every specific glyph in "glyph space" (using the information from the GLYF tables in the embedded fonts). I'm trying to calculate the exact bounding box of every text glyph in a vector PDF.
