metrics and malloc

Dear Lazyweb, how do I get real font metrics on iOS?

On OSX, the only way I've found to do it is:

  1. Make an NSTextStorage and NSLayoutManager;
  2. Get an NSGlyph from that;
  3. [NSBezierPath appendBezierPathWithGlyph];
  4. currentPoint to find the width;
  5. bounds to find the bounding box.

(You'd think you could get the NSGlyph with [NSFont glyphWithName], but that expects full Unicrud names like "LATIN CAPITAL LETTER A WITH ACUTE" and there is no way to get one of those from a single-character NSString, so I don't understand why that API even exists.)

Naturally, none of these APIs exist on iOS.

  1. [NSString sizeWithFont] gives us width and ascent, but nothing about the bounding box or bearings.
  2. [NSString drawAtPoint] to an offscreen CGContext returns the width of the character and the overall ascent of the font;
  3. CGFontGetGlyphBBoxes() might help (or it might just return the ascent/width again instead of the bounding box, I don't know) but there seems to again be no way to map a single-character NSString to a CGGlyph.

So I'm stuck with assuming that all characters have a 0 lbearing and rbearing, and things get clipped. See query_font() in xscreensaver/OSX/jwxyz.m.


Does the malloc on iOS contain any tricks for allocating out of a private heap? What the LISPMs called *default-cons-area* and PostScript called save/restore. E.g., it would be convenient to do something like this in my code:

#define malloc(x) malloc_in_heap (my_heap, x)

And then at the end, flush that heap in one swell foop, that is, I want to be able to say, "I swear that nothing references any pointer that has been allocated out of this area any more, regardless of whether all corresponding free calls have been made; unmap it all right now."

I know about NSAutoreleasePool, but that doesn't help with malloc, only NSObject.

Tags: , , ,

18 Responses:

  1. I have a patch here from my Firefox-for-iOS port that reads in font metrics, I'm not sure if it gets you exactly what you want, or if it's just way too much code:

    The implementation of InitMetricsFromSfntTables is over here:

    I think we do that because operating systems can't be trusted to actually parse fonts safely (and we parse random fonts from random websites nowadays).

  2. Jon Parise says:

    Perhaps you're looking for something like NSZone?

    • jwz says:

      From that page: "zones no longer allow one to mass-deallocate objects without messing about with actual deallocation".

  3. mikeash says:

    The ultimate font API on iOS is CoreText. I don't know offhand how to get this information with it, but that's where you'll want to look.

    For your allocation stuff, if you just need temporary allocations without wanting to worry about freeing the stuff at the end, you can this fun hack to get a memory region whose lifetime is tied to the autorelease pool: [[NSMutableData dataWithLength: yourLengthHere] mutableBytes].

  4. rob mayoff says:

    You need to get a CTFont for your font. This is different than a CGFont or a UIFont. You can convert from a CGFont to a CTFont using CTFontCreateWithGraphicsFont. Once you have the CTFont, you can use CTFontGetGlyphsForCharacters to get the glyphs for an array of UniChar (16-bit Unicode character codes). You can then use CTFontCreatePathForGlyph, or CTFontGetBoundingRectsForGlyph, or CTFontGetAdvancesForGlyph on each glyph.

    • jwz says:

      That works, thanks!

      It does seem really weird that there seems to be no CoreGraphics way to transform a UniChar to a CGGlyph, given that CoreGraphics has all of these functions that take CGGlyphs as arguments, but it looks like CTFontGetGlyphsForCharacters is the only game in town for that.

      • Dusk says:

        There often isn't a 1:1 relationship between characters and glyphs. For instance, some fancy fonts (e.g, Zapfino) have ligatures which will convert particular sequences of characters (e.g, "fi", "fl", "st", "Zapfino") into special glyphs. Non-Latin scripts where the glyphs used for a single character will customarily be rendered differently if the character appears at the start or end of a word. (English used to have something similar: "s" -> long-s "ſ".) Additionally, some combining characters will generate different glyphs based on the base character -- for instance, the accent on "é" is in a different location than the one on "É", even though both characters may have been generated using the same combining acute mark.

        I'm sure that most of this isn't relevant to your use-case, and you're probably wondering why there isn't some shortcut for what you're doing but… well, now you know.

        • Nick Lamb says:

          Yeah, this also means that "divide and conquer" strategies (trying to divide a string into characters for rendering) will usually appear to work right up until you render something more interesting than "Hello, world" whereupon they will fall apart into tiny sharp pieces that can pierce a lung.

          Sorry, blame several thousand years of scribes inventing various strange methods of turning spoken language into symbols none of which have been fully deprecated except arguably boustrophedon. The only way Unicode made this any worse was by bringing about the situation that there's also no 1:1 relationship between characters (to most people's understanding) and Unicode code points AND Unicode code points can consist of more than one Unicode code unit in the encodings people actually use. The terrifying truth is that this was probably the least insane way forward and the main consequence is that anything claiming to be a "character" data type in the Unicode era is either a small integer type with a misleading name (C, Java, C# etc.), or the worst example of leaky abstractions you'll encounter all year and sometimes both. Just use strings everywhere and try to foist as much of the work onto the OS or libraries as possible.

          • Brains are fuzzy pattern-matchers, not digital computers, and all information that is meant for humans to process is a mess to deal with for computers.

            Try to model dates and times. Or postal addresses. Or personal names. Or devise an internationalisation framework that lets programs produce output translated to other languages with correct pluralisation (what in English you’d do with 'unit' + (num == 1 ? '' : 's') – you know).

            It’s not just scribes. Everywhere computers intersect with human cognition, the code is some sort of mess. And the mess differs by culture.

          • Steen says:

            none of which have been fully deprecated except arguably boustrophedon.

            And Comic Sans.

      • hattifattener says:

        CoreText is my suggestion too. It's painfully inflexible (all the iOS guys at Apple have apparently forgotten what object orientation was good for), but works quite well if you stay inside what they thought about when they wrote it. If I were you I'd actually lay out the text and get CTRuns, then measure glyph properties of those in the aggregate. the "image bounds" is presumably what you want.

  5. Adam Goode says:

    For malloc stuff, I like to use the tiny halloc library for this:!/halloc

  6. Ewen McNeill says:

    On the malloc() pool front, I'd normally have suggested talloc (a hierarchical malloc()/free() wrapper, from the Samba project). But the LGPL license might be an issue for use on iOS. So the (BSD-licensed) halloc mentioned by Adam Goode might be more suitable for that use case.


  7. Michael G says:

    Not quite exactly the same as a private heap but this will give you convenient automatic cleanup using the regular NSAutoreleasePool mechanism. You just stick your malloc onto an NSData and tell it to free it when the NSData is released:

    void* malloc_autorelease( size_t size )
        // The memory will be freed when the current autorelease pool is cleared.
        void* p = malloc( size );
        if ( p ) [NSData dataWithBytesNoCopy:p length:size freeWhenDone:YES];
        return p;

    - Michael

  8. Jesus. That crap actually makes me miss PostScript...

  9. Eric says:

    Take a look at the malloc_zone stuff in malloc.h; I think it'll do what you're looking for.