For quite some time I managed to keep the number of “construction sites” in PDCLib to a minimum. Sure, there were plenty of unfinished parts (floating point, multibyte / wide characters, locales, …), but my actual work was focussed on one part of the library only.
Unfortunately, I have strayed a bit from that path, and ended up with more “action items” than I am comfortable with. That is why I opened this “drawing board”, to write down my thoughts about all those “construction sites”, getting them organized, and make the path onward a bit clearer.
With later versions of glibc now finally supporting
<threads.h>, we can expect to see software emerging making actual use of this header (instead of
<pthread.h>). However, when conducting some more involved tests with my implementation, I also found a couple of severe defects. (The
thrd_join() return value handling is broken, for one.)
<threads.h> we get the ability to handle thread-local storage. While I implemented a thread-local
errno (which was simple enough), thread-local locale handling might turn out to be a bit more complicated. It might require initializing things, and we don't get to call functions from
The idea was to write a function
_PDCLIB_load_lc_<category> for each locale category (collate, ctype, monetary, numeric, time, messages). This worked rather well at first. For ctype I delved into Unicode, getting the “right” character classes directly from the Unicode database (
Then I wanted to do the same for collate (sorting equivalence), and this was where I got stuck. Unicode collation is a pretty big subject in the Unicode standard, and information about it is scattered over multiple chapters, even multiple documents. In a kind of repeat performance of the block I had with
<stdio.h>, I did not find the necessary uninterrupted time to really grasp what was before me.
The thing to do here would be to identify which data from which Unicode input files I would need, in which format, in order to implement (initially)
strxfrm. Ideally, whatever architecture I come up with would also serve for (upcoming) multibyte and wide character collation.
Several time functions are not implemented yet. The
mktime group requires timezone information (which in turn requires me looking into the timezone database for proper support code). For
ctime I need alternative access to the “C” strings in the time locale category, because they are both defined locale-independent.
I picked up the IANA reference implementation, tzcode, and am working toward an integration of that into PDCLib proper. This will do the time zone and leap second handling; it will also mean I will not have to maintain a separate database for time zone data. The Olsen database provided by IANA will do.
A request from downstream was to add FP support to my
printf() implementation (which currently breaks for %f/%g et al. because it doesn't draw the accompanying value from the stack – not nice!).
I got a good introduction to the Dragon4 binary-to-string conversion algorithm as well as the paper for the Grisu3 small integer optimization, but this would be another major construction site (touching
<fenv.h> matters as well), and I feel it would be just one thing too many to tackle at this point.