Here I jot down thoughts, roadmaps, to-do's and other things related to PDCLib. Newest entry first.
Enjoying two weeks of vacation at the North Sea, I spend quite some time relaxing at the keyboard. (Yes, this can be actually relaxing, if you go at it the right way.)
I integrated dlmalloc (using default settings only for the time being), and am making some progress toward implementing
<threads.h>. That was not at all on the to-do list, since it's C11 and I claimed that as being out of scope until I got C99 covered. But as I received feedback from several adopters of PDCLib, and the subject of multithreading support popped up in almost every single one, I bowed to popular demand.
The example platform will implement
<threads.h> as a wrapper for pthread, but it should be comparatively easy to come up with other adaptions. Note that contributions supporting other mainstream APIs and / or platforms will always be welcome!
It's also simpler to implement those pthread wrapper functions than digging through the Unicode specs.
Once I got the functionality nailed down, I will wade through the existing code to implement thread safety as required. (Looking at you,
<stdio.h>…) I might add some C11 extensions while I am at it (
strtok_s was among the requested functions, and I do not see a reason not to oblige, really).
So… yes. Progress is being made.
It's been a long time since I last did anything with / for PDCLib, but I won't make excuses for it. I just could not get myself to dig into that Unicode standard again. And as I said to a fellow developer some time ago:
A hobby should always be a CAN do, not a TO do. Have a good hard look at what each of your hobbies is giving you, and be ready to drop hobbies that drain your energy instead of recharging it.
After the ePub debacle, and due to several other (private) issues, my energy was drained. So I focussed on more enjoyable things… but I'm back.
Since I still could not bear the thought of going full Unicode mode again, I had a look at integrating Doug Lea's ''malloc()'', properly this time, to replace the makeshift
free() implementations PDCLib currently “offers”.
To do this with a minimum of changes to the
dlmalloc() code (desirable because easier to maintain facing future changes), that meant I had to tackle the issue of symbol visibility (
dlmalloc() supports and PDCLib doesn't (yet).
That in turn meant I had to test the stuff, which in turn meant it was time to enable building PDCLib as a shared library instead of the static one it currently is. But that meant touching
Makefile… and that thing, while I liked its results, was not exactly a beauty to behold in an editor.
So I started working toward supporting CMake, which would bring several other benefits as well. And today I committed the first version of just that, so…
Let's see if I get back on track on this.
Quickly saving a link for later reference: What Every Computer Scientist Should Know About Floating-Point Arithmetic.
The ePub conversion was a dead end; I should have spent the time reading instead of working on “conversion to better readable format”. So now I am looking at wasted time, a reading backlog, and lots of things I neglected while working on the now-abandoned conversion.
Since I was asked, I thought I could just as well give the answer here:
Why are you doing
<locale.h>first? I would think floating point support would be more important.
Three reasons, really. The first is just a minor snag – FP I/O is locale-dependent (decimal point vs. decimal comma).
The second is that, to do the FP logic right (instead of naïve 80-20 solutions), you need to take lots of platform specifics into account. This will blow up
<_PDCLIB_config.h> significantly, and result in lots of rather ugly conditional code.
Third, it is quite simply the area I have the least expertise in. I want to save the hardest part for last.
Sometimes we find ourselves approaching new technologies from rather unexpected angles. Right now I am working on an ePub conversion of The Unicode Standard for easier reading, as PDF handles poorly on my tolino ebook reader.
I would probably never have bothered with looking into the ePub format if it had not been for PDCLib… we live and learn.
There is no way around it. Too much of the whole ctype, wctype, uchar, locale issue is pointing to Unicode all over again. And I have been cursing at getting tangled by lots of cross-references and internal dependencies, so now I made myself sit down and tackle the monster that is The Unicode Standard. From cover to cover, as there seems to be no real shortcut to “just what I need right now”.
So… yeah. Stay put.
In these past two days, I learned a lot about the Unicode Collation Algorithm. Yes, I can do this, I can make this part of the PDCLib.
But no, not in the immediate future. That will have to “make do” with the “C” locale.
I have added
_PDCLIB_load_lc_*() functions for all the locale categories mandated by C99, plus
LC_MESSAGES which is a C99-compliant POSIX extension which is required anyway for
perror() to be locale-aware.
The one thing left is
LC_COLLATE. Collation in the C locale is comparatively simple, but Unicode aware collation?
Let's just say that the corresponding Unicode document, converted to PDF for easier offline reading, amounts to 61 pages. I will have to dig through that at some point, so why not now.
Bah. Think first. There already is a function to load contents for the various locale-data structures from file, and it's name is
Also, while loading from the filesystem is rather “raw”, any other mechanic will be even more “raw”, and less standard (as in,
So stop dithering and make
setlocale() do more than
Looking at what I already had in
<locale.h>, I decided some reworking was required. Stuffing everything into
struct lconv was not the smartest idea I had, so I did split things up into separate
struct _PDCLIB_lc_*. I also moved the
extern declaration of the actual data instances from
<_PDCLIB_int.h> where they are less confusing to the casual observer.
I am currently thinking in terms of
_PDCLIB_load_lc() to load contents for the various locale-data structures from file. I do not like the idea of having raw filesystem access inside PDCLib, though… this needs some pondering.
get-uctypes (the source of which is in the repo at
auxiliary/uctype/), I now have a program to get character classification information (as required by
<ctype.h> and, more importantly,
<wctype.h>) directly from data files available from unicode.org.
shepherd branch already had this functionality, but it was a) written in Python (which IMHO has no place in the source tree of a C library); b) including the raw data files which made them prone to getting outdated and required additional legalese added due to Unicode licensing; c) not giving correct results, and more importantly, not offering an easy way to test against the system library's results.
Now I have to provide a way to actually use the derived information in PDCLib proper.