Home > Archive > PHP Documentation > November 2007 > Re: [PHP-DOC] On serializing the id index
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Re: [PHP-DOC] On serializing the id index
|
|
| Richard Quadling 2007-11-30, 8:02 am |
| If, at the least, the files was a simple $a_Array, which could be
written using ...
file_put_contents('./index.cache', ,'<?php $a_Index = ' .
var_export($a_Index, True) . '; ?>');
sort of thing, then you could just include it and the index would be
available instantly. No need for file io parsing.
On 30/11/2007, Edward Z. Yang <edwardzyang@thewritingpot.com> wrote:
> I understand this is the least of our worries right now, but I'd like to
> get the indexer cache working so I can get karma for PhD. :-D
>
> The main troubles with serialize is that it 1. wastes space by not
> preserving internal references and 2. requires lots of memory. I recall
> SQLite being proposed as a possible solution, but since the entire index
> is loaded into memory I don't think that's necessary: just as easily
> parseable file format.
>
> So, we have a file, with each id seperated by a newline, and the fields
> seperated by tabs (both of which should never occur in the fields, if
> they do I'll use a control character or something). children is
> collapsed into a list of IDs.
>
> Reassembling is as simple parsing the file line by line, constructing
> the ID array from the fields. Then the children are reconstituted by
> running through $IDs a second time, replacing them with references to
> the appropriate indexes.
>
> Thoughts?
>
> --
> Edward Z. Yang GnuPG: 0x869C48DA
> HTML Purifier <http://htmlpurifier.org> Anti-XSS Filter
> [[ 3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA ]]
>
--
-----
Richard Quadling
Zend Certified Engineer : http://zend.com/zce.php?c=ZEND002498&r=213474731
"Standing on the shoulders of some very clever giants!"
| |
| Edward Z. Yang 2007-11-30, 7:02 pm |
| -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Richard Quadling wrote:
> If, at the least, the files was a simple $a_Array, which could be
> written using ...
>
> file_put_contents('./index.cache', ,'<?php $a_Index = ' .
> var_export($a_Index, True) . '; ?>');
>
> sort of thing, then you could just include it and the index would be
> available instantly. No need for file io parsing.
That's an interesting technique. However, PHP's source code parser is
quite a bit more complicated than what our index format needs, so it's
quite possible that the fact that our custom format's simplicity would
offset PHP's performance gains from being written in C (I suppose we'd
have to benchmark to find out, but I remember MediaWiki's developers
talking about this).
The other implementation difficulty is getting the references to point
to the right locations in var_export, which is probably impossible,
which means we have to make a copy of $IDs in order to process in order
a non-referenced structure (alternatively, reversibly de-reference and
reference the children's entries).
- --
Edward Z. Yang Portable GnuPG: 0x995A2C84
HTML Purifier <http://htmlpurifier.org> Anti-XSS Filter
[[ C8D5 9E3C 15AD 1467 5561 2C0E 719A 2D9D 995A 2C84 ]]
This Message Courtesy of Thunderbird Portable
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (MingW32)
iD8DBQFHUFCBcZotnZlaLIQRAh5NAJsGo++SnO/DH0rkPFqErDoaVMBv9wCbBQmZ
Qd0CpeQDLcHJyPuv25wZ/mk=
=mFPf
-----END PGP SIGNATURE-----
|
|
|
|
|