| anno4000@radom.zrz.tu-berlin.de 2006-10-13, 7:56 am |
| <bekijkfotos@gmail.com> wrote in comp.lang.perl.misc:
> Dear perl newsgroups,
>
> I am looking for someone with Perl skills that is willing to help me in
> a small research project.
I don't know what you call small. What you describe below looks like
quite a challenge.
[...]
> This procedure should contain the following steps:
> - Parsing of a pdf (or text) file, which lists all abstracts of the
> ISMRM, to create a database with relevant information.
> - Performing a set of queries in Pubmed for each abstract, to maximize
> the sensitivity of procedure.
>
> For example, I will use one of my own abstracts to illustrate the
> possible different appearance of an abstract and its final publication
> in a journal:
>
> Abstract:
> J.F.A. Jansen, J.M. Hakumäki, L. Ifeanyi, M. Shamblott, J. Gearhart
> and P.C.M. van Zijl
> 1H-NMR spectroscopy of stem cells in vitro demonstrates high
> proliferation state.
> Proc. Int. Soc. Mag. Reson. Med. 10, Honolulu, Hawaii, USA, May 2002:
> 131
>
> Paper:
> Jacobus F.A. Jansen, Michael J. Shamblott, Peter C.M. van Zijl, Kimmo
> K. Lehtimäki, Jeff W.M. Bulte, John D. Gearhart, and Juhana M.
> Hakumäki
> Stem Cell Profiling by Nuclear Magnetic Resonance Spectroscopy.
> Magnetic Resonance in Medicine 56 (3): 666-670 SEP 2006
>
> Although both the abstract and the paper describe essentially the same
> research, the title has changed, and the authors are different. It
> requires a relative sophisticated search algorithm to find the paper
> based on the information from the abstract, and to consider it as a
> match.
>
> Unfortunately my Perl skills are very limited, and I don't think it
> is plausible that I will succeed in creating this sophisticated search
> algorithm.
You're mixing up two stages. First you need to develop an algorithm.
Then you can decide in which language to implement it (Perl is a likely
choice) and find a programmer to do so.
The hard part will be the development of an algorithm that makes
sense of the mess. I don't think a totally automatic procedure
will suffice. Aim for a pre-selection that finds plausible pairs
but prepare to hand-select the final list from that.
> Therefore I was hoping that someone could help me out with
> this. (e.g. a student who could use it to improve his/her resume?). If
> possible, I'd like to submit the results as an abstract for the ISMRM
> 2007, for which the deadline is 15 November 2006.
Four w s? That won't do for something this size, not as a one-man
job.
Anno
|