For Programmers: Free Programming Magazines  


Home > Archive > Lisp > December 2004 > Replacing subsequences









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Replacing subsequences
Chris Capel

2004-12-10, 8:59 am

I had a need just now to replace one substring in my string with a different
substring, of a different length (for quoting parameters to SQL queries,
incidentally). It appears that accomplishing this isn't straightforward in
CL. Below is what I came up with. Does anyone have a better way than this?
You'd figure CL would have something that does this, but I couldn't find
anything. Maybe it's time to read the standard cover-to-cover.

Chris Capel

(defun replace-sequence (array old new)
(do ((new-array (make-array (length array)
:adjustable t
:element-type (array-element-type array)))
(new-array-position 0 (+ new-array-position (length old)))
(array-position (search old array)
(search old array :start2 (1+ array-position)))
(prev-array-position 0 array-position)
(difference (if (> 0 #1=(- (length new) (length old))) nil #1#)))
((not array-position) (progn (replace new-array array
:start1 new-array-position
:start2 prev-array-position)
new-array))
(replace new-array array
:start1 new-array-position
:start2 prev-array-position :end2 array-position)
(incf new-array-position (- array-position prev-array-position))
(when difference
(adjust-array new-array (+ (length new-array) difference)))
(replace new-array new :start1 new-array-position)))

Bulent Murtezaoglu

2004-12-10, 3:58 pm


Perhaps something like:

(defun replace-string (string old new)
(let ((end-of-first (search old string)))
(if (not end-of-first) string ;; nothing to replace
(concatenate 'string
(subseq string 0 end-of-first)
new
(subseq string (+ end-of-first (length old)))))))

This _is_ consy, and not general purpose (we are assuming old occurs
at most once in string), but that's what I'd do first.

cheers,

BM

Wade Humeniuk

2004-12-10, 3:58 pm

Chris Capel wrote:

> I had a need just now to replace one substring in my string with a different
> substring, of a different length (for quoting parameters to SQL queries,
> incidentally). It appears that accomplishing this isn't straightforward in
> CL. Below is what I came up with. Does anyone have a better way than this?
> You'd figure CL would have something that does this, but I couldn't find
> anything. Maybe it's time to read the standard cover-to-cover.
>


(defun replace-sequence (array old new &key (test #'eql))
(let ((new-array (make-array (length array)
:adjustable t
:fill-pointer 0
:element-type (array-element-type array))))
(loop with index = 0
while (< index (length array))
if (and (funcall test (elt array index) (elt old 0))
(search old array :start2 index :end2 (+ index (length old)) :test test))
do
(loop for new-index from 0 below (length new)
do (vector-push-extend (elt new new-index) new-array))
(incf index (length old))
else do
(vector-push-extend (elt array index) new-array)
(incf index))
new-array))

CL-USER 9 > (replace-sequence "test spot the dog" "dog" "the dogs"
:test #'char=)
"test spot the the dogs"

CL-USER 10 > (replace-sequence "test spot the dog" "the" "many"
:test #'char=)
"test spot many dog"

CL-USER 11 >

It is a little confusing if you want this function to work on all
sequences (in which case it needs some work or change the name)
or just arrays.

Wade
Wade Humeniuk

2004-12-10, 3:58 pm

There are a few bugs in the code.

A better version is:

(defun replace-sequence (array old new &key (test #'eql))
(assert (> (length old) 0))
(let ((new-array (make-array (length array)
:adjustable t
:fill-pointer 0
:element-type (array-element-type array))))
(loop with index = 0
while (< index (length array))
if (search old array :start2 index
:end2 (min (length array) (+ index (length old)))
:test test)
do
(loop for new-index from 0 below (length new)
do (vector-push-extend (elt new new-index) new-array))
(incf index (length old))
else do
(vector-push-extend (elt array index) new-array)
(incf index))
new-array))

Wade
Thomas A. Russ

2004-12-10, 3:58 pm

Chris Capel <ch.ris@iba.nktech.net> writes:

>
> I had a need just now to replace one substring in my string with a different
> substring, of a different length (for quoting parameters to SQL queries,
> incidentally). It appears that accomplishing this isn't straightforward in
> CL. Below is what I came up with. Does anyone have a better way than this?
> You'd figure CL would have something that does this, but I couldn't find
> anything. Maybe it's time to read the standard cover-to-cover.


If I need to do something like that, I usually end up using a string
stream and writing various substrings to it, splicing in the substituted
values.

The advantage is that the string maintenance is nice and clean.

One drawback is that the setup overhead and consing is not trivial, so
it is often worthwhile to first do a search on the string to see if the
substring to be replaced even occurs. It adds an additional pass
through the string, but if cases where no substitution is needed, it can
provide a big speedup. That makes it perhaps useful in a general
utility than in special purpose code where you know or expect to be
doing the substitution.

Herewith is my entry:

(defun string-substitute (string old new)
(if (search old string)
(with-output-to-string (out)
(loop with old-len = (length old)
for start = 0 then (+ end old-len)
as end = (search old string :start2 start)
do (write-string (subseq string start end) out)
(if end
(write-string new out)
(loop-finish))))
string))


--
Thomas A. Russ, USC/Information Sciences Institute

Jason Kantz

2004-12-10, 3:58 pm

You could be better off using a library like CL-PPCRE.
http://www.weitz.de/cl-ppcre/#regex-replace

Chris Capel

2004-12-10, 3:58 pm

Thomas A. Russ wrote:

> Chris Capel <ch.ris@iba.nktech.net> writes:
>
>
> If I need to do something like that, I usually end up using a string
> stream and writing various substrings to it, splicing in the substituted
> values.
>
> The advantage is that the string maintenance is nice and clean.


Very elegant, I agree. I'm of the opinion, however, that you write it once,
and forget about it. So do it well, make it fast, and you're set
forevermore.

Chris Capel
Chris Capel

2004-12-10, 3:58 pm

Jason Kantz wrote:

> You could be better off using a library like CL-PPCRE.
> http://www.weitz.de/cl-ppcre/#regex-replace


Wow. That's about seven times as fast as mine.

Wow.

Chris Capel
Thomas F. Burdick

2004-12-10, 3:58 pm

Chris Capel <ch.ris@iba.nktech.net> writes:

> Thomas A. Russ wrote:
>
>
> Very elegant, I agree. I'm of the opinion, however, that you write it once,
> and forget about it. So do it well, make it fast, and you're set
> forevermore.


Thomas Russ' is both efficient (assuming an efficient w-o-t-string),
and, very importantly, it scales nicely to large strings. Here's what
I get on sbcl: On small strings with small replacements, your is about
30% faster; on large strings with large replacements, it's many orders
of magnitude slower. When you're evaluating stuff for library use,
don't make the mistake of only considering best-case behavior!
Frank Buss

2004-12-10, 8:58 pm

Chris Capel <ch.ris@iba.nktech.net> wrote:

> Below is what I came up with.


looks like it doesn't work correctly. I've tested it with LispWorks:

CL-USER > (replace-sequence "abc" "b" "123")
"a1bc^P"

A function, which works with every sequence (I hope):

(defun replace-sequence (seq old new &key (start 0))
(let ((seq-len (length seq))
(old-len (length old))
(new-len (length new)))
(if (or (< (- seq-len start) old-len) (= old-len 0))
(copy-seq seq)
(if (search seq old :start1 start :end1 (+ start old-len))
(replace-sequence
(concatenate (type-of seq)
(subseq seq 0 start)
new
(subseq seq (+ start old-len)))
old
new
:start (+ start new-len))
(replace-sequence seq old new :start (1+ start))))))

CL-USER : 2 > (replace-sequence "" "" "")
""

CL-USER : 2 > (replace-sequence "abc" "b" "123")
"a123c"

CL-USER : 2 > (replace-sequence "Hello universe!" "universe" "world")
"Hello world!"

CL-USER : 2 > (replace-sequence '(1 2 3 4 5) '(3 4) nil)
(1 2 5)

CL-USER : 2 > (replace-sequence '#(1 2 3 4 5) '#(2 3) '#(10 11 12))
#(1 10 11 12 4 5)

--
Frank Buß, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
Wade Humeniuk

2004-12-10, 8:58 pm

Chris Capel wrote:

>
> It'd probably need to be able to create a blank sequence of the same format
> as the input (nil in the case of a list). It'd probably be best to provide
> two sections of the body that worked only on vectors and lists,
> respectively. Unless you can figure out how this abstraction applies to any
> array. Is there anything besides lists and vectors to worry about?
>


Like Frank's version, a generalized sequence version:

(defun replace-sequence (seq old new &key (start 0) (end nil) (test #'eql))
(if (< start (or end (length seq)))
(let ((position (search old seq :start2 start :end2 end)))
(if position
(concatenate (type-of seq)
(subseq seq start position)
new
(replace-sequence seq
old new
:start (+ position (length old))
:end end
:test test))
(subseq seq start)))
(subseq seq start)))

Wade

Thomas A. Russ

2004-12-10, 8:58 pm

Chris Capel <ch.ris@iba.nktech.net> writes:

> (defun replace-sequence (array old new)
> (let ((hits (list (- 0 (length old))))
> (n-hits 0))
> (let (hit (pos 0))
> (while (setf hit (search old array :start2 pos))
> (push hit hits)
> (incf n-hits)
> (setf pos (1+ hit)))
> (setf hits (nreverse hits)))


Hmmm. This is interesting. You seem to allow for overlapping
replacements, since each match position is incremented by 1 instead
of the length of OLD. Do you get what you expect from the
following call:

(replace-sequence "aaaaWWWWaaaWWWaa" "aa" "bb")

BTW, when I tried the above, I got an error....

> (let* ((difference (- (length new) (length old)))
> (new-array (make-array (+ (length array)
> (* difference n-hits))
> :element-type (array-element-type array)))
> (new-array-position 0))
> (docdr (x hits)
> (let ((start (+ (car x) (length old)))
> (next (cadr x)))
> (replace new-array array
> :start1 new-array-position
> :start2 start :end2 next)
> (when next
> (incf new-array-position (- next start))
> (replace new-array new
> :start1 new-array-position)
> (incf new-array-position (length new)))))
> new-array)))


--
Thomas A. Russ, USC/Information Sciences Institute
















Jeff

2004-12-11, 3:56 am

Chris Capel wrote:

> I had a need just now to replace one substring in my string with a
> different substring, of a different length


If you want something that's quick to just "drop in and use", might I
suggest using my regular expression package (url below). It allows very
quick and simple replacing of text in a string:

(re:replace-all "Some random string"
#/\s(\a+)$/
(format nil "~D" (length $1)))

==> "Some random 6"

--
http://www.retrobyte.org
mailto:massung@gmail.com
Chris Capel

2004-12-11, 8:56 am

Chris Capel wrote:

> You'd figure CL would have something that does this,
> but I couldn't find anything.


Well, I guess it doesn't. How did something like this not make it into the
standard? Did people just not use strings back then? Every other language I
know of has something to do this, that works at least for strings, built
in. Was the ANSI committee simply too pressed for time?

Chris Capel
David Sletten

2004-12-11, 8:56 am

Chris Capel wrote:
> I had a need just now to replace one substring in my string with a different
> substring, of a different length (for quoting parameters to SQL queries,
> incidentally). It appears that accomplishing this isn't straightforward in
> CL. Below is what I came up with. Does anyone have a better way than this?
> You'd figure CL would have something that does this, but I couldn't find
> anything. Maybe it's time to read the standard cover-to-cover.
>
> Chris Capel
>
> (defun replace-sequence (array old new)
> (do ((new-array (make-array (length array)
> :adjustable t
> :element-type (array-element-type array)))
> (new-array-position 0 (+ new-array-position (length old)))
> (array-position (search old array)
> (search old array :start2 (1+ array-position)))
> (prev-array-position 0 array-position)
> (difference (if (> 0 #1=(- (length new) (length old))) nil #1#)))
> ((not array-position) (progn (replace new-array array
> :start1 new-array-position
> :start2 prev-array-position)
> new-array))
> (replace new-array array
> :start1 new-array-position
> :start2 prev-array-position :end2 array-position)
> (incf new-array-position (- array-position prev-array-position))
> (when difference
> (adjust-array new-array (+ (length new-array) difference)))
> (replace new-array new :start1 new-array-position)))
>

May I suggest that whichever implementation you choose, you follow the
convention of parameter order in SUBST and SUBSTITUTE? Namely,
(defun replace-sequence (new old seq) ...)

David Sletten
David Sletten

2004-12-11, 8:56 am

Frank Buss wrote:


> A function, which works with every sequence (I hope):
>
> (defun replace-sequence (seq old new &key (start 0))
> (let ((seq-len (length seq))
> (old-len (length old))
> (new-len (length new)))
> (if (or (< (- seq-len start) old-len) (= old-len 0))
> (copy-seq seq)
> (if (search seq old :start1 start :end1 (+ start old-len))
> (replace-sequence
> (concatenate (type-of seq)

^^^^^^^^^^^^^

This doesn't seem to be portable. In LispWorks:
(type-of "foo") => SIMPLE-TEXT-STRING
But in CLISP and SBCL:
(type-of "foo") => (SIMPLE-BASE-STRING 3)

So your code works as long as the NEW sequence is the same length as the
OLD:
(replace-sequence "Is this not pung?" "this" "that") => "Is that not pung?"

But if the lengths are different you violate the type specifier:
(replace-sequence "abc" "b" "123") =>
*** - sequence type forces length 3, but result has length 5

I think you need something like:
(concatenate (typecase seq (string 'string) (vector 'vector)) ...)

David Sletten
Jeff

2004-12-11, 3:58 pm

Chris Capel wrote:

> Chris Capel wrote:
>
>
> Well, I guess it doesn't. How did something like this not make it
> into the standard? Did people just not use strings back then? Every
> other language I know of has something to do this, ...


What languages?

Aside from scripting languages that have some form of regular
expression parsing (eg, Perl, Python), I don't know of any [compiled]
language that can replace sections of a string of different lengths. It
is just easier in some languages to find the string you want replaced,
and easier to concatenate your new string with the previous characters
that you want to keep.

Jeff M.

--
http://www.retrobyte.org
mailto:massung@gmail.com
Wade Humeniuk

2004-12-13, 4:08 pm

Chris Capel wrote:

> I had a need just now to replace one substring in my string with a different
> substring, of a different length (for quoting parameters to SQL queries,
> incidentally). It appears that accomplishing this isn't straightforward in
> CL. Below is what I came up with. Does anyone have a better way than this?
> You'd figure CL would have something that does this, but I couldn't find
> anything. Maybe it's time to read the standard cover-to-cover.
>


(defun replace-sequence (array old new &key (test #'eql))
(let ((new-array (make-array (length array)
:adjustable t
:fill-pointer 0
:element-type (array-element-type array))))
(loop with index = 0
while (< index (length array))
if (and (funcall test (elt array index) (elt old 0))
(search old array :start2 index :end2 (+ index (length old)) :test test))
do
(loop for new-index from 0 below (length new)
do (vector-push-extend (elt new new-index) new-array))
(incf index (length old))
else do
(vector-push-extend (elt array index) new-array)
(incf index))
new-array))

CL-USER 9 > (replace-sequence "test spot the dog" "dog" "the dogs"
:test #'char=)
"test spot the the dogs"

CL-USER 10 > (replace-sequence "test spot the dog" "the" "many"
:test #'char=)
"test spot many dog"

CL-USER 11 >

It is a little confusing if you want this function to work on all
sequences (in which case it needs some work or change the name)
or just arrays.

Wade
Bulent Murtezaoglu

2004-12-13, 4:08 pm


Perhaps something like:

(defun replace-string (string old new)
(let ((end-of-first (search old string)))
(if (not end-of-first) string ;; nothing to replace
(concatenate 'string
(subseq string 0 end-of-first)
new
(subseq string (+ end-of-first (length old)))))))

This _is_ consy, and not general purpose (we are assuming old occurs
at most once in string), but that's what I'd do first.

cheers,

BM

Wade Humeniuk

2004-12-13, 4:08 pm

There are a few bugs in the code.

A better version is:

(defun replace-sequence (array old new &key (test #'eql))
(assert (> (length old) 0))
(let ((new-array (make-array (length array)
:adjustable t
:fill-pointer 0
:element-type (array-element-type array))))
(loop with index = 0
while (< index (length array))
if (search old array :start2 index
:end2 (min (length array) (+ index (length old)))
:test test)
do
(loop for new-index from 0 below (length new)
do (vector-push-extend (elt new new-index) new-array))
(incf index (length old))
else do
(vector-push-extend (elt array index) new-array)
(incf index))
new-array))

Wade
Chris Capel

2004-12-13, 4:08 pm

Thomas A. Russ wrote:

> Chris Capel <ch.ris@iba.nktech.net> writes:
>
>
> If I need to do something like that, I usually end up using a string
> stream and writing various substrings to it, splicing in the substituted
> values.
>
> The advantage is that the string maintenance is nice and clean.


Very elegant, I agree. I'm of the opinion, however, that you write it once,
and forget about it. So do it well, make it fast, and you're set
forevermore.

Chris Capel
Chris Capel

2004-12-13, 4:08 pm

Jason Kantz wrote:

> You could be better off using a library like CL-PPCRE.
> http://www.weitz.de/cl-ppcre/#regex-replace


Wow. That's about seven times as fast as mine.

Wow.

Chris Capel
Frank Buss

2004-12-13, 4:08 pm

Chris Capel <ch.ris@iba.nktech.net> wrote:

> Below is what I came up with.


looks like it doesn't work correctly. I've tested it with LispWorks:

CL-USER > (replace-sequence "abc" "b" "123")
"a1bc^P"

A function, which works with every sequence (I hope):

(defun replace-sequence (seq old new &key (start 0))
(let ((seq-len (length seq))
(old-len (length old))
(new-len (length new)))
(if (or (< (- seq-len start) old-len) (= old-len 0))
(copy-seq seq)
(if (search seq old :start1 start :end1 (+ start old-len))
(replace-sequence
(concatenate (type-of seq)
(subseq seq 0 start)
new
(subseq seq (+ start old-len)))
old
new
:start (+ start new-len))
(replace-sequence seq old new :start (1+ start))))))

CL-USER : 2 > (replace-sequence "" "" "")
""

CL-USER : 2 > (replace-sequence "abc" "b" "123")
"a123c"

CL-USER : 2 > (replace-sequence "Hello universe!" "universe" "world")
"Hello world!"

CL-USER : 2 > (replace-sequence '(1 2 3 4 5) '(3 4) nil)
(1 2 5)

CL-USER : 2 > (replace-sequence '#(1 2 3 4 5) '#(2 3) '#(10 11 12))
#(1 10 11 12 4 5)

--
Frank Buß, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
Wade Humeniuk

2004-12-13, 4:08 pm

Chris Capel wrote:

>
> It'd probably need to be able to create a blank sequence of the same format
> as the input (nil in the case of a list). It'd probably be best to provide
> two sections of the body that worked only on vectors and lists,
> respectively. Unless you can figure out how this abstraction applies to any
> array. Is there anything besides lists and vectors to worry about?
>


Like Frank's version, a generalized sequence version:

(defun replace-sequence (seq old new &key (start 0) (end nil) (test #'eql))
(if (< start (or end (length seq)))
(let ((position (search old seq :start2 start :end2 end)))
(if position
(concatenate (type-of seq)
(subseq seq start position)
new
(replace-sequence seq
old new
:start (+ position (length old))
:end end
:test test))
(subseq seq start)))
(subseq seq start)))

Wade

Thomas A. Russ

2004-12-13, 4:08 pm

Chris Capel <ch.ris@iba.nktech.net> writes:

> (defun replace-sequence (array old new)
> (let ((hits (list (- 0 (length old))))
> (n-hits 0))
> (let (hit (pos 0))
> (while (setf hit (search old array :start2 pos))
> (push hit hits)
> (incf n-hits)
> (setf pos (1+ hit)))
> (setf hits (nreverse hits)))


Hmmm. This is interesting. You seem to allow for overlapping
replacements, since each match position is incremented by 1 instead
of the length of OLD. Do you get what you expect from the
following call:

(replace-sequence "aaaaWWWWaaaWWWaa" "aa" "bb")

BTW, when I tried the above, I got an error....

> (let* ((difference (- (length new) (length old)))
> (new-array (make-array (+ (length array)
> (* difference n-hits))
> :element-type (array-element-type array)))
> (new-array-position 0))
> (docdr (x hits)
> (let ((start (+ (car x) (length old)))
> (next (cadr x)))
> (replace new-array array
> :start1 new-array-position
> :start2 start :end2 next)
> (when next
> (incf new-array-position (- next start))
> (replace new-array new
> :start1 new-array-position)
> (incf new-array-position (length new)))))
> new-array)))


--
Thomas A. Russ, USC/Information Sciences Institute
















Jeff

2004-12-13, 4:08 pm

Chris Capel wrote:

> I had a need just now to replace one substring in my string with a
> different substring, of a different length


If you want something that's quick to just "drop in and use", might I
suggest using my regular expression package (url below). It allows very
quick and simple replacing of text in a string:

(re:replace-all "Some random string"
#/\s(\a+)$/
(format nil "~D" (length $1)))

==> "Some random 6"

--
http://www.retrobyte.org
mailto:massung@gmail.com
Chris Capel

2004-12-13, 4:08 pm

Chris Capel wrote:

> You'd figure CL would have something that does this,
> but I couldn't find anything.


Well, I guess it doesn't. How did something like this not make it into the
standard? Did people just not use strings back then? Every other language I
know of has something to do this, that works at least for strings, built
in. Was the ANSI committee simply too pressed for time?

Chris Capel
David Sletten

2004-12-13, 4:08 pm

Chris Capel wrote:
> I had a need just now to replace one substring in my string with a different
> substring, of a different length (for quoting parameters to SQL queries,
> incidentally). It appears that accomplishing this isn't straightforward in
> CL. Below is what I came up with. Does anyone have a better way than this?
> You'd figure CL would have something that does this, but I couldn't find
> anything. Maybe it's time to read the standard cover-to-cover.
>
> Chris Capel
>
> (defun replace-sequence (array old new)
> (do ((new-array (make-array (length array)
> :adjustable t
> :element-type (array-element-type array)))
> (new-array-position 0 (+ new-array-position (length old)))
> (array-position (search old array)
> (search old array :start2 (1+ array-position)))
> (prev-array-position 0 array-position)
> (difference (if (> 0 #1=(- (length new) (length old))) nil #1#)))
> ((not array-position) (progn (replace new-array array
> :start1 new-array-position
> :start2 prev-array-position)
> new-array))
> (replace new-array array
> :start1 new-array-position
> :start2 prev-array-position :end2 array-position)
> (incf new-array-position (- array-position prev-array-position))
> (when difference
> (adjust-array new-array (+ (length new-array) difference)))
> (replace new-array new :start1 new-array-position)))
>

May I suggest that whichever implementation you choose, you follow the
convention of parameter order in SUBST and SUBSTITUTE? Namely,
(defun replace-sequence (new old seq) ...)

David Sletten
David Sletten

2004-12-13, 4:08 pm

Frank Buss wrote:


> A function, which works with every sequence (I hope):
>
> (defun replace-sequence (seq old new &key (start 0))
> (let ((seq-len (length seq))
> (old-len (length old))
> (new-len (length new)))
> (if (or (< (- seq-len start) old-len) (= old-len 0))
> (copy-seq seq)
> (if (search seq old :start1 start :end1 (+ start old-len))
> (replace-sequence
> (concatenate (type-of seq)

^^^^^^^^^^^^^

This doesn't seem to be portable. In LispWorks:
(type-of "foo") => SIMPLE-TEXT-STRING
But in CLISP and SBCL:
(type-of "foo") => (SIMPLE-BASE-STRING 3)

So your code works as long as the NEW sequence is the same length as the
OLD:
(replace-sequence "Is this not pung?" "this" "that") => "Is that not pung?"

But if the lengths are different you violate the type specifier:
(replace-sequence "abc" "b" "123") =>
*** - sequence type forces length 3, but result has length 5

I think you need something like:
(concatenate (typecase seq (string 'string) (vector 'vector)) ...)

David Sletten
Frank Buss

2004-12-13, 4:08 pm

David Sletten <david@slytobias.com> wrote:

> But if the lengths are different you violate the type specifier:
> (replace-sequence "abc" "b" "123") =>
> *** - sequence type forces length 3, but result has length 5
>
> I think you need something like:
> (concatenate (typecase seq (string 'string) (vector 'vector)) ...)


but if I create my own sequence type, I have to add it to this typecase
list. How can I write a concatenate, which works for all sequence types,
and which perhaps creates a (SIMPLE-BASE-STRING 5) in CLISP, if the inputs
are (SIMPLE-BASE-STRING 2) and (SIMPLE-BASE-STRING 3)?

--
Frank Buß, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
Barry Margolin

2004-12-13, 9:01 pm

In article <cpkqbb$r3$1@newsreader2.netcologne.de>,
Frank Buss <fb@frank-buss.de> wrote:

> David Sletten <david@slytobias.com> wrote:
>
>
> but if I create my own sequence type, I have to add it to this typecase
> list. How can I write a concatenate, which works for all sequence types,
> and which perhaps creates a (SIMPLE-BASE-STRING 5) in CLISP, if the inputs
> are (SIMPLE-BASE-STRING 2) and (SIMPLE-BASE-STRING 3)?


CL doesn't provide a way for you to create new sequence types, so it
naturally doesn't provide a way to extend the sequence functions, either.

--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
Jeff

2004-12-15, 3:59 pm

Chris Capel wrote:

> Chris Capel wrote:
>
>
> Well, I guess it doesn't. How did something like this not make it
> into the standard? Did people just not use strings back then? Every
> other language I know of has something to do this, ...


What languages?

Aside from scripting languages that have some form of regular
expression parsing (eg, Perl, Python), I don't know of any [compiled]
language that can replace sections of a string of different lengths. It
is just easier in some languages to find the string you want replaced,
and easier to concatenate your new string with the previous characters
that you want to keep.

Jeff M.

--
http://www.retrobyte.org
mailto:massung@gmail.com
David Sletten

2004-12-15, 3:59 pm

Chris Capel wrote:
> I had a need just now to replace one substring in my string with a different
> substring, of a different length (for quoting parameters to SQL queries,
> incidentally). It appears that accomplishing this isn't straightforward in
> CL. Below is what I came up with. Does anyone have a better way than this?
> You'd figure CL would have something that does this, but I couldn't find
> anything. Maybe it's time to read the standard cover-to-cover.
>
> Chris Capel
>
> (defun replace-sequence (array old new)
> (do ((new-array (make-array (length array)
> :adjustable t
> :element-type (array-element-type array)))
> (new-array-position 0 (+ new-array-position (length old)))
> (array-position (search old array)
> (search old array :start2 (1+ array-position)))
> (prev-array-position 0 array-position)
> (difference (if (> 0 #1=(- (length new) (length old))) nil #1#)))
> ((not array-position) (progn (replace new-array array
> :start1 new-array-position
> :start2 prev-array-position)
> new-array))
> (replace new-array array
> :start1 new-array-position
> :start2 prev-array-position :end2 array-position)
> (incf new-array-position (- array-position prev-array-position))
> (when difference
> (adjust-array new-array (+ (length new-array) difference)))
> (replace new-array new :start1 new-array-position)))
>

May I suggest that whichever implementation you choose, you follow the
convention of parameter order in SUBST and SUBSTITUTE? Namely,
(defun replace-sequence (new old seq) ...)

David Sletten
Frank Buss

2004-12-15, 3:59 pm

David Sletten <david@slytobias.com> wrote:

> But if the lengths are different you violate the type specifier:
> (replace-sequence "abc" "b" "123") =>
> *** - sequence type forces length 3, but result has length 5
>
> I think you need something like:
> (concatenate (typecase seq (string 'string) (vector 'vector)) ...)


but if I create my own sequence type, I have to add it to this typecase
list. How can I write a concatenate, which works for all sequence types,
and which perhaps creates a (SIMPLE-BASE-STRING 5) in CLISP, if the inputs
are (SIMPLE-BASE-STRING 2) and (SIMPLE-BASE-STRING 3)?

--
Frank Buß, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
Dominic Robinson

2004-12-16, 8:58 am

How about this variation which recurses rather than collecting the
hits:

(with-apologies-for-google-groups-taking-liberties-with-indentation

(defun replace-sequence (string old new)
(let ((old-length (length old))
(new-length (length new)))
(labels ((replacer (pos new-pos)
(let ((match-pos (search old string :start2 pos)))
(if match-pos
(let* ((match-offset (- match-pos pos))
(next-pos (+ match-pos old-length))
(next-new-pos (+ new-pos match-offset
new-length))
(new-string (replacer next-pos
next-new-pos)))
(replace new-string string
:start1 new-pos
:start2 pos :end2 match-pos)
(replace new-string new
:start1 (+ new-pos match-offset)))
(if (= pos 0)
string
(let ((new-string (make-array (+ new-pos (-
(length string) pos)) :element-type (array-element-type string))))
(replace new-string string
:start1 new-pos
:start2 pos)))))))
(replacer 0 0))))

)

It won't handle overlapping hits of course as the matching process only
sees the original string.


Dominic Robinson

Dominic Robinson

2004-12-16, 8:58 am

How about this variation which recurses rather than collecting the
hits:

;(defun replace-sequence (string old new)
; (let ((old-length (length old))
; (new-length (length new)))
; (labels ((replacer (pos new-pos)
; (let ((match-pos (search old string :start2 pos)))
; (if match-pos
; (let* ((match-offset (- match-pos pos))
; (next-pos (+ match-pos old-length))
; (next-new-pos (+ new-pos match-offset
new-length))
; (new-string (replacer next-pos
next-new-pos)))
; (replace new-string string
; :start1 new-pos
; :start2 pos :end2 match-pos)
; (replace new-string new
; :start1 (+ new-pos match-offset)))
; (if (= pos 0)
; string
; (let ((new-string (make-array (+ new-pos (-
(length string) pos)) :element-type (array-element-type string))))
; (replace new-string string
; :start1 new-pos
; :start2 pos)))))))
; (replacer 0 0))))

Overlapping replacements are not handled as the matching pass only sees
the original string.

Apologies for the ;s - but at least they stop google groups eating the
indentation

Dominic Robinson

Chris Capel

2004-12-18, 12:51 pm

Chris Capel wrote:

> You'd figure CL would have something that does this,
> but I couldn't find anything.


Well, I guess it doesn't. How did something like this not make it into the
standard? Did people just not use strings back then? Every other language I
know of has something to do this, that works at least for strings, built
in. Was the ANSI committee simply too pressed for time?

Chris Capel
David Sletten

2004-12-18, 12:51 pm

Chris Capel wrote:
> I had a need just now to replace one substring in my string with a different
> substring, of a different length (for quoting parameters to SQL queries,
> incidentally). It appears that accomplishing this isn't straightforward in
> CL. Below is what I came up with. Does anyone have a better way than this?
> You'd figure CL would have something that does this, but I couldn't find
> anything. Maybe it's time to read the standard cover-to-cover.
>
> Chris Capel
>
> (defun replace-sequence (array old new)
> (do ((new-array (make-array (length array)
> :adjustable t
> :element-type (array-element-type array)))
> (new-array-position 0 (+ new-array-position (length old)))
> (array-position (search old array)
> (search old array :start2 (1+ array-position)))
> (prev-array-position 0 array-position)
> (difference (if (> 0 #1=(- (length new) (length old))) nil #1#)))
> ((not array-position) (progn (replace new-array array
> :start1 new-array-position
> :start2 prev-array-position)
> new-array))
> (replace new-array array
> :start1 new-array-position
> :start2 prev-array-position :end2 array-position)
> (incf new-array-position (- array-position prev-array-position))
> (when difference
> (adjust-array new-array (+ (length new-array) difference)))
> (replace new-array new :start1 new-array-position)))
>

May I suggest that whichever implementation you choose, you follow the
convention of parameter order in SUBST and SUBSTITUTE? Namely,
(defun replace-sequence (new old seq) ...)

David Sletten
David Sletten

2004-12-18, 12:51 pm

Frank Buss wrote:


> A function, which works with every sequence (I hope):
>
> (defun replace-sequence (seq old new &key (start 0))
> (let ((seq-len (length seq))
> (old-len (length old))
> (new-len (length new)))
> (if (or (< (- seq-len start) old-len) (= old-len 0))
> (copy-seq seq)
> (if (search seq old :start1 start :end1 (+ start old-len))
> (replace-sequence
> (concatenate (type-of seq)

^^^^^^^^^^^^^

This doesn't seem to be portable. In LispWorks:
(type-of "foo") => SIMPLE-TEXT-STRING
But in CLISP and SBCL:
(type-of "foo") => (SIMPLE-BASE-STRING 3)

So your code works as long as the NEW sequence is the same length as the
OLD:
(replace-sequence "Is this not pung?" "this" "that") => "Is that not pung?"

But if the lengths are different you violate the type specifier:
(replace-sequence "abc" "b" "123") =>
*** - sequence type forces length 3, but result has length 5

I think you need something like:
(concatenate (typecase seq (string 'string) (vector 'vector)) ...)

David Sletten
Jeff

2004-12-18, 12:51 pm

Chris Capel wrote:

> Chris Capel wrote:
>
>
> Well, I guess it doesn't. How did something like this not make it
> into the standard? Did people just not use strings back then? Every
> other language I know of has something to do this, ...


What languages?

Aside from scripting languages that have some form of regular
expression parsing (eg, Perl, Python), I don't know of any [compiled]
language that can replace sections of a string of different lengths. It
is just easier in some languages to find the string you want replaced,
and easier to concatenate your new string with the previous characters
that you want to keep.

Jeff M.

--
http://www.retrobyte.org
mailto:massung@gmail.com
Dominic Robinson

2004-12-18, 12:51 pm

How about this variation which recurses rather than collecting the
hits:

;(defun replace-sequence (string old new)
; (let ((old-length (length old))
; (new-length (length new)))
; (labels ((replacer (pos new-pos)
; (let ((match-pos (search old string :start2 pos)))
; (if match-pos
; (let* ((match-offset (- match-pos pos))
; (next-pos (+ match-pos old-length))
; (next-new-pos (+ new-pos match-offset
new-length))
; (new-string (replacer next-pos
next-new-pos)))
; (replace new-string string
; :start1 new-pos
; :start2 pos :end2 match-pos)
; (replace new-string new
; :start1 (+ new-pos match-offset)))
; (if (= pos 0)
; string
; (let ((new-string (make-array (+ new-pos (-
(length string) pos)) :element-type (array-element-type string))))
; (replace new-string string
; :start1 new-pos
; :start2 pos)))))))
; (replacer 0 0))))

Overlapping replacements are not handled as the matching pass only sees
the original string.

Apologies for the ;s - but at least they stop google groups eating the
indentation

Dominic Robinson

Dominic Robinson

2004-12-20, 3:59 pm

How about this variation which recurses rather than collecting the
hits:

;(defun replace-sequence (string old new)
; (let ((old-length (length old))
; (new-length (length new)))
; (labels ((replacer (pos new-pos)
; (let ((match-pos (search old string :start2 pos)))
; (if match-pos
; (let* ((match-offset (- match-pos pos))
; (next-pos (+ match-pos old-length))
; (next-new-pos (+ new-pos match-offset
new-length))
; (new-string (replacer next-pos
next-new-pos)))
; (replace new-string string
; :start1 new-pos
; :start2 pos :end2 match-pos)
; (replace new-string new
; :start1 (+ new-pos match-offset)))
; (if (= pos 0)
; string
; (let ((new-string (make-array (+ new-pos (-
(length string) pos)) :element-type (array-element-type string))))
; (replace new-string string
; :start1 new-pos
; :start2 pos)))))))
; (replacer 0 0))))

Overlapping replacements are not handled as the matching pass only sees
the original string.

Apologies for the ;s - but at least they stop google groups eating the
indentation

Dominic Robinson

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com