# Essays/Substring Replacement

## Substring Replacement

`x replace p;q` replaces all occurrences of `p` by `q` in `x` .

replace=: 4 : 0 'p q'=. y j=. p nosindx x if. ''-:j do. x return. end. d=. p-&#q if. d do. b=. 1$~#x if. d>0 do. b=. 0 (,j+/i.d)} b else. b=. (1 j. -d) j} b end. x=. b # x j=. j-d*i.#j end. k=. <^:2"_1 j+/i.#q x=. q k} x ) nosindx=: 4 : 0 s=. x I.@E. y i=. s I. s+#x (i.&_1 {. ]) (s,_1) {~ (i,_1) {~^:a: 0 )

`replace` uses `nosindex` , adapted from
Non-Overlapping Substrings
to find the indices of non-overlapping substring occurrences. `nosindex` is
equivalent to `I.@E.` if substring occurrences
do not overlap (*i.e.* if no non-trivial prefix of `p` is a suffix of `p` ).

Examples of `replace` in use:

x=: 'of the people, by the people, for the people' x replace 'the';'a' of a people, by a people, for a people x replace 'the';'one' of one people, by one people, for one people x replace 'the';'them there' of them there people, by them there people, for them there people

## Program Logic

Let `d=. p-&#q` be the difference between the length
of `p` (the target string) and the length of `q` (the replacement string).
The processing depends on whether `d` is positive, zero, or negative:

- If
`d`is positive (`p`is longer than`q`),`x`is first shrinked to drop excessive elements and then amended by`q`. - If
`d`is zero (`p`has the same length as`q`),`x`is simply amended by`q`. - If
`d`is negative (`p`is shorter than`q`),`x`is first expanded to make room and then amended by`q`.

Let `j=. x nosindex p` be the indices of non-overlapping occurrences
of `p` in `x` . If `p` and `q` differ in length, then `j` should be adjusted: `j=. j-d*i.#j` . The indices to be amended are `k=. j+/i.#q` . An additional post-processing to k is required to avoid scatter-modify, so the final expression will be `k=. <^:2"_1 j+/i.#q` .

The following examples illustrate the three cases:

x=: 'of the people, by the people, for the people' p=: 'the' ] j=: p nosindx x 3 18 34

### p is longer than q (d>0)

q=: 'a' ] d=: p-&#q 2 j 3 18 34 ,j+/i.d 3 4 18 19 34 35 x ,: '.1' {~ 0 (,j+/i.d)} b=. 1$~#x of the people, by the people, for the people 111..1111111111111..11111111111111..11111111 ] x=. (0 (,j+/i.d)} b) # x of e people, by e people, for e people ] j=. j-d*i.#j 3 16 30 ] k=. <^:2"_1 j +/ i. # q +---+----+----+ |+-+|+--+|+--+| ||3|||16|||30|| |+-+|+--+|+--+| +---+----+----+ ] x=. q k} x of a people, by a people, for a people

### p has the same length as q (d=0)

q=: 'one' ] d=: p-&#q 0 j 3 18 34 ] k=. <^:2"_1 j +/ i. # q +-------+----------+----------+ |+-----+|+--------+|+--------+| ||3 4 5|||18 19 20|||34 35 36|| |+-----+|+--------+|+--------+| +-------+----------+----------+ ] x=. q k} x of one people, by one people, for one people

### p is shorter than q (d<0)

q=: 'them there' ] d=: p-&#q _7 j 3 18 34 1 j. -d 1j7 (t # x) ,: '.1' {~ 1 #~ t=. (1 j. - d) j} b=. 1$~#x of t he people, by t he people, for t he people 1111.......111111111111111.......1111111111111111.......111111111 ] x=. t # x of t he people, by t he people, for t he people ] j=. j - d * i. # j 3 25 48 ] k=. <^:2"_1 j +/ i. # q +------------------------+-------------------------------+-------------------------------+ |+----------------------+|+-----------------------------+|+-----------------------------+| ||3 4 5 6 7 8 9 10 11 12|||25 26 27 28 29 30 31 32 33 34|||48 49 50 51 52 53 54 55 56 57|| |+----------------------+|+-----------------------------+|+-----------------------------+| +------------------------+-------------------------------+-------------------------------+ ] x=. q k} x of them there people, by them there people, for them there people

## Non-Overlapping Occurrences

As stated above, `nosindx` finds the indices of non-overlapping
substring occurrences. It is the same as `I.@E.` if substring
occurrences do not overlap. What if they do overlap?

] x=: 40$'abcabcabc_' abcabcabc_abcabcabc_abcabcabc_abcabcabc_ x replace 'abcabc';'syzygy' syzygyabc_syzygyabc_syzygyabc_syzygyabc_ ] j0=: 'abcabc' I.@E. x 0 3 10 13 20 23 30 33 ] j1=: 'abcabc' nosindx x 0 10 20 30 x abcabcabc_abcabcabc_abcabcabc_abcabcabc_ '1' j0} '.'$~#x 1..1......1..1......1..1......1..1...... '1' j1} '.'$~#x 1.........1.........1.........1......... 'syzygy' (j0+/i.6)} x syzsyzygy_syzsyzygy_syzsyzygy_syzsyzygy_ 'syzygy' (j1+/i.6)} x syzygyabc_syzygyabc_syzygyabc_syzygyabc_ x replace 'abcabc';'syzygy' syzygyabc_syzygyabc_syzygyabc_syzygyabc_

Contributed by Roger Hui, with further contribution by Igor Zhuravlov.