On Sun, 02 Dec 2007 03:35:55 -0800, William James wrote:
> On Dec 1, 6:00 pm, Steffen Schuler <schuler.stef...@[EMAIL PROTECTED]
>
> wrote:
>> Hello netlanders,
>>
>> is there a faster POSIX awk function solving task SEPSPLIT below?
>>
>> task SEPSPLIT
>>
>> searched:
>>
>> function sepsplit(re, a, s) in POSIX awk with: s is split into the
>> fields a[0], a[2], ...,a[2n] by the field separator regex re and
>> a[2*i+1] is the string matched by re between a[2*i] and a[2*i+2] for i
>> = 0, 1, ..., n - 1. sepsplit() returns n.
>>
>> solution:
>>
>> function sepsplit(re, a, s, suffix, m) {
>> m = 0
>> suffix = s
>> while (match(suffix, re)) {
>> a[m++] = substr(suffix, 1, RSTART - 1) a[m++] = substr(suffix,
>> RSTART, RLENGTH) suffix = substr(suffix, RSTART + RLENGTH)
>> }
>> a[m] = suffix
>> return m / 2
>>
>> }
>>
>> remark:
>>
>> there is also gawk solution with gensub() and index() possible.
>>
>> Any useful answer is appreciated.
>>
>> Regards,
>>
>> Steffen "goedel" Schuler
>
> I wrote this some years ago. Note that the string cannot contain ASCII
> 1.
>
> # Produces array of nonmatching and matching # substrings. The size of
> the array will # always be an odd number. The first and the # last item
> will always be nonmatching. function shatter( s, shards, regexp ) {
> gsub( regexp, "\1&\1", s )
> return split( s, shards, "\1" )
> }
Hi William, hello netlanders,
thanks a lot for your beautyful code, William. It's better than the code
I posted.
Kind regards,
Steffen "goedel" Schuler


|