Talk About Network



Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Awk > Re: Splitting h...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 8 of 16 Topic 2194 of 2236
Post > Topic >>

Re: Splitting huge XML Files into fixsized wellformed parts

by Malapha <malapha@[EMAIL PROTECTED] > Mar 18, 2008 at 08:43 AM

On 18 Mrz., 00:01, Hermann Peifer <pei...@[EMAIL PROTECTED]
> wrote:
> Malapha wrote:
>
> > As I come from the VBA world - I tried to get familiar with awk. What
> > I do have is theoretical solution in form of a structured process
> > diagram :-)
>
> > Copy Header and Footer from Original to Var
> > Set Start_Offer =3D First Offer (from <Offer> to </Offer>)
> > Set End_Transaction =3D 0
> > Set Part =3D 0
> > Set FileSize =3D 0
> > Set MaxFileSize =3D 250
> > while not Start_Offer < EOF(OriginalXMLFile)
> > =A0 =A0 =A0Part=3Dpart+1
> > =A0 =A0 =A0Open NewFile OriginalXMLFileName + Part + ".xml"
> > =A0 =A0 =A0Paste Header from Var to NewFile
> > =A0 =A0 =A0While filesize(NewFile)<MaxFileSize do
> > =A0 =A0 =A0 =A0 =A0Copy Offer (Start_Offer) from OriginalXMLDatei to
New=
File
> > =A0 =A0 =A0 =A0 =A0Start_Offer=3DStart_Offer + 1
> > =A0 =A0 =A0wend
> > =A0 =A0 =A0Paste Footer from Var to NewFile
> > wend
>
> > I am right now trying to translate this into awk.. Please dont ask me
> > how far i am, its frustrating :-)
>
> Below one solution for splitting in well-formed chunks, here: 100
> OfferInfos each. =A0There might be better solutions (I just don't know
> them ;-) It only works if the XML data is in "pretty print format", as
> the sample data you posted.
>
> $ cat split_bigfile.awk
>
> BEGIN { new_chunk =3D 1 ; size =3D 100 }
>
> NR =3D=3D 1 { header =3D $0 ; next }
> NR =3D=3D 2 { header =3D header ORS $0 ; footer =3D "</" substr($1,2)
">" =
; next }
>
> $0 !~ footer {
> =A0 =A0 =A0 =A0 if (new_chunk) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 outfile =3D "chunk" sprintf("%07d", num)
"=
..xml"
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 print header > outfile
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 new_chunk =3D 0
> =A0 =A0 =A0 =A0 }
> =A0 =A0 =A0 =A0 print > outfile
>
> }
>
> /<\/OfferInfo>/ {
> =A0 =A0 =A0 =A0 num =3D int(count++/size)
> =A0 =A0 =A0 =A0 if (num > prev_num) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 print footer > outfile
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 new_chunk =3D 1
> =A0 =A0 =A0 =A0 }
> =A0 =A0 =A0 =A0 prev_num =3D num
>
> }
>
> END { if (!new_chunk) print footer > outfile }

Herman you are great. As I have written in to J=FCrgen, I am unable to
check it. But as soon as possible I ll give it a try!!

Thanks again
Malapha




 16 Posts in Topic:
Splitting huge XML Files into fixsized wellformed parts
Malapha <malapha@[EMAI  2008-03-17 03:43:20 
Re: Splitting huge XML Files into fixsized wellformed parts
Janis Papanagnou <Jani  2008-03-17 13:37:27 
Re: Splitting huge XML Files into fixsized wellformed parts
Malapha <malapha@[EMAI  2008-03-17 06:35:37 
Re: Splitting huge XML Files into fixsized wellformed parts
Hermann Peifer <peifer  2008-03-17 20:20:36 
Re: Splitting huge XML Files into fixsized wellformed parts
=?ISO-8859-1?Q?J=FCrgen_K  2008-03-17 21:33:46 
Re: Splitting huge XML Files into fixsized wellformed parts
Hermann Peifer <peifer  2008-03-18 00:01:43 
Re: Splitting huge XML Files into fixsized wellformed parts
Malapha <malapha@[EMAI  2008-03-18 08:42:50 
Re: Splitting huge XML Files into fixsized wellformed parts
Malapha <malapha@[EMAI  2008-03-18 08:43:54 
Re: Splitting huge XML Files into fixsized wellformed parts
Hermann Peifer <peifer  2008-03-18 20:49:03 
Re: Splitting huge XML Files into fixsized wellformed parts
Hermann Peifer <peifer  2008-03-19 14:05:17 
Re: Splitting huge XML Files into fixsized wellformed parts
Malapha <malapha@[EMAI  2008-03-19 15:11:08 
Re: Splitting huge XML Files into fixsized wellformed parts
Hermann Peifer <peifer  2008-03-20 09:52:22 
Re: Splitting huge XML Files into fixsized wellformed parts
Malapha <malapha@[EMAI  2008-03-25 03:39:31 
Re: Splitting huge XML Files into fixsized wellformed parts
Hermann Peifer <peifer  2008-03-25 06:32:31 
Re: Splitting huge XML Files into fixsized wellformed parts
Malapha <malapha@[EMAI  2008-03-26 10:01:38 
Re: Splitting huge XML Files into fixsized wellformed parts
Hermann Peifer <peifer  2008-03-26 19:57:10 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Fri May 16 9:23:02 CDT 2008.