Talk About Network



Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Awk > Re: Splitting h...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 6 of 16 Topic 2194 of 2236
Post > Topic >>

Re: Splitting huge XML Files into fixsized wellformed parts

by Hermann Peifer <peifer@[EMAIL PROTECTED] > Mar 18, 2008 at 12:01 AM

Malapha wrote:
> 
> As I come from the VBA world - I tried to get familiar with awk. What
> I do have is theoretical solution in form of a structured process
> diagram :-)
> 
> Copy Header and Footer from Original to Var
> Set Start_Offer = First Offer (from <Offer> to </Offer>)
> Set End_Transaction = 0
> Set Part = 0
> Set FileSize = 0
> Set MaxFileSize = 250
> while not Start_Offer < EOF(OriginalXMLFile)
>      Part=part+1
>      Open NewFile OriginalXMLFileName + Part + ".xml"
>      Paste Header from Var to NewFile
>      While filesize(NewFile)<MaxFileSize do
>          Copy Offer (Start_Offer) from OriginalXMLDatei to NewFile
>          Start_Offer=Start_Offer + 1
>      wend
>      Paste Footer from Var to NewFile
> wend
> 
> I am right now trying to translate this into awk.. Please dont ask me
> how far i am, its frustrating :-)
> 
> 

Below one solution for splitting in well-formed chunks, here: 100 
OfferInfos each.  There might be better solutions (I just don't know 
them ;-) It only works if the XML data is in "pretty print format", as 
the sample data you posted.


$ cat split_bigfile.awk

BEGIN {	new_chunk = 1 ; size = 100 }

NR == 1 { header = $0 ; next }
NR == 2 { header = header ORS $0 ; footer = "</" substr($1,2) ">" ; next }

$0 !~ footer {
	if (new_chunk) {
		outfile = "chunk" sprintf("%07d", num) ".xml"
		print header > outfile
		new_chunk = 0
	}
	print > outfile
}

/<\/OfferInfo>/ {
	num = int(count++/size)
	if (num > prev_num) {
		print footer > outfile
		new_chunk = 1
	}
	prev_num = num
}

END { if (!new_chunk) print footer > outfile }




 16 Posts in Topic:
Splitting huge XML Files into fixsized wellformed parts
Malapha <malapha@[EMAI  2008-03-17 03:43:20 
Re: Splitting huge XML Files into fixsized wellformed parts
Janis Papanagnou <Jani  2008-03-17 13:37:27 
Re: Splitting huge XML Files into fixsized wellformed parts
Malapha <malapha@[EMAI  2008-03-17 06:35:37 
Re: Splitting huge XML Files into fixsized wellformed parts
Hermann Peifer <peifer  2008-03-17 20:20:36 
Re: Splitting huge XML Files into fixsized wellformed parts
=?ISO-8859-1?Q?J=FCrgen_K  2008-03-17 21:33:46 
Re: Splitting huge XML Files into fixsized wellformed parts
Hermann Peifer <peifer  2008-03-18 00:01:43 
Re: Splitting huge XML Files into fixsized wellformed parts
Malapha <malapha@[EMAI  2008-03-18 08:42:50 
Re: Splitting huge XML Files into fixsized wellformed parts
Malapha <malapha@[EMAI  2008-03-18 08:43:54 
Re: Splitting huge XML Files into fixsized wellformed parts
Hermann Peifer <peifer  2008-03-18 20:49:03 
Re: Splitting huge XML Files into fixsized wellformed parts
Hermann Peifer <peifer  2008-03-19 14:05:17 
Re: Splitting huge XML Files into fixsized wellformed parts
Malapha <malapha@[EMAI  2008-03-19 15:11:08 
Re: Splitting huge XML Files into fixsized wellformed parts
Hermann Peifer <peifer  2008-03-20 09:52:22 
Re: Splitting huge XML Files into fixsized wellformed parts
Malapha <malapha@[EMAI  2008-03-25 03:39:31 
Re: Splitting huge XML Files into fixsized wellformed parts
Hermann Peifer <peifer  2008-03-25 06:32:31 
Re: Splitting huge XML Files into fixsized wellformed parts
Malapha <malapha@[EMAI  2008-03-26 10:01:38 
Re: Splitting huge XML Files into fixsized wellformed parts
Hermann Peifer <peifer  2008-03-26 19:57:10 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Fri May 16 9:06:30 CDT 2008.