On 3/18/2008 7:53 AM, iaminsik wrote:
> On 3=BF=F918=C0=CF, =BF=C0=C8=C48=BD=C343=BA=D0, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
>=20
>>On 3/17/2008 11:30 PM, iaminsik wrote:
>>
>>
>>
>>
>>
>>
>>>On 3=BF=F918=C0=CF, =BF=C0=C8=C412=BD=C341=BA=D0, Joel Reicher <j...@[EMAIL PROTECTED]
> wrote:
>>
>>>>iaminsik <iamin...@[EMAIL PROTECTED]
> writes:
>>>
>>>>>Hi, I compared one element with others in a Hash (size : about 7,000=
).
>>>>
>>>>>--------------------------------------------------------------------=
----
>>>>>for(word1 in arrword)
>>>>>{
>>>>> for(word2 in arrword)
>>>>> {
>>>>> # compare word1 with word2.
>>>>> }
>>>>>}
>>>>>--------------------------------------------------------------------=
----
>>>>
>>>>>But, in every iteration, the memory-usage increased.
>>>>
>>>>>Somebody help me?
>>>>
>>>>I would guess your comparison procedure is causing memory allocation,=
>>>>but you haven't said what it's doing.
>>>
>>>I experimented word-similarity with predicates.
>>>My source code is very long.... In the sub-routines, I frequently
>>>split a tring into an array, and delete the array.
>>
>>>If you have some times, let me know memory bugs...
>>
>>> print "calc similarity";
>>> for(word1 in arrword)
>>> {
>>> I1 =3D I_of_word(arrword[word1], cnt_all_noun_type);
>>> for(word2 in arrword)
>>> {
>>> if(word1 !=3D word2 && arrwordfreq[word1]>10 &&
>>>arrwordfreq[word2]>10 )
>>> #if(word1 !=3D word2 )
>>> {
>>> I2 =3D I_of_word(arrword[word2], cnt_all_noun_type);
>>> ret =3D Co_I_of_words(arrword[word1], cnt_all_noun_type, word2=
);
>>> ln=3Dsplit(ret, arrret, SUBSEP);
>>> ISame =3D arrret[1];
>>> ISame =3D ISame * 2;
>>> I =3D ISame / ( I1 + I2);
>>> if( I )
>>> print word1 "(" arrwordfreq[word1] ")", I, word2 "("
>>>arrwordfreq[word2] ")", arrret[2], arrret[3] > "Dekang_over10.txt";
>>> delete arrret;
>>> }
>>> }
>>> }
>>>}
>>
>>># Co-Information Sum between Two Words
>>>function Co_I_of_words (features,cnt_all_type,word2) # divided by [|]
>>>{
>>> cofeatures1=3Dcofeatures2=3D"";
>>> I=3D0.0;
>>> word1num=3Dsplit(features,arrfeature2,"[|]");
>>> for(i=3D1;i<=3Dword1num;i++)
>>> {
>>> # check whether word2 has the same feature.
>>> if(arrwordfeaturerelation[word2 SUBSEP arrfeature2[i]] !=3D 0)
>>> {
>>> I +=3D I_of_feature(arrfeature2[i], cnt_all_type);
>>> arrfeaturestr=3Darrfeature2[i];
>>> gsub(SUBSEP, "/", arrfeaturestr);
>>> cofeatures2=3Dcofeatures2 (cofeatures2=3D=3D""?"":"|") arrfeatur=
estr \
>>> "(" arrwordfeaturerelation[word2 SUBSEP arrfeature2[i]]
>>>")";
>>> cofeatures1=3Dcofeatures1 (cofeatures1=3D=3D""?"":"|") arrfeatur=
estr \
>>> "(" arrwordfeaturerelation[word1 SUBSEP arrfeature2[i]]
>>>")";
>>> }
>>> else
>>> delete arrwordfeaturerelation[word2 SUBSEP arrfeature2[i]];
>>> }
>>> delete arrfeature2;
>>
>>> retval=3DI SUBSEP cofeatures1 SUBSEP cofeatures2;
>>> return retval;
>>>}
>>
>>># Information Sum of Word
>>>function I_of_word (features,cnt_all_type) # divided by [|]
>>>{
>>> I=3D0.0;
>>> word1num=3Dsplit(features,arrfeature1,"[|]");
>>> for(i=3D1;i<=3Dword1num;i++)
>>> {
>>> I +=3D I_of_feature(arrfeature1[i], cnt_all_type);
>>> }
>>> delete arrfeature1;
>>
>>> return I;
>>>}
>>
>>># Information A Feature
>>>function I_of_feature (feature_relation,cnt_all_type) # divided by [|]=
>>>{
>>> lchild=3Dsplit(arrfeature[feature_relation],div,"[|]");
>>> delete div;
>>> return -log(lchild / cnt_all_type);
>>>}
>>
>>># NonDuplicateString
>>>function InsertArrayString (ArrString, newbie)
>>>{
>>> n=3Dsplit(ArrString, Container, "[|]");
>>> bflag=3D0;
>>> for(i=3D1;i<=3Dn;i++)
>>> {
>>> if(Container[i]=3D=3Dnewbie) bflag=3D1;
>>> }
>>> delete Container;
>>> if(bflag=3D=3D0)
>>> ArrString =3D ArrString (ArrString=3D=3D""?"":"|") newbie;
>>> return ArrString;
>>>}
>>
>>># Get Function String
>>>function CleanStringToFunc (String)
>>>{
>>> retstr=3D"";
>>> n=3Dsplit(String, Arrstr, " [+] ");
>>> for(i=3D1;i<=3Dn;i++)
>>> {
>>> if(Arrstr[i] ~ /\/fjc/)
>>> {
>>> tempstr=3DArrstr[i];
>>> gsub(/^[^\/]*\//,"",tempstr);
>>> delete Arrstr;
>>> return tempstr;
>>> }
>>> }
>>> delete Arrstr;
>>> return "NULL";
>>>}
>>
>>># CleanString
>>>function CleanStringToVerb (String)
>>>{
>>> retstr=3D"";
>>> n=3Dsplit(String, Arrstr, " [+] ");
>>> for(i=3D1;i<=3Dn;i++)
>>> {
>>> if(Arrstr[i] ~ /\/CMC/ || Arrstr[i] ~ /\/YBDO/ \
>>> || Arrstr[i] ~ /\/YBHO/ \
>>> || Arrstr[i] ~ /\/fpd/ || Arrstr[i] ~ /\/fph/)
>>> {
>>> if(retstr=3D=3D"") retstr =3D retstr Arrstr[i];
>>> else retstr =3D retstr " + " Arrstr[i];
>>> }
>>> }
>>> delete Arrstr;
>>> return retstr;
>>>}
>>
>>>Thanks!
>>>Remi.
>>
>>>>Cheers,
>>>
>>>> - Joel- =B5=FB=BF=C2 =C5=D8=BD=BA=C6=AE =BC=FB=B1=E2=B1=E2 -
>>>
>>>>- =B5=FB=BF=C2 =C5=D8=BD=BA=C6=AE =BA=B8=B1=E2 -
>>>
>>It may be possible for some elements of arrwordfeaturerelation[] not to=
get
>>deleted since in Co_I_of_words() you only delete them if the condition
>>"arrwordfeaturerelation[word2 SUBSEP arrfeature2[i]] !=3D 0" is false (=
though I
>>don't see where you increment that array element anyway so I assume it'=
s in some
>>piece of code you didn't post) and you only delete the word2-indexed en=
tries
>>
>> delete arrwordfeaturerelation[word2 SUBSEP arrfeature2[i]]
>>
>=20
>=20
> In the 'else' control structure where I deleted an element of
> arrwordfeaturerelation,
> I tried to check that if I refer an empty element of HASH in the 'if'
> condition, something(null value) will be created.
> I mean it is a verification code.
>
If you want to check for an index being present in an array, use the "in"=
operator:
if (index in array) ...
don't create the element then destroy it:
if (!array[index])
delete array[index]
>=20
>>while in your code you use both word2 and word1indices and so create ar=
ray
>>entries for both:
>>
>> cofeatures2=3D...arrwordfeaturerelation[word2 SUBSEP arrfeature2[=
i]]
>> cofeatures1=3D...arrwordfeaturerelation[word1 SUBSEP arrfeature2[=
i]]
>>
>=20
>=20
> In my algorithm, both of 'arrwordfeaturerelation[word1 SUBSEP
> arrfeature2[i]]' and 'arrwordfeaturerelation[word2 SUBSEP
> arrfeature2[i]]'
> cannot generate an empty element in HASH.
I don't know what that means. The above code allocates 2 pieces of memory=
in the
array whether or not that array is being used as a hash table.
> But, I will check whether or not.
OK.
Ed.


|