in perl. how does hash store data in memory -


I have a large XML file and there is too much memory to parse it. Because I believe that there are many reasons for this in the file for a lot of username
I changed the length of each user name from 28 bytes to 10 bytes.
Play again. But it still takes almost the same amount of memory.
The XML file is now parsed with SAX and during the handle, the result is stored in a hash structure such as:
$ this-> {'Date'} - & gt; {'School 1'} - & gt; {$ Class} - & gt; {$ Student} ...

Why still memory is so high I reduce the length of the student name? It is possible when the data is stored in the hash memory there is too much overhead, no matter how the length of the string is alone?

"Go to a linear list all keys of the same hash (see in macro PERL_HASH_INTERNAL )

According to the document

If you evaluate a hash in the context of the scalar, then it incorrectly returns if the hash is empty. If a key / value is a pair, then it returns true; The more accurate, returned value is a string in which the number of buckets used and the number of allocated balls is different from the slash. It is very useful only to know whether Pearl's internal hashing algorithm is performing poorly on your data set. For example, you stick to 10,000 things in a hash, but evaluating % HASH in the scalar reference "1/16" is detected, which means One of only sixteen buckets has touched, and potentially all your 10,000 items are included. It is not going to happen. If the scalar of a banded hash is evaluated in the context, then there will be a fatal error, Because this bucket usage information and Tman is not available for the hash.

To see if there is a disease delivery in your dataset, you can inspect different levels in the scalar context, example ,

< Print> scalar (% $ this), "\ n", scalar } {"School 1"}}), "\ n", ...

For some extent dated observation, see perl.com.

A minor reduction in the length of the name of the students, the keys below four levels will not make a significant difference. In general, Pearl has a strong bias for throwing memory on implementation problems. This is not your father's fortran.