View Single Post
gnuite's Avatar
Posts: 1,245 | Thanked: 421 times | Joined on Dec 2005
#142
Originally Posted by murphy View Post
About the database : is it really not possible by using GDBM, to check duplicate maps ? You're speaking about 5% of empty duplicate maps... with Google I think, but using OpenStreetMap it's more about 90-95% of "empty" maps in France !
I can work on getting a dup-checking solution for GDBM, but it was really a lot easier with sqlite. As for the space savings, even if 90% of maps are duplicates, those duplicate maps are the smallest, about 106 bytes, whereas busy maps can be as large as 30k, and almost-empty maps are around 4k.

If 90% of maps are 106 bytes, and the other 10% of maps are an average of 15k, removing the duplicates (even with no overhead) saves you just 5.85% of your disk space. And, in reality, tracking duplicates requires about 16 bytes per duplicate, so it'd be difficult to obtain a savings that is worthwhile.