Super-BitLocality-SensitiveHashing
JianqiuJi ,JianminLi ,ShuichengYan ,BoZhang ,QiTian
StateKeyLaboratoryofIntelligentTechnologyandSystems,
TsinghuaNationalLaboratoryforInformationScienceandTechnology(TNList),
DepartmentofComputerScienceandTechnology,
TsinghuaUniversity,Beijing100084,China
jijq10@http://www.77cn.com.cn,
{lijianmin,dcszb}@http://www.77cn.com.cn
DepartmentofElectricalandComputerEngineering,
NationalUniversityofSingapore,Singapore,117576
eleyans@nus.edu.sg
DepartmentofComputerScience,UniversityofTexasatSanAntonio,
OneUTSACircle,UniversityofTexasatSanAntonio,SanAntonio,TX78249-1644
qitian@cs.utsa.edu
Abstract
Sign-random-projectionlocality-sensitivehashing(SRP-LSH)isaprobabilisticdimensionreductionmethodwhichprovidesanunbiasedestimateofangularsim-ilarity,yetsuffersfromthelargevarianceofitsestimation.Inthiswork,wepro-posetheSuper-Bitlocality-sensitivehashing(SBLSH).Itiseasytoimplement,whichorthogonalizestherandomprojectionvectorsinbatches,anditistheoreti-callyguaranteedthatSBLSHalsoprovidesanunbiasedestimateofangularsim-ilarity,yetwithasmallervariancewhentheangletoestimateiswithin(0,π/2].Theextensiveexperimentsonrealdatawellvalidatethatgiventhesamelengthofbinarycode,SBLSHmayachievesigni cantmeansquarederrorreductioninestimatingpairwiseangularsimilarity.Moreover,SBLSHshowsthesuperiorityoverSRP-LSHinapproximatenearestneighbor(ANN)retrievalexperiments.1Introduction
Locality-sensitivehashing(LSH)methodaimstohashsimilardatasamplestothesamehashcodewithhighprobability[7,9].ThereexistvariouskindsofLSHforapproximatingdifferentdistancesorsimilarities,e.g.,bit-samplingLSH[9,7]forHammingdistanceand 1-distance,min-hash[2,5]forJaccardcoef cient.AmongthemaresomebinaryLSHschemes,whichgeneratebinarycodes.BinaryLSHapproximatesacertaindistanceorsimilarityoftwodatasamplesbycomputingtheHammingdistancebetweenthecorrespondingcompactbinarycodes.SincecomputingHammingdistanceinvolvesmainlybitwiseoperations,itismuchfasterthandirectlycomputingotherdis-tances,e.g.Euclidean,cosine,whichrequiremanyarithmeticoperations.Ontheotherhand,thestorageissubstantiallyreducedduetotheuseofcompactbinarycodes.Inlarge-scaleapplications
[20,11,5,17],e.g.near-duplicateimagedetection,objectandscenerecognition,etc.,weareoftenconfrontedwiththeintensivecomputingofdistancesorsimilaritiesbetweensamples,thenbinaryLSHmayactasascalablesolution.
1.1Locality-SensitiveHashingforAngularSimilarity
Formanydatarepresentations,thenaturalpairwisesimilarityisonlyrelatedwiththeanglebetweenthedata,e.g.,thenormalizedbag-of-wordsrepresentationfordocuments,images,andvideos,andthenormalizedhistogram-basedlocalfeatureslikeSIFT[18].Inthesecases,angularsimilarity
百度搜索“77cn”或“免费范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,免费范文网,提供经典小说教育文库2012--Super-Bit Locality-Sensitive Hashing在线全文阅读。
相关推荐: