ietf-irnss message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [ietf-irnss Home]


Subject: RE: Transport requirements for DNS-like protocols


--On Saturday, June 29, 2002 10:11 AM +0200 Patrik Fältström
<paf@cisco.com> wrote:

> --On 2002-06-28 10.24 -0700 Nicolas Popp <nico@realnames.com>
> wrote:
> 
>> As soon as you do fuzzy matching that forces you to retrieve
>> multiple records and rank them, the operational complexity is
>> increased ten-fold (and your query response time becomes way
>> more inpredictable unless you do a few "right things").
> 
> Doing fuzzy-matching is most efficiently done by doing a
> calculation of a hash on the search string (something like
> soundex) and then exact mathing in the database.
> 
> So, fuzzy-matching is for me just another version of
> "preparation" of the search string.

Patrik,

In a number of areas, matching by distance function --i.e.,
knowing all of the things that might match and determining which
one(s) are closest-- has turned out to be much more useful than
matching on a canonical form.  In one of the classic examples,
the first-generation theory of how to do OCR was to try to
standardize ("prepare" in your terminology) characters,
font-independent, down to a common abstraction.  Nice idea, but
it basically didn't work.  Instead, we now assume (with English)
that a given character has to match one of 62, and make a
tentative decision based on similarity functions.  Then we
repeat the process, looking up word-candidates in a dictionary
to see which candidates can be excluded because they are
uncommon in, or absent from, the language.

Sonex/ soundex matches are fuzzy matching, but they are not
fuzzy search; I think that fuzzy search is going to be needed
here.

So, I hope you are right -- it would be a lot easier.  But...

     john



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [ietf-irnss Home]


Powered by eList eXpress LLC