home bbs files messages ]

Just a sample of the Echomail archive

Cooperative anarchy at its finest, still active today. Darkrealms is the Zone 1 Hub.

   DBRIDGE      D'Bridge Support Echo      10,398 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 8,471 of 10,398   
   Rob Swindell to mark lewis   
   Dupeloops   
   20 Jun 18 11:44:26   
   
     Re: Dupeloops   
     By: mark lewis to Rob Swindell on Wed Jun 20 2018 08:08 am   
      
    >   
    >  On 2018 Jun 19 22:43:24, you wrote to me:   
    >   
    >  >> AFAIK, seenbys and paths are not included in most dupe detection   
    >  >> schemes... other non-changing control lines are fine to be included...   
    >  >> one of the problems comes when some system sort those control lines on   
    >  >> messages they are passing along... we don't see so much of that like we   
    >  >> did at one time ;)   
    >   
    >  RS> So some metadata is included in the data that is hashed for dupe   
    >  RS> detection and some is not?   
    >   
    > yes...   
    >   
    >  RS> Are you sure about that?   
    >   
    > yes... in fact, and i don't recall who pointed this out to me back in the   
    > '90s,   
    > dbridge does exactly this in a manner of speaking... it takes the whole   
    > message   
    > header plus X bytes immediately following the message header and uses all of   
    > that as at least part of the checksum calculation... this was pointed out to   
    > me   
    > when i was working on my posting tool and was adding MSGID support to it...   
    >   
    > i was using a library and just letting it do its thing... some of my test   
    > posts   
    > were reported as dupes when they clearly weren't... IIRC, they were detected   
    > as   
    > dupes because they were posted within the same second... it turned out that   
    > my  MSGID was somewhere in the middle of the control lines at the beginning   
    > of the  message body and only my dbridge using testers were seeing this...   
    > someone  pointed out this thing about dbridge also using X bytes from the   
    > beginning of  the message body in addition to the message header so i moved   
    > my posting tool's   
    > MSGID to the top of the list and no more dupes were detected by those   
    > dbridge  systems...   
    >   
    > i don't know what other systems do... there's only a very few that provide   
    > this   
    > information... SBBS is one of them... when i was testing Mystic, there was   
    > some   
    > discussion about dupe detection as james worked to try to figure out the   
    > best  method he liked... i have used fastecho here for decades but i don't   
    > know what  data it uses for its checksums... i do know it uses two   
    > checksums, though... i  know this because i was being nosy one day and   
    > looking at FE's dupe database  file (one for all message areas) with a hex   
    > viewer and noticed that groups of  bytes were repeated all throughout the   
    > file... i asked about this and was told  i found a bug... basically, FE has   
    > two checksums that it uses for each message  and both are supposed to be   
    > stored in the database... what i found was that  only one was being used and   
    > written to both fields... toby fixed that problem  right quick... i just   
    > don't know what data is used to calculate them...   
    >   
    > back in the day, dupe detection formulas were not really shared around...   
    > maybe   
    > a couple of developers talking amongst themselves would tell each other what   
    > they were doing but this information was not published where everyone could   
    > find it... it was more or less black majik to a point...   
      
   To complete the discussion, Synchronet (smblib) actually uses multiple methods   
   of body text dupe detection:   
      
   1. A "legacy" CRC-32 hash of the body text, excluding any metadata, like FTN   
      control lines and excluding any trailing white-space or control-characters   
   2. A tuple of hashes (MD5 digest, CRC-32, and CRC-16) and length (char count)   
      of the body text excluding any metadata and *all* white-space characters   
      
   These, in addition to duplicate Internet (RFC-822) compliant Message-ID and   
   FTN-compliant Message-ID checks.   
      
   No black majik here. :-)   
      
                                               digital man   
      
   Synchronet "Real Fact" #64:   
   Synchronet PCMS (introduced w/v2.0) is Programmable Command and Menu Structure.   
   Norco, CA WX: 77.6øF, 57.0% humidity, 8 mph ENE wind, 0.00 inches rain/24hrs   
   --- SBBSecho 3.05-Linux   
    * Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca