Half a croissant, on a plate, with a sign in front of it saying '50c'
h a l f b a k e r y
This ain't rocket surgery.

idea: add, search, annotate, link, view, overview, recent, by name, random

meta: news, help, about, links, report a problem

account: browse anonymously, or get an account and write.



IPv6 Worded Addresses

Allow words to be made out of IPv6 Addresses, or addresses chosen by word
  (+5, -2)
(+5, -2)
  [vote for,

EDITED: I originally started with trying to map something like Microsoft:Domain Controller to an IPv6 address, but as [pmarks] pointed out, the network segment is used to implement hierarchical routing. But there's still hope for this dough....

The key problem I'm trying to solve is memorability of the IPv6 addresses. I believe any standard to make an arbitrary IPV6 address more memorable is helpful (as long as it compliments the IPv6 standard).

Consider the following random address: 862A:7373:3386:BF1F:8D77:D3D2:220F:D7E0 for SomeDomain.com

DNS is down and you need to ping it to see if it's up. In IT we need to memorize IPv4 addresses all the time. As pointed out by [pmarks], we can't let a word determine the address, but we do need to be able to extract a more memorable representation.

The best way I see this to happen is similar to "scheme1" to have a one-to-many relationship between characters and 4-bit values. With every base16 value having 3 possible characters an algorithm can find a more memorable representation. The algorithm which suggests a memorable representation would use a dictionary of English words to help.

The result may be (only network segment shown - and indicative only): FEET3CATFFDOOR90:..... {Remembered as FEET 3 CAT FF DOOR 90}

Still some randomness in there, but heaps easier to remember.

So hopefully, we're still baking - I'll try and update my application to suggest memorable strings.

toadth, Nov 24 2009

Demo Tool https://sourceforge...projects/ipv6namer/
An open source tool which suggests words to use to aid in remembering IPv6 addresses. [toadth, Nov 27 2009, last modified Dec 17 2009]


       Interesting. (But I'm sticking with DNS)
phoenix, Nov 24 2009

       problem is if you limit the addresses to letter binhex and then limit the letters to sequences that make sense then the number of available addresses falls very sharply, if you then allow that people will elect to set "wordy" things in there then suddenly we have a problem of duplication.
WcW, Nov 24 2009

       Comparison with DNS and IPv4: * DNS: 1 record mapping IP to Name syncronised across thousands of routers worldwide. * IPv4: Not very feasible, can't get enough characters, or have to use Scheme4, which has too high of probability for collisions. * IPv6: Numbers too hard to memorise with current scheme. Plenty of space for worded scheme without collisions. Is sort of a built in DNS system, but it would be best to register your allocation with an authority to avoid any disputes.
toadth, Nov 25 2009

       [WcW] I believe that with Scheme1 (my preferred method), duplication is not a problem - probabilistically. If you just use 26 letters and 10 digits. You have 36 characters binding to 16 values. Ie. At most you have a digit and two letters bound to the same value. 0,G and X all map to 0. F and Y map to 16.   

       More research into English words - more specifically likely business nouns, would be able to determine the best mapping.   

       But even this shouldn't be a problem.   

       Consider, the words "Apple" and "Microsoft". Unless {A and M}, {p and i}, {p and c}, {l and r}... etc.. all mapped to the same values AND the hashed padding equaled the same value, there would be no collision.   

       It's possible that a word like "Potato" could have the exact mapping to values as "Corned", but even if that was to happen, the hashed padding of both values is based of the ASCII values and therefore they would both have different padding.   

       Being "wordy" or even short words have little if any impact on the chance of collisions...   

       Shorter words have better chances of having the same Base16 values, but much less chances of having the same padding.   

       Longer words have much less chances of having the same Base16 values, and better chances of having the same padding.   

       So hopefully, you now better understand the mechanics of the idea and the extremely rare probabilities involved regarding collisions.
toadth, Nov 26 2009

       [UnaBubba] ICANN don't have to. Like I said you might just use it for the host segment (of which you control). And in an IPv6 network I'd imagine you wouldn't have to type in the network segment when pinging a LAN computer by IPv6 for instance.   

       Eg. Ping LOCAL:DomainController -t   

       The OS would resolve LOCAL to be your local network segment, and resolve DomainController to a host segment (faster than a network DNS query).
toadth, Nov 27 2009

       [bobofthefuture] I don't think so. A network segment name wouldn't represent a "single" host (like DNS does), it represents the hole network - of which there can be 18 billion billion hosts.   

       The DNS query of microsoft.com may return an IPv6 Readable Address of [Microsoft:Web]. Considering you don't need billion billion addresses at home, your home IP may be in the form {ISP:Account} [iiNet Australia:John Smith]   

       I would think that all network segments will be dispensed by IANA. And the cost of buying a network segment would have a cost - much higher than buying a domain name I would think.   

       The issue here is, that even if my scheme is not used for IPv6 wording, the fact is there are enough bytes to make words and therefore there can be added value. And as you pointed out there may be auctions for words which describe subjects and topics (with Nouns being reserved for use by Companies owning the trademarks).
toadth, Dec 02 2009

       This idea shows a lack of understanding of what IPv6 addresses actually are. An address is not simply a random string of bits; address blocks are delegated to ISPs and their customers in a hierarchical manner, so that routes can be aggregated to prevent the size of the global routing table from exploding. For that reason, deriving the first half of the address from words is not feasible.   

       Typically with IPv6, the last 64 bits of the address are free to be assigned within a LAN segment, so word-derived suffixes could be somewhat interesting.   

       However, this idea suggests a non-reversible mapping which "probably won't collide". If you want a one-way mapping with collision resistance, you should be using a real hash function.   

       But, I think this idea would only remotely make sense if the mapping were bidirectional, so you could look at an IPv6 address and read the text from the suffix. You can pack A-Z and some punctuation into 6 bits, so a bidirectional mapping could be constructed that lets you store 10 characters in a 64-bit suffix.   

       However, the whole idea is stupid because IPv6 addresses are low-level constructs for communicating between machines, not for looking pretty to humans. If you want human-readability, we've had the DNS system in place for 25 years now.
pmarks, Dec 07 2009

       [pmarks] Yer you're right - i'm going to change the original post. I did know there were provisions for hierarchical routing, but didn't think this meant building the hierarchy into the addresses. See rfc2374. The first 48 bits of the network segment are for public routing, the last 16 bits are for private routing leaving the last 64 bits for the host interface.   

       Now that you mention it, implementing an 8 byte hash would've been easier, however more destructive and more non-reversible.   

       My scheme is still recoverable though. The purpose of the scheme wasn't for aesthetic human- readability - it's all about memorability.
toadth, Dec 11 2009

       You can still use a user defined word for specifying the host interfaces segment. But the network segment side cannot be derived from a word.
toadth, Dec 11 2009

       Go the IPv6 namer link in SourceForge. I've updated the code to work the other way - helping you use english words to make IPv6 addresses easier to remember.   

       Examples: A362D63D907327A2 = A3MIDMUD9GNU27A2 5E8630C9659BC9B1 = LEOMUGS9MLPRCPR1   

       I think it would be better to suggest words based on English phonetics, looking for common sounds {CH, ES, ED, ING, PH, BO, BI, BA, TT} to make up words such as CHEDING, BATTING, PHED. Etc.. they are still memorable and increases the chances of finding words.   

       Feel free to contribute. I'm sure there's an Linguist who can help. Or someone who knows of a place where I can find a list of English word components (phonetics).
toadth, Dec 17 2009

       Did some volume testing - here are some statistics: Probability of Finding an N length word [N,Chances out of 1mil.] = { [2, 999981], [3,973134], [4,506495], [5,87047], [6,8804], [7,615], [8,47], [9,4]}   

       Coverage of N characters with words of at least X length [N, X, Chances out of 100k] = {[16,2,82], [8,2,45249], [16,3,.5], [8,3,9579], [8,4,160], [8,5,5]}   

       In my testing I found the following HEX: B5ED 80BE 1EB6 C58E which produced a fully covered address, using words no smaller than 3 characters - BLED OX BE HERM SLOE, unfortuantely the chances of this happening are about 1 in 200,000.   

       There's a 50/50 chance to have 8 characters covered with words at least 2 characters long. This needs to be much closer to 100%.   

       I think the best way to acheive this is to go for the strategy of using, small phonetics.   

       Even better! Show the possabilities for each character and let the User string together something they think will be most suitable.. I'll do this first.
toadth, Dec 18 2009

       Implemented 2 changes: 1. Redistributed letters in mapping to increase prob. of gettings word matches. First I analysed the probability of a letter being used based on Dictionary.txt, then logically adjusted the mapping. (eg. there are 6 values which could only have one letter, so I assigned the most common letters there, then where values had two letters, I assigned the most common letters to pair with the least common letters). Here is the updated stats: a) Probability of Finding an N length word [N,Chances out of 1mil.] = { [2, 1000000], [3,982931], [4,527349], [5,89684], [6,8848], [7,704], [8,40], [9,1]} b) Coverage of N characters with words of at least X length [N, X, Chances out of 100k] = {[16,2,102], [8,2,51631], [16,3,1.5], [8,3,10502], [8,4,178], [8,5,6]}   

       2. Implement "Pairs". The probability of having two specific letters next to each other is quite rare. But if we take the most common pairs and bind them to values, we have better chances of making words. I analysed the probablity of any two letters appearing based on Dictionary.txt, then distributed 22 of the most common across the mapping. Starting at 0 (which is the least common end - with single character mapping). Here is the updated stats: a) { [2, 1000000], [3,999973], [4,841067], [5,185467], [6,15534], [7,847], [8,35], [9,1]} b) {[16,2,3780], [8,2,96882], [16,3,40], [8,3,38070], [8,4,790], [8,5,10]}   

       I had to make some additional changes to support string to value mapping. When reading an IPv6 string, while iterating through each letter the algorithm must: i] look for pair matches first (looking ahead N characters). ii] if a string matches, the follogin N characters in the string/pair is ignored (ie. can't be used as another pair) iii] if no string matches, then regular character to value mapping is used.   

       Because of the pairs, the words can actually be longer in characters than the original HEX string. EG. SHORTEN in the old scheme would map to 7 HEX characters. In the new scheme EN is a common pair, so the word SHORTEN would map to 6 HEX characters. I believe the loss in efficiency is more than made up, by the improvement in memorability. Remembering a series of single or twin characters is a lot harder than remembering words. The human brain doesn't just remember the spelling of "SHORTEN", but the action of SHORTENING. I encourage any academic institution to perform further research, where human subjects are given several tests made up of IPv6 Words to see which schemes are the best for memorability.   

       Finally, I set out to acheive a much better chance of having 8 spread with at least 2 character words. With change 2, we went from 50/50 chances to 97/100 chances! The aim has been to make IPv6 addresses memorable and to acheieve this we need to ensure we have high chances of providing words. I've just got to make a few final changes to the project before updating it on the website. I'll let you know when it's up.
toadth, Dec 18 2009

       The source forge project is updated. You can now demo using "pairs". While doing this and testing, I found a little set-back. EG. RI is a pair, but it is possible to select R (for it's value - 14) and I (11) seperately. I included a check to notify the user when such "Incident Pairs" occur. The user needs to simply select other words or change a value position to rectify such an error.   

       Due to this, I also need to update the program to automatically avoid those problems. And also this will reduce the statistics slightly.
toadth, Dec 21 2009


back: main index

business  computer  culture  fashion  food  halfbakery  home  other  product  public  science  sport  vehicle