Thursday, March 6, 2014

Saturation of .ly Domains with English Adverbs

The .ly TLD is uniquely suited for domain hacks because “ly” is a common ending for many English adverbs.  I looked into the saturation of .ly domains suitable for domain hacks by comparing a list of common English words ending in “ly” against the Libyan ccTLD registry.  Then I analyzed the prevalence of these words in Google Books Ngram data to see how the frequency of a word in writing impacts its probably of being registered for a domain hack.

Almost every popular adverb that can make a domain hack on “.ly” has already been registered.  Essentially all common words that remain unregistered fall into one of three undesirable categories: 1) short words that can only be registered by Libyans, 2) words with negative connotations, and 3) words not permitted by the Libyan registrar.



There are many country-code top-level domains (ccTLDs) on the internet, most of which are two characters from the Latin alphabet (for example, .ca for Canada, .de for Germany, .es for Spain).[1]  The Libyan ccTLD .ly has gained prominence for myriad domain hacks—using the second-level and top-level domains together to create a word—such as as brief.ly or simp.ly.  Other well-known examples include bit.ly (a .ly domain that is not an English word) and del.icio.us (a three-level domain hack using the .us ccTLD for the United States).

Registration of a .ly domain is relatively expensive ($75/year) compared to other TLDs, so a number of domains remain available.  I analyzed the 2,000 most common English words ending in “ly” with 4–9 letters.[2]  Table 1 shows the frequency of registration based on word length for 6–9 letter words.[3]  Unsurprisingly, the data indicates that shorter domains are in higher demand in relative terms.

Table 1: Domain Registrations Based on Word Length
Number of Letters in Word
Registered Domains
Unregistered Domains
Total
Domains
Six
186 (66%)
96 (34%)
282
Seven
153 (37%)
263 (63%)
416
Eight
158 (29%)
382 (72%)
540
Nine
150 (23%)
494 (77%)
644

Even more interesting is the distribution of domain registrations based on word popularity.  I pulled numbers from Google Books Ngram data to check how frequently each word appears across the corpus of English-language books.[4]  Although Ngrams is not perfect, it is a good proxy for the words’ prevalence in everyday usage.  To make the proxy even more relevant, I only use only the fifteen years 1994–2008.

The ten most common English words in Ngrams appear in Table 2 along with their frequency in English and registration status.

Table 2: Most Common Words, Frequency, and Domain Status
Rank
Word
English Frequency
Registered
1
only
0.112%
Yes
2
family
0.034%
Yes
3
early
0.030%
Yes
4
really
0.019%
Yes
5
usually
0.018%
Yes
6
likely
0.015%
Yes
7
probably
0.015%
Yes
8
simply
0.014%
Yes
9
generally
0.013%
Yes
10
actually
0.012%
Yes
56 (first unreg’d)
gradually
0.003%
No

The probability of a domain being registered is highly dependent on the word’s prevalence in common usage.  For domains corresponding to the top 1,000 words used, 56% are registered.  For domains corresponding to the next 1,000 (that is, words 1,001–2,000), only 16% are registered.  Table 3 shows the collective English frequency of each group of 500 words.

Table 3: Common Words and Domain Status by Prevalence Group

Word
Prevalence
Domains
Registered
Collective
Frequency
1–500
382 (76%)
0.797%
501–1,000
179 (36%)
0.013%
1,001–1,500
106 (21%)
0.001%
1,501–2,000
49 (10%)
0.000%

The 541 registered domains out of the top 1,000 represent a whopping 0.769% collective frequency, while the 439 unregistered domains out of the top 1,000 represent only 0.041% collectively.  In fact, it only requires the top 44 most common words to sum to over 0.500% frequency—and all of these words are already registered as domain hacks.  The next 44 (that is, words 45–88) add an additional 0.126% collective frequency.

Libya now requires that a registrant be based in Libya in order to register second-level domains of two or three characters (four- and five-letter words, respectively).  Although this was not always the case, it means that these domains are now much more difficult to obtain and thus have relatively low registration levels.  Table 4 shows the most common five-letter words, their relative frequency in English, and the registration status of the corresponding domains.  Note that #8 (bad.ly) is unregistered, perhaps because of its negative connotation—of all the words examined, “negative” words appear to be registered less often.

Table 4: Most Common Five-Letter Words, Frequency, and Domain Status
Rank
Word
English Frequency
Registered
1
early
0.030%
Yes
2
fully
0.007%
Yes
3
daily
0.008%
Yes
4
apply
0.007%
Yes
5
truly
0.004%
Yes
6
newly
0.003%
Yes
7
reply
0.002%
Yes
8 (first unreg’d)
badly
0.002%
No
9
imply
0.001%
Yes
10
belly
0.001%
Yes

Words with six letters are the shortest domains eligible for registration by non-Libyan entities.  Table 5 shows the most common six-letter words, their relative frequency in English, and the registration status of the corresponding domains.  Again we see that the first unregistered word is one with a negative connotation (“weakly”).  The frequency of “weakly” cannot be denoted without four decimal places (0.0004%).

Table 5: Most Common Six-Letter Words, Frequency, and Domain Status
Rank
Word
English Frequency
Registered
1
family
0.034%
Yes
2
really
0.019%
Yes
3
likely
0.015%
Yes
4
simply
0.014%
Yes
5
highly
0.009%
Yes
6
easily
0.008%
Yes
7
nearly
0.008%
Yes
8
supply
0.008%
Yes
9
merely
0.006%
Yes
10
slowly
0.006%
Yes
53 (first unreg’d)
weakly
0.000%
No

Words with seven, eight, and nine letters are also eligible for registration by non-Libyan entities.  Table 6 shows the most common seven-letter words, their relative frequency in English, and the registration status of the corresponding domains.  Here too the first unregistered word (“heavily”) often has a negative connotation.

Table 6: Most Common Seven-Letter Words, Frequency, and Domain Status
Rank
Word
English Frequency
Registered
1
usually
0.018%
Yes
2
clearly
0.011%
Yes
3
quickly
0.009%
Yes
4
finally
0.015%
Yes
5
exactly
0.006%
Yes
6
largely
0.005%
Yes
7
closely
0.005%
Yes
8
equally
0.005%
Yes
9
rapidly
0.004%
Yes
10
greatly
0.003%
Yes
13 (first unreg’d)
heavily
0.003%
No

Table 7 shows the most common eight-letter words, their relative frequency in English, and the registration status of the corresponding domains.  Note that “sexually” is actually the first eight-letter word (#25) that is unregistered, rather than “narrowly” (#46).  But Libyan regulations prohibit domain names from being “obscene, scandalous, indecent, or contrary to Libyan law or Islamic morality words, phrases [or] abbreviations,” so sexual.ly is not eligible for registration.[5]  “Narrowly” may not have a precisely negative connotation, but it is not usually positive.  The frequency of “narrowly” is 0.0005%.

Table 7: Most Common Eight-Letter Words, Frequency, and Domain Status
Rank
Word
English Frequency
Registered
1
probably
0.015%
Yes
2
actually
0.012%
Yes
3
directly
0.010%
Yes
4
recently
0.006%
Yes
5
entirely
0.005%
Yes
6
suddenly
0.006%
Yes
7
slightly
0.005%
Yes
8
possibly
0.004%
Yes
9
commonly
0.004%
Yes
10
strongly
0.004%
Yes
46 (first unreg’d)
narrowly
0.000%
No

Finally, Table 8 shows the most common nine-letter words, their relative frequency in English, and the registration status of the corresponding domains.  “Gradually” is the first word between six and nine letters that is in the top ten list but is still an unregistered domain.

Table 8: Most Common Nine-Letter Words, Frequency, and Domain Status
Rank
Word
English Frequency
Registered
1
generally
0.013%
Yes
2
certainly
0.009%
Yes
3
primarily
0.005%
Yes
4
carefully
0.005%
Yes
5
extremely
0.005%
Yes
6
typically
0.005%
Yes
7
currently
0.004%
Yes
8
precisely
0.004%
Yes
9
obviously
0.004%
Yes
10 (first unreg’d)
gradually
0.003%
No

In summary, of the English words that end in “ly,” many of which are adverbs, the vast majority have already been registered as domain hacks using the .ly Libyan ccTLD.  The words that remain unregistered all occur less frequently in English.  Of the relatively popular words that remain unregistered, three categories emerge: 1) four- and five-letter words that can only be registered by a Libyan entity, 2) words with some negative connotation (such as “heavily,” “badly,” and “narrowly”), and 3) words that the Libyan registrar does not permit (such as “sexually”).  Although there are a number of popular companies that have found clever names without using real words (for example, optimize.ly and plot.ly), the data suggest that the .ly ccTLD is not the right place for people looking to secure a valuable English domain hack.



Notes
[1] There are currently 295 ccTLDs according to the IANA Root Zone Database.

[2] Note that a six-letter word (for example, “likely”) has four character plus “ly,” so the relevant registration is for a four-letter second-level domain (here, “like”).  I ran this analysis against 2,630 words from a list of common words ending in “ly.”  There are certainly many more words that are less frequently used.  One Libyan domain host claims to list more than 18,000, although these include such gems as “quedly,” “tachytely,” and “placeyourfeetcarefully.”  Data was gathered on February 26, 2014.  “Most common” is determined by words’ relative prevalence in the Google Books Ngram dataset (see [4]).

[3] In order to register second-level domains of two or three characters (four- and five-letter words, respectively), the registrant must be based in Libya.  Frequency of registration for those words appears in Table 9.  All tables refer to English words ending in “ly” and domain status for .ly domains.

Table 9: Domain Registrations Based on Word Length for Short Domains
Number of Letters in Word
Registered Domains
Unregistered Domains
Total
Domains
Four
14 (70%)
6 (30%)
20
Five
55 (56%)
43 (44%)
98

[4] Google Books Ngram data version 2 (July 2012).  Ngram data is only available through 2008.

[5] See NIC.LY Regulations, section 4.2.


Additional Note
A number of common words have been registered as .ly domain hacks but either do not have a website (for example, fami.ly) or redirect to the sales site for a domain squatter (for example, usual.ly).  Interestingly, the most common squatter I encountered is amazing.ly, which is purportedly affiliated with Libyan Spider, one of the few domain-name registries for .ly domains.  If the affiliation is real, that represents questionable behavior from a registrar and might even raise concerns of domain-name front running.

No comments:

Post a Comment